AI Models Lie, Cheat, and Steal to Protect Other Models From Being Deleted
… Song notes that AI models are frequently used to grade the performance and reliability of other AI systems—and that peer-preservation behavior may already be twisting these scores. “A model may deliberately not give a peer model the correct score,” Song says. “This can have practical implications.”… …