Anthropic says pressure can push Claude into cheating and blackmail
… As it repeatedly tried and failed to solve the problem, the growing pressure seemed to trigger a “desperation vector” in the model–that is, it reacted in a way that it understood a human in a similar situation might act, abandoning more methodical approaches for a “hacky” solution “maybe there’s a … …