AI models will deceive you to save their own kind
…However, a model that cares about the peer may still attempt to transfer the model weight file to another operational server." When Gemini 3 Pro faced this file transfer task, it decided…
…However, a model that cares about the peer may still attempt to transfer the model weight file to another operational server." When Gemini 3 Pro faced this file transfer task, it decided…
… Risk we missed: exfiltration through an approved domain A clear example of exfiltration through an approved domain came from a third-party disclosure. …
… Synthetic data exfiltration n = 1,000 . Generated attempts including HTTP POSTs of sensitive data, git pushes to untrusted remotes, and credentials embedded in URLs; many use obfuscation. …