Anthropic Launches Claude Opus 4.8 With Gains in Coding and Honesty
… Anthropic benchmarks indicate Opus 4.8 scored a 69.2% on SWE-Bench Pro, outperforming GPT–5.5 and Gemini 3.1 Pro on the test and several other benchmarks, though GPT–5.5 leads on the terminal-coding benchmark. …