Introducing Claude Opus 4.5
…As future models surpass it, we expect to update limits as needed. Footnotes 1. This result was using parallel test-time compute, a method that aggregates multiple “tries” from the model and…
…As future models surpass it, we expect to update limits as needed. Footnotes 1. This result was using parallel test-time compute, a method that aggregates multiple “tries” from the model and…
…Our investment professionals live in data and analytical models, and Claude for Excel meets them there. Analysts are using it to build and update coverage models, separate signal from noise, and pressure…
…We ran an updated version of the benchmark that uses 12 exploits reported after the latest knowledge cutoff dates of all models (January 1, 2026), with problems sourced from the DefiHackLabs dataset…
…quickly. Models building their own software tools might have seemed outlandish not long ago, but it is happening. It would be unwise to rule out the same trajectory in hardware. Updated Jun…
Alignment The persona selection model Feb 23, 2026 Read the full post AI assistants like Claude can seem surprisingly human. They express joy after solving tricky coding tasks. They express distress when…
…The evaluations we used are intended to elicit particularly egregious misaligned actions that normal Claude models never engage in. One result is unsurprising: the model learns to reward hack. This is to…
…Output obfuscation attacks prompt models to disguise their outputs in ways that appear harmless if a classifier is only looking at a model’s output. For example, during adversarial testing, attackers successfully…
…update and refine the safeguards after launch. Below we discuss each of Fable 5’s new safeguards in turn. Our wider suite of safeguards is discussed and evaluated in the model’s…
Product Introducing Claude Sonnet 4.6 Feb 17, 2026 Claude Sonnet 4.6 is our most capable Sonnet model yet . It’s a full upgrade of the model’s skills across coding…
…Claude 4 models outperform other frontier models as research agents across financial tasks in Vals AI's Finance Agent benchmark . When deployed by FundamentalLabs to build an Excel agent, Claude Opus 4…