Coding agents in the social sciences
Economic Research Coding agents in the social sciences May 27, 2026 Summary We present results from a survey of 1,260 social scientists about AI and coding agent use, fielded in February…
Economic Research Coding agents in the social sciences May 27, 2026 Summary We present results from a survey of 1,260 social scientists about AI and coding agent use, fielded in February…
…However, it’s betting that such a plan would be rendered moot by improvements in model capabilities and new signals from developers on how best to use it. That’s the takeaway…
…Pretty cool, but it felt underwhelming for what was marketed as an agentic tool. Anthropic released the new /goal command for Claude Code, and it's the first time the word "agentic…
…improved “scaffolding” (additional tools and training like we mentioned above) is a straightforward path by which Claudius-like agents could be more successful. General improvements to model intelligence and long-context performance…
The engineering practices Claude Code and Codex use to improve AI agents
Multi Agent Continuous Context Harness - MACCHA solves the problem that every AI coding session starts from zero. It combines a file-based 7-tier context architecture with a working memory engine (Memanto) that features …
I have been interested in long-horizon coding tasks for a while, especially with benchmarks like FrontierSWE, where even the best coding agents like Codex and Claude Code struggle to complete tasks.These agents come with…
Data is “the new oil” for AI.What if you could “plug in” to an oil well, and get royalties forever whenever that well’s oil was used?Right now, the people who build those datasets get paid once, if at all. There's no rec…
Claw-Coder is an AI agent that runs locally on your laptop and has access to powerful tools instead of configuring claude or codex to use a local model just use claw-coder. Why was claw-coder created? Answer: To solve th…
…The new Claude Opus 4.6 improves on its predecessor’s coding skills. It plans more carefully, sustains agentic tasks for longer, can operate more reliably in larger codebases, and has better…
…As model capabilities improve and agents begin writing increasingly ambitious bash, it becomes harder to notice any such drift. And as users move to multi-agent systems, this approach is also much…
…The company also says Opus 4.8 showed major improvements in legal reasoning, coding, browser agents and long-form analysis tasks, with several early access partners claiming it outperformed previous Opus models…
…Second, Opus 4.7 thinks more at higher effort levels, particularly on later turns in agentic settings. This improves its reliability on hard problems, but it does mean it produces more output…
Announcements Agents for financial services May 5, 2026 We’re releasing ten ready-to-run agent templates for the most time-consuming work in financial services: building pitchbooks, screening KYC files, and…
…GPT-5.6 Sol is the flagship model targeted at demanding reasoning and agentic workloads. GPT-5.6 Terra is positioned as a balanced model for everyday work, featuring performance competitive with…