Claude Code, Codex, and Pi can create their own AI agents now, and that changes everything
…Strong models like Claude Sonnet 4.6 and DeepSeek v3.2 performed best with minimal guidelines, and the open-source models were within 95% of the performance of the closed-source models…