Search: AI in systems

Paper page - From AGI to ASI

…This report investigates how AI itself might continue to develop in a post-AGI world along the continuum of machine intelligence. The endpoint of this continuum, Universal AI , is theoretically well understood…

Jun 15, 2026

Paper page - Discovering Cooperative Pipelines: Autoresearch for Sequential Social Dilemmas

…an outer-loop AI agent autonomously redesigns the inner-loop pipeline of an LLM policy-synthesis system for multi-agent Sequential Social Dilemmas (SSDs). A researcher agent R (run as a coding…

May 29, 2026

Paper page - Multi-Stream LLMs: Unblocking Language Models with Parallel Streams of Thoughts, Inputs and Outputs

…Even advanced AI agents function on message exchange formats, successively exchanging messages with users, systems, with itself (i.e. chain-of-thought) and tools in a single stream of computation. This bottleneck…

May 13, 2026

Paper page - FutureSim: Replaying World Events to Evaluate Adaptive Agents

…Intelligent Systems Authors: , , , , , , , Jonas Geiping Abstract FutureSim enables evaluation of AI agents' long-term predictive capabilities by simulating chronological real-world event sequences, revealing significant gaps in current forecasting performance. AI-generated…

May 15, 2026

Paper page - AcademiClaw: When Students Set Challenges for AI Agents

…Junjie Yu , , , , , , , , , , , , , , , , , , , , , Abstract AcademiClaw presents a comprehensive benchmark for evaluating AI agents on complex academic tasks spanning multiple domains, revealing significant capability gaps in current models. AI-generated summary Benchmarks within the…

May 5, 2026

Paper page - RealICU: Do LLM Agents Understand Long-Context ICU Data? A Benchmark Beyond Behavior Imitation

…Together, Real ICU provides a clinically grounded testbed for measuring and improving AI sequential decision-support in high-stakes care. Project page: https://chengzhi-leo.github.io/Real ICU -Bench/ View arXiv…

May 14, 2026

Paper page - Workflow-GYM: Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields

…Towards Long-Horizon Evaluation of Computer-use Agentic tasks in Real-World Professional Fields Published on Jun 9 Submitted by taesiri on Jun 10 ByteDance Seed Authors: , Jingzhe Ding , , , , , , , , , , , , , , , , , , , , Abstract Current AI…

Jun 10, 2026

Paper page - Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

…2 Om AI Lab Authors: , , , Abstract A systematic comparison of vision-language models and video generation models reveals complementary strengths for spatial intelligence tasks, with vision-language models excelling in semantic tagging…

Jun 2, 2026

Paper page - Reducing Political Manipulation with Consistency Training

…28 Submitted by Long Phan on May 29 Center for AI Safety Authors: , , , , , Abstract Large language models demonstrate systematic political bias in handling opposing viewpoints, which can be mitigated through a reinforcement…