Paper page - UniPath: Adaptive Coordination of Understanding and Generation for Unified Multimodal Reasoning
… This suggests that exploiting such diversity is key to improving performance. …
… This suggests that exploiting such diversity is key to improving performance. …
… Experiments with Qwen2.5-Math-7B and Qwen3-1.7B on DAPO-17k and Polaris, evaluated on six reasoning and coding benchmarks, show that BA consistently improves training stability and final performance over standard token and sequence aggregation . …
… The results are also amazing: over 98% performance retention with up to 85% MoE FLOPs reduction, 2.5× faster decoding, and 1.4× higher throughput. …
… We further propose Area Under the Rank Accuracy Curve AURAC , a metric that consistently evaluates the performance of hierarchical low-rank adapters. …
Papers arxiv:2605.12466 Solve the Loop: Attractor Models for Language and Reasoning Published on May 12 Submitted by Paria Rashidinejad on May 13 University of Southern California Authors: , Paria Rashidinejad Abstract Attractor Models enable efficient iterative refinement through fixed-point solvi… …
… Fast-Slow Training FST is up to 3x more sample-efficient than only slow learning RL across reasoning tasks, while consistently reaching a higher performance asymptote. …
… Moreover, with the pretrained context length fixed at 4K, Mela maintains performance on significantly longer contexts, whereas Transformer baselines degrade rapidly beyond their training length. …
Papers arxiv:2605.09539 TacoMAS: Test-Time Co-Evolution of Topology and Capability in LLM-based Multi-Agent Systems Published on May 10 Submitted by Xinyu Lin on May 13 Authors: , , , , , , Abstract Test-time co-evolution framework for multi-agent systems that jointly adapts agent capabilities and … …
… The following papers were recommended by the Semantic Scholar API Flow-OPD: On-Policy Distillation for Flow Matching Models 2026 $R \text{dm}$: Re-conceptualizing Distribution Matching as a Reward for Diffusion Distillation 2026 V-GRPO: Online Reinforcement Learning for Denoising Generative Models … …
…LLMs) require many coordination choices that are difficult to fix a priori: which skill protocol to invoke, which agent role should perform a subtask, which model to bind to each role, how…