Paper page - Key-Value Means
…AI-generated summary We present Key-Value Means ("KVM"), a novel block-recurrence for attention that can accommodate either fixed-size or growing state . Equipping a strong transformer baseline with fixed-size…
…AI-generated summary We present Key-Value Means ("KVM"), a novel block-recurrence for attention that can accommodate either fixed-size or growing state . Equipping a strong transformer baseline with fixed-size…
…points along sampling trajectories rather than only at a few fixed anchors. Second, we propose a continuous-time alignment objective that performs active off-trajectory matching on latents extrapolated via the student…
…However, existing methods often train separate models for each problem setting, which fixes the input-output mapping and limits the modeling of correlations across modalities. We present UniVidX, a unified multimodal framework…
…One nuance is that γ is estimated across different depths, so fixing the horizon cannot directly test γ itself. But I agree that a tighter ablation would be useful: fixing the horizon…
…reliability, generated environments substantially improve embodied agent performance that generalizes to unseen benchmarks, and co-evolution yields an 18-point success-rate gain over fixed-environment learning and a 40-point gain…
…This yields an optimization mechanism that modulates the geometry of weight matrices while keeping their spectral norm fixed. We derive the Pion update rule, systematically examine its design choices, and analyze its…
…Xiangyuan Xue , Yifan Zhou , , , , , , Abstract Strategic Trajectory Abstraction framework enhances long-horizon decision making in large language models by introducing trajectory-level strategies that improve sample efficiency and performance across interactive environments…
…However, the performance of consistency-distilled models often degrades as more sampling steps are allocated at test time, limiting their effectiveness for any-step video diffusion. This limitation arises because consistency distillation…
…stages to improve performance in complex agentic environments. AI-generated summary Existing memory-augmented LLM agents often treat memory as a static repository with pre-defined representations and fixed retrieval pipelines, which…
…a parallel visual memory path that retrieves from a fixed visual token bank and gates it back into the main stream, so perception doesn't fade as text length grows. i’ve…