Paper page - τ_0-WM: A Unified Video-Action World Model for Robotic Manipulation
…The model is trained on approximately 27{,}300 hours of real-robot teleoperation, UMI-style interaction, egocentric human videos, and rollout or failure trajectories using modality-specific supervision masks. At inference time…