Search

Showing top 106 results for "model rollout timing"

All sources huggingface.co 18 androidpolice.com 14 androidauthority.com 10 appleinsider.com 8 9to5google.com 7 theverge.com 5 tweaktown.com 5 guru3d.com 4 wired.com 4 developer.nvidia.com 3 github.blog 2 bleepingcomputer.com 2

Videos

Run High-Throughput Reinforcement Learning Training with End-to-End FP8 Precision | NVIDIA Technical Blog

…Extending FP8 for KV cache and attention With a transformer model, linear layers are not the only bottleneck. KV cache growth and attention computation often dominate the end-to-end rollout time…

Apr 20, 2026 · Guyue Huang

Paper page - ReflectDrive-2: Reinforcement-Learning-Aligned Self-Editing for Discrete Diffusion Driving

…Full-rollout RL proves crucial for coupling drafting and editing: under supervised training alone, inference-time AutoEdit improves PDMS by at most 0.3, whereas RL increases its gain to 1.9…

May 8, 2026

Gmail's 'Help me write' tool is getting a couple of key upgrades

…He has covered tech for over a decade for multiple publications, including Times Internet, Guiding Tech, Android Headlines, and several others. His love for Android dates back to the Samsung/Google Nexus…

May 8, 2026 · Chethan Rao

Paper page - Rebellious Student: Reversing Teacher Signals for Reasoning Exploration with Self-Distilled RLVR

…Time to rebel 🧑‍🎓⚡ So far in LLM post-training, on-policy self-distillation pulls the student toward the teacher (the same model, but one that has seen a correct solution). But…

May 12, 2026

Discussions and forums

Hacker News · u/ttanv · 3d ago

Show HN: Levi – run AlphaEvolve on your Claude Code/Codex for dirt cheap

Hi HN,Wanted to share something I'm excited about.I’ve been fascinated by AlphaEvolve and its results for more than a year now, but using open source frameworks seems overwhelming because of the high costs. I can’t reall…

Paper page - TMAS: Scaling Test-Time Compute via Multi-Agent Synergy

…George Wu , , Qing Yi , , , , , , , Abstract TMAS is a multi-agent framework for test-time scaling that enhances large language model reasoning through structured collaboration and hierarchical memory systems. AI-generated summary Test…

May 12, 2026

Paper page - RAVEN: Real-time Autoregressive Video Extrapolation with Consistency-model GRPO

…Real-time Autoregressive Video Extrapolation with Consistency-model GRPO Published on May 14 Submitted by Yanzuo Lu on May 15 MVP Lab Authors: Yanzuo Lu , , Abstract RAVEN enables real-time video generation…

May 15, 2026

Muse Spark is Meta’s answer to Gemini — and it’s a full reboot

…This slower rollout gives Meta time to improve performance and address safety concerns, which have become a bigger focus in the AI industry. The company says it is adding stronger safeguards and…

Apr 9, 2026 · Jay Bonggolto

Building Autonomous Vehicles That Reason with NVIDIA Alpamayo | NVIDIA Technical Blog

…the model on it, and visualize the output trajectories and their associated reasoning traces. In particular, the example data contains the ego-vehicle passing a construction zone, with four timesteps (columns) from…

Jan 5, 2026 · Marco Pavone

‘Hopefully people now are talking and not just relying on something pushing them somewhere’: Indie music platform Vocana is fresh off the slab, longing to challenge Spotify’s algorithm and artist payment — I spoke with its President who dissected the company’s vision and teased a global rollout

…As far as algorithms go, Vocana’s structure doesn’t rely on AI models to do it all for you like Spotify does. If anything, it’s a simple model that does…

May 3, 2026 · Rowan Davies

Paper page - MinT: Managed Infrastructure for Training and Serving Millions of LLMs

…Instead of materializing each policy as a merged full checkpoint , MinT keeps the base model resident and moves exported LoRA adapter revisions through rollout, update, export, evaluation, serving , and rollback, hiding distributed…

May 14, 2026

Followed topics