Paper page - Accelerating RL Post-Training Rollouts via System-Integrated Speculative Decoding
… We implement speculative decoding in NeMo-RL with a vLLM backend , supporting both synchronous and asynchronous pipeline s and enabling speculation during RL rollouts. …