Full-Stack Optimizations for Agentic Inference with NVIDIA Dynamo | NVIDIA Technical Blog
… Please reach out to us if you have any ideas or feedback. { "model": "MiniMaxAI/MiniMax-M2.5", "messages": ... , "tools": ... , "nvext": { "agent hints": { "osl": 256, "speculative prefill": true, "priority": 10 }, "cache control": { "type": "ephemeral", "ttl": "1h" } } } The agent hints fields: pr… …
