Search

Showing top 115 results for "agentic improvements"

People also ask

Why are SLMs beneficial to agentic AI tasks?

SLMs are well-positioned for the agentic era because they use a narrow slice of LLM functionality for any single language model errand. LLMs are built to be powerful generalists, but most agents use only a very narrow subset of their capabilities. They typically parse commands, generate structured outputs such as JSON for tool calls, or produce summaries and answer contextualized questions. These tasks are repetitive (up to the differences in prompt payloads), predictable, and highly specialized—well within the scope of specialized SLMs. An LLM trained to handle open-domain conversations is o

How Small Language Models Are Key to Scalable Agentic AI | NVIDIA Technical Blog

How are NVIDIA and the OSS community accelerating inference for local agentic AI?

With agents running 24 hours a day, seven days a week on increasingly complex tasks, efficient local compute matters even more. NVIDIA has collaborated with the open source community to enhance the top inference backends for agents, llama.cpp and vLLM. llama.cpp now delivers 2x performance on Qwen 3.5 and 3.6 27B dense models, and 1.6x performance on Qwen 3.5 and 3.6 35B mixture-of-expert (MoE) models. The following two techniques make this possible: Multi-Token Prediction (MTP): An advanced speculative decoding technique, where a smaller draft model proposes several tokens ahead that the targ

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA | NVIDIA Technical Blog

Introducing NVIDIA Fleet Intelligence for Real-Time GPU Fleet Visibility and Optimization | NVIDIA Technical Blog

…NVIDIA is releasing the Fleet Intelligence agent as an open source project for auditability. The agent leverages other NVIDIA open source solutions such as GPUd , NVIDIA Data Center GPU Manager (DCGM) , and…

May 11, 2026 · Christian Shrauder

NVIDIA Nemotron AI Models

…NVIDIA Nemotron Datasets Improve reasoning capabilities of large language models (LLMs) with one of the broadest commercially usable open data collections for agentic AI — spanning pre-training, post-training, personas, safety, RL…

Full-Stack Optimizations for Agentic Inference with NVIDIA Dynamo | NVIDIA Technical Blog

Coding agents are starting to write production code at scale. Stripe’s agents generate 1,300+ PRs per week. Ramp attributes 30% of merged PRs to agents. Spotify reports 650+ agent-generated…

Apr 17, 2026 · Ishan Dhanani

NVIDIA Vera Rubin POD: Seven Chips, Five Rack-Scale Systems, One AI Supercomputer | NVIDIA Technical Blog

Mar 16, 2026 · Rohil Bhargava

Enhancing Distributed Inference Performance with the NVIDIA Inference Transfer Library | NVIDIA Technical Blog

…For the case of transferring between two agents, one agent plays the role of the initiator, which creates and starts the read or write operation. The other agent plays the role of…

Mar 9, 2026 · Seonghee Lee

Metropolis for Developers

NVIDIA Metropolis NVIDIA Metropolis is a collection of models, libraries, and blueprints that provides everything you need to build, deploy, and scale video analytics AI agents and applications, from the edge to…

Building Token‑Metered AI Services on Telco AI Factories | NVIDIA Technical Blog

…to retrieval pipelines or agentic workflows. Within an AI studio, they can choose models from a curated catalog, fine-tune them with their own enterprise data to improve accuracy and relevancy, and…

May 21, 2026 · Waleed Badr

Build Real-Time Multimodal XR Apps with NVIDIA AI Blueprint for Video Search and Summarization | NVIDIA Technical Blog

…By creating generative AI agents that offer Q&A capabilities within the XR environment, users can interact more naturally and receive immediate assistance. A multimodal AI agent processes and synthesizes multiple input…

Mar 11, 2025 · Shubham Agrawal

NVIDIA NeMo Retriever

…AI Agent for Enterprise Research Develop AI agents that continuously process and synthesize multimodal enterprise data, reason, plan, and refine to generate comprehensive reports. Read Blueprint Model Card Read Technical Blog Try…

Inside the NVIDIA Vera Rubin Platform: Six New Chips, One AI Supercomputer | NVIDIA Technical Blog

…The advantage is most pronounced at the service levels required for interactive agents, where prior platforms may encounter an efficiency wall where costs rise steeply to incrementally improve responsiveness. Vera Rubin remains…

Jan 5, 2026 · Kyle Aubrey

Followed topics