Search

Showing top 2 results for "agentic AI direction"

People also ask

How are NVIDIA and the OSS community accelerating inference for local agentic AI?

With agents running 24 hours a day, seven days a week on increasingly complex tasks, efficient local compute matters even more. NVIDIA has collaborated with the open source community to enhance the top inference backends for agents, llama.cpp and vLLM. llama.cpp now delivers 2x performance on Qwen 3.5 and 3.6 27B dense models, and 1.6x performance on Qwen 3.5 and 3.6 35B mixture-of-expert (MoE) models. The following two techniques make this possible: Multi-Token Prediction (MTP): An advanced speculative decoding technique, where a smaller draft model proposes several tokens ahead that the targ

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA | NVIDIA Technical Blog

How does NVIDIA RTX Spark power personal AI agents?

Earlier this week at GTC Taipei, NVIDIA unveiled the NVIDIA RTX Spark product family, including small form factor desktops and laptops built for the age of personal assistants. These desktops and laptops deliver 1 petaflop of AI power, up to 128 GB of memory, and CUDA-accelerated AI frameworks for running large models alongside everyday work. Microsoft is creating an RTX Spark special developer edition—the Microsoft Surface NVIDIA RTX Spark Dev Box—preloaded with a modified Windows configured for developers and the top developer tools you need to get started. To learn more, see Building the n

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA | NVIDIA Technical Blog

How does multi-GPU support scale AI performance for RTX PCs?

One popular way to run AI locally has been to use multiple GPUs to access more memory and compute. While cloud frameworks like vLLM are well optimized for multiple GPUs thanks to their use in data centers, PC frameworks like llama.cpp and the ComfyUI implementation in PyTorch are not optimized for it. To solve this challenge, NVIDIA has collaborated with both llama.cpp and ComfyUI to enhance performance for RTX PCs with two equivalent GPUs. This enables you to run larger models and use the compute of both GPUs for better performance. llama.cpp now supports tensor parallelism (TP), fully utiliz

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA | NVIDIA Technical Blog

Agentic AI / Generative AI Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA Turnkey agent sandboxing on native Windows is now available, plus 2x faster agentic inference, new agent apps, and more Jun 02, 2026 By Annamalai Chockalingam and Gerardo Delgado Discuss 0 Di… …

Jun 2, 2026 · Annamalai Chockalingam

NVIDIA RTX Innovations Are Powering the Next Era of Game Development | NVIDIA Technical Blog

… This post provides a detailed overview of these latest innovations, including: Introducing a new system for dense, path-traced foliage in NVIDIA RTX Mega Geometry Adding path-traced indirect lighting with ReSTIR PT in the NVIDIA RTX Dynamic Illumination SDK and RTX Hair beta for strand-based accele… …

Mar 10, 2026 · Ike Nnoli

Followed topics

People also ask

Build Personal AI Agents on Windows PCs with New Tools from Microsoft and NVIDIA | NVIDIA Technical Blog

NVIDIA RTX Innovations Are Powering the Next Era of Game Development | NVIDIA Technical Blog