Search

Showing top 101 results for "Distribution support"

NVIDIA Dynamo

…It supports open source inference engines including SGLang, TensorRT™ LLM, and vLLM and simplifies the complexities of distributed serving by disaggregating the various phases of inference across different GPUs, intelligently routing requests…

Building the AI Grid with NVIDIA: Orchestrating Intelligence Everywhere | NVIDIA Technical Blog

…Intelligent workload placement across distributed sites The NVIDIA AI Grid reference design provides a unified framework for building geographically distributed, interconnected, and orchestrated AI infrastructure. Figure 1 shows how existing network assets…

Mar 17, 2026 · Sree Sankar

NVIDIA Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes | NVIDIA Technical Blog

…GMS restore path with pluggable backends (GDS, UCX, etc), currently gated on pending CUDA driver patch TensorRT-LLM support Multi-GPU and multi-node support via quiesce/resume hooks for PyTorch, NCCL…

May 27, 2026 · Schwinn Saereesitthipitak

Removing the Guesswork from Disaggregated Serving | NVIDIA Technical Blog

…distributions directly. NVIDIA plans to keep working with third parties on bringing AIConfigurator to more systems and tools. AIConfigurator is actively welcoming contributions, including performance data for new hardware, additional backend support…

Mar 9, 2026 · Tianhao Xu

NVIDIA Aerial

…Tech Blog Accelerated and Distributed UPF for the Era of Agentic AI and 6G NVIDIA AI Aerial A key enabler for 6G and AI-RAN is the distributed user plane function (dUPF…

Speeding Up Variable-Length Training with Dynamic Context Parallelism and NVIDIA Megatron Core | NVIDIA Technical Blog

…Both LLM training and large-scale video generation have clear long-tail distributions in sequence length. A small fraction of ultra-long samples accounts for a disproportionately large share of the computational…

Jan 28, 2026 · Kunlun Li

DOCA Software Framework

…The SDK supports a range of operating systems and distributions and includes drivers, libraries, tools, documentation, and example applications. DOCA-Host is the DOCA package for host installation and includes several installation…

NVIDIA CUDA 13.3 Enhances GPU Development with Tile Programming in C++, Compiler Autotuning, and Python Updates | NVIDIA Technical Blog

…Additionally, CUDA Tile programming is now supported on Compute Capability 9.0 (NVIDIA Hopper) GPUs in addition to all other supported GPU architectures. We are also releasing CUDA Python 1.0, solidifying…

May 26, 2026 · Jonathan Bentz

Running Large-Scale GPU Workloads on Kubernetes with Slurm | NVIDIA Technical Blog

…Slurm 25.11 supports TopologyParam=BlockAsNodeRank with TopologyPlugin=topology/block , ensuring allocations are sorted so applications can discover segments by node rank. Distributed training jobs achieve full NVLink bandwidth across node boundaries…

Apr 9, 2026 · Anton Polyakov

How NVIDIA Dynamo 1.0 Powers Multi-Node Inference at Production Scale | NVIDIA Technical Blog

…The AI framework delivers low-latency, high-throughput, distributed inference for production-grade multi-node AI deployments. Dynamo supports leading open source inference engines, including SGLang, NVIDIA TensorRT LLM, and vLLM. It…

Mar 16, 2026 · Amr Elmeleegy

Followed topics