Search: integration/deployment

Deploying Disaggregated LLM Inference Workloads on Kubernetes | NVIDIA Technical Blog

…nvidia.com/gpu: "1" Router (a standard deployment—no leader-worker topology needed): apiVersion: apps/v1 kind: Deployment metadata: name: router spec: replicas: 2 selector: matchLabels: app: router template: metadata: labels: app…

Mar 23, 2026 · Anish Maddipoti

Deploy Self-Evolving Agents for Faster, More Secure Research with a Hermes Agent and NVIDIA NemoClaw | NVIDIA Technical Blog

…The more users work with the agent, the better it gets. While the integration points are specific to this use case (Slack, Outlook, and GitHub), the pattern of safely mixing public and…

Jun 2, 2026 · Sam Pastoriza

Pruning and Distilling LLMs Using NVIDIA TensorRT Model Optimizer | NVIDIA Technical Blog

…However, their deployment remains resource-intensive, motivating a growing interest in small language models (SLMs) that offer strong performance at a fraction of the cost. NVIDIA researchers and engineers have demonstrated a…

Oct 7, 2025 · Max Xu

NVIDIA JetPack SDK Downloads and Notes

…Release Information This section provides key details about the latest JetPack release, including version highlights, new features, supported Jetson platforms, and important updates for developers and production deployments. What's New NOTE…

NVIDIA JetPack SDK Downloads and Notes Archive

…Release Information This section provides key details about the latest JetPack release, including version highlights, new features, supported Jetson platforms, and important updates for developers and production deployments. What's New NOTE…

How to Build a Voice Agent with RAG and Safety Guardrails | NVIDIA Technical Blog

…Each layer has its own interface, latency constraints, and integration challenges, and you start to feel them as soon as you move beyond a simple prototype. In this tutorial , you’ll learn…

Jan 5, 2026 · Chris Alexiuk

Real-Time Performance Monitoring and Faster Debugging with NCCL Inspector and Prometheus | NVIDIA Technical Blog

…Live, time-series visualizations can now be powered directly within a user’s infrastructure dashboard by integrating NCCL Inspector with Prometheus Exporter. NCCL Inspector deployment architecture NCCL 2.30 introduces Prometheus Mode…

May 7, 2026 · Ava Arnaz

How Justt Scaled Chargeback Extraction with Nemotron Parse

…scale, enterprise deployment. System architecture To meet the performance, accuracy, and cost constraints, Justt designed a GPU-accelerated document processing architecture optimized for high-throughput inference. The system integrates Nemotron Parse , part…

NVIDIA RTX Innovations Are Powering the Next Era of Game Development | NVIDIA Technical Blog

…visionOS integration with privacy-preserving foveated streaming, which helps developers deliver better visual quality and performance without rebuilding content for standalone hardware. For developers, this means faster iteration, easier deployment of demanding…

Mar 10, 2026 · Ike Nnoli

Accelerating AI-Powered Chemistry and Materials Science Simulations with NVIDIA ALCHEMI Toolkit-Ops | NVIDIA Technical Blog

…integrators, and data structures to enable large-scale, batched simulations leveraging AI. ALCHEMI NIM microservices : A scalable layer of cloud‑ready, domain‑specific microservices for chemistry and materials science, enabling deployment and…

Dec 19, 2025 · Justin S. Smith

Followed topics