Search

Showing top 114 results for "model-by-model evaluation"

People also ask

What’s the difference between evaluating an AI model and evaluating an AI agent?

While model and agent evaluation are inextricably linked, their technical benchmarks and metrics for success are fundamentally different.

Mastering Agentic Techniques: AI Agent Evaluation | NVIDIA Technical Blog

What is NVIDIA Model Optimizer?

The NVIDIA Model Optimizer (ModelOpt) library incorporates state-of-the-art model optimization techniques to compress and accelerate AI models. These techniques include quantization, distillation, pruning, speculative decoding, and sparsity. ModelOpt accepts Hugging Face, PyTorch, or ONNX format models as input and provides Python APIs for users to easily combine different optimization techniques to produce optimized checkpoints. ModelOpt supports highly performant quantization formats such as FP4, FP8, INT8, and INT4, and advanced algorithms including SmoothQuant, AWQ, SVDQuant, and Double Q

Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer | NVIDIA Technical Blog

What is CLIP?

CLIP (Contrastive Language-Image Pretraining), introduced by OpenAI in 2021, is a foundation vision language model (VLM) that learns a shared embedding space for images and text through contrastive learning on large image-text pairs. Its ability to produce semantically aligned representations has made it a core building block across modern multimodal systems. The CLIP text encoder is widely reused as a conditioning module for text-to-image (Stable Diffusion, for example) and text-to-video (AnimateDiff, for example) synthesis. Its vision encoder serves as the visual backbone in multimodal LLMs

Model Quantization: Post-Training Quantization Using NVIDIA Model Optimizer | NVIDIA Technical Blog

How Small Language Models Are Key to Scalable Agentic AI | NVIDIA Technical Blog

Top Stories How Small Language Models Are Key to Scalable Agentic AI Aug 29, 2025 By Peter Belcak Discuss (0) Discuss (0) L T F R E AI-Generated Summary Like Dislike…

Aug 29, 2025 · Peter Belcak

How to Build Deep Agents for Enterprise Search with NVIDIA AI-Q and LangChain | NVIDIA Technical Blog

…The NVIDIA AI-Q blueprint and NeMo Agent Toolkit are both part of the broader NVIDIA Agent Toolkit, a collection of tools, models and runtimes for building, evaluating and optimizing safe, long…

Mar 18, 2026 · Sean Lopp

Develop Native Multimodal Agents with Qwen3.5 VLM Using NVIDIA GPU-Accelerated Endpoints | NVIDIA Technical Blog

…by NVIDIA Blackwell GPUs. As part of the NVIDIA Developer Program , you can explore quickly in the browser, experiment with prompts, and even test the model with your own data to evaluate…

Feb 27, 2026 · Anu Srivastava

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog

Trustworthy AI / Cybersecurity Updating Classifier Evasion for Vision Language Models Jan 28, 2026 By Joseph Lucas Discuss (0) Discuss (0) L T F R E AI-Generated Summary Like Dislike Transformer-based…

Jan 28, 2026 · Joseph Lucas

NVIDIA Blackwell Sets STAC-AI Record for LLM Inference in Finance | NVIDIA Technical Blog

…language models (LLMs) enhance financial trading by analyzing unstructured data such as financial news and earnings reports to predict market movements and automate strategies. The STAC-AI LANG6 benchmark evaluates LLM inference…

May 27, 2026 · Dan Blanaru

OSMO Platform

…Infrastructure Skills for Your Coding Agent Agent-aware workflows can reason about pipelines, monitor execution, inspect capacity, and ensure traceable, auditable model deployment. Secure With Open Standards Secure your solution with OIDC…

MiniMax M2.7 Advances Scalable Agentic Workflows on NVIDIA Platforms for Complex AI Applications | NVIDIA Technical Blog

…2026 By Anu Srivastava and Shruti Koparkar Discuss (0) Discuss (0) L T F R E The release of MiniMax M2.7 adds enhancements to the popular MiniMax M2.5 model, built…

Apr 12, 2026 · Anu Srivastava

Deploy Long-Context Reasoning and Agentic Workflows with MiniMax M3 on NVIDIA Accelerated Infrastructure | NVIDIA Technical Blog

…on the latest frontier models. Get started today Developers can prototype and evaluate MiniMax M3 by using the GPU-accelerated API on build.nvidia.com or by downloading the weights from Hugging…

Jun 12, 2026 · Anu Srivastava

Unlock Exascale Performance on NVIDIA GB200 NVL72 with Slurm Topology-Aware Job Scheduling | NVIDIA Technical Blog

…The optimal segment size for a given application is determined by factors such as model type and the combination of parallelism types used for training. Generally, larger jobs (those utilizing more GPUs…

May 21, 2026 · Sachin Lakharia

Removing the Guesswork from Disaggregated Serving | NVIDIA Technical Blog

…Aichen focuses on AI inference frameworks and deep learning model optimization, and is particularly interested in large language models and multimodal models. View all posts by Aichen Feng View all posts by…

Mar 9, 2026 · Tianhao Xu

Followed topics