Search

Showing top 15 results for "AI prompt injection"

People also ask

How do attackers poison AI systems in this stage?

In the poison stage, the attacker’s goal is to place malicious inputs into locations where they will ultimately be processed by the AI model. Two primary techniques dominate: Direct prompt injection: The attacker is the user, and provides inputs via normal user interactions. Impact is typically scoped to the attacker’s session but is useful for probing behaviors. Indirect prompt injection: The attacker poisons data that the application ingests on behalf of other users (e.g., RAG databases, shared documents). This is where impact scales. Text-based prompt infection is the most common technique

Modeling Attacks on AI-Powered Apps with the AI Kill Chain Framework | NVIDIA Technical Blog

What is a multistage pipeline for AI agent customization?

In practice, the most effective agent customization combines multiple techniques in sequence. The stages of a representative pipeline are outlined below. Start with system prompts, tool and skill definitions, and retrieval to establish baseline behavior.

Mastering Agentic Techniques: AI Agent Customization | NVIDIA Technical Blog

What happens during the recon stage of the AI Kill Chain?

In the recon stage, the attacker maps the system to plan their attack. Key questions an attacker is asking at this point include: What are the routes by which data I control can get into the AI model? What tools, Model Context Protocol (MCP) servers, or other functions does the application use that might be exploitable? What open source libraries does the application use? Where are system guardrails applied, and how do they work? What kinds of system memory does the application use? Recon is often interactive. Attackers will probe the system to observe errors and behavior. The more observ

Modeling Attacks on AI-Powered Apps with the AI Kill Chain Framework | NVIDIA Technical Blog

How do attackers hijack AI model behavior once poisoning succeeds?

The hijack stage is where the attack becomes active. Malicious inputs, successfully placed in the poison stage, are ingested by the model, hijacking its output to serve attacker objectives. Common hijack patterns include: Attacker-controlled tool use: Forcing the model to call specific tools with attacker-defined parameters. Data exfiltration: Encoding sensitive data from the model’s context into outputs (e.g., URLs, CSS, file writes). Misinformation generation: Crafting responses that are deliberately false or misleading. Context-specific payloads: Triggering malicious behavior only in tar

Modeling Attacks on AI-Powered Apps with the AI Kill Chain Framework | NVIDIA Technical Blog

Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments | NVIDIA Technical Blog

… Indirect prompt injection as a supply chain vector: The agent’s summarization model was also susceptible to indirect prompt injection through code comments, illustrating how these techniques can chain together across agentic workflows. …

Apr 20, 2026 · Daniel Teixeira

Modeling Attacks on AI-Powered Apps with the AI Kill Chain Framework | NVIDIA Technical Blog

… Introducing guardrails or prompt injection detection tools like NeMoGuard-JailbreakDetect as part of the filtering step can help complicate prompt injection through these sources. …

Sep 11, 2025 · Rich Harang

Practical Security Guidance for Sandboxing Agentic Workflows and Managing Execution Risk | NVIDIA Technical Blog

… The primary threat to these tools is that of indirect prompt injection, where a portion of the content ingested by the LLM driving the model is provided by an adversary through vectors such as malicious repositories or pull requests, git histories with prompt injections, .cursorrules , CLAUDE/AGENT… …

Jan 30, 2026 · Rich Harang

Mastering Agentic Techniques: AI Agent Customization | NVIDIA Technical Blog

… The following sections cover the main approaches. Prompt engineering and system prompts Prompt engineering only requires changing the prompt to the agent at inference time. …

May 20, 2026 · Edward Li

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog

… In the following examples, we test against this general inference setup where the model is initialized, a processor is defined to handle input formatting, and a fixed system prompt is defined: model id = "google/paligemma2-3b-mix-224" model = PaliGemmaForConditionalGeneration.from pretrained model … …

Jan 28, 2026 · Joseph Lucas

Run Autonomous, Self-Evolving Agents More Safely with NVIDIA OpenShell | NVIDIA Technical Blog

… An agent with persistent shell access, live credentials, the ability to rewrite its own tooling, and six hours of accumulated context running against your internal APIs is a fundamentally different threat model. Every prompt injection is a potential credential leak. …

Mar 16, 2026 · Ali Golshan

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo | NVIDIA Technical Blog

Mar 1, 2026 · Aiden Chang

NVIDIA NeMo Agent Toolkit

… Safety and Security Use NeMo Agent Toolkit safety and security middleware features to Red Team agentic workflows and find points of exploitability and vulnerabilities like prompt injection, jail break, tool poisoning, and other custom attacks. …

How to Build License-Compliant Synthetic Data Pipelines for AI Model Distillation | NVIDIA Technical Blog

… Define evaluation rubrics for answer quality CompletenessRubric = dd.Score name="Completeness", description="Evaluation of AI assistant's thoroughness in addressing all aspects of the user's query.", options={ "Complete": "The response thoroughly covers all key points requested in the question, pro… …

Feb 5, 2026 · Alex Steiner

Build a More Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw | NVIDIA Technical Blog

… Security note: While OpenShell provides robust isolation, remember that no sandbox offers complete protection against advanced prompt injection. …

Apr 17, 2026 · Patrick Moorhead

Followed topics

People also ask

Mitigating Indirect AGENTS.md Injection Attacks in Agentic Environments | NVIDIA Technical Blog

Modeling Attacks on AI-Powered Apps with the AI Kill Chain Framework | NVIDIA Technical Blog

Practical Security Guidance for Sandboxing Agentic Workflows and Managing Execution Risk | NVIDIA Technical Blog

Mastering Agentic Techniques: AI Agent Customization | NVIDIA Technical Blog

Updating Classifier Evasion for Vision Language Models | NVIDIA Technical Blog

Run Autonomous, Self-Evolving Agents More Safely with NVIDIA OpenShell | NVIDIA Technical Blog

Building Telco Reasoning Models for Autonomous Networks with NVIDIA NeMo | NVIDIA Technical Blog

NVIDIA NeMo Agent Toolkit

How to Build License-Compliant Synthetic Data Pipelines for AI Model Distillation | NVIDIA Technical Blog

Build a More Secure, Always-On Local AI Agent with OpenClaw and NVIDIA NemoClaw | NVIDIA Technical Blog