Microsoft's AI Agents Can Be Hijacked With One Poisoned Description

A single fake tool description is all an attacker needs to turn your AI agent into a data thief.

Contents

How Does a Poisoned Tool Description Actually Work?
Why Standard Security Monitoring Fails to Catch This
What This Means for Enterprise AI Deployments
Is Your Organization Equipped to Detect This?

Microsoft researchers have confirmed a critical vulnerability in how AI agents operate: attackers can hijack systems designed to act on a user’s behalf by poisoning the descriptions of the digital tools those agents rely on. The attack leaves no trace, triggers no alarms, and looks entirely routine to any human watching the logs.

Key Findings:

The Attack Vector: Attackers need only write access to a tool registry to redirect an AI agent’s legitimate data access toward malicious endpoints, with no device compromise required.
The Invisibility Problem: Logging systems record the transaction as normal, audit trails show the agent acting within its permissions, and no policy rule is technically violated — making detection through standard security monitoring nearly impossible.
The Scale Risk: A single poisoned tool description in a shared agent ecosystem can simultaneously affect dozens of workflows and hundreds of users, making this a supply-chain-class threat against enterprise AI deployments.

The finding comes from Microsoft Incident Response, which documented how an AI agent — a system trained to perform tasks autonomously using external tools and APIs — can be tricked into exfiltrating sensitive corporate data through nothing more than deceptive metadata. The agent follows every rule correctly. It breaks no policy. Yet the data walks out the door anyway.

This class of attack sits within a broader category that security researchers have been tracking with increasing urgency. A 2025 survey published in the ACM Digital Library identified prompt injection and tool-level manipulation as among the most consequential emerging threats in agent-integrated frameworks, noting that the widespread adoption of AI agents has dramatically expanded the practical attack surface for these techniques. The Microsoft findings give that theoretical concern a documented, real-world shape.

How Does a Poisoned Tool Description Actually Work?

AI agents are built to use external tools to accomplish their goals. A tool might be a database query system, a file retriever, or an API that connects to a company’s internal services. Each tool comes with a description — metadata that tells the agent what the tool does, what inputs it accepts, and what outputs it produces. That description is how the agent decides whether to use the tool and what data to feed it.

An attacker who can modify or inject a tool description can rewrite that metadata to mislead the agent about what the tool actually does. The agent, trusting the description, might send sensitive information to what it believes is a legitimate internal service — but is actually a malicious endpoint controlled by the attacker. From the agent’s perspective, it followed instructions correctly. From the security team’s perspective, the transaction looks normal. The agent was authorized to use that tool. The data request matched the tool’s stated purpose. No rule was violated.

Microsoft’s research demonstrates that this attack vector bypasses the default safeguards most organizations rely on. Logging systems record the transaction. Audit trails show the agent acting within its permissions. But the fundamental trust assumption — that tool descriptions are honest — has been broken. The structural parallel to supply-chain attacks is direct: in both cases, the compromise occurs upstream of the user, inside infrastructure that is presumed trustworthy.

By the Numbers:
• AI agents can access dozens of external tools simultaneously, each representing a potential poisoning point within a single workflow
• A single compromised tool description in a shared registry can propagate across all agents that reference it, affecting entire enterprise deployments at once
• Standard permission-based access controls — the primary defense layer in most enterprise AI deployments — provide no protection against this attack class, because the agent’s access remains fully authorized throughout

Why Standard Security Monitoring Fails to Catch This

What makes this particularly concerning is the invisibility of the attack to both the agent and the security team. The agent has no way to verify whether a tool description is truthful. It has no mechanism to detect that the tool it’s about to use isn’t what the description claims. And security teams, trained to watch for unauthorized data access or policy violations, may see nothing amiss because the agent is doing exactly what its description told it to do.

The vulnerability is particularly acute because AI agents are increasingly deployed to handle real business processes. They retrieve documents from repositories, query databases, send emails, and interact with APIs on behalf of their users. As enterprises integrate agents into workflows, the surface area for poisoned tool descriptions expands. A single compromised tool description in a shared agent ecosystem could affect dozens of workflows and hundreds of users.

The attack also scales silently. Unlike a phishing email or a malware infection, a poisoned tool description doesn’t require the attacker to compromise a user’s device or email account. It only requires write access to the tool registry or configuration system where tool descriptions are stored — a single point of failure that could be exploited once to affect many agents simultaneously. This is precisely the dynamic that made the Capital One cloud breach so damaging: a single misconfiguration in trusted infrastructure enabled unauthorized data access at scale, with no individual transaction appearing anomalous.

What Research Shows:
• A systematic literature review of prompt injection attacks published in IEEE Access identifies tool hijacking and agent security as a distinct and underexplored research boundary, with existing defenses largely focused on input-level filtering rather than tool-level integrity verification
• Research published in IEEE’s 2026 survey on Agentic AI Security documents that propagating attacks on AI agents include recursive injection patterns, where a single malicious entry point can cascade through an agent’s tool-use chain — amplifying the damage from a single poisoned description
• Both bodies of research converge on the same gap: current evaluation frameworks for AI agent security do not adequately account for trust assumptions embedded in tool metadata

What This Means for Enterprise AI Deployments

For organizations deploying AI agents in production, the implications are stark. The systems built to automate work and improve efficiency can become vectors for data theft if the tool ecosystem they depend on is compromised. A supply-chain attack on tool descriptions could be far more damaging than a direct breach of user credentials, because it weaponizes the agent’s own legitimate access. The history of large-scale data breaches consistently shows that the most consequential incidents exploit trusted infrastructure rather than brute-forcing protected systems — and poisoned tool descriptions follow exactly that pattern.

Microsoft’s research underscores a broader tension in AI agent design: these systems are built to be autonomous and to trust their environment. But that trust is only as strong as the integrity of the metadata and configurations they rely on. As AI agents become more central to enterprise operations, the security model for those agents needs to evolve beyond permission-based access control to include verification of the tools themselves.

Expert Analysis:
• The core security failure here is architectural: AI agents inherit the trust model of their tool ecosystem without any independent verification layer, meaning security guarantees are only as strong as the least-protected configuration file in the registry
• Defenders need to treat tool descriptions as a privileged attack surface — applying the same integrity controls to metadata that they apply to code, including version control, access logging, and cryptographic signing where feasible
• The practical implication for security teams is a monitoring gap: existing SIEM rules and anomaly detection systems are not calibrated to flag authorized agents performing authorized actions, even when the underlying tool description has been tampered with

Is Your Organization Equipped to Detect This?

The question now is whether organizations deploying AI agents have the visibility and controls in place to detect poisoned tool descriptions before an agent acts on them. For many, the answer is likely no. Most enterprise security stacks were not designed with tool-registry integrity in mind, and the audit trails generated by AI agents typically record what the agent did — not whether the tool description it relied on was legitimate.

Closing this gap requires a shift in how organizations think about the trust boundaries in their AI infrastructure. Tool registries need to be treated as security-critical assets, not administrative conveniences. Access controls, change logging, and integrity verification for tool descriptions are not optional hardening measures — they are foundational requirements for any enterprise that has given an AI agent access to sensitive data. The same principle applies to any third-party tool integrated into an agent workflow, where the risk profile resembles the broader challenge of supply-chain compromise through trusted update mechanisms.

Microsoft’s documentation of this vulnerability is a signal that the security community needs to treat AI agent infrastructure with the same rigor applied to any other privileged system. The agents are trustworthy. The question is whether the environment they trust has been secured to the same standard.