MCP Tool Poisoning
You are publishing an MCP tool. The host runs a linter that flags any description containing an instruction pattern (before/after/always/first … read/send/attach) together with a sensitive path (.ssh, .env, api key, etc.). A naive poisoned description is blocked.
Smuggle a hidden directive past the linter — either by using invisible Unicode or by hiding the instruction inside a parameter field the linter does not scan.
No leads yet. Declassify intel one step at a time when you’re stuck.
How this attack works
The linter scanned only the top-level description key. JSON parameter fields (defaults, enums, examples) carry arbitrary strings — any of them can hold a hidden directive the agent will read. Invisible Unicode breaks string-comparison entirely.
Why it's dangerous
Tool metadata is treated as trusted guidance by agent runtimes. One poisoned tool in a registry affects every agent that installs it. Parameter fields are never audited by users or most linters.
OWASP mapping
Maps to OWASP Top 10 for LLM Applications (2025): LLM01: Prompt Injection and LLM03: Supply Chain. SkillTotal flags tool-poisoning indicators (visible and hidden) as ST-MCP-DANGEROUS-TOOL.
How to defend
- Normalize Unicode (strip invisible chars, resolve homoglyphs) before scanning any field.
- Scan ALL JSON fields — description, parameter names, defaults, enums, examples — not just the top-level string.
- Treat tool metadata as data, not instructions; require confirmation before sensitive actions.
- Apply least privilege: tools should not have both file-read and network-egress.
SkillTotal catches this class of issue deterministically (rule ST-MCP-DANGEROUS-TOOL).
FAQ
- What is MCP tool poisoning?
- Hiding malicious instructions in an MCP tool's metadata (description or parameter fields) so the agent silently performs unintended actions when the tool is loaded.
- How does SkillTotal detect this?
- By scanning all tool metadata fields after Unicode normalization, flagging instruction patterns co-occurring with sensitive paths — including invisible-char variants.