SkillTotal
cd ~/labs
visitor@skilltotal:~$ cat ./labs/mcp-tool-poisoning/mission.txt

MCP Tool Poisoning

lab 04 · ST-MCP-DANGEROUS-TOOL · LLM01
mission.txt
scenario

You are publishing an MCP tool. The host runs a linter that flags any description containing an instruction pattern (before/after/always/first … read/send/attach) together with a sensitive path (.ssh, .env, api key, etc.). A naive poisoned description is blocked.

objective

Smuggle a hidden directive past the linter — either by using invisible Unicode or by hiding the instruction inside a parameter field the linter does not scan.

session — mcp-tool-poisoning
tool.json
▚ Intel

No leads yet. Declassify intel one step at a time when you’re stuck.

▰ Dossierclassified — solve to unseal

How this attack works

The linter scanned only the top-level description key. JSON parameter fields (defaults, enums, examples) carry arbitrary strings — any of them can hold a hidden directive the agent will read. Invisible Unicode breaks string-comparison entirely.

Why it's dangerous

Tool metadata is treated as trusted guidance by agent runtimes. One poisoned tool in a registry affects every agent that installs it. Parameter fields are never audited by users or most linters.

OWASP mapping

Maps to OWASP Top 10 for LLM Applications (2025): LLM01: Prompt Injection and LLM03: Supply Chain. SkillTotal flags tool-poisoning indicators (visible and hidden) as ST-MCP-DANGEROUS-TOOL.

How to defend

  • Normalize Unicode (strip invisible chars, resolve homoglyphs) before scanning any field.
  • Scan ALL JSON fields — description, parameter names, defaults, enums, examples — not just the top-level string.
  • Treat tool metadata as data, not instructions; require confirmation before sensitive actions.
  • Apply least privilege: tools should not have both file-read and network-egress.

SkillTotal catches this class of issue deterministically (rule ST-MCP-DANGEROUS-TOOL).

Scan AI component (free)

FAQ

What is MCP tool poisoning?
Hiding malicious instructions in an MCP tool's metadata (description or parameter fields) so the agent silently performs unintended actions when the tool is loaded.
How does SkillTotal detect this?
By scanning all tool metadata fields after Unicode normalization, flagging instruction patterns co-occurring with sensitive paths — including invisible-char variants.