SkillTotal
cd ~/labs
visitor@skilltotal:~$ cat ./labs/command-injection/mission.txt

RCE via Tool Arguments

lab 06 · ST-CMDI-PY · LLM06
mission.txt
scenario

OpsBot exposes a ping(host) tool implemented as sh -c "ping -c1 " + host. To stop injection it runs a blocklist that strips ; and && from your argument. Everything else passes straight to the shell.

objective

Supply a host argument that breaks out of the ping command and runs your own command (e.g. id or cat /etc/passwd) using a metacharacter the blocklist does not strip.

session — command-injection

# awaiting command — type a payload and press Enter

▚ Intel

No leads yet. Declassify intel one step at a time when you’re stuck.

▰ Dossierclassified — solve to unseal

How this attack works

The argument was concatenated into a shell string and the blocklist only removed two tokens. Command substitution $(…), backticks, pipes, a newline, or ${IFS} all reach the shell untouched and execute. A blocklist enumerates bad inputs; the shell grammar is far larger than any blocklist.

Why it's dangerous

Agents are increasingly wired to real tools (shells, HTTP, files). If a tool builds a shell command from model- or user-controlled text, one missed metacharacter is remote code execution on the host. SkillTotal flags shell-building sinks as ST-CMDI-PY.

OWASP mapping

Maps to OWASP Top 10 for LLM Applications (2025): LLM06: Excessive Agency (a tool with more power than it can safely be given untrusted input). SkillTotal’s ST-CMDI-PY flags command-injection sinks statically.

How to defend

  • Never build a shell string from untrusted input — pass an argv array (no sh -c).
  • Validate against a strict allowlist (e.g. a hostname/IP regex), not a blocklist.
  • Drop the shell entirely; call the library/syscall directly.
  • Run tools with least privilege in a sandbox so a breakout has nothing to reach.

SkillTotal catches this class of issue deterministically (rule ST-CMDI-PY).

Scan AI component (free)

FAQ

Why is a blocklist the wrong control here?
The shell metacharacter space is large and composable. Stripping ; and && leaves $(), backticks, pipes, redirects, newlines, and ${IFS}. Allowlisting the expected shape is the only robust option.
How does SkillTotal detect this?
ST-CMDI-PY statically flags code paths that pass untrusted data into a shell-executing sink (os.system, subprocess with shell=True, sh -c), so the dangerous tool is caught before it ships.