mcp-recon

Reverse-engineer any MCP server's tool surface in 30 seconds. Connects to an MCP server (stdio or HTTP), enumerates its tools, runs a schema-aware adversarial fuzzer, classifies the authority each tool exposes against OWASP LLM Top 10 and MITRE ATLAS, and emits a structured threat profile — JSON for machines, Markdown for humans.

The thesis: every team adopting MCP right now is asking "what does this server actually do?" and there's no tooling for it. The agentic ecosystem grew faster than its security tooling. mcp-recon is the recon side of that gap. capnagent is the defensive side: take a recon report, derive a tight capability caveat, deny everything outside it.

Status: v0.1.2 shipped 2026-04-30. Public dataset of every stdio TypeScript MCP server in Anthropic's @modelcontextprotocol/* namespace audited. See docs/WRITEUP.md for the headline findings (DoS surface on everything, missing-bounds finding on filesystem example wrapper, full server-maturity ranking).

At a glance

Coverage	Surface	Performance
4 / 4 Anthropic reference servers scanned	5 commands · 4 schema-tagged artefacts	scan budget=200 in <60s on 14-tool server
37 tools classified across the public dataset	enumerate · fuzz · classify · report · scan	deterministic (seeded PRNG, default `0xC0FFEE`)
1374 fuzz calls across the dataset (1 confirmed DoS finding)	rules-based, not LLM-mediated	<256MB memory on 100-tool server

Maps tools to OWASP LLM01 / LLM06 / LLM08 and MITRE ATLAS categories. Every output ships with a copy-pasteable capnagent caveat per tool. Reproducibility contract in capnagent's docs/EVALUATION.md.

What you get

Run mcp-recon scan against any MCP server (stdio or HTTP) and get a folder of evidence: a tool inventory, a fuzz transcript, a classification, and a Markdown threat profile that a security reviewer or developer-on-call can actually read. The JSON files are the machine-parseable evidence the writeup links to. Run against any of the 4 servers in the public dataset and your output matches examples/public-servers/server-<name>/ byte-for-byte.

Command cheatsheet

mcp-recon enumerate <server-spec>                                # → inventory.json
mcp-recon fuzz      <server-spec> [--budget=N] [--seed=N]        # → fuzz.json
mcp-recon classify  <inventory.json> [--fuzz=<fuzz.json>]        # → classification.json
mcp-recon report    <inventory.json> <classification.json> [--fuzz=<fuzz.json>]  # → report.md
mcp-recon scan      <server-spec> --out=<dir> [--budget=N] [--seed=N]            # → 4 artefacts

Server-spec forms: stdio:<command> [args...] (spawn process, talk over stdio) or http://host:port (HTTP transport).

Sample output

$ mcp-recon scan "stdio:npx -y @modelcontextprotocol/server-filesystem /tmp" \
    --out=./reports/filesystem --budget=200

mcp-recon: 14 tools, 4 confused-deputy candidates
mcp-recon: fuzz — ok=4 protocol_error=719 runtime_error=0
mcp-recon: wrote 4 artefacts to ./reports/filesystem/

$ ls ./reports/filesystem/
inventory.json   fuzz.json   classification.json   report.md

A snippet from the resulting classification.json — every tool gets a class, an authority level, a confused-deputy verdict, and a copy-pasteable capnagent caveat:

{
  "tool": "edit_file",
  "data_class": "filesystem",
  "authority_level": "write",
  "confused_deputy_candidate": true,
  "confidence": 0.91,
  "rationale": "name match \"\\b(write[_-]?file|edit[_-]?file|create[_-]?directory|move[_-]?file)\\b\" → filesystem/write (0.70); description match → filesystem/read (0.50); schema: arg \"path\" is path-shaped → filesystem (0.40); user-controllable string arg + non-read authority → confused-deputy candidate",
  "recommended_caveat": "tool == \"edit_file\" AND caller == \"<your-caller-id>\" AND arg.path starts_with \"<your-sandbox-prefix>/\" AND now <= @<your-cap-expiry>  // WRITE filesystem"
}

The full headline findings — including the everything server's DoS surface and the filesystem wrapper's missing-bounds — are in docs/WRITEUP.md.

Recon → capnagent in one pipe

   ┌──────────────┐    inventory.json     ┌──────────────┐
   │              │    fuzz.json          │              │
   │  MCP server  │ ──▶  classification ──▶│  capnagent   │ ──▶ deny anything
   │              │      .json            │   issuer     │     outside scope
   │              │      report.md        │              │
   └──────────────┘                       └──────────────┘
        ▲                                       │
        │                                       ▼
        └────────── scoped caller ◀──────  signed capability

mcp-recon documents the tool surface; capnagent enforces the bound. Each project stands alone. Together they're a single security posture for any MCP-shaped agent. Run mcp-recon first, paste the suggested caveats into your capnagent issuer, ship.

From recon to a capnagent issuer in one pipe

classification.json ships a copy-pasteable caveat per tool, but manual paste is its own foot-gun. The caveats command produces a machine-readable issuance plan ready to feed straight into a capnagent issuer:

$ mcp-recon caveats ./reports/filesystem/classification.json \
    --caller=agent:planner \
    --sandbox-prefix=/var/agent-sandbox/tenant-42 \
    --expiry=2026-12-31T23:59:59Z \
    > ./reports/filesystem/caveats.json

mcp-recon: 14 plans (14 ready, 0 flagged) — schema=mcp-recon/v0.1/caveats

The output document (schema mcp-recon/v0.1/caveats) is one entry per tool, with caveats: string[] already split into individual capnagent DSL predicates and operator bindings substituted. Plans get flagged with a structured reason set (classification_unknown, low_confidence, cdc_without_arg_constraint, unsubstituted_placeholder) so the review surface is machine-checkable.

Run with no bindings to get a "review pass" — every plan is flagged, but you can see exactly which placeholders need binding before committing values. Per-tool overrides (per_tool_overrides in the library API) let you tighten confused-deputy candidates the classifier didn't constrain.

Why this exists

For the developer adopting MCP. Before you wire a third-party MCP server into your agent, run mcp-recon against it. You get an honest threat profile in 30 seconds — what does this thing actually let an agent do, and what's the smallest cap that preserves utility?

For the security team auditing an agent stack. mcp-recon turns "we depend on N MCP servers" into "here's the consolidated tool surface, here's what each one is classified as, here's where the confused-deputy candidates are." A printable artifact you can review.

For the AI-security researcher. mcp-recon's reports are the input to round-N writeups in the capnagent purple-team corpus. Recon → capability gap → attack PoC → fix → CLOSED.

Installation

# From source (the recommended path today; npm package is post-v0.2)
git clone https://github.com/euanmcrosson-dotcom/mcp-recon
cd mcp-recon
npm install
npm run -w @mcp-recon/cli build

# Run the CLI directly via tsx (no build step needed for development)
npx tsx packages/mcp-recon-cli/src/bin/recon.ts scan \
  "stdio:npx -y @modelcontextprotocol/server-filesystem $HOME/sandbox" \
  --out=./reports/filesystem --budget=200

Windows / Git Bash users: prefix path-shaped flags with MSYS_NO_PATHCONV=1 to prevent leading-slash path mangling. Example: MSYS_NO_PATHCONV=1 mcp-recon caveats classification.json --sandbox-prefix=/var/sandbox --expiry=2026-12-31T23:59:59Z

Documentation

docs/SPEC.md — v0.1 surface, server-spec syntax, output schemas
docs/METHODOLOGY.md — classifier rules, fuzz axes, signals, falsifiability
docs/WRITEUP.md — public-dataset findings + headline observations
schemas/ — formal JSON Schema files for the four wire formats
findings/ — corpus of documented findings (F001–F006)
SECURITY.md — vulnerability reporting policy
CONTRIBUTING.md — how to add classifier rules, fuzz axes, dataset entries

How it compares

	mcp-recon	NVIDIA garak	Burp / ZAP	manual review
Scope	MCP server tool surfaces	model-behavior testing	HTTP fuzzing	everything
Output	structured JSON + Markdown	reports	proxy logs	human prose
Determinism	yes (seeded PRNG)	partial	no	no
LLM in the loop	no (rules-based)	yes	no	yes
OWASP LLM / MITRE ATLAS mapping	yes (per-tool)	partial	no	author-dependent
Companion enforcement	capnagent	none	none	none

mcp-recon is not a replacement for any of those — it's the piece nobody else is building: a deterministic, schema-aware characterization of an MCP server's tool surface, in a format that wires straight into a capability-bounded enforcement layer.

What this is NOT

Not a replacement for capnagent. mcp-recon documents what's there; capnagent enforces what's allowed. You want both.
Not a vulnerability scanner for the model itself. Use NVIDIA garak for that. We test the tool surface, not model behavior.
Not an exploitation framework. We send adversarial schemas to characterize handling, not actual exploits.
Not a proxy / MITM tool. Out of scope. See docs/SPEC.md §"What v0.1 does NOT do."

Tests

The workspace has 68 unit + property-based tests passing today (npm test), covering schema parsing, the seeded PRNG, fuzz generators along all six adversarial axes, the classification rules, the Markdown report renderer, and end-to-end scan flow. Two additional integration test files (enumerate.integration.test.ts, fuzz.integration.test.ts) exercise live transport against a locally-spawned MCP server when the dev environment provides one.

npm test           # all packages, vitest
npm run typecheck  # tsc --noEmit, strict mode
npm run lint       # biome check

Companion project — capnagent

mcp-recon is the offensive complement to capnagent, which provides capability-bounded authorization for AI agent tool calls. Together they implement the standard recon-then-bound security workflow:

[ mcp-recon ]  →  threat profile  →  [ capnagent ]
   "what is        "what should           "deny anything
    here?"          we allow?"             outside that"

Each project stands alone. Together they're a single security posture for any MCP-shaped agent.

License

Apache-2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
.github		.github
crates/mcp-recon-core		crates/mcp-recon-core
docs		docs
examples/public-servers		examples/public-servers
findings		findings
packages/mcp-recon-cli		packages/mcp-recon-cli
schemas		schemas
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
SECURITY-INSIGHTS.yml		SECURITY-INSIGHTS.yml
SECURITY.md		SECURITY.md
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mcp-recon

Contents

At a glance

What you get

Command cheatsheet

Sample output

Recon → capnagent in one pipe

From recon to a capnagent issuer in one pipe

Why this exists

Installation

Documentation

How it compares

What this is NOT

Tests

Companion project — capnagent

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mcp-recon

Contents

At a glance

What you get

Command cheatsheet

Sample output

Recon → capnagent in one pipe

From recon to a capnagent issuer in one pipe

Why this exists

Installation

Documentation

How it compares

What this is NOT

Tests

Companion project — capnagent

License

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages