MCP Scoreboard

Install

pip install mcp-score

Requires Python 3.12+. Or run without installing via uvx mcp-score or pipx run mcp-score.

Quick Start

# Score a running HTTP server
mcp-score http://localhost:3000/mcp

# Score a server via stdio transport
mcp-score stdio -- python my_server.py

# Score via npx
mcp-score stdio -- npx -y @modelcontextprotocol/server-everything stdio

# Score a GitHub repo (static analysis only)
mcp-score github:owner/repo

# Combine: probe a live server + analyze its source
mcp-score http://localhost:3000/mcp --repo github:owner/repo

Example Output

┌────────────────────────────────────────────────┐
│ MCP Server Score Report                        │
│ mcp-servers/everything                         │
└────────────────────────────────────────────────┘
  Overall Score:  78/100  Grade: B

  Category               Score    Wt.
  Schema Quality             —    25%
  Protocol Compliance       86    20%
  Reliability                —    20%
  Docs & Maintenance         —    15%
  Security & Permissions    71    20%

  Tools Found: 13
  Schema Valid: No
  Error Handling: 100/100
  Fuzz Resilience: 100/100
  Latency: 4ms

What It Scores

The CLI runs two of the same data tiers used by the scoreboard:

⚙

Deep Protocol Probe

Connection, initialization, ping, tools/list, schema validation against JSON Schema, error handling (unknown tools, missing params, wrong types), and fuzz testing (edge-case inputs). Triggered by http:// or stdio targets.

🔍

Static Analysis

7 sub-metrics via GitHub API: schema completeness, description quality, documentation coverage, maintenance pulse, dependency health, license clarity, version hygiene. Triggered by github: target or --repo flag.

Combine both for the most complete CLI score:

mcp-score http://localhost:3000/mcp --repo github:owner/repo

Output Formats

▒

Terminal

Rich-formatted tables with color-coded grades. Default format. Add --verbose for probe timing, schema issues, and error handling details.

{ }

JSON

Machine-readable output for CI/CD pipelines and integrations. Use --format json.

📜

Markdown

Clean markdown for PR comments and documentation. Use --format markdown.

Save a markdown report to a file with -o report.md (works alongside any --format).

CI/CD Integration

Use --fail-below to gate deployments on quality:

# Fail the pipeline if score drops below B
mcp-score --fail-below B stdio -- python my_server.py

GitHub Actions

- name: Score MCP Server
  run: |
    pip install mcp-score
    mcp-score --fail-below B -o score-report.md \
      stdio -- python my_server.py
  env:
    GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

- name: Upload Score Report
  if: always()
  uses: actions/upload-artifact@v4
  with:
    name: mcp-score-report
    path: score-report.md

Exit codes: 0 = pass, 1 = below threshold, 2 = error.

CLI vs. Scoreboard

The CLI is a single-run subset of the full scoreboard pipeline. Here's what each adds:

Capability	CLI	Scoreboard
Protocol probe (schema, errors, fuzz)	✓	✓
Static analysis (7 GitHub metrics)	✓	✓
Reliability monitoring (7-day uptime & latency)	—	✓
Behavioral security (LLM source scan)	—	✓
Agent usability (multi-model LLM eval)	—	✓
Sandbox probes (Docker-based)	—	✓
Score history & regression alerts	—	✓
Score badge for your README	—	✓

The CLI is ideal for local development and CI gates. For the full picture — including continuous monitoring, behavioral security scanning, and agent usability scoring — submit your server to the scoreboard.

All Options

mcp-score [OPTIONS] TARGET [EXTRA_ARGS]...

Options:
  --repo TEXT                     GitHub repo URL for static analysis
  --format [terminal|json|markdown]
  --verbose                       Show individual check details
  --fail-below [A+|A|B|C|D|F]    Exit 1 if below grade
  --timeout INTEGER               Probe timeout in seconds (default: 30)
  -o, --output PATH               Save markdown report to file
  -h, --help                      Show help
  --version                       Show version

MIT Licensed. Free and open source.

Install from PyPI → Source on GitHub → Full Methodology → Pre-flight Check →