mcp-score
Score your MCP server's quality from the command line. Same engine that powers the scoreboard.
Install
pip install mcp-score
Requires Python 3.12+. Or run without installing via uvx mcp-score or pipx run mcp-score.
Quick Start
# Score a running HTTP server
mcp-score http://localhost:3000/mcp
# Score a server via stdio transport
mcp-score stdio -- python my_server.py
# Score via npx
mcp-score stdio -- npx -y @modelcontextprotocol/server-everything stdio
# Score a GitHub repo (static analysis only)
mcp-score github:owner/repo
# Combine: probe a live server + analyze its source
mcp-score http://localhost:3000/mcp --repo github:owner/repo
Example Output
┌────────────────────────────────────────────────┐
│ MCP Server Score Report │
│ mcp-servers/everything │
└────────────────────────────────────────────────┘
Overall Score: 78/100 Grade: B
Category Score Wt.
Schema Quality — 25%
Protocol Compliance 86 20%
Reliability — 20%
Docs & Maintenance — 15%
Security & Permissions 71 20%
Tools Found: 13
Schema Valid: No
Error Handling: 100/100
Fuzz Resilience: 100/100
Latency: 4ms
What It Scores
The CLI runs two of the same data tiers used by the scoreboard:
http:// or stdio targets.github: target or --repo flag.Combine both for the most complete CLI score:
mcp-score http://localhost:3000/mcp --repo github:owner/repo
Output Formats
--verbose for probe timing, schema issues, and error handling details.--format json.--format markdown.Save a markdown report to a file with -o report.md (works alongside any --format).
CI/CD Integration
Use --fail-below to gate deployments on quality:
# Fail the pipeline if score drops below B
mcp-score --fail-below B stdio -- python my_server.py
GitHub Actions
- name: Score MCP Server
run: |
pip install mcp-score
mcp-score --fail-below B -o score-report.md \
stdio -- python my_server.py
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Upload Score Report
if: always()
uses: actions/upload-artifact@v4
with:
name: mcp-score-report
path: score-report.md
Exit codes: 0 = pass, 1 = below threshold, 2 = error.
CLI vs. Scoreboard
The CLI is a single-run subset of the full scoreboard pipeline. Here's what each adds:
| Capability | CLI | Scoreboard |
|---|---|---|
| Protocol probe (schema, errors, fuzz) | ✓ | ✓ |
| Static analysis (7 GitHub metrics) | ✓ | ✓ |
| Reliability monitoring (7-day uptime & latency) | — | ✓ |
| Behavioral security (LLM source scan) | — | ✓ |
| Agent usability (multi-model LLM eval) | — | ✓ |
| Sandbox probes (Docker-based) | — | ✓ |
| Score history & regression alerts | — | ✓ |
| Score badge for your README | — | ✓ |
The CLI is ideal for local development and CI gates. For the full picture — including continuous monitoring, behavioral security scanning, and agent usability scoring — submit your server to the scoreboard.
All Options
mcp-score [OPTIONS] TARGET [EXTRA_ARGS]...
Options:
--repo TEXT GitHub repo URL for static analysis
--format [terminal|json|markdown]
--verbose Show individual check details
--fail-below [A+|A|B|C|D|F] Exit 1 if below grade
--timeout INTEGER Probe timeout in seconds (default: 30)
-o, --output PATH Save markdown report to file
-h, --help Show help
--version Show version
MIT Licensed. Free and open source.