—
/100
Unscored
○ Unscored 0⁄0
HumanJudge
Human-evaluation infrastructure for AI quality. 25,000+ blind human reviews by 200+ verified reviewers across 58 AI models — query the data via five MCP tools (get_model_scores, compare_models, get_flags, check_content, get_latest).
Unscored visibility
— 0/0 applicable dimensions scored
○ Schema Quality
○ Protocol
— Reliability
○ Docs & Maintenance
○ Security Hygiene
— Schema Interpretability
A remote probe is needed for Protocol and Reliability scores.
Schema Quality
—
25% weight
Protocol Compliance
—
20% weight
Reliability
—
20% weight
Docs & Maintenance
—
15% weight
Security Hygiene
—
20% weight
30-Day Uptime
30 days ago
Today