Experimental - This is an experimental MCP server for inspecting Karenina verification results through natural language queries.
karenina-mcp provides an MCP (Model Context Protocol) interface that allows AI assistants like Claude to explore and analyze verification results stored in a Karenina SQLite database. Instead of writing SQL queries manually, you can ask questions in natural language and the assistant will translate them into appropriate queries.
The server uses a hierarchical context exposition approach to help the assistant understand your database efficiently:
First, call configure_database with the path to your SQLite results database. This connects the server and returns a list of available tables and views.
Once configured, the agent uses hierarchical schema discovery to answer your questions:
- Schema Awareness - View summaries are embedded in the
get_schematool description, so the agent sees all available views without any tool call - Selective Deep-Dive - The agent calls
get_schema([view_names])only for views relevant to your question - Query Generation - With precise schema knowledge, it generates accurate SQL queries
- Results Interpretation - Results are returned as formatted markdown tables
This approach minimizes context usage while ensuring the assistant has the precise information needed to answer your questions accurately.
┌─────────────────────────────────────────────────────────────────┐
│ configure_database(db_path) │
│ Points the server to the SQLite results database │
│ → Returns list of available tables and views │
└─────────────────────────────────────────────────────────────────┘
│
(database now connected)
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ User Question │
│ "Which model performed best on biology questions?" │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ Agent reads tool descriptions (no call needed) │
│ get_schema description contains one-line view summaries │
│ → Agent identifies relevant views for the question │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ get_schema(["template_results", ...]) │
│ Returns full column docs, types, keys, joins, examples │
│ → Agent now knows exact column names and relationships │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ query(sql) │
│ Agent generates precise SQL with correct column names │
│ → Returns formatted markdown table with results │
└─────────────────────────────────────────────────────────────────┘
cd karenina-mcp
uv syncuv run karenina-mcp
# or
uv run fastmcp run src/karenina_mcp/server.pyStart the MCP server as an HTTP server for remote or web-based access:
uv run fastmcp run src/karenina_mcp/server.py --transport http --port 8000The server will be available at http://localhost:8000. You can also specify a custom host:
uv run fastmcp run src/karenina_mcp/server.py --transport http --host 0.0.0.0 --port 8000Add to your Claude Code settings (.claude/settings.local.json or global settings):
{
"mcpServers": {
"karenina": {
"command": "uv",
"args": ["--directory", "/path/to/karenina-mcp", "run", "karenina-mcp"]
}
}
}Replace /path/to/karenina-mcp with the absolute path to the karenina-mcp directory.
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"karenina": {
"command": "uv",
"args": ["--directory", "/path/to/karenina-mcp", "run", "karenina-mcp"]
}
}
}Initialize the server with your results database.
configure_database(db_path="/path/to/karenina.db")
Returns confirmation with list of available tables and views.
Get detailed schema documentation for specific views. The tool description itself contains one-line summaries of all available views, so the agent can identify relevant views without calling the tool.
get_schema(view_names=["template_results", "question_attributes"])
Returns full column documentation, types, primary/foreign keys, join information, and example queries for the requested views.
Once the database is configured, you can ask questions like:
- "What's the overall pass rate across all models?"
- "Show me the questions where "mcp-local" was correct but "mcp-remote" failed;
- "Compute pass rates by question keywords and sort them in increasing performance"
- Show me results to question from the last run where more than one but not all of the replicates failed;
- Karenina - Core benchmarking framework