aiorch — AI Orchestrator

A C++23 orchestrator that intelligently combines Claude (Anthropic API) and local models (ollama) — as an interactive CLI, pipe tool, and MCP server for Claude Code.

Idea & Motivation

Anyone working with AI assistants daily quickly runs into a dilemma: simple tasks like "write a docstring" or "explain this snippet" don't really need a cloud API — but they still cost money and add latency. Complex questions about architecture, security, or large codebases, on the other hand, benefit enormously from Claude's full context window and reasoning power.

aiorch solves this with an orchestrator pattern:

Simple task   →  local model  (ollama, free, fast)
Complex task  →  Claude API   (powerful, context-rich)

A scoring-based router makes the routing decision automatically — based on token count, keywords, and context signals. The result: significantly lower API costs while maintaining full quality for complex tasks.

Architecture

Features

Intelligent routing — scoring-based (token count, keywords, context signals, question marks, code blocks)
Interactive REPL — with linenoise, history (~/.aiorch_history), ANSI colors and REPL commands
Pipe mode — for shell scripting and CI integration
MCP server — direct integration as a tool in Claude Code
Automatic fallback — ollama unreachable → transparent switch to Claude
Session statistics — local/remote distribution, estimated API cost savings
Self-test — checks all backends and credentials at once
Sliding-window context — max. 50 messages, system message always preserved
Timeouts — ollama 30 s, Anthropic 120 s, no process hang
Unit tests — 12 tests with doctest (Router, Config, Context)

Prerequisites

Component	Version
Compiler	clang++ 18+ or g++ 13+ (C++23)
CMake	3.25+
OpenSSL	3.0+
ollama	any recent version
ollama model	`qwen2.5-coder:14b` (recommended) or any other
Anthropic API key	`sk-ant-api03-...` (from console.anthropic.com)

Dependencies are fetched automatically via CMake FetchContent:

cpp-httplib — header-only HTTP/HTTPS
nlohmann/json — header-only JSON
linenoise — REPL/readline
doctest — unit tests

Build

git clone https://github.com/<your-user>/aiorch.git
cd aiorch

# Debug build (development)
cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build

# Release build (production)
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build

# Smoke test
./build/aiorch --version

Run Unit Tests

cmake --build build --target tests
ctest --test-dir build --output-on-failure

Expected output:

Test project /home/malo/src/aiorch/build
    Start 1: aiorch_tests
1/1 Test #1: aiorch_tests ...........   Passed    0.12 sec

100% tests passed, 0 tests failed out of 1

Installation

# Install to ~/.local/bin/aiorch
cmake --install build --prefix ~/.local

# Make sure ~/.local/bin is in your PATH
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Verify
aiorch --version

Configuration

1. Anthropic API Key

Get your API key from console.anthropic.com (starts with sk-ant-api03-...).

mkdir -p ~/.claude
cat > ~/.claude/aiorch.json << 'EOF'
{
  "apikey": "sk-ant-api03-YOUR-KEY-HERE"
}
EOF
chmod 600 ~/.claude/aiorch.json

Key loading order (first found wins):

~/.claude/aiorch.json → field "apikey" (primary — real API key)
~/.claude/.credentials.json → field "apiKey" (Claude Code login, usually OAuth token)
Environment variable ANTHROPIC_API_KEY

2. Configuration file (optional)

aiorch works out of the box without this file — it starts with sensible defaults (ollama at http://localhost:11434, model qwen2.5-coder:14b, etc.). Create ~/.aiorch.conf only if you want to override those defaults; it is never generated automatically.

cat > ~/.aiorch.conf << 'EOF'
# aiorch configuration
ollama_endpoint   = http://localhost:11434
ollama_model      = qwen2.5-coder:14b
anthropic_model   = claude-sonnet-4-5
context_threshold = 2000
history_file      = ~/.aiorch_history
log_level         = info   # debug | info | warn | error
EOF

Loading order:

~/.aiorch.conf  →  environment variables  →  CLI arguments

Use AIORCH_CONFIG to point to a different config file:

AIORCH_CONFIG=/path/to/my.conf aiorch

3. Set up ollama

# Install ollama (if not already installed)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the recommended model
ollama pull qwen2.5-coder:14b

# Start ollama (runs as a background service)
ollama serve

Usage

Interactive REPL

aiorch

aiorch> explain this snippet
[Response streams token by token]
(local: qwen2.5-coder:14b)

aiorch> What is the architecture behind this design pattern?
[Response streams token by token]
(remote: claude-sonnet-4-5)

aiorch> /exit

REPL commands:

Command	Function
`/clear`	Reset context
`/history`	Show conversation history
`/backend`	Show routing decision for last input
`/model`	Show currently active model
`/exit`	Quit (also `Ctrl+D`, `Ctrl+C`)

Pipe Mode (Scripting)

# Simple query
echo "Write a docstring for this function: int add(int a, int b)" | aiorch --pipe

# Analyze file contents
cat myfile.cpp | aiorch --pipe

# In shell scripts
RESULT=$(echo "explain this code" | aiorch --pipe --local)

# Exit code: 0 = success, 1 = error

Force a backend

aiorch --local    # always ollama, no remote routing
aiorch --remote   # always Claude API, no local routing

Override model

aiorch --model llama3.2           # different ollama model for this session
aiorch --model llama3.2 --local   # combined with local override

Self-test

aiorch --selftest

aiorch selftest
───────────────────────────────────────────
Credentials ~/.claude/aiorch.json   OK
ollama  http://localhost:11434      OK
Anthropic API                       OK
───────────────────────────────────────────
All checks passed. Exit code: 0

Routing Logic

The router uses a scoring system — both sides accumulate points, the higher score wins. On a tie, Remote wins (safer default).

Criterion	Points
Token count < 500	+2 local
Token count 500–2000	+1 local
Token count > 2000	+2 remote
Local keyword found (`docstring`, `explain`, `snippet`, `format`, `rename`, …)	+2 local
Remote keyword found (`architecture`, `security`, `refactor`, `why`, `design`, …)	+2 remote
Multiple files in context (`---` separator detected)	+1 remote
Question mark in prompt	+1 remote
Prompt ends with code block (```)	+1 local

Session Statistics

Statistics are printed at the end of every session — after /exit in the REPL or at the end of a pipe run:

─────────────────────────────────────────
Session Statistics
  Total requests:          42
  → local  (ollama):       31  (73 %)
  → remote (Claude):       11  (27 %)
  Estimated API savings:  ~0.09 USD
─────────────────────────────────────────

The cost estimate is based on the average price per request for claude-sonnet-4-5 and shows how much the locally handled requests saved in API costs.

Claude Code Integration (MCP Server)

aiorch can be registered as an MCP server directly inside Claude Code:

# Register once
claude mcp add aiorch -- ~/.local/bin/aiorch --mcp

# Verify connection
claude mcp list
# aiorch: /home/malo/.local/bin/aiorch --mcp - ✓ Connected

Three tools are then available inside Claude Code:

Tool	Function
`local_complete`	Send a prompt directly to ollama and return the response
`route_query`	Determine the routing decision for a given prompt
`clear_context`	Reset the orchestrator's context

MCP logs are written to ~/.aiorch_mcp.log (never to stderr — that is reserved for JSON-RPC).

Project Structure

aiorch/
├── CMakeLists.txt
├── docs/
│   └── cpp_orchestrator_architecture_en.svg
├── src/
│   ├── main.cpp                  — CLI entry point, REPL, pipe mode, MCP server
│   ├── config.hpp / .cpp         — configuration, key loading order
│   ├── context_manager.hpp/.cpp  — message history, sliding window, token estimation
│   ├── router.hpp / .cpp         — scoring-based task router
│   ├── anthropic_client.hpp/.cpp — Anthropic API, HTTPS, SSE streaming
│   └── ollama_client.hpp/.cpp    — ollama REST API, HTTP streaming
└── tests/
    ├── test_router.cpp            — 5 router tests (scoring, keywords, tie-break)
    ├── test_config.cpp            — 3 config tests (defaults, file, env vars)
    └── test_context.cpp           — 4 context tests (tokens, sliding window, clear)

Prior Art & Comparison

Similar approaches exist — here is how aiorch differs from each of them.

Project	Approach	Key Difference
ollama-prompt	A tool for Claude Code that lets Claude explicitly delegate subtasks to ollama	Not an orchestrator — no routing logic, no automatic decision. Claude decides manually via tool call.
MCP Server ollama-claude	An MCP server that forwards all requests to a local ollama instance	No routing — everything goes local regardless of complexity. No fallback to Claude.
CliGate	A local proxy that intercepts Claude Code requests and redirects them to ollama	Proxy-based, rule-configured, no scoring. Typically requires Node.js or Python runtime.

What makes aiorch different

1. Scoring-based routing — not a fixed rule

The others use static rules ("always local" or "always remote"). aiorch weighs multiple signals simultaneously — token count, keyword categories, context size, code blocks, question marks — and produces a score for each side. The higher score wins.

2. Native C++23 — no runtime dependency

ollama-prompt and CliGate are typically Node.js or Python based. aiorch compiles to a single self-contained binary with no interpreter, no npm install, no virtual environment.

3. One binary — three modes

aiorch                # interactive REPL
aiorch --pipe         # scripting / CI
aiorch --mcp          # MCP server for Claude Code

No other tool in this space combines all three in a single binary.

4. Automatic fallback

If ollama is unreachable, aiorch silently falls back to Claude and prints a warning on stderr. None of the alternatives handle this gracefully — they either crash or return an error.

5. Observability built in

Session statistics, a self-test flag (--selftest), MCP request logging to ~/.aiorch_mcp.log, and a unit test suite are part of the project from the start — not afterthoughts.

License

MIT License — see LICENSE.

Acknowledgements

This project was conceived, built, and shipped by Martin Lonkwitz across six development sections — from the initial project scaffold through the scoring router, MCP server, unit test suite, and production hardening.

Architecture, implementation guidance, code review, and documentation were developed in close collaboration with Claude (Anthropic) — acting as a technical sparring partner throughout the entire build. The orchestrator pattern at the heart of aiorch is itself a reflection of how that collaboration worked: a capable local effort directed and enriched by a powerful remote intelligence.

Open source dependencies:

cpp-httplib by Yuji Hirose
nlohmann/json by Niels Lohmann
linenoise by Salvatore Sanfilippo
doctest by Viktor Kirilov
ollama — local LLM inference
Anthropic — Claude API

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

aiorch — AI Orchestrator

Idea & Motivation

Architecture

Features

Prerequisites

Build

Run Unit Tests

Installation

Configuration

1. Anthropic API Key

2. Configuration file (optional)

3. Set up ollama

Usage

Interactive REPL

Pipe Mode (Scripting)

Force a backend

Override model

Self-test

Routing Logic

Session Statistics

Claude Code Integration (MCP Server)

Project Structure

Prior Art & Comparison

What makes aiorch different

License

Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
docs		docs
src		src
tests		tests
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

aiorch — AI Orchestrator

Idea & Motivation

Architecture

Features

Prerequisites

Build

Run Unit Tests

Installation

Configuration

1. Anthropic API Key

2. Configuration file (optional)

3. Set up ollama

Usage

Interactive REPL

Pipe Mode (Scripting)

Force a backend

Override model

Self-test

Routing Logic

Session Statistics

Claude Code Integration (MCP Server)

Project Structure

Prior Art & Comparison

What makes aiorch different

License

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages