Skip to content

mawelo/aiorch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

aiorch — AI Orchestrator

A C++23 orchestrator that intelligently combines Claude (Anthropic API) and local models (ollama) — as an interactive CLI, pipe tool, and MCP server for Claude Code.


Idea & Motivation

Anyone working with AI assistants daily quickly runs into a dilemma: simple tasks like "write a docstring" or "explain this snippet" don't really need a cloud API — but they still cost money and add latency. Complex questions about architecture, security, or large codebases, on the other hand, benefit enormously from Claude's full context window and reasoning power.

aiorch solves this with an orchestrator pattern:

Simple task   →  local model  (ollama, free, fast)
Complex task  →  Claude API   (powerful, context-rich)

A scoring-based router makes the routing decision automatically — based on token count, keywords, and context signals. The result: significantly lower API costs while maintaining full quality for complex tasks.


Architecture

aiorch Architecture


Features

  • Intelligent routing — scoring-based (token count, keywords, context signals, question marks, code blocks)
  • Interactive REPL — with linenoise, history (~/.aiorch_history), ANSI colors and REPL commands
  • Pipe mode — for shell scripting and CI integration
  • MCP server — direct integration as a tool in Claude Code
  • Automatic fallback — ollama unreachable → transparent switch to Claude
  • Session statistics — local/remote distribution, estimated API cost savings
  • Self-test — checks all backends and credentials at once
  • Sliding-window context — max. 50 messages, system message always preserved
  • Timeouts — ollama 30 s, Anthropic 120 s, no process hang
  • Unit tests — 12 tests with doctest (Router, Config, Context)

Prerequisites

Component Version
Compiler clang++ 18+ or g++ 13+ (C++23)
CMake 3.25+
OpenSSL 3.0+
ollama any recent version
ollama model qwen2.5-coder:14b (recommended) or any other
Anthropic API key sk-ant-api03-... (from console.anthropic.com)

Dependencies are fetched automatically via CMake FetchContent:


Build

git clone https://github.com/<your-user>/aiorch.git
cd aiorch

# Debug build (development)
cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build

# Release build (production)
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build

# Smoke test
./build/aiorch --version

Run Unit Tests

cmake --build build --target tests
ctest --test-dir build --output-on-failure

Expected output:

Test project /home/malo/src/aiorch/build
    Start 1: aiorch_tests
1/1 Test #1: aiorch_tests ...........   Passed    0.12 sec

100% tests passed, 0 tests failed out of 1

Installation

# Install to ~/.local/bin/aiorch
cmake --install build --prefix ~/.local

# Make sure ~/.local/bin is in your PATH
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc

# Verify
aiorch --version

Configuration

1. Anthropic API Key

Get your API key from console.anthropic.com (starts with sk-ant-api03-...).

mkdir -p ~/.claude
cat > ~/.claude/aiorch.json << 'EOF'
{
  "apikey": "sk-ant-api03-YOUR-KEY-HERE"
}
EOF
chmod 600 ~/.claude/aiorch.json

Key loading order (first found wins):

  1. ~/.claude/aiorch.json → field "apikey" (primary — real API key)
  2. ~/.claude/.credentials.json → field "apiKey" (Claude Code login, usually OAuth token)
  3. Environment variable ANTHROPIC_API_KEY

2. Configuration file (optional)

aiorch works out of the box without this file — it starts with sensible defaults (ollama at http://localhost:11434, model qwen2.5-coder:14b, etc.). Create ~/.aiorch.conf only if you want to override those defaults; it is never generated automatically.

cat > ~/.aiorch.conf << 'EOF'
# aiorch configuration
ollama_endpoint   = http://localhost:11434
ollama_model      = qwen2.5-coder:14b
anthropic_model   = claude-sonnet-4-5
context_threshold = 2000
history_file      = ~/.aiorch_history
log_level         = info   # debug | info | warn | error
EOF

Loading order:

~/.aiorch.conf  →  environment variables  →  CLI arguments

Use AIORCH_CONFIG to point to a different config file:

AIORCH_CONFIG=/path/to/my.conf aiorch

3. Set up ollama

# Install ollama (if not already installed)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull the recommended model
ollama pull qwen2.5-coder:14b

# Start ollama (runs as a background service)
ollama serve

Usage

Interactive REPL

aiorch
aiorch> explain this snippet
[Response streams token by token]
(local: qwen2.5-coder:14b)

aiorch> What is the architecture behind this design pattern?
[Response streams token by token]
(remote: claude-sonnet-4-5)

aiorch> /exit

REPL commands:

Command Function
/clear Reset context
/history Show conversation history
/backend Show routing decision for last input
/model Show currently active model
/exit Quit (also Ctrl+D, Ctrl+C)

Pipe Mode (Scripting)

# Simple query
echo "Write a docstring for this function: int add(int a, int b)" | aiorch --pipe

# Analyze file contents
cat myfile.cpp | aiorch --pipe

# In shell scripts
RESULT=$(echo "explain this code" | aiorch --pipe --local)

# Exit code: 0 = success, 1 = error

Force a backend

aiorch --local    # always ollama, no remote routing
aiorch --remote   # always Claude API, no local routing

Override model

aiorch --model llama3.2           # different ollama model for this session
aiorch --model llama3.2 --local   # combined with local override

Self-test

aiorch --selftest
aiorch selftest
───────────────────────────────────────────
Credentials ~/.claude/aiorch.json   OK
ollama  http://localhost:11434      OK
Anthropic API                       OK
───────────────────────────────────────────
All checks passed. Exit code: 0

Routing Logic

The router uses a scoring system — both sides accumulate points, the higher score wins. On a tie, Remote wins (safer default).

Criterion Points
Token count < 500 +2 local
Token count 500–2000 +1 local
Token count > 2000 +2 remote
Local keyword found (docstring, explain, snippet, format, rename, …) +2 local
Remote keyword found (architecture, security, refactor, why, design, …) +2 remote
Multiple files in context (--- separator detected) +1 remote
Question mark in prompt +1 remote
Prompt ends with code block (```) +1 local

Session Statistics

Statistics are printed at the end of every session — after /exit in the REPL or at the end of a pipe run:

─────────────────────────────────────────
Session Statistics
  Total requests:          42
  → local  (ollama):       31  (73 %)
  → remote (Claude):       11  (27 %)
  Estimated API savings:  ~0.09 USD
─────────────────────────────────────────

The cost estimate is based on the average price per request for claude-sonnet-4-5 and shows how much the locally handled requests saved in API costs.


Claude Code Integration (MCP Server)

aiorch can be registered as an MCP server directly inside Claude Code:

# Register once
claude mcp add aiorch -- ~/.local/bin/aiorch --mcp

# Verify connection
claude mcp list
# aiorch: /home/malo/.local/bin/aiorch --mcp - ✓ Connected

Three tools are then available inside Claude Code:

Tool Function
local_complete Send a prompt directly to ollama and return the response
route_query Determine the routing decision for a given prompt
clear_context Reset the orchestrator's context

MCP logs are written to ~/.aiorch_mcp.log (never to stderr — that is reserved for JSON-RPC).


Project Structure

aiorch/
├── CMakeLists.txt
├── docs/
│   └── cpp_orchestrator_architecture_en.svg
├── src/
│   ├── main.cpp                  — CLI entry point, REPL, pipe mode, MCP server
│   ├── config.hpp / .cpp         — configuration, key loading order
│   ├── context_manager.hpp/.cpp  — message history, sliding window, token estimation
│   ├── router.hpp / .cpp         — scoring-based task router
│   ├── anthropic_client.hpp/.cpp — Anthropic API, HTTPS, SSE streaming
│   └── ollama_client.hpp/.cpp    — ollama REST API, HTTP streaming
└── tests/
    ├── test_router.cpp            — 5 router tests (scoring, keywords, tie-break)
    ├── test_config.cpp            — 3 config tests (defaults, file, env vars)
    └── test_context.cpp           — 4 context tests (tokens, sliding window, clear)

Prior Art & Comparison

Similar approaches exist — here is how aiorch differs from each of them.

Project Approach Key Difference
ollama-prompt A tool for Claude Code that lets Claude explicitly delegate subtasks to ollama Not an orchestrator — no routing logic, no automatic decision. Claude decides manually via tool call.
MCP Server ollama-claude An MCP server that forwards all requests to a local ollama instance No routing — everything goes local regardless of complexity. No fallback to Claude.
CliGate A local proxy that intercepts Claude Code requests and redirects them to ollama Proxy-based, rule-configured, no scoring. Typically requires Node.js or Python runtime.

What makes aiorch different

1. Scoring-based routing — not a fixed rule

The others use static rules ("always local" or "always remote"). aiorch weighs multiple signals simultaneously — token count, keyword categories, context size, code blocks, question marks — and produces a score for each side. The higher score wins.

2. Native C++23 — no runtime dependency

ollama-prompt and CliGate are typically Node.js or Python based. aiorch compiles to a single self-contained binary with no interpreter, no npm install, no virtual environment.

3. One binary — three modes

aiorch                # interactive REPL
aiorch --pipe         # scripting / CI
aiorch --mcp          # MCP server for Claude Code

No other tool in this space combines all three in a single binary.

4. Automatic fallback

If ollama is unreachable, aiorch silently falls back to Claude and prints a warning on stderr. None of the alternatives handle this gracefully — they either crash or return an error.

5. Observability built in

Session statistics, a self-test flag (--selftest), MCP request logging to ~/.aiorch_mcp.log, and a unit test suite are part of the project from the start — not afterthoughts.


License

MIT License — see LICENSE.


Acknowledgements

This project was conceived, built, and shipped by Martin Lonkwitz across six development sections — from the initial project scaffold through the scoring router, MCP server, unit test suite, and production hardening.

Architecture, implementation guidance, code review, and documentation were developed in close collaboration with Claude (Anthropic) — acting as a technical sparring partner throughout the entire build. The orchestrator pattern at the heart of aiorch is itself a reflection of how that collaboration worked: a capable local effort directed and enriched by a powerful remote intelligence.

Open source dependencies:

Releases

No releases published

Packages

 
 
 

Contributors