A GitHub Copilot API proxy written in Rust. It exposes standard OpenAI and Anthropic compatible HTTP endpoints so any tool (Claude Code, the Codex CLI, OpenAI/Anthropic SDKs, etc.) can talk to GitHub Copilot models.
This is a Rust backend port of the
ghc-tunnel Node.js project.
📖 Documentation: https://martinforreal.github.io/ghc-proxy/
# Build
cargo build --release
# Run — on first launch in a terminal this opens the interactive setup wizard
./target/release/ghc-proxy
# Re-run the setup wizard at any time
./target/release/ghc-proxy --setup
# Generate the default config file non-interactively and exit
./target/release/ghc-proxy --configOn a first run with no config file, when launched from a terminal, the proxy
opens an interactive setup wizard that signs you in to GitHub (Device Flow),
fetches the live model catalog, and helps you configure model mappings. In
headless or piped contexts the wizard is skipped: the proxy falls back to GitHub
Device Flow auth (or a *_TOKEN environment variable) and a default config
file.
- OpenAI-compatible
/v1/chat/completionsand/v1/responsesendpoints (with Codex adapters:apply_patchtool rewrite,X-Initiatorheader, context compaction trimming,service_tiernulling, unsupported-tool stripping). - Anthropic-compatible
/v1/messagesendpoint (direct passthrough when the upstream model supports it, otherwise translated through chat completions). - Gemini-compatible
/v1beta/models/{model}:generateContent,:streamGenerateContent, and:countTokensendpoints (translated through chat completions). - GitHub Models inference — requests whose model id uses the
publisher/modelform (e.g.openai/gpt-4o) are transparently routed to the GitHub Models API instead of Copilot, authenticated with the raw GitHub token (which must carry themodelsscope). Enabled by default; the catalog is merged into/v1/models. - Optional API-key authentication on the LLM endpoints
(
Authorization: Bearer,x-api-key, orx-goog-api-key), disabled by default and compared in constant time. - OpenAPI spec served at
/openapi.jsondescribing every LLM endpoint. - Automatic model name translation via configurable exact/prefix mappings.
- Streaming support (SSE) for all endpoints.
- Retry with exponential backoff for upstream connection errors.
- Content filtering (system prompt add/remove, tool-result suffix removal).
- Copilot token management with automatic refresh.
- Orphaned
tool_use_idrecovery — retries with offending tool results stripped when the upstream returns the corresponding 400 error. - Request analytics dashboard at
/and a request browser at/requests. - Interactive setup wizard (
--setup, or first launch in a terminal): GitHub sign-in, live model catalog, and model-mapping configuration. - 1M-context support — forwards the
anthropic-beta: context-1m-2025-08-07header for models whose catalog advertises an extended context window.
ghc-proxy [options]
-s, --setup Launch the interactive setup wizard (sign in + map models)
--claudecode Configure Claude Code (~/.claude/settings.json) to use this proxy (with --setup)
--codex Configure Codex (~/.codex/config.toml) to use this proxy (with --setup)
--gemini Configure Gemini CLI (~/.gemini/.env) to use this proxy (with --setup)
-d, --default Reset config to defaults during setup
-p, --port <port> Port to listen on (default: 8314)
-a, --address <addr> Address to listen on (default: 127.0.0.1)
--debug / --no-debug Toggle debug mode
--account-type <t> Account tier: individual | business | enterprise
-c, --config Generate the default config file (non-interactive)
auth Authenticate with GitHub and exit (CI/headless)
check-usage Print Copilot quota/usage and exit
info Print diagnostics (version, paths, token) and exit
--json Emit machine-readable JSON (with info)
--show-token Log GitHub and Copilot tokens on refresh
--rate-limit <secs> Minimum seconds between forwarded requests
--wait When rate limited, wait instead of returning HTTP 429
--manual Require interactive approval before each request
--fetch-version Fetch the latest VS Code version at startup
--no-fetch-version Disable dynamic VS Code version fetching
--auto-upgrade Auto-upgrade app when a newer release is available
--no-auto-upgrade Disable app auto-upgrade
--update-config Persist migrated config/default additions back to config.yaml
-v, --version Show version
-h, --help Show help
A GitHub token is resolved in this order:
COPILOT_GITHUB_TOKEN, thenGH_TOKEN, thenGITHUB_TOKENenvironment variables (matching the GitHub Copilot SDK precedence).- Saved token file at
<config-dir>/github_token.txt. - Interactive GitHub Device Flow (the resulting token is saved for reuse, with
0600permissions on Unix).
The GitHub token is exchanged for a short-lived Copilot token via
https://api.github.com/copilot_internal/v2/token, which is refreshed
automatically before it expires.
The interactive Device Flow requests the read:user copilot models scopes. The
models scope authorizes the GitHub Models inference API; if
you supply your own token instead, give it the models scope (classic/OAuth
token) or the models: read permission (fine-grained PAT) to use GitHub Models.
By default the proxy accepts all local requests. To require a key, set api_key
in config.yaml (or the GHC_PROXY_API_KEY environment variable). When set,
every request to the LLM endpoints must present a matching key, compared in
constant time:
# Anthropic / OpenAI style
curl http://127.0.0.1:8314/v1/messages -H "x-api-key: my-secret-key" ...
curl http://127.0.0.1:8314/v1/chat/completions -H "Authorization: Bearer my-secret-key" ...
# Gemini style
curl "http://127.0.0.1:8314/v1beta/models/gemini-2.5-pro:generateContent" -H "x-goog-api-key: my-secret-key" ...The dashboard, metrics, and static pages remain open so local monitoring keeps working without a key.
Running ghc-proxy --setup — or launching the proxy for the first time from a
terminal with no config file — opens an interactive wizard that:
- Prompts for the server settings (listen address, port, account tier).
- Signs in to GitHub via Device Flow and saves the token.
- Fetches the live model catalog and lets you map the
opus/sonnet/haikualiases to specific models (or keep the recommended defaults). - Optionally configures Claude Code to route through the proxy.
The wizard only runs when attached to a terminal, so headless and CI launches
are unaffected (they fall back to environment/file tokens and a default config).
Pass --default to start the wizard from built-in defaults, or --claudecode
to include the Claude Code step automatically.
Config file: ~/.ghc-tunnel/config.yaml (%APPDATA%/ghc-tunnel/config.yaml
on Windows). It is generated on first run or with --config.
config_version: 2
address: 127.0.0.1
port: 8314
debug: false
account_type: individual # individual | business | enterprise
vscode_version: "1.123.0"
api_version: "2025-05-01"
copilot_version: "0.48.1"
auto_upgrade: false
model_mappings:
exact:
opus: claude-opus-4.8
sonnet: claude-opus-4.8
haiku: claude-haiku-4.5
prefix:
claude-sonnet-4-: claude-opus-4.8
github_models:
enabled: true # route publisher/model ids to GitHub Models
# org: my-org # attribute inference to an organization
# token: ghp_xxx # dedicated token (models scope / models:read)
system_prompt_remove: []
system_prompt_add: []
tool_result_suffix_remove: []
max_connection_retries: 3
# Optional: require this key on all LLM endpoints (Bearer / x-api-key /
# x-goog-api-key). Omit or leave empty to disable authentication.
# api_key: my-secret-keyBesides Copilot, GitHub offers a separate model inference service — GitHub Models — exposing OpenAI-compatible endpoints for models from OpenAI, Meta, Mistral, xAI, DeepSeek, and others. This proxy routes to it transparently.
Routing. GitHub Models identifies models by a publisher/model id (e.g.
openai/gpt-4o, meta/llama-4-maverick). When github_models.enabled is true
(the default), any request whose translated model id contains a / is sent to
GitHub Models instead of Copilot. Because Copilot model ids never contain a /,
the two never collide, and existing model mappings are unaffected. This works on
/v1/chat/completions, /v1/messages (translated), and the Gemini endpoints.
curl http://127.0.0.1:8314/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hi!"}]}'Authentication. GitHub Models uses the raw GitHub token (not the Copilot
token) via Authorization: Bearer. The token must carry the models scope
(classic/OAuth tokens, including the one minted by the Device Flow) or the
models: read permission (fine-grained PATs). Tokens without it get an
Unauthorized response from GitHub. To use a dedicated token — e.g. a
fine-grained PAT scoped only to models: read — set github_models.token or the
GHC_PROXY_GITHUB_MODELS_TOKEN environment variable.
Configuration.
github_models:
enabled: true # set false to always use Copilot
org: my-org # optional: attribute inference to an organization
token: ghp_xxx # optional: dedicated token (models scope / models:read)| Environment variable | Effect |
|---|---|
GHC_PROXY_GITHUB_MODELS_ENABLED |
Enable/disable routing (true/1) |
GHC_PROXY_GITHUB_MODELS_ORG |
Attribute inference to an organization |
GHC_PROXY_GITHUB_MODELS_TOKEN |
Dedicated token for GitHub Models |
The GitHub Models catalog is merged into GET /v1/models so those ids show up in
the dashboard and model listings.
| Endpoint | Description |
|---|---|
POST /v1/chat/completions |
OpenAI chat completions |
POST /v1/responses |
OpenAI responses API (Codex) |
GET /v1/models |
List available models |
POST /v1/messages |
Anthropic messages API |
POST /v1/messages/count_tokens |
Anthropic token counting |
POST /v1beta/models/{model}:generateContent |
Gemini generate content |
POST /v1beta/models/{model}:streamGenerateContent |
Gemini streaming (SSE) |
POST /v1beta/models/{model}:countTokens |
Gemini token counting |
GET /openapi.json |
OpenAPI v3 specification |
GET / |
Web dashboard |
GET /metrics/dashboard |
Metrics dashboard UI |
GET /metrics |
OpenMetrics endpoint |
GET /requests |
Request browser |
POST /api/config/reload |
Reload config.yaml without restart |
GET /api/models |
All supported models (used by the dashboard) |
from openai import OpenAI
client = OpenAI(base_url="http://127.0.0.1:8314/v1", api_key="not-needed")
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)import anthropic
client = anthropic.Anthropic(base_url="http://127.0.0.1:8314", api_key="not-needed")
msg = client.messages.create(
model="claude-sonnet-4",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)curl http://127.0.0.1:8314/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'cargo build # compile
cargo test # run unit + integration tests
cargo clippy # lint| File | Responsibility |
|---|---|
src/main.rs |
CLI parsing and server startup |
src/setup.rs |
Interactive first-run setup wizard |
src/config.rs |
Config dir, YAML config, defaults, model-mapping defaults |
src/auth.rs |
GitHub token resolution (env/file/Device Flow), Copilot token exchange |
src/state.rs |
Shared state, token refresh, upstream header construction |
src/translate.rs |
Model-name translation (exact + prefix) |
src/filters.rs |
Content filtering and token estimation |
src/anthropic.rs |
Anthropic <-> OpenAI request/response/stream translation |
src/gemini.rs |
Gemini <-> OpenAI request/response/stream translation |
src/responses.rs |
Codex /v1/responses adapters |
src/util.rs |
Retry-with-backoff and orphaned tool-result handling |
src/server.rs |
Axum router and all HTTP handlers |
src/store.rs |
In-memory request store for the dashboard |
The proxy authenticates to GitHub Copilot by impersonating the official
VS Code Copilot Chat client. To do this faithfully it sends the same
identity headers that the real client sends to api.githubcopilot.com
(Editor-Version, Editor-Plugin-Version, User-Agent,
Copilot-Integration-Id, OpenAI-Intent, X-Interaction-Type,
X-GitHub-Api-Version, openai-organization, plus a persisted
vscode-machineid and a per-session vscode-sessionid, etc.). These are built
in AppState::copilot_headers / github_headers (src/state.rs) from the
version strings in src/config.rs.
For Anthropic-native /v1/messages requests, the proxy also forwards the
anthropic-beta: context-1m-2025-08-07 header for models whose catalog
advertises a context window larger than 200K tokens, unlocking the 1M-token
tier the same way the official client does.
GitHub may reject requests that report stale client versions, so these values
occasionally need refreshing. The source of truth is the now open-source
microsoft/vscode-copilot-chat
extension and the VS Code Marketplace:
| Config value | Where to read it |
|---|---|
copilot_version |
latest GitHub.copilot-chat version on the VS Code Marketplace (or the version field in the extension's package.json) |
vscode_version |
latest VS Code stable release (https://update.code.visualstudio.com/api/releases/stable) |
api_version |
X-GitHub-Api-Version constant in src/platform/networking/common/networking.ts |
After updating the constants in src/config.rs, run the test suite and bump
the example values in this README.
This Rust port focuses on the core proxy behavior: authentication, token
management, model translation, all four API surfaces with streaming, content
filtering, retry, the CLI, and the dashboard. The following ghc-tunnel
auxiliary features are intentionally not ported: OneDrive config sync, the
ACP code agent, Codex config auto-repair, and the persistent on-disk analytics
database. --setup launches an interactive wizard (GitHub sign-in, live model
catalog, model-mapping configuration) and writes/updates the config file; in
headless or piped contexts it instead re-renders the config non-interactively,
applying any CLI overrides or resetting to defaults with --default.
--codex patches ~/.codex/config.toml (adding a model_providers.ghc-proxy
block and selecting it), and --gemini patches ~/.gemini/.env (base URL,
model, and api-key auth selection). --claudecode patches ~/.claude/settings.json, merging
env.ANTHROPIC_BASE_URL and ensuring env.ANTHROPIC_API_KEY exists so Claude
Code routes through this proxy (existing settings are preserved, and an
existing API key is left untouched). The dashboard lists all supported models
alongside the request statistics.
MIT