ghc-proxy

A GitHub Copilot API proxy written in Rust. It exposes standard OpenAI and Anthropic compatible HTTP endpoints so any tool (Claude Code, the Codex CLI, OpenAI/Anthropic SDKs, etc.) can talk to GitHub Copilot models.

This is a Rust backend port of the ghc-tunnel Node.js project.

📖 Documentation: https://martinforreal.github.io/ghc-proxy/

Quick Start

# Build
cargo build --release

# Run — on first launch in a terminal this opens the interactive setup wizard
./target/release/ghc-proxy

# Re-run the setup wizard at any time
./target/release/ghc-proxy --setup

# Generate the default config file non-interactively and exit
./target/release/ghc-proxy --config

On a first run with no config file, when launched from a terminal, the proxy opens an interactive setup wizard that signs you in to GitHub (Device Flow), fetches the live model catalog, and helps you configure model mappings. In headless or piped contexts the wizard is skipped: the proxy falls back to GitHub Device Flow auth (or a *_TOKEN environment variable) and a default config file.

Features

OpenAI-compatible /v1/chat/completions and /v1/responses endpoints (with Codex adapters: apply_patch tool rewrite, X-Initiator header, context compaction trimming, service_tier nulling, unsupported-tool stripping).
Anthropic-compatible /v1/messages endpoint (direct passthrough when the upstream model supports it, otherwise translated through chat completions).
Gemini-compatible /v1beta/models/{model}:generateContent, :streamGenerateContent, and :countTokens endpoints (translated through chat completions).
GitHub Models inference — requests whose model id uses the publisher/model form (e.g. openai/gpt-4o) are transparently routed to the GitHub Models API instead of Copilot, authenticated with the raw GitHub token (which must carry the models scope). Enabled by default; the catalog is merged into /v1/models.
Optional API-key authentication on the LLM endpoints (Authorization: Bearer, x-api-key, or x-goog-api-key), disabled by default and compared in constant time.
OpenAPI spec served at /openapi.json describing every LLM endpoint.
Automatic model name translation via configurable exact/prefix mappings.
Streaming support (SSE) for all endpoints.
Retry with exponential backoff for upstream connection errors.
Content filtering (system prompt add/remove, tool-result suffix removal).
Copilot token management with automatic refresh.
Orphaned tool_use_id recovery — retries with offending tool results stripped when the upstream returns the corresponding 400 error.
Request analytics dashboard at / and a request browser at /requests.
Interactive setup wizard (--setup, or first launch in a terminal): GitHub sign-in, live model catalog, and model-mapping configuration.
1M-context support — forwards the anthropic-beta: context-1m-2025-08-07 header for models whose catalog advertises an extended context window.

CLI Options

ghc-proxy [options]

  -s, --setup             Launch the interactive setup wizard (sign in + map models)
      --claudecode        Configure Claude Code (~/.claude/settings.json) to use this proxy (with --setup)
      --codex             Configure Codex (~/.codex/config.toml) to use this proxy (with --setup)
      --gemini            Configure Gemini CLI (~/.gemini/.env) to use this proxy (with --setup)
  -d, --default           Reset config to defaults during setup
  -p, --port <port>       Port to listen on (default: 8314)
  -a, --address <addr>    Address to listen on (default: 127.0.0.1)
      --debug / --no-debug  Toggle debug mode
      --account-type <t>  Account tier: individual | business | enterprise
  -c, --config            Generate the default config file (non-interactive)
      auth                Authenticate with GitHub and exit (CI/headless)
      check-usage         Print Copilot quota/usage and exit
      info                Print diagnostics (version, paths, token) and exit
      --json              Emit machine-readable JSON (with info)
      --show-token        Log GitHub and Copilot tokens on refresh
      --rate-limit <secs> Minimum seconds between forwarded requests
      --wait              When rate limited, wait instead of returning HTTP 429
      --manual            Require interactive approval before each request
      --fetch-version     Fetch the latest VS Code version at startup
      --no-fetch-version  Disable dynamic VS Code version fetching
      --auto-upgrade      Auto-upgrade app when a newer release is available
      --no-auto-upgrade   Disable app auto-upgrade
      --update-config     Persist migrated config/default additions back to config.yaml
  -v, --version           Show version
  -h, --help              Show help

Authentication

A GitHub token is resolved in this order:

COPILOT_GITHUB_TOKEN, then GH_TOKEN, then GITHUB_TOKEN environment variables (matching the GitHub Copilot SDK precedence).
Saved token file at <config-dir>/github_token.txt.
Interactive GitHub Device Flow (the resulting token is saved for reuse, with 0600 permissions on Unix).

The GitHub token is exchanged for a short-lived Copilot token via https://api.github.com/copilot_internal/v2/token, which is refreshed automatically before it expires.

The interactive Device Flow requests the read:user copilot models scopes. The models scope authorizes the GitHub Models inference API; if you supply your own token instead, give it the models scope (classic/OAuth token) or the models: read permission (fine-grained PAT) to use GitHub Models.

Endpoint Authentication

By default the proxy accepts all local requests. To require a key, set api_key in config.yaml (or the GHC_PROXY_API_KEY environment variable). When set, every request to the LLM endpoints must present a matching key, compared in constant time:

# Anthropic / OpenAI style
curl http://127.0.0.1:8314/v1/messages -H "x-api-key: my-secret-key" ...
curl http://127.0.0.1:8314/v1/chat/completions -H "Authorization: Bearer my-secret-key" ...
# Gemini style
curl "http://127.0.0.1:8314/v1beta/models/gemini-2.5-pro:generateContent" -H "x-goog-api-key: my-secret-key" ...

The dashboard, metrics, and static pages remain open so local monitoring keeps working without a key.

Setup Wizard

Running ghc-proxy --setup — or launching the proxy for the first time from a terminal with no config file — opens an interactive wizard that:

Prompts for the server settings (listen address, port, account tier).
Signs in to GitHub via Device Flow and saves the token.
Fetches the live model catalog and lets you map the opus / sonnet / haiku aliases to specific models (or keep the recommended defaults).
Optionally configures Claude Code to route through the proxy.

The wizard only runs when attached to a terminal, so headless and CI launches are unaffected (they fall back to environment/file tokens and a default config). Pass --default to start the wizard from built-in defaults, or --claudecode to include the Claude Code step automatically.

Configuration

Config file: ~/.ghc-tunnel/config.yaml (%APPDATA%/ghc-tunnel/config.yaml on Windows). It is generated on first run or with --config.

config_version: 2
address: 127.0.0.1
port: 8314
debug: false
account_type: individual            # individual | business | enterprise
vscode_version: "1.123.0"
api_version: "2025-05-01"
copilot_version: "0.48.1"
auto_upgrade: false
model_mappings:
  exact:
    opus: claude-opus-4.8
    sonnet: claude-opus-4.8
    haiku: claude-haiku-4.5
  prefix:
    claude-sonnet-4-: claude-opus-4.8
github_models:
  enabled: true                     # route publisher/model ids to GitHub Models
  # org: my-org                     # attribute inference to an organization
  # token: ghp_xxx                  # dedicated token (models scope / models:read)
system_prompt_remove: []
system_prompt_add: []
tool_result_suffix_remove: []
max_connection_retries: 3

# Optional: require this key on all LLM endpoints (Bearer / x-api-key /
# x-goog-api-key). Omit or leave empty to disable authentication.
# api_key: my-secret-key

GitHub Models

Besides Copilot, GitHub offers a separate model inference service — GitHub Models — exposing OpenAI-compatible endpoints for models from OpenAI, Meta, Mistral, xAI, DeepSeek, and others. This proxy routes to it transparently.

Routing. GitHub Models identifies models by a publisher/model id (e.g. openai/gpt-4o, meta/llama-4-maverick). When github_models.enabled is true (the default), any request whose translated model id contains a / is sent to GitHub Models instead of Copilot. Because Copilot model ids never contain a /, the two never collide, and existing model mappings are unaffected. This works on /v1/chat/completions, /v1/messages (translated), and the Gemini endpoints.

curl http://127.0.0.1:8314/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hi!"}]}'

Authentication. GitHub Models uses the raw GitHub token (not the Copilot token) via Authorization: Bearer. The token must carry the models scope (classic/OAuth tokens, including the one minted by the Device Flow) or the models: read permission (fine-grained PATs). Tokens without it get an Unauthorized response from GitHub. To use a dedicated token — e.g. a fine-grained PAT scoped only to models: read — set github_models.token or the GHC_PROXY_GITHUB_MODELS_TOKEN environment variable.

Configuration.

github_models:
  enabled: true          # set false to always use Copilot
  org: my-org            # optional: attribute inference to an organization
  token: ghp_xxx         # optional: dedicated token (models scope / models:read)

Environment variable	Effect
`GHC_PROXY_GITHUB_MODELS_ENABLED`	Enable/disable routing (`true`/`1`)
`GHC_PROXY_GITHUB_MODELS_ORG`	Attribute inference to an organization
`GHC_PROXY_GITHUB_MODELS_TOKEN`	Dedicated token for GitHub Models

The GitHub Models catalog is merged into GET /v1/models so those ids show up in the dashboard and model listings.

API Endpoints

Endpoint	Description
`POST /v1/chat/completions`	OpenAI chat completions
`POST /v1/responses`	OpenAI responses API (Codex)
`GET /v1/models`	List available models
`POST /v1/messages`	Anthropic messages API
`POST /v1/messages/count_tokens`	Anthropic token counting
`POST /v1beta/models/{model}:generateContent`	Gemini generate content
`POST /v1beta/models/{model}:streamGenerateContent`	Gemini streaming (SSE)
`POST /v1beta/models/{model}:countTokens`	Gemini token counting
`GET /openapi.json`	OpenAPI v3 specification
`GET /`	Web dashboard
`GET /metrics/dashboard`	Metrics dashboard UI
`GET /metrics`	OpenMetrics endpoint
`GET /requests`	Request browser
`POST /api/config/reload`	Reload config.yaml without restart
`GET /api/models`	All supported models (used by the dashboard)

Example Usage

OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8314/v1", api_key="not-needed")
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Anthropic SDK

import anthropic

client = anthropic.Anthropic(base_url="http://127.0.0.1:8314", api_key="not-needed")
msg = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

cURL

curl http://127.0.0.1:8314/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Development

cargo build      # compile
cargo test       # run unit + integration tests
cargo clippy     # lint

Project Layout

File	Responsibility
`src/main.rs`	CLI parsing and server startup
`src/setup.rs`	Interactive first-run setup wizard
`src/config.rs`	Config dir, YAML config, defaults, model-mapping defaults
`src/auth.rs`	GitHub token resolution (env/file/Device Flow), Copilot token exchange
`src/state.rs`	Shared state, token refresh, upstream header construction
`src/translate.rs`	Model-name translation (exact + prefix)
`src/filters.rs`	Content filtering and token estimation
`src/anthropic.rs`	Anthropic <-> OpenAI request/response/stream translation
`src/gemini.rs`	Gemini <-> OpenAI request/response/stream translation
`src/responses.rs`	Codex `/v1/responses` adapters
`src/util.rs`	Retry-with-backoff and orphaned tool-result handling
`src/server.rs`	Axum router and all HTTP handlers
`src/store.rs`	In-memory request store for the dashboard

Mimicking the Copilot Client

The proxy authenticates to GitHub Copilot by impersonating the official VS Code Copilot Chat client. To do this faithfully it sends the same identity headers that the real client sends to api.githubcopilot.com (Editor-Version, Editor-Plugin-Version, User-Agent, Copilot-Integration-Id, OpenAI-Intent, X-Interaction-Type, X-GitHub-Api-Version, openai-organization, plus a persisted vscode-machineid and a per-session vscode-sessionid, etc.). These are built in AppState::copilot_headers / github_headers (src/state.rs) from the version strings in src/config.rs.

For Anthropic-native /v1/messages requests, the proxy also forwards the anthropic-beta: context-1m-2025-08-07 header for models whose catalog advertises a context window larger than 200K tokens, unlocking the 1M-token tier the same way the official client does.

GitHub may reject requests that report stale client versions, so these values occasionally need refreshing. The source of truth is the now open-source microsoft/vscode-copilot-chat extension and the VS Code Marketplace:

Config value	Where to read it
`copilot_version`	latest `GitHub.copilot-chat` version on the VS Code Marketplace (or the `version` field in the extension's `package.json`)
`vscode_version`	latest VS Code stable release (`https://update.code.visualstudio.com/api/releases/stable`)
`api_version`	`X-GitHub-Api-Version` constant in `src/platform/networking/common/networking.ts`

After updating the constants in src/config.rs, run the test suite and bump the example values in this README.

Notes on Parity with `ghc-tunnel`

This Rust port focuses on the core proxy behavior: authentication, token management, model translation, all four API surfaces with streaming, content filtering, retry, the CLI, and the dashboard. The following ghc-tunnel auxiliary features are intentionally not ported: OneDrive config sync, the ACP code agent, Codex config auto-repair, and the persistent on-disk analytics database. --setup launches an interactive wizard (GitHub sign-in, live model catalog, model-mapping configuration) and writes/updates the config file; in headless or piped contexts it instead re-renders the config non-interactively, applying any CLI overrides or resetting to defaults with --default. --codex patches ~/.codex/config.toml (adding a model_providers.ghc-proxy block and selecting it), and --gemini patches ~/.gemini/.env (base URL, model, and api-key auth selection). --claudecode patches ~/.claude/settings.json, merging env.ANTHROPIC_BASE_URL and ensuring env.ANTHROPIC_API_KEY exists so Claude Code routes through this proxy (existing settings are preserved, and an existing API key is left untouched). The dashboard lists all supported models alongside the request statistics.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github		.github
docs		docs
public		public
src		src
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ghc-proxy

Quick Start

Features

CLI Options

Authentication

Endpoint Authentication

Setup Wizard

Configuration

GitHub Models

API Endpoints

Example Usage

OpenAI SDK

Anthropic SDK

cURL

Development

Project Layout

Mimicking the Copilot Client

Notes on Parity with `ghc-tunnel`

License

About

Uh oh!

Releases 11

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ghc-proxy

Quick Start

Features

CLI Options

Authentication

Endpoint Authentication

Setup Wizard

Configuration

GitHub Models

API Endpoints

Example Usage

OpenAI SDK

Anthropic SDK

cURL

Development

Project Layout

Mimicking the Copilot Client

Notes on Parity with ghc-tunnel

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Notes on Parity with `ghc-tunnel`

Packages