Skip to content

MartinForReal/ghc-proxy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ghc-proxy

A GitHub Copilot API proxy written in Rust. It exposes standard OpenAI and Anthropic compatible HTTP endpoints so any tool (Claude Code, the Codex CLI, OpenAI/Anthropic SDKs, etc.) can talk to GitHub Copilot models.

This is a Rust backend port of the ghc-tunnel Node.js project.

📖 Documentation: https://martinforreal.github.io/ghc-proxy/

Quick Start

# Build
cargo build --release

# Run — on first launch in a terminal this opens the interactive setup wizard
./target/release/ghc-proxy

# Re-run the setup wizard at any time
./target/release/ghc-proxy --setup

# Generate the default config file non-interactively and exit
./target/release/ghc-proxy --config

On a first run with no config file, when launched from a terminal, the proxy opens an interactive setup wizard that signs you in to GitHub (Device Flow), fetches the live model catalog, and helps you configure model mappings. In headless or piped contexts the wizard is skipped: the proxy falls back to GitHub Device Flow auth (or a *_TOKEN environment variable) and a default config file.

Features

  • OpenAI-compatible /v1/chat/completions and /v1/responses endpoints (with Codex adapters: apply_patch tool rewrite, X-Initiator header, context compaction trimming, service_tier nulling, unsupported-tool stripping).
  • Anthropic-compatible /v1/messages endpoint (direct passthrough when the upstream model supports it, otherwise translated through chat completions).
  • Gemini-compatible /v1beta/models/{model}:generateContent, :streamGenerateContent, and :countTokens endpoints (translated through chat completions).
  • GitHub Models inference — requests whose model id uses the publisher/model form (e.g. openai/gpt-4o) are transparently routed to the GitHub Models API instead of Copilot, authenticated with the raw GitHub token (which must carry the models scope). Enabled by default; the catalog is merged into /v1/models.
  • Optional API-key authentication on the LLM endpoints (Authorization: Bearer, x-api-key, or x-goog-api-key), disabled by default and compared in constant time.
  • OpenAPI spec served at /openapi.json describing every LLM endpoint.
  • Automatic model name translation via configurable exact/prefix mappings.
  • Streaming support (SSE) for all endpoints.
  • Retry with exponential backoff for upstream connection errors.
  • Content filtering (system prompt add/remove, tool-result suffix removal).
  • Copilot token management with automatic refresh.
  • Orphaned tool_use_id recovery — retries with offending tool results stripped when the upstream returns the corresponding 400 error.
  • Request analytics dashboard at / and a request browser at /requests.
  • Interactive setup wizard (--setup, or first launch in a terminal): GitHub sign-in, live model catalog, and model-mapping configuration.
  • 1M-context support — forwards the anthropic-beta: context-1m-2025-08-07 header for models whose catalog advertises an extended context window.

CLI Options

ghc-proxy [options]

  -s, --setup             Launch the interactive setup wizard (sign in + map models)
      --claudecode        Configure Claude Code (~/.claude/settings.json) to use this proxy (with --setup)
      --codex             Configure Codex (~/.codex/config.toml) to use this proxy (with --setup)
      --gemini            Configure Gemini CLI (~/.gemini/.env) to use this proxy (with --setup)
  -d, --default           Reset config to defaults during setup
  -p, --port <port>       Port to listen on (default: 8314)
  -a, --address <addr>    Address to listen on (default: 127.0.0.1)
      --debug / --no-debug  Toggle debug mode
      --account-type <t>  Account tier: individual | business | enterprise
  -c, --config            Generate the default config file (non-interactive)
      auth                Authenticate with GitHub and exit (CI/headless)
      check-usage         Print Copilot quota/usage and exit
      info                Print diagnostics (version, paths, token) and exit
      --json              Emit machine-readable JSON (with info)
      --show-token        Log GitHub and Copilot tokens on refresh
      --rate-limit <secs> Minimum seconds between forwarded requests
      --wait              When rate limited, wait instead of returning HTTP 429
      --manual            Require interactive approval before each request
      --fetch-version     Fetch the latest VS Code version at startup
      --no-fetch-version  Disable dynamic VS Code version fetching
      --auto-upgrade      Auto-upgrade app when a newer release is available
      --no-auto-upgrade   Disable app auto-upgrade
      --update-config     Persist migrated config/default additions back to config.yaml
  -v, --version           Show version
  -h, --help              Show help

Authentication

A GitHub token is resolved in this order:

  1. COPILOT_GITHUB_TOKEN, then GH_TOKEN, then GITHUB_TOKEN environment variables (matching the GitHub Copilot SDK precedence).
  2. Saved token file at <config-dir>/github_token.txt.
  3. Interactive GitHub Device Flow (the resulting token is saved for reuse, with 0600 permissions on Unix).

The GitHub token is exchanged for a short-lived Copilot token via https://api.github.com/copilot_internal/v2/token, which is refreshed automatically before it expires.

The interactive Device Flow requests the read:user copilot models scopes. The models scope authorizes the GitHub Models inference API; if you supply your own token instead, give it the models scope (classic/OAuth token) or the models: read permission (fine-grained PAT) to use GitHub Models.

Endpoint Authentication

By default the proxy accepts all local requests. To require a key, set api_key in config.yaml (or the GHC_PROXY_API_KEY environment variable). When set, every request to the LLM endpoints must present a matching key, compared in constant time:

# Anthropic / OpenAI style
curl http://127.0.0.1:8314/v1/messages -H "x-api-key: my-secret-key" ...
curl http://127.0.0.1:8314/v1/chat/completions -H "Authorization: Bearer my-secret-key" ...
# Gemini style
curl "http://127.0.0.1:8314/v1beta/models/gemini-2.5-pro:generateContent" -H "x-goog-api-key: my-secret-key" ...

The dashboard, metrics, and static pages remain open so local monitoring keeps working without a key.

Setup Wizard

Running ghc-proxy --setup — or launching the proxy for the first time from a terminal with no config file — opens an interactive wizard that:

  1. Prompts for the server settings (listen address, port, account tier).
  2. Signs in to GitHub via Device Flow and saves the token.
  3. Fetches the live model catalog and lets you map the opus / sonnet / haiku aliases to specific models (or keep the recommended defaults).
  4. Optionally configures Claude Code to route through the proxy.

The wizard only runs when attached to a terminal, so headless and CI launches are unaffected (they fall back to environment/file tokens and a default config). Pass --default to start the wizard from built-in defaults, or --claudecode to include the Claude Code step automatically.

Configuration

Config file: ~/.ghc-tunnel/config.yaml (%APPDATA%/ghc-tunnel/config.yaml on Windows). It is generated on first run or with --config.

config_version: 2
address: 127.0.0.1
port: 8314
debug: false
account_type: individual            # individual | business | enterprise
vscode_version: "1.123.0"
api_version: "2025-05-01"
copilot_version: "0.48.1"
auto_upgrade: false
model_mappings:
  exact:
    opus: claude-opus-4.8
    sonnet: claude-opus-4.8
    haiku: claude-haiku-4.5
  prefix:
    claude-sonnet-4-: claude-opus-4.8
github_models:
  enabled: true                     # route publisher/model ids to GitHub Models
  # org: my-org                     # attribute inference to an organization
  # token: ghp_xxx                  # dedicated token (models scope / models:read)
system_prompt_remove: []
system_prompt_add: []
tool_result_suffix_remove: []
max_connection_retries: 3

# Optional: require this key on all LLM endpoints (Bearer / x-api-key /
# x-goog-api-key). Omit or leave empty to disable authentication.
# api_key: my-secret-key

GitHub Models

Besides Copilot, GitHub offers a separate model inference service — GitHub Models — exposing OpenAI-compatible endpoints for models from OpenAI, Meta, Mistral, xAI, DeepSeek, and others. This proxy routes to it transparently.

Routing. GitHub Models identifies models by a publisher/model id (e.g. openai/gpt-4o, meta/llama-4-maverick). When github_models.enabled is true (the default), any request whose translated model id contains a / is sent to GitHub Models instead of Copilot. Because Copilot model ids never contain a /, the two never collide, and existing model mappings are unaffected. This works on /v1/chat/completions, /v1/messages (translated), and the Gemini endpoints.

curl http://127.0.0.1:8314/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "openai/gpt-4o", "messages": [{"role": "user", "content": "Hi!"}]}'

Authentication. GitHub Models uses the raw GitHub token (not the Copilot token) via Authorization: Bearer. The token must carry the models scope (classic/OAuth tokens, including the one minted by the Device Flow) or the models: read permission (fine-grained PATs). Tokens without it get an Unauthorized response from GitHub. To use a dedicated token — e.g. a fine-grained PAT scoped only to models: read — set github_models.token or the GHC_PROXY_GITHUB_MODELS_TOKEN environment variable.

Configuration.

github_models:
  enabled: true          # set false to always use Copilot
  org: my-org            # optional: attribute inference to an organization
  token: ghp_xxx         # optional: dedicated token (models scope / models:read)
Environment variable Effect
GHC_PROXY_GITHUB_MODELS_ENABLED Enable/disable routing (true/1)
GHC_PROXY_GITHUB_MODELS_ORG Attribute inference to an organization
GHC_PROXY_GITHUB_MODELS_TOKEN Dedicated token for GitHub Models

The GitHub Models catalog is merged into GET /v1/models so those ids show up in the dashboard and model listings.

API Endpoints

Endpoint Description
POST /v1/chat/completions OpenAI chat completions
POST /v1/responses OpenAI responses API (Codex)
GET /v1/models List available models
POST /v1/messages Anthropic messages API
POST /v1/messages/count_tokens Anthropic token counting
POST /v1beta/models/{model}:generateContent Gemini generate content
POST /v1beta/models/{model}:streamGenerateContent Gemini streaming (SSE)
POST /v1beta/models/{model}:countTokens Gemini token counting
GET /openapi.json OpenAPI v3 specification
GET / Web dashboard
GET /metrics/dashboard Metrics dashboard UI
GET /metrics OpenMetrics endpoint
GET /requests Request browser
POST /api/config/reload Reload config.yaml without restart
GET /api/models All supported models (used by the dashboard)

Example Usage

OpenAI SDK

from openai import OpenAI

client = OpenAI(base_url="http://127.0.0.1:8314/v1", api_key="not-needed")
resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Anthropic SDK

import anthropic

client = anthropic.Anthropic(base_url="http://127.0.0.1:8314", api_key="not-needed")
msg = client.messages.create(
    model="claude-sonnet-4",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

cURL

curl http://127.0.0.1:8314/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{"model": "gpt-4o", "messages": [{"role": "user", "content": "Hello!"}]}'

Development

cargo build      # compile
cargo test       # run unit + integration tests
cargo clippy     # lint

Project Layout

File Responsibility
src/main.rs CLI parsing and server startup
src/setup.rs Interactive first-run setup wizard
src/config.rs Config dir, YAML config, defaults, model-mapping defaults
src/auth.rs GitHub token resolution (env/file/Device Flow), Copilot token exchange
src/state.rs Shared state, token refresh, upstream header construction
src/translate.rs Model-name translation (exact + prefix)
src/filters.rs Content filtering and token estimation
src/anthropic.rs Anthropic <-> OpenAI request/response/stream translation
src/gemini.rs Gemini <-> OpenAI request/response/stream translation
src/responses.rs Codex /v1/responses adapters
src/util.rs Retry-with-backoff and orphaned tool-result handling
src/server.rs Axum router and all HTTP handlers
src/store.rs In-memory request store for the dashboard

Mimicking the Copilot Client

The proxy authenticates to GitHub Copilot by impersonating the official VS Code Copilot Chat client. To do this faithfully it sends the same identity headers that the real client sends to api.githubcopilot.com (Editor-Version, Editor-Plugin-Version, User-Agent, Copilot-Integration-Id, OpenAI-Intent, X-Interaction-Type, X-GitHub-Api-Version, openai-organization, plus a persisted vscode-machineid and a per-session vscode-sessionid, etc.). These are built in AppState::copilot_headers / github_headers (src/state.rs) from the version strings in src/config.rs.

For Anthropic-native /v1/messages requests, the proxy also forwards the anthropic-beta: context-1m-2025-08-07 header for models whose catalog advertises a context window larger than 200K tokens, unlocking the 1M-token tier the same way the official client does.

GitHub may reject requests that report stale client versions, so these values occasionally need refreshing. The source of truth is the now open-source microsoft/vscode-copilot-chat extension and the VS Code Marketplace:

Config value Where to read it
copilot_version latest GitHub.copilot-chat version on the VS Code Marketplace (or the version field in the extension's package.json)
vscode_version latest VS Code stable release (https://update.code.visualstudio.com/api/releases/stable)
api_version X-GitHub-Api-Version constant in src/platform/networking/common/networking.ts

After updating the constants in src/config.rs, run the test suite and bump the example values in this README.

Notes on Parity with ghc-tunnel

This Rust port focuses on the core proxy behavior: authentication, token management, model translation, all four API surfaces with streaming, content filtering, retry, the CLI, and the dashboard. The following ghc-tunnel auxiliary features are intentionally not ported: OneDrive config sync, the ACP code agent, Codex config auto-repair, and the persistent on-disk analytics database. --setup launches an interactive wizard (GitHub sign-in, live model catalog, model-mapping configuration) and writes/updates the config file; in headless or piped contexts it instead re-renders the config non-interactively, applying any CLI overrides or resetting to defaults with --default. --codex patches ~/.codex/config.toml (adding a model_providers.ghc-proxy block and selecting it), and --gemini patches ~/.gemini/.env (base URL, model, and api-key auth selection). --claudecode patches ~/.claude/settings.json, merging env.ANTHROPIC_BASE_URL and ensuring env.ANTHROPIC_API_KEY exists so Claude Code routes through this proxy (existing settings are preserved, and an existing API key is left untouched). The dashboard lists all supported models alongside the request statistics.

License

MIT

About

Fast Rust proxy that turns GitHub Copilot into OpenAI- and Anthropic-compatible API endpoints. Works with Claude Code, Codex CLI, and any OpenAI/Anthropic SDK.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors