Skip to content

feat(oauth): native OAuth 2.1 authorization for MCP clients#1870

Open
t0saki wants to merge 10 commits intovolcengine:mainfrom
t0saki:feat/oauth-mcp-sdk
Open

feat(oauth): native OAuth 2.1 authorization for MCP clients#1870
t0saki wants to merge 10 commits intovolcengine:mainfrom
t0saki:feat/oauth-mcp-sdk

Conversation

@t0saki
Copy link
Copy Markdown
Contributor

@t0saki t0saki commented May 6, 2026

Description

Native OAuth 2.1 implementation for openviking-server, enabling Claude.ai / Claude Desktop / ChatGPT / Cursor and other OAuth-only clients to connect directly — no more external MCP-Key2OAuth Cloudflare Worker proxy.

Built on the official mcp.server.auth SDK (already a dependency). OpenViking contributes a Provider adapter, a device-flow style authorization page, and a console-integrated verification form. All tokens are opaque random strings stored hashed in SQLite — zero cryptographic code on the OpenViking side.

Scope: While the immediate driver is MCP clients, the OAuth implementation is not MCP-specific. The protocol surface (DCR, authorize, token, metadata) is standards-compliant OAuth 2.1 — future browser apps, mobile clients, and third-party integrations can reuse the same authorization system. Routes and naming carry no "mcp" prefix.

Related Issue

Replaces the community MCP-Key2OAuth workaround documented in #1861.

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Documentation update

Changes Made

Core OAuth

  • Provider adapterOAuthAuthorizationServerProvider Protocol (8 methods), tokens bound to (account_id, user_id, role) triple
  • Storage layer — SQLite 5 tables + atomic one-shot consumption + WAL + periodic GC + revoke_user_tokens cascade
  • Authorize page — Device-flow style HTML+JS: displays verification code → polls status → auto-redirects; detects localStorage on same origin for Quick authorize
  • Bearer routingovat_ prefix fast-path → DB lookup; fail-closed (never falls back to API key)
  • 401 discoveryWWW-Authenticate: Bearer resource_metadata="..." with 4-level origin resolution

Aggregated reverse proxy (port 1934)

  • Caddyfile — Default HTTP proxy merging API (1933) + Console (8020) onto :1934, one port for everything
  • docker-compose.yml — Caddy enabled by default; upgrading to HTTPS requires appending a domain block + uncommenting 3 lines
  • docs/guides/12-public-access.md — Full public access guide (Caddy / nginx / CDN)

Console integration

  • "Authorize an MCP client" form + collapsible "Get OTP" entry
  • Proxy routes for /ov/auth/oauth-verify and /ov/auth/otp

Documentation

  • docs/guides/11-oauth.md — OAuth user guide (recommended setup, client onboarding, curl walkthrough, troubleshooting)
  • docs/guides/12-public-access.md — Public access & reverse proxy guide
  • docs/design/mcp-oauth2-1.md — Internal design doc

Testing

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have tested this on the following platforms:
    • Linux
    • macOS
    • Windows

37 new tests covering: storage-layer atomic operations & race conditions, full device-flow happy path, bearer routing fail-closed semantics, refresh rotation + replay detection, WWW-Authenticate header injection. Existing test_auth.py regression: none.

End-to-end verified with ChatGPT Connectors on a live deployment.

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes

Key design decisions

Decision Choice Rationale
Protocol layer mcp.server.auth SDK Already a dependency; DCR / PKCE / redirect_uri validation maintained upstream
Token format Opaque + SQLite (not JWT) Zero crypto code; no signing key to manage; revocation is a single UPDATE
Authorization UX Device-flow (page shows code → user confirms in console) User confirms from an already-authenticated environment
Same-origin quick-authorize Explicit click required Security: authorization is never silent
Token permissions Phase 1: equivalent to the issuing API key Not a privilege escalation; Phase 2 adds scope to narrow it

Backward compatibility

  • oauth.enabled = false (default): OAuth routes not mounted, bearer discriminator skipped — behavior identical to before this PR
  • oauth.enabled = true: bearers without the ovat_ prefix still route through APIKeyManager — existing clients unaffected

Future work

Item Priority
OAuth scope mechanism (mcp / fs / admin) P1
Admin API exposing revoke_user_tokens P1
GitHub / Google third-party login P2
ov otp Rust CLI subcommand P3

t0saki added 8 commits May 6, 2026 15:32
…token, JWT discriminator)

Snapshot before evaluating migration to mcp.server.auth SDK provider. The
hand-rolled HS256 JWT implementation in openviking/server/oauth/jwt.py is
the main candidate for replacement: its surface area is small but it would
require careful crypto review by maintainers, while the official MCP SDK
already ships an OAuth provider wired into FastMCP.

Included so far:
- OAuthConfig + integration into OpenVikingConfig (default disabled)
- openviking/server/oauth/{jwt,storage,otp,router}.py
- POST /oauth/token (authorization_code + refresh_token, PKCE S256, RFC 6749 errors)
- JWT discriminator in resolve_identity (fail-closed; ResolvedIdentity.from_oauth)
- WWW-Authenticate Bearer hint on /mcp 401 (RFC 9728)
- 49 OAuth-specific unit/integration tests (all passing)

Not yet implemented (M3 / MVP gap):
- /oauth/register (DCR), /oauth/authorize (HTML + OTP submit), well-known metadata
- POST /api/v1/auth/otp REST endpoint
…wn JWT

Replaces the hand-rolled HS256 JWT signer / token endpoint / DCR with
the OAuth 2.1 surface shipped in mcp.server.auth. We supply a Provider
that adapts the existing OAuthStore (SQLite) to the SDK Protocol, plus
two custom routes the SDK doesn't own: an OTP-entry HTML page (the URL
provider.authorize() returns) and POST /api/v1/auth/otp for issuing
OTPs against an existing API key.

Net result: all OAuth crypto is now the SDK's responsibility (PKCE
S256, redirect_uri matching, error formatting). The OpenViking-side code
contains zero cryptography — access tokens are opaque random strings
prefixed with `ovat_` and looked up in SQLite by SHA-256 hash. Refresh
tokens, auth codes, OTPs use the same scheme.

Highlights:
- openviking/server/oauth/provider.py: OpenVikingOAuthProvider implements
  the 8-method SDK Protocol, including subclassing AuthorizationCode /
  RefreshToken / AccessToken to pin (account_id, user_id, role) per
  token. Refresh-token replay triggers per-user chain revocation.
- openviking/server/oauth/storage.py: adds oauth_access_tokens and
  oauth_pending_authorizations tables; peek_auth_code / peek_refresh
  for non-destructive lookups; revoke_user_tokens cascades all OAuth
  state for an (account, user) pair when a key is rotated.
- openviking/server/oauth/router.py: minimal authorize page (inline
  HTML with frame-ancestors 'none') + OTP endpoint authenticated via
  existing get_request_context dependency.
- openviking/server/auth.py: replaces JWT discriminator with prefix
  match + provider.load_access_token; still fail-closed.
- openviking/server/app.py: mounts SDK routes via create_auth_routes
  alongside our authorize-page + OTP routes.
- Deletes openviking/server/oauth/jwt.py and tests/server/oauth/test_jwt.py.

Tests: 32 passing, including a full DCR -> OTP -> authorize page ->
token-exchange -> /mcp lookup happy path, refresh rotation, and replay
detection. Existing test_auth.py regression unchanged.

Phase 1 still missing for full Claude.ai connectivity:
- WWW-Authenticate hint already present on /mcp 401 (from M2)
- /.well-known/oauth-protected-resource (RFC 9728) — not currently
  emitted by the SDK; small custom route still TODO.
The earlier draft described a hand-sewn HS256 JWT plan; the implementation
took a different route after discovering mcp.server.auth ships a complete
RFC 6749 / 7591 / 8414 server. Updated to reflect:

- SDK owns the protocol surface (DCR, /authorize parsing, /token, metadata,
  PKCE, redirect_uri matching, error codes).
- OpenViking only contributes a Provider implementation, the OTP-entry
  HTML page, and POST /api/v1/auth/otp.
- Tokens are opaque (ovat_ / ovrt_ / ovac_ prefixes) — no JWT, no crypto
  on our side.
- Implementation status: M1/M2/M3 done; only RFC 9728 protected-resource
  metadata + reverse-proxy issuer derivation remain for full Claude.ai
  end-to-end connectivity.
The /mcp 401 path already advertises this URL via WWW-Authenticate
Bearer resource_metadata="...", but the endpoint itself didn't exist —
clients fetched it and got a 404, which silently broke the discovery
chain even though /.well-known/oauth-authorization-server worked. Wire
up the resource metadata document so the full RFC 9728 → RFC 8414
discovery chain works end-to-end.

Uses mcp.shared.auth.ProtectedResourceMetadata pydantic model. Reads
X-Forwarded-Proto/Host so the published resource URL matches what the
client used (matches our existing WWW-Authenticate behavior).

Cache-Control: max-age=3600 — metadata is stable across requests.
Adds a "Get OTP" button under the Settings panel of the 8020 web
console. Clicking it issues an OAuth OTP via the user's existing API
key (already loaded into sessionStorage) and displays it inline with
a copy-to-clipboard button.

Replaces the previous workflow of users having to:
  curl -X POST -H "X-Api-Key: $KEY" http://1933/api/v1/auth/otp

…with a single button-click flow that the user can reach from any
machine with a browser.

Wires:
- console/app.py: new POST /console/api/v1/ov/auth/otp proxy route,
  forwarding to upstream /api/v1/auth/otp. Not gated by write_enabled
  since OTP issuance is an authentication artifact, not data mutation.
- index.html: new OAuth section in the Settings panel with otpBox
  (hidden until OTP is generated) and a Copy button.
- app.js: getOtpBtn click handler calls callConsole, otpCopyBtn copies
  to clipboard. Clear failure messages when the user has no API key
  loaded yet.

This is the lightweight half of the Console-OAuth integration. The
fuller "same-origin auto-authorize" flow (Phase 2) — where the
authorize page detects sessionStorage and submits the OTP form
automatically — is still TBD and will reuse this proxy route.
Pivots the OTP flow direction so the UX matches OAuth 2.0 Device
Authorization Grant (RFC 8628) more closely:

  Old (push):  user goes to console -> Get OTP -> copy -> paste in
               client's authorize page -> submit -> redirect.
  New (pull):  client's authorize page DISPLAYS a 6-char code -> user
               types it into the console verify form -> page polls -> redirect.

This removes one tab switch and aligns with how users mentally model
authorization ("I'm approving the request shown over there from
where I'm already signed in"). The legacy POST /api/v1/auth/otp +
"Get OTP" button are kept under a collapsed details element for any
scripted/CLI flows that still drive the older pattern.

Also wires OPENVIKING_PUBLIC_BASE_URL env var as the highest-priority
public origin override, used consistently by:
  - /.well-known/oauth-protected-resource
  - WWW-Authenticate header
  - authorize page links
  - SDK issuer at app start.

Server changes:
- storage.py: oauth_pending_authorizations gains display_code,
  verified, verified_account_id/user_id/role columns; new
  find_pending_by_display_code + mark_pending_verified.
- provider.authorize() now generates display_code at pending creation
  and returns the page URL.
- router.py:
  * GET /oauth/authorize/page — renders the code + same-origin quick-
    authorize panel (sessionStorage detection, but click still required
    so authorization is never silent).
  * GET /oauth/authorize/page/status — polled by the page until verified;
    response carries the redirect_url with auth_code on approval.
  * POST /api/v1/auth/oauth-verify — authenticated; binds caller
    identity to a pending row (decision=approve|deny).

Console changes:
- Settings panel: new "Authorize an MCP client" section with code input
  and Authorize/Deny buttons. Legacy "Get OTP" still available under
  details.
- console proxy gains POST /console/api/v1/ov/auth/oauth-verify.

Tests: 38 OAuth tests passing, including a full device-flow happy path,
deny path, idempotency (one-shot pending), unknown-code rejection,
status-410 on consumed/expired, refresh rotation, OPENVIKING_PUBLIC_BASE_URL
override, and X-Forwarded-* fallback.
… compose

Adds a top-level OAuth 2.1 guide (zh/en) covering the production path
end-to-end. Opens with a 5-step recommended setup so readers don't have
to wade through the rationale before they can deploy. Drops the "MCP"
qualifier from the doc name — OAuth 2.1 here is generic and serves any
OAuth client, not just MCP.

- docs/{en,zh}/guides/11-oauth.md: new. Recommended setup at the top,
  then background, full device flow, HTTP-local vs HTTPS-production
  deployment, Caddy + nginx templates, docker-compose with the shipped
  Caddy service, curl walkthrough, config reference, troubleshooting.
- docker-compose.yml: replace the prior PR's commented-out hint with a
  single OPENVIKING_PUBLIC_BASE_URL var (read by both the openviking
  service and an optional Caddy reverse-proxy service that's also
  shipped commented-out). Same env var drives Caddy via
  {$OPENVIKING_PUBLIC_BASE_URL}, so the public domain is configured
  once in .env.
- docs/{en,zh}/guides/06-mcp-integration.md: replace the "OAuth Proxy
  (planned, use community Cloudflare Worker)" section with a short
  pointer to the new 11-oauth guide. The community proxy is still
  mentioned as an alternative.

Same env-variable design also matches what the MCP add_resource tool
expects (it already reads OPENVIKING_PUBLIC_BASE_URL), so deployments
get a single source of truth for the public address.
The same-origin "Quick authorize" panel was reading sessionStorage,
which is per-tab. Since the OAuth authorize page opens in a different
tab from the console, the panel never showed up even when the user was
signed in.

The console persists the API key in localStorage as well (key
"ov_console_api_key" — see static/console_settings.js's
LEGACY_API_KEY_STORAGE_KEY) for cross-tab use, and that copy is what
the authorize page should consult.

Switch the page JS to localStorage first, fall back to sessionStorage
for resilience. No console-side change needed; the localStorage entry
has been written by the console all along.
Copilot AI review requested due to automatic review settings May 6, 2026 10:45
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
🏅 Score: 90
🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ No major issues detected

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 6, 2026

Failed to generate code suggestions for PR

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a native OAuth 2.1 authorization flow to openviking-server for OAuth-only MCP clients (Claude.ai / Claude Desktop / ChatGPT / Cursor), built on the mcp.server.auth SDK, with OpenViking-specific storage + device-flow style verification via the console.

Changes:

  • Introduces an SQLite-backed OAuth store plus an SDK Provider adapter that issues opaque ovat_/ovrt_ tokens and supports DCR, auth codes, refresh rotation, and revocation.
  • Adds OpenViking OAuth endpoints (protected-resource metadata, authorize page + polling, OTP issuance, console-driven verification) and wires OAuth into bearer auth resolution and MCP 401 discovery (WWW-Authenticate).
  • Adds console UI + proxy routes for approving/denying device-flow verification codes, plus extensive new OAuth tests and documentation.

Reviewed changes

Copilot reviewed 25 out of 26 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
tests/server/test_mcp_endpoint.py Fixes test import to use ls tool name.
tests/server/oauth/init.py Adds oauth test package marker.
tests/server/oauth/test_storage.py Adds unit tests for OAuthStore (clients/codes/tokens/GC).
tests/server/oauth/test_router.py Adds end-to-end tests for device-flow style OAuth authorization.
tests/server/oauth/test_mcp_www_authenticate.py Adds tests for RFC 9728 discovery hints on /mcp 401.
tests/server/oauth/test_auth_integration.py Adds tests for ovat_ bearer discrimination + fail-closed behavior.
openviking/server/oauth/init.py Adds OAuth package module docstring.
openviking/server/oauth/otp.py Adds OTP generation + SHA-256 hashing helper.
openviking/server/oauth/storage.py Implements SQLite schema + CRUD for clients/codes/tokens/pending authorizations + GC.
openviking/server/oauth/provider.py Implements MCP SDK provider adapter and token issuance/refresh rotation.
openviking/server/oauth/router.py Implements PRM endpoint, authorize page + polling, OTP issuance, and oauth-verify endpoint.
openviking/server/auth.py Adds OAuth access-token resolution path and from_oauth identity handling.
openviking/server/identity.py Adds ResolvedIdentity.from_oauth flag for downstream guards.
openviking/server/mcp_endpoint.py Adds WWW-Authenticate: Bearer resource_metadata=... hint when OAuth is enabled.
openviking/server/app.py Mounts SDK OAuth routes + OpenViking OAuth router; initializes OAuth store + GC loop in lifespan.
openviking/console/app.py Proxies OTP and oauth-verify endpoints through the console service.
openviking/console/static/index.html Adds “Authorize an MCP client” device-flow form and legacy OTP UI.
openviking/console/static/app.js Adds handlers for OAuth verify/deny and legacy OTP issuance/copy UX.
openviking_cli/utils/config/open_viking_config.py Adds oauth config block to OpenVikingConfig.
openviking_cli/utils/config/oauth_config.py Introduces OAuthConfig model (enabled/issuer/TTLs/db/rate limit).
docs/en/guides/11-oauth.md Adds full English OAuth setup and troubleshooting guide.
docs/zh/guides/11-oauth.md Adds full Chinese OAuth setup and troubleshooting guide.
docs/en/guides/06-mcp-integration.md Updates MCP integration docs to point to native OAuth guide.
docs/zh/guides/06-mcp-integration.md Updates MCP integration docs to point to native OAuth guide.
docs/design/mcp-oauth2-1.md Adds internal design/architecture document for the OAuth implementation.
docker-compose.yml Documents OPENVIKING_PUBLIC_BASE_URL and includes commented Caddy reverse-proxy template.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +3 to +23
"""OAuth 2.1 (MCP-flavored) configuration.

Phase 1 supports OTP-only authorization. JWT (HS256) is used for access tokens
and opaque random strings for refresh tokens, authorization codes and OTPs;
all but access tokens are stored hashed in {workspace}/oauth.db.
"""

from typing import Optional

from pydantic import BaseModel, Field


class OAuthConfig(BaseModel):
"""OAuth 2.1 server configuration.

OAuth is layered on top of `AuthMode.API_KEY`: when enabled, the existing
`Authorization: Bearer <api_key>` path is augmented with a JWT discriminator
that resolves OAuth-issued tokens to the same `ResolvedIdentity` shape.
Disabled by default — turning it on registers `/oauth/*` routes and the
well-known metadata endpoints.
"""
default=3600,
ge=60,
le=24 * 3600,
description="Lifetime of issued JWT access tokens in seconds.",
default=10,
ge=1,
le=600,
description="Per-IP rate limit on POST /oauth/authorize (OTP submissions per minute).",
Comment thread openviking/server/oauth/storage.py Outdated
record["response_types"] = json.loads(record["response_types"])
return record

return await asyncio.to_thread(_query)
Comment on lines +232 to +245
if consumed is None:
# Reuse detection: if known but already consumed, revoke chain.
if await self._store.is_refresh_known_but_consumed(refresh_token.token):
logger.warning(
"OAuth refresh replay detected for client_id=%s account=%s user=%s",
client.client_id,
refresh_token.account_id,
refresh_token.user_id,
)
await self._store.revoke_user_tokens(
account_id=refresh_token.account_id,
user_id=refresh_token.user_id,
)
raise TokenError(error="invalid_grant", error_description="refresh token invalid")
Comment thread docker-compose.yml Outdated
#
# 1. Set OPENVIKING_PUBLIC_BASE_URL=https://ov.your-domain.com in `.env`.
# 2. Drop a Caddyfile next to this compose file (template in
# docs/en/guides/11-mcp-oauth.md). Caddy reads the URL from the
t0saki added 2 commits May 6, 2026 19:12
- Add Caddyfile with :1934 HTTP aggregated proxy (merges 1933+8020)
- Enable Caddy service by default in docker-compose.yml on port 1934
- Add docs/{en,zh}/guides/12-public-access.md with full HTTPS setup guide
- Simplify 11-oauth.md: replace inline reverse proxy config with refs to 12
- Add HTTPS requirement callout to OAuth recommended setup
- Update 03-deployment.md to mention port 1934 as recommended entry point
- Update oauth_config.py docstrings to describe opaque tokens, not JWT
  (we switched away from JWT during implementation)
- Remove unused authorize_rate_limit_per_min config field — was never
  enforced anywhere in router/storage, dead config misled operators
- Wrap all OAuthStore read paths in self._lock (matching writes); the
  shared sqlite3.Connection with check_same_thread=False is not safe
  for concurrent cursor use across threads
- Clarify provider.exchange_refresh_token comment that replay revokes
  the entire (account, user) family, not just the (client, account,
  user) chain — broader blast radius is intentional
- ruff format: 8 files reformatted to satisfy CI lint
@t0saki
Copy link
Copy Markdown
Contributor Author

t0saki commented May 6, 2026

Thanks Copilot — addressed in 1ed0db8 (just pushed):

# Comment Resolution
1 OAuthConfig module/class docstrings describe JWT ✅ Updated to opaque-token + SQLite description (we switched away from JWT during implementation)
2 access_token_ttl_seconds description says JWT ✅ Changed to "opaque access tokens"
3 authorize_rate_limit_per_min not enforced ✅ Removed — was dead config that gave a false sense of protection. Will reintroduce at the actual verification endpoint when we add real rate limiting
4 OAuthStore reads not synchronized via _lock ✅ Valid concern. Wrapped all read paths (get_client, _peek_code, peek_refresh, is_refresh_known_but_consumed, load_access, load_pending_authorization, find_pending_by_display_code) in async with self._lock:. Updated module docstring to make the invariant explicit. Phase 1 QPS makes lock contention a non-issue
5 revoke chain comment vs revoke_user_tokens call ✅ Comment was misleading. Clarified that we deliberately go broader than per-(client, account, user) chain: replay implies the family is compromised, so we revoke ALL access + refresh + auth-codes for the (account, user) pair
6 docker-compose reference to 11-mcp-oauth.md ✅ Already fixed in c5cb241 (file was renamed to 11-oauth.md and the compose comment now points to 12-public-access.md)

Also ran ruff format to fix CI lint failure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants