Skip to content

feat: TCP transport, token auth, lease-aware proxy, and VM agent#5

Merged
ceejbot merged 11 commits intolatestfrom
feature/tcp-transport
Mar 29, 2026
Merged

feat: TCP transport, token auth, lease-aware proxy, and VM agent#5
ceejbot merged 11 commits intolatestfrom
feature/tcp-transport

Conversation

@ceejbot
Copy link
Copy Markdown
Owner

@ceejbot ceejbot commented Mar 28, 2026

Summary

Adds TCP transport with token-based authentication, a lease-aware network proxy, and a VM-side agent binary — the complete infrastructure for deploying zerolease in QEMU VM environments where AI coding agents need credential access.

TCP Transport + Token Authentication

  • PeerIdentity::Tcp with peer address + SHA-256 token hash (never raw token)
  • ClientHello.token field (backward-compatible — UDS/vsock omit it)
  • TcpListener (localhost-only) + TcpConnector
  • TokenAuthenticator: reference impl mapping pre-registered tokens to identities
  • VaultClient::connect_with_token()

Lease-Aware Proxy (security-audited)

An HTTPS CONNECT proxy that makes lease revocation mean "network access cut off":

  • Explicit mode (port 8080): Parses HTTP CONNECT, validates domain against lease state
  • Transparent mode (port 8443): Extracts domain from TLS SNI for tools ignoring HTTPS_PROXY
  • Active lease → bidirectional TCP tunnel (time-bounded to lease expiry)
  • Expired/unknown/revoked → 403 Forbidden

Security hardening from adversarial audit:

  • SSRF prevention: DNS resolve + private IP blocklist (blocks 169.254.169.254, 10.x, etc.)
  • Port restriction: only 443/8443 allowed
  • DoS prevention: bounded request line (8 KiB) and header count (64)
  • Generic 502 responses (no internal error leakage)
  • Case-normalized domain matching
  • Tunnel timeout derived from lease expiry

VM Agent Binary (zerolease-agent)

Three subcommands in a single binary:

  • provision: Acquires credentials, writes env file + config files + lease state, exits. Vault token dies with this process (never enters agent env).
  • proxy: Long-running lease-aware proxy (described above)
  • credential-fill: Git credential helper with per-request domain validation

Credential injection via manifest with three mechanisms:

  • env: Environment variables (GITHUB_TOKEN, NPM_TOKEN, etc.)
  • file: Config files from templates with ${SECRET} expansion (mode 0600)
  • git_credential: Git credential helper mapping (per-request domain validation)

Documentation

  • Deployment architecture doc with ASCII diagram + Mermaid sequence chart
  • CLI README with use cases: git HTTPS/SSH, gh, Fastly, AWS OIDC, npm, pip, cargo, Docker, databases
  • Updated main README and CLAUDE.md

Test plan

  • 100 tests across workspace, all passing
  • TCP transport: bind, connect, frame round-trip, token handshake accept/reject
  • TokenAuthenticator: register, authenticate, revoke, reject unknown
  • CONNECT proxy: 403 for denied/expired, 502 for private IPs, 400 for bad hostnames, 403 for disallowed ports
  • SNI extraction: synthetic ClientHello, missing SNI, truncated, non-TLS
  • Lease state: active/expired/prune, atomic write/read
  • Manifest: full format, git host map, unknown mechanism rejection
  • Config writer: template expansion, tilde expansion, permissions
  • Security audit: all Critical/High findings addressed, Medium findings addressed
  • CI green

ceejbot added 11 commits March 28, 2026 15:00
Add PeerIdentity::Tcp variant with peer address and token hash
(SHA-256, never the raw token). Add TcpListener (localhost-only)
and TcpConnector transports.

Extend ClientHello with optional token field for TCP auth. UDS/vsock
clients omit it (backward compatible via serde defaults). Add
ClientHello::with_token() constructor.

Add hash_token() helper and TokenHash type alias.
Add token parameter to Authenticator::authenticate() trait method.
Existing impls (AllowAllAdmin) ignore it.

Add TokenAuthenticator: reference implementation that maps pre-registered
tokens (stored as SHA-256 hashes) to ConnectionIdentity values. Supports
register() and revoke().

Server: extract token from ClientHello, hash into PeerIdentity::Tcp,
pass raw token to authenticator for validation.

Client: add connect_with_token() and connect_with_hello() for TCP
transports that need to present a bearer token in the handshake.
Test the full handshake: TCP client with token → server parses
ClientHello → enriches PeerIdentity with token hash → authenticator
validates → ServerHello accept/reject.

Tests both the happy path (registered token → Agent identity) and
the rejection path (wrong token → handshake rejected).
README: document workspace structure, all three storage backends,
three transport options, updated build/test commands, remove stale
feature flag table and standalone server references.

CLAUDE.md: reflect current workspace layout, transport/auth
separation, storage/audit decoupling, code quality principles.
Describes the production layout: Claw orchestrator managing QEMU VM
stacks with zerolease providing credential leasing via TCP + token
auth. Includes ASCII diagram, Mermaid sequence chart, security
properties, and component mapping.
New crate that acquires credentials from a zerolease vault via TCP +
token auth, injects them as environment variables, and execs a command
(typically claude). Designed for managed QEMU VMs where the prompt-run
token is injected at boot.

Usage: zerolease-cli --token $TOKEN -- claude code

Reads a credential manifest (JSON) describing which secrets to acquire
and which env vars to set. Default vault address is 10.0.2.2:9100
(QEMU user-mode networking gateway).

Phase 1: env injection at startup. Phase 2 (future): git credential
helper subcommand for per-request domain-validated credential injection.
The CLI is now the sole credential source inside a VM we control.
Three injection mechanisms, each serving a different tool ecosystem:

- env: set environment variables (GH_TOKEN, NPM_TOKEN, etc.)
- file: write config files with template expansion and mode 0600
  (.npmrc, pip.conf, etc.)
- git_credential: git credential helper protocol with per-request
  domain validation — the vault's core security property exercised
  on every git operation

Subcommands:
- exec: acquire credentials, inject via all mechanisms, exec command
- credential-fill: git credential helper called by git at auth time

The manifest format now supports multiple injection mechanisms per
credential. The exec subcommand sets GIT_CONFIG_* env vars to
configure itself as the credential helper, and passes ZEROLEASE_TOKEN
and ZEROLEASE_VAULT_ADDR through so credential-fill can reach the vault.
Covers git HTTPS (credential helper), git SSH (HTTPS rewrite or key
file), gh CLI, Fastly, AWS OIDC, npm, pip, cargo, Docker, and
database connection strings. Each with a manifest example. Explains
the security model and when to prefer each injection mechanism.
…on/proxy/credential-fill

The agent is now three subcommands:

- provision: acquires credentials, writes env file + config files +
  lease state for proxy, exits. Vault token dies with this process.
- proxy: lease-aware HTTPS CONNECT proxy (stub, implementation next).
- credential-fill: git credential helper, unchanged.

Key changes from the old exec-wrapper model:
- No exec wrapping. Provisioner runs and exits.
- Env file is sourceable (/etc/zerolease/env), not inherited.
- Vault token (ZEROLEASE_TOKEN) never enters the agent's env.
- credential-fill uses a separate ZEROLEASE_CREDENTIAL_TOKEN.
- HTTPS_PROXY and git credential helper configured in env file.
- Lease state file written for proxy consumption.

New lease_state module with atomic write/read and expiry checking.
The proxy is the load-bearing security component — it makes lease
revocation mean "network access cut off."

Explicit proxy (CONNECT mode):
- Parses HTTP CONNECT request to extract target host:port
- Checks lease state: active lease → 200 + bidirectional tunnel,
  expired/unknown → 403 Forbidden
- Integration tested: allowed domain tunnels data, denied gets 403,
  expired lease gets 403

Transparent proxy (SNI mode, defense-in-depth):
- Peeks at TLS ClientHello to extract SNI hostname
- Denies when SNI is absent (covers ECH, non-TLS, malformed)
- Same lease check as CONNECT mode
- SNI parser tested with synthetic ClientHello construction

Shared infrastructure:
- SharedLeaseState (Arc<RwLock<LeaseState>>) for concurrent access
- Background refresh loop reloads lease file periodically
- Both proxy modes use the same lease checking logic
Fixes from security audit:

Critical:
- Validate CONNECT target: restrict to ports 443/8443, validate
  hostname characters, resolve DNS and block private/link-local IPs
  (prevents SSRF to 169.254.169.254 and internal networks)

High:
- Bound request line reads (8 KiB max) and header count (64 max)
  to prevent OOM via unbounded allocation

Medium:
- Generic 502 responses (no internal error details leaked)
- Case-normalize domains to lowercase in both CONNECT and SNI paths
- Increase SNI peek buffer to 16 KiB (max TLS record size)
- Prune expired leases from cache even when file read fails
- Tunnel timeout derived from lease expiry (max 1 hour)

Tests:
- Private IP detection (127.0.0.1, 10.x, 169.254.x, etc.)
- Hostname validation (path traversal, spaces, etc.)
- Port restriction (SSH port 22 blocked with valid domain lease)
- CONNECT to localhost gets 502 (IP validation)
@ceejbot ceejbot changed the title feat: TCP transport with token-based authentication feat: TCP transport, token auth, lease-aware proxy, and VM agent Mar 29, 2026
@ceejbot ceejbot merged commit d60af43 into latest Mar 29, 2026
2 checks passed
@ceejbot ceejbot deleted the feature/tcp-transport branch March 29, 2026 02:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant