A production-grade reverse proxy with dynamic service discovery, per-IP rate limiting, and full observability - built in Go.
GopherProxy is a reverse proxy (Data Plane) that load-balances traffic across a dynamic pool of backends using round-robin. Its companion, Sentinel (Control Plane), continuously TCP-probes each backend and updates a Redis Set as the live source of truth. The proxy polls Redis every 5 seconds and updates its pool — no restarts required when backends come or go.
The entire stack ships as a single docker compose up with Prometheus scraping metrics every 15 s and a pre-provisioned Grafana dashboard that auto-loads on first boot.
flowchart LR
Clients[HTTP Clients] --> Proxy[GopherProxy :8080\nRate limit + round-robin]
Proxy --> B1[Backend :8081]
Proxy --> B2[Backend :8082]
Proxy --> B3[Backend :8083]
Sentinel[Sentinel\nTCP health checks] --> Redis[(Redis\ngopher_backends)]
Redis --> Proxy
B1 --> Sentinel
B2 --> Sentinel
B3 --> Sentinel
Proxy --> Metrics[Prometheus metrics :2112/metrics]
Metrics --> Prometheus[Prometheus]
Prometheus --> Grafana[Grafana Dashboard]
classDef data fill:#e8f4ff,stroke:#1f6feb,color:#0b1f33;
classDef control fill:#fff4e5,stroke:#d97706,color:#4a2c0a;
classDef storage fill:#eefbf0,stroke:#16a34a,color:#11331a;
classDef obs fill:#f3e8ff,stroke:#9333ea,color:#2d113f;
class Proxy,B1,B2,B3,Metrics data;
class Sentinel control;
class Redis storage;
class Prometheus,Grafana obs;
The Mermaid diagram above shows the data plane, control plane, registry, and observability stack in one view.
| Service | Role | Ports (host) |
|---|---|---|
| GopherProxy | Reverse proxy · rate limiting · metrics | 8080 (proxy), 2112 (metrics) |
| Sentinel | TCP health prober · Redis registry writer | — (internal) |
| Redis | Live backend set (gopher_backends) |
16379 → 6379 |
| Prometheus | Scrapes /metrics · stores 15 days of data |
9090 |
| Grafana | Pre-provisioned dashboard · no manual setup | 3000 |
- Round-robin load balancing with dead-backend skip —
sync/atomiccounter +sync.RWMutexpool - Per-IP rate limiting (Token Bucket) — each client IP gets its own isolated
rate.Limiterviasync.Map; burst of 5, replenishes at 2 req/s - Dynamic backend pool — polls Redis
SMembersevery 5 s, registers new backends without restart - Local TCP health check — probes every 10 s with
context-awareDialContext; logs state transitions (recovered / went down) - Structured JSON logging (
log/slog) — every request logged withmethod,path,remote_addr,status,duration_ms - Graceful shutdown — 30 s drain on
SIGTERM/SIGINT, cancels root context, drains both servers, closes Redis client - Dedicated metrics server —
/metricsand/healthzon a separate port (:2112), never reachable through the proxy path
- TCP probe loop — checks each backend every 2 s with a 1 s dial timeout
- Idempotent registry —
SAddon UP,SRemon DOWN; only logs on actual state transitions - Runtime target override —
SENTINEL_TARGETSenv var (comma-separatedhost:port) replaces hard-coded defaults without rebuilding
-
4 Prometheus metrics —
gopherproxy_processed_requests_total,gopherproxy_dropped_requests_total,gopherproxy_active_backends,gopherproxy_request_duration_secondshistogram (per-backend label)
-
7-panel Grafana dashboard — auto-provisioned; request rate, p50/p95/p99 latency, active backends gauge + timeseries, totals, success rate %
-
Non-root containers — fixed UID/GID
1001(gopheruser:gophergroup);HEALTHCHECKon every service -
Multi-stage Docker build —
CGO_ENABLED=0,-trimpath,-s -w; final image ≈ 13 MB -
Version stamped at build time —
VersionandCommitinjected via-ldflags -
Resource limits on every service — CPU and memory capped in
docker-compose.yml -
Redis persistence —
appendonly yes,appendfsync everysec,maxmemory 128 MB(LRU eviction)
| Tool | Version |
|---|---|
| Docker + Docker Compose | v2+ |
| Python 3 | any (for mock backends) |
| Go | 1.24+ (tests only) |
make |
optional |
cp .env.example .envEdit .env — at minimum change GRAFANA_PASSWORD:
VERSION=1.0.0
GRAFANA_USER=admin
GRAFANA_PASSWORD=changeme # ← change this
SENTINEL_TARGETS=host.docker.internal:8081,host.docker.internal:8082,host.docker.internal:8083make up
# expands to: VERSION=... COMMIT=$(git rev-parse --short HEAD) docker compose up --build -dWait ~15 s for all health checks to pass:
gopher-redis (healthy) ✔
gopher-proxy (healthy) ✔
gopher-sentinel (healthy) ✔
gopher-prometheus up ✔
gopher-grafana up ✔
make mock-backends
# Starts Python HTTP servers on :8081 :8082 :8083Sentinel detects them within 2 s and registers them in Redis. The proxy picks them up within the next 5 s poll.
# 30 requests, 1/s — stays under rate limit, round-robins across all three backends
for i in $(seq 1 30); do curl -s http://localhost:8080/; sleep 1; doneYou will see each response coming from a different backend:
<h1>GopherProxy Demo: Response from SERVER 1</h1>
<h1>GopherProxy Demo: Response from SERVER 2</h1>
<h1>GopherProxy Demo: Response from SERVER 3</h1>
...
| URL | Description |
|---|---|
http://localhost:8080 |
Proxy — load-balanced entry point |
http://localhost:2112/healthz |
Liveness probe — returns ok |
http://localhost:2112/metrics |
Raw Prometheus metrics (text format) |
http://localhost:9090 |
Prometheus expression browser |
http://localhost:9090/targets |
Scrape target health |
http://localhost:3000 |
Grafana — GopherProxy Dashboard |
Grafana login:
admin/admin(or whatever is set in.env)
The proxy uses a per-IP Token Bucket limiter:
| Parameter | Value |
|---|---|
| Burst | 5 requests |
| Replenish rate | 1 token per 500 ms (2 req/s) |
| Response when exceeded | 429 Too Many Requests |
| Counter metric | gopherproxy_dropped_requests_total |
Each client IP gets its own isolated bucket stored in a sync.Map — a busy IP cannot affect others.
All metrics are prefixed gopherproxy_ and scraped from :2112/metrics:
| Metric | Type | Description |
|---|---|---|
gopherproxy_processed_requests_total |
Counter | Requests successfully forwarded to a backend |
gopherproxy_dropped_requests_total |
Counter | Requests dropped (rate-limited or no healthy backend) |
gopherproxy_active_backends |
Gauge | Number of backends currently marked alive |
gopherproxy_request_duration_seconds |
Histogram | Latency per backend — labelled backend="host:port" |
# Request throughput (req/s)
rate(gopherproxy_processed_requests_total[1m])
# p50 / p95 / p99 latency per backend
histogram_quantile(0.99, rate(gopherproxy_request_duration_seconds_bucket[1m]))
# Drop rate (rate-limited req/s)
rate(gopherproxy_dropped_requests_total[1m])
# Success rate %
rate(gopherproxy_processed_requests_total[1m])
/
(rate(gopherproxy_processed_requests_total[1m]) + rate(gopherproxy_dropped_requests_total[1m]))
* 100
# How many backends are alive
gopherproxy_active_backends
The dashboard (grafana/provisioning/dashboards/gopherproxy.json) is auto-loaded on first boot — no manual import needed.
| Panel | Query |
|---|---|
| Request Rate | rate(gopherproxy_processed_requests_total[1m]) + dropped overlay |
| Latency Percentiles | histogram_quantile(0.50/0.95/0.99, ...) per backend |
| Active Backends (gauge) | gopherproxy_active_backends |
| Backend Health Over Time | Same gauge as timeseries |
| Total Processed | gopherproxy_processed_requests_total |
| Total Dropped | gopherproxy_dropped_requests_total |
| Success Rate % | processed / (processed + dropped) × 100 |
Set the time range to Last 5 minutes and auto-refresh to 10 s while traffic flows to see all panels update live.
| Variable | Default | Description |
|---|---|---|
REDIS_URL |
localhost:16379 |
Redis address (host:port) |
PROXY_PORT |
8080 |
Port the proxy listens on |
METRICS_PORT |
2112 |
Port for /metrics and /healthz |
| Variable | Default | Description |
|---|---|---|
REDIS_URL |
localhost:16379 |
Redis address (host:port) |
SENTINEL_TARGETS |
host.docker.internal:808{1,2,3} |
Comma-separated host:port list to monitor |
make test
# go test -race -count=1 ./...22 unit tests across both packages — all run with the race detector:
| Package | Tests |
|---|---|
gopherproxy |
TestGetNextPeer_EmptyPool, TestGetNextPeer_AllDead, TestGetNextPeer_RoundRobin, TestGetNextPeer_SkipsDeadBackend, TestBackend_SetAndIsAlive, TestUpdateServerPool_AddNew, TestUpdateServerPool_NoDuplicates, TestUpdateServerPool_InvalidURL, TestIPRateLimiter_AllowsBurst, TestIPRateLimiter_BlocksAfterBurst, TestIPRateLimiter_IsolatesIPs, TestLoggingMiddleware_PassesThrough, TestLimitMiddleware_Allows, TestLimitMiddleware_Blocks, TestGetEnv_UsesDefault, TestGetEnv_UsesEnvVar, TestServerPool_AtomicConcurrency |
sentinel |
TestGetEnv_Default, TestGetEnv_FromEnvironment, TestGetTargets_Defaults, TestGetTargets_FromEnv, TestGetTargets_SingleEntry |
make lint # requires golangci-lintmake help # list all targets
make build # build images only (no start)
make logs # tail all container logs
make down # stop stack, keep volumes
make clean # prune dangling images + build cache
make tidy # go mod tidy && go mod verifycurl -s -X POST http://localhost:9090/-/reload.
├── main.go # GopherProxy — Data Plane
├── main_test.go # 17 proxy unit tests
├── sentinel/
│ ├── main.go # Sentinel — Control Plane
│ └── main_test.go # 5 sentinel unit tests
├── Dockerfile # Multi-stage proxy image
├── Dockerfile.sentinel # Multi-stage sentinel image
├── docker-compose.yml # Full stack (5 services)
├── prometheus.yml # Scrape config + relabelling
├── Makefile # Developer targets
├── .env.example # Environment variable template
├── .dockerignore # Keeps build context lean
├── go.mod / go.sum # Module: github.com/YogeshT22/gopherproxy.git
├── grafana/
│ └── provisioning/
│ ├── datasources/
│ │ └── datasource.yml # Prometheus datasource (uid: prometheus)
│ └── dashboards/
│ ├── dashboard.yml # Dashboard provider config
│ └── gopherproxy.json # 7-panel pre-built dashboard
└── mock_backends/
├── server1/index.html # Mock on :8081
├── server2/index.html # Mock on :8082
└── server3/index.html # Mock on :8083
| Package | Version | Purpose |
|---|---|---|
github.com/prometheus/client_golang |
v1.23.2 | Metrics instrumentation |
github.com/redis/go-redis/v9 |
v9.17.2 | Redis client |
golang.org/x/time |
v0.14.0 | Token bucket rate limiter |
This project includes k6 load test scenarios to validate performance claims. k6 is a modern, scriptable load testing tool that measures throughput, latency percentiles, and error rates.
Install k6:
# macOS
brew install k6
# Windows (using chocolatey)
choco install k6
# Linux
sudo apt-get install k6
# or download from https://k6.io/docs/getting-started/installationNote: Keep the full stack running (
make up+make mock-backends) in one terminal before running k6 tests.
make k6-smoke
# 10 concurrent users for 30 seconds
# ✔ Verifies proxying is working
# ✔ Baseline latency measurementmake k6-load
# Ramps up: 0 → 100 → 300 → 500 users
# Sustains 500 concurrent users for 3 minutes
# ✔ Measures sustained throughput (target: ~4,200 req/s)
# ✔ Captures p95/p99 latency (target: <35ms)
# ✔ Monitors rate-limiter rejection rate (~12% under spike)make k6-spike
# 10 users → sudden jump to 500 in 5 seconds
# ✔ Stress tests rapid scale-up
# ✔ Verifies no crashes or goroutine leaks
# ✔ Measures latency under immediate loadmake k6-sustained
# Constant 200 concurrent users for 5 minutes
# ✔ Real-world scenario (not ramped, constant)
# ✔ Detects slow memory leaks
# ✔ Validates graceful shutdown after sustained loadOn typical hardware (4-core i5, 8 GB RAM):
| Metric | Smoke Test | Load Test | Spike Test | Sustained (200 CU) |
|---|---|---|---|---|
| Throughput (req/s) | 500–800 | 4,000–4,500 | 3,500–4,200 | 2,500–3,000 |
| p50 latency | 2–5 ms | 10–15 ms | 8–12 ms | 6–10 ms |
| p95 latency | 8–15 ms | 25–35 ms | 20–30 ms | 15–25 ms |
| p99 latency | 15–25 ms | 35–50 ms | 30–45 ms | 25–40 ms |
| Error rate | 0% | ~1% | ~2–5% | <1% |
| Rate-limited (429) | 0% | ~10–12% | ~15–20% | ~8–10% |
- Throughput: Requests per second that completed (200 OK or 429)
- p95 latency: 95% of requests completed within this time
- p99 latency: 99% of requests completed within this time
- Rate-limited: Percentage of requests that hit the 429 rate-limit response
- Errors: Requests that returned 5xx or connection failures
Edit k6-load-test.js to adjust:
// Change proxy URL
const PROXY_URL = __ENV.PROXY_URL || "http://localhost:8080";
// Modify scenarios — e.g., more aggressive ramp-up
stages: [
{ duration: "30s", target: 200 }, // ← faster ramp
{ duration: "2m", target: 200 },
{ duration: "1m", target: 0 },
];Or run with environment variables:
PROXY_URL=http://192.168.1.100:8080 k6 run k6-load-test.js --scenario loadAfter each test run, results are saved to k6-results-{scenario}.json:
ls -lh k6-results-*.json
# -rw-r--r-- k6-results-load.json (3.2 MB)View the JSON summary:
cat k6-results-load.json | jq '.metrics | keys' | head -20Or use k6's output options:
# Generate HTML report (requires k6 extension)
k6 run k6-load-test.js --scenario load --out json=report.json
# Push results to Prometheus or InfluxDB (advanced)
k6 run k6-load-test.js --out prometheus=http://prometheus:9090- Redis port 16379 is exposed to the host for local debugging only — remove the
ports:mapping before any real deployment - Grafana sign-up is disabled (
GF_USERS_ALLOW_SIGN_UP=false) and analytics reporting is off - Prometheus retains 15 days of data in the
prometheus_datanamed volume - All containers run as non-root (UID/GID
1001) and have CPU + memory limits set - Sentinel
kill -0 1health check — more reliable thanpgrepon BusyBox/Alpine, where the full path (/app/sentinel) does not match an exact-namepgrep -xsearch
Real issues hit during development, recorded here as reference:
| Symptom | Root Cause | Fix |
|---|---|---|
| Metrics showed 0 traffic | pool variable shadowed inside handler — handler read an empty instance |
Removed := re-declaration; handler closes over the outer pool pointer |
bind: forbidden on port 6379 |
Hyper-V reserved the port range on Windows | Mapped Redis to high port 16379 on the host |
| Proxy crashed on empty registry | Modulo by zero when len(backends) == 0 |
Defensive if l == 0 { return nil } before the % l operation |
gopher-sentinel always unhealthy |
pgrep -x sentinel on BusyBox matches full path, not basename |
Replaced with kill -0 1 |
| Grafana panels showed "No data" | Datasource UID auto-generated by Grafana didn't match "uid": "prometheus" in dashboard JSON |
Added uid: prometheus to datasource.yml |
MIT — see LICENSE
Author: Yogesh T · GitHub @YogeshT22
