┌─────────────┐
│ Clients │
└──────┬──────┘
│
┌──────▼──────┐
│ Nginx │
│ (L7 LB) │
└──────┬──────┘
│
┌──────────────────┼──────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ API 1 │ ... │ API 4 │ ... │ API 8 │
└────┬────┘ └────┬────┘ └────┬────┘
│ │ │
└─────────────────┼──────────────────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌────▼────┐ ┌────▼────┐ ┌────▼────┐
│ Redis │ │PgBouncer│ │RabbitMQ │
│ (Cache) │ │ (Pool) │ │ (Queue) │
└─────────┘ └────┬────┘ └─────────┘
│
┌─────▼─────┐
│PostgreSQL │
└───────────┘
- Docker and Docker Compose
- Python 3.11+
- Make (optional)
# Start all services
docker compose up --build
# Or with load testing profile
docker compose --profile loadtest up --build --scale locust-worker=4| Service | URL | Purpose |
|---|---|---|
| API | http://localhost:8000 | Main application |
| Locust | http://localhost:8089 | Load testing UI |
| RabbitMQ | http://localhost:15672 | Message broker management |
| PgBouncer | localhost:6432 | Connection pooling |
| PostgreSQL | localhost:5432 | Database (direct) |
| Redis | localhost:6379 | Cache |
| etcd | localhost:2379 | Configuration |
Create .env.docker for Docker or .env for local development:
# Database
DATABASE_URL=postgresql+asyncpg://library:library_secret@pgbouncer:6432/library_db
# etcd Configuration
ETCD_HOST=etcd
ETCD_PORT=2379
# RabbitMQ
RABBITMQ_URL=amqp://guest:guest@rabbitmq:5672/
# Redis
REDIS_URL=redis://redis:6379/0
REDIS_ENABLED=true
REDIS_CACHE_TTL=300
# SendGrid (optional)
SENDGRID_API_KEY=your-api-key
SENDGRID_FROM_EMAIL=noreply@example.comapi:
build: .
depends_on:
- pgbouncer
- redis
- rabbitmq
- etcdThe API runs with 2 Uvicorn workers per container. Scale horizontally with multiple containers behind nginx.
- Round-robin load balancing across 8 API instances
- Connection keepalive for performance
- Health check routing
Configuration: deploy/nginx/nginx.conf
| Setting | Value | Purpose |
|---|---|---|
POOL_MODE |
transaction | Release connections after each transaction |
MAX_CLIENT_CONN |
10000 | Maximum client connections |
MAX_DB_CONNECTIONS |
400 | Maximum PostgreSQL connections |
DEFAULT_POOL_SIZE |
300 | Connections per pool |
MIN_POOL_SIZE |
50 | Minimum idle connections |
Tuned for high concurrency:
command:
- "postgres"
- "-c" "max_connections=500"
- "-c" "shared_buffers=512MB"
- "-c" "effective_cache_size=1GB"
- "-c" "work_mem=32MB"- 256MB memory limit
- LRU eviction policy
- 5-minute default TTL
Centralized configuration management. Keys are stored under /config/ prefix:
/config/database/url
/config/redis/enabled
/config/circuit_breakers/rabbitmq/timeout
- Enable Cloud SQL with managed PgBouncer
- Set pool mode to
Transaction - Configure Cloud SQL Auth Proxy
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: library-service
spec:
template:
metadata:
annotations:
run.googleapis.com/execution-environment: gen2
autoscaling.knative.dev/maxScale: "100"
run.googleapis.com/cloudsql-instances: PROJECT:REGION:INSTANCE
spec:
containers:
- image: gcr.io/PROJECT/library-app:latest
ports:
- containerPort: 8000
env:
- name: DATABASE_URL
value: "postgresql+asyncpg://user:pass@/db?host=/cloudsql/PROJECT:REGION:INSTANCE"
- name: DB_POOL_SIZE
value: "50"
resources:
limits:
cpu: "1000m"
memory: "512Mi"apiVersion: apps/v1
kind: Deployment
metadata:
name: library-api
spec:
replicas: 8
selector:
matchLabels:
app: library-api
template:
spec:
containers:
- name: api
image: library-app:latest
ports:
- containerPort: 8000
resources:
requests:
cpu: "500m"
memory: "256Mi"
limits:
cpu: "1000m"
memory: "512Mi"
livenessProbe:
httpGet:
path: /health
port: 8000
readinessProbe:
httpGet:
path: /health/ready
port: 8000apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: library-api-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: library-api
minReplicas: 4
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70# Start with load test profile
docker compose --profile loadtest up --build --scale locust-worker=4
# Access Locust UI
open http://localhost:8089| Shape | File | Description |
|---|---|---|
| Stages | run_stages.py |
Ramps to 10k users |
| Stress | run_stress.py |
Finds breaking point |
| Soak | run_soak.py |
Extended duration |
| Spike | run_spike.py |
Sudden load bursts |
At 10,000 concurrent users:
| Metric | Target | Achieved |
|---|---|---|
| Error Rate | <1% | 0% |
| P50 Latency | <2s | 1.5s |
| P95 Latency | <5s | 4.8s |
| P99 Latency | <10s | 6.6s |
| RPS | >1000 | 1219 |
| Endpoint | Purpose |
|---|---|
GET /health |
Basic liveness check |
GET /health/ready |
Readiness with dependencies |
GET /health/circuit-breakers |
Circuit breaker status |
- Request latency (P50, P95, P99)
- Error rate
- Database connection pool utilization
- Cache hit rate
- Circuit breaker state
- Check PgBouncer pool utilization
- Verify Redis cache is enabled and hitting
- Review slow query logs
- Check for connection pool exhaustion
- Increase
DEFAULT_POOL_SIZEin PgBouncer - Check
max_connectionsin PostgreSQL - Verify network connectivity between services
- Check if API instances are healthy
- Review nginx upstream configuration
- Increase nginx
proxy_read_timeout - Scale up API instances
- Review transaction isolation levels
- Check for missing indexes
- Ensure consistent lock ordering