"Today I'll design an online coding practice and evaluation platform like LeetCode or HackerRank. The core challenges are executing untrusted user code safely in sandboxed environments, supporting multiple programming languages, enforcing resource limits, and scaling to handle thousands of concurrent submissions while maintaining fair and consistent evaluation."
- Problem database - Store coding problems with descriptions, test cases, solutions
- Code submission - Users submit code in multiple languages
- Code execution - Run user code against test cases in sandboxed environment
- Test case validation - Compare output with expected results
- Leaderboards - Track submissions, success rate, ranking
- User progress - Track solved problems, submissions history
- Contests - Time-limited competitive programming events
- Security: Sandboxed execution preventing malicious code
- Resource limits: CPU, memory, time constraints per submission
- Fairness: Consistent evaluation across all users
- Scale: 100K concurrent users, 10K submissions/minute during contests
- Latency: Results within 5 seconds for simple problems
- Discussion forums
- Premium subscription management
- Interview preparation features
User base:
- 10 million registered users
- 500K daily active users
- Peak during contests: 100K concurrent
Submissions:
- Normal: 1 million submissions/day = 12 submissions/second
- Contest peak: 10K submissions/minute = 170/second
- Average code size: 2KB
- Average test cases per problem: 50
Execution:
- Average execution time: 2 seconds
- Languages: 15+ (Python, Java, C++, JavaScript, etc.)
- Concurrent executions needed: 170 * 2 = 340 at peak
Storage:
- Problems: 3,000 problems * 100KB = 300MB
- Submissions: 365M/year * 2KB = 730 GB/year
- Test cases: 3,000 * 50 * 10KB = 1.5 GB
Key insight: The bottleneck is execution capacity. We need to run untrusted code safely and at scale.
┌────────────────────────────────────┐
│ Web/Mobile Clients │
│ (Code Editor, Problem View) │
└─────────────────┬──────────────────┘
│
▼
┌────────────────────────────────────┐
│ Load Balancer │
└─────────────────┬──────────────────┘
│
┌─────────────────────────────────────┼─────────────────────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌───────────▼───────────┐ ┌────────▼────────┐
│ Web Servers │ │ Submission Service │ │ Problem Service│
│ │ │ │ │ │
│ - Static content │ │ - Queue submissions │ │ - CRUD problems │
│ - User sessions │ │ - Track status │ │ - Test cases │
└───────────────────┘ └───────────┬───────────┘ └─────────────────┘
│
▼
┌────────────────────────────────────┐
│ Message Queue (Kafka) │
│ (Submission Queue per lang) │
└─────────────────┬──────────────────┘
│
┌─────────────────────────────────────┼─────────────────────────────────────┐
│ │ │
┌─────────▼─────────┐ ┌───────────▼───────────┐ ┌────────▼────────┐
│ Judge Worker │ │ Judge Worker │ │ Judge Worker │
│ (Python Pool) │ │ (Java Pool) │ │ (C++ Pool) │
│ │ │ │ │ │
│ ┌──────────────┐ │ │ ┌──────────────┐ │ │ ┌─────────────┐ │
│ │ Sandbox │ │ │ │ Sandbox │ │ │ │ Sandbox │ │
│ │ Container │ │ │ │ Container │ │ │ │ Container │ │
│ └──────────────┘ │ │ └──────────────┘ │ │ └─────────────┘ │
└───────────────────┘ └───────────────────────┘ └─────────────────┘
│ │ │
└─────────────────────────────────────┼─────────────────────────────────────┘
│
┌─────────────────▼──────────────────┐
│ Result Handler │
│ (Update DB, Notify, Rankings) │
└────────────────────────────────────┘
│
┌─────────────────▼──────────────────┐
│ PostgreSQL │
│ (Users, Problems, Submissions) │
└────────────────────────────────────┘
-
Submission Service
- Receives code submissions
- Validates and queues for execution
- Tracks submission status
-
Message Queue (Kafka)
- Buffers submissions
- Separate topics per language
- Handles burst traffic
-
Judge Workers
- Pull submissions from queue
- Execute code in sandboxed containers
- Compare output with expected results
-
Sandbox Environment
- Isolated execution environment
- Resource limits (CPU, memory, time)
- No network, no filesystem access
-
Result Handler
- Processes execution results
- Updates database
- Triggers notifications
- Updates leaderboards
This is the most critical and complex part of the system.
User code is untrusted. We must prevent:
- System access: Reading files, executing commands
- Network access: Making external requests
- Resource exhaustion: Infinite loops, memory bombs
- Process escape: Breaking out of sandbox
┌─────────────────────────────────────────────────────────────────────┐
│ Host Machine │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ Container Runtime (gVisor) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────┐ │ │
│ │ │ Sandbox Container │ │ │
│ │ │ │ │ │
│ │ │ ┌─────────────────────────────────────────────────┐ │ │ │
│ │ │ │ User Process │ │ │ │
│ │ │ │ │ │ │ │
│ │ │ │ - No network │ │ │ │
│ │ │ │ - Read-only filesystem │ │ │ │
│ │ │ │ - No fork/exec │ │ │ │
│ │ │ │ - Memory limit: 256MB │ │ │ │
│ │ │ │ - CPU limit: 2 seconds │ │ │ │
│ │ │ │ - No /proc, /sys access │ │ │ │
│ │ │ └─────────────────────────────────────────────────┘ │ │ │
│ │ │ │ │ │
│ │ │ Seccomp: Whitelist of allowed syscalls │ │ │
│ │ │ AppArmor/SELinux: Mandatory access control │ │ │
│ │ └─────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ └───────────────────────────────────────────────────────────────┘ │
│ │
│ cgroups: Resource limits enforced at kernel level │
└─────────────────────────────────────────────────────────────────────┘
Option 1: Docker with restrictions
# docker-compose for sandbox
security_opt:
- no-new-privileges:true
- seccomp:./seccomp-profile.json
cap_drop:
- ALL
network_mode: none
read_only: true
mem_limit: 256m
cpus: 0.5
pids_limit: 10Option 2: gVisor (chosen)
- User-space kernel implementation
- Intercepts syscalls
- Stronger isolation than Docker alone
- Used by Google Cloud Run
Option 3: Firecracker microVMs
- VM-level isolation
- More overhead
- Used by AWS Lambda
async function executeSubmission(submission: Submission): Promise<Result> {
const sandbox = await sandboxPool.acquire(submission.language);
try {
// 1. Write user code to sandbox
await sandbox.writeFile('/code/solution.py', submission.code);
// 2. Compile if needed (for compiled languages)
if (needsCompilation(submission.language)) {
const compileResult = await sandbox.exec(
getCompileCommand(submission.language),
{ timeout: 30000, memory: '512m' }
);
if (compileResult.exitCode !== 0) {
return { status: 'COMPILE_ERROR', error: compileResult.stderr };
}
}
// 3. Run against each test case
const results: TestCaseResult[] = [];
for (const testCase of submission.problem.testCases) {
const result = await runTestCase(sandbox, submission, testCase);
results.push(result);
// Early termination on failure (for efficiency)
if (result.status !== 'PASSED' && !submission.showAllResults) {
break;
}
}
return aggregateResults(results);
} finally {
await sandbox.reset(); // Clean up for reuse
sandboxPool.release(sandbox);
}
}
async function runTestCase(
sandbox: Sandbox,
submission: Submission,
testCase: TestCase
): Promise<TestCaseResult> {
const startTime = Date.now();
try {
const result = await sandbox.exec(
getRunCommand(submission.language),
{
stdin: testCase.input,
timeout: submission.problem.timeLimit,
memory: submission.problem.memoryLimit
}
);
const executionTime = Date.now() - startTime;
if (result.timeout) {
return { status: 'TIME_LIMIT_EXCEEDED', time: executionTime };
}
if (result.memoryExceeded) {
return { status: 'MEMORY_LIMIT_EXCEEDED', time: executionTime };
}
if (result.exitCode !== 0) {
return { status: 'RUNTIME_ERROR', error: result.stderr, time: executionTime };
}
// Compare output
const passed = compareOutput(result.stdout, testCase.expectedOutput);
return {
status: passed ? 'PASSED' : 'WRONG_ANSWER',
time: executionTime,
output: result.stdout.substring(0, 1000) // Truncate for display
};
} catch (error) {
return { status: 'SYSTEM_ERROR', error: error.message };
}
}function compareOutput(actual: string, expected: string): boolean {
// Normalize whitespace
const normalizeWhitespace = (s: string) =>
s.trim().replace(/\r\n/g, '\n').replace(/\s+$/gm, '');
const actualNorm = normalizeWhitespace(actual);
const expectedNorm = normalizeWhitespace(expected);
if (actualNorm === expectedNorm) return true;
// Handle floating point comparison
if (isNumericOutput(expectedNorm)) {
return compareNumeric(actualNorm, expectedNorm, 1e-6);
}
return false;
}const resourceLimits: Record<string, ResourceLimits> = {
python: { time: 10000, memory: '256m', multiplier: 3 },
java: { time: 5000, memory: '512m', multiplier: 2 },
cpp: { time: 2000, memory: '256m', multiplier: 1 },
javascript: { time: 8000, memory: '256m', multiplier: 2.5 },
go: { time: 3000, memory: '256m', multiplier: 1.2 },
};
// Time limit = base_limit * language_multiplierEach supported language needs:
- Compiler/interpreter installed
- Standard library available
- Execution wrapper
# Base image
FROM ubuntu:22.04
# Common setup
RUN apt-get update && apt-get install -y \
ca-certificates \
&& rm -rf /var/lib/apt/lists/*
# Python image
FROM base AS python
RUN apt-get update && apt-get install -y python3.11 python3-pip
RUN pip3 install numpy scipy # Common libraries
# Java image
FROM base AS java
RUN apt-get update && apt-get install -y openjdk-17-jdk
ENV JAVA_HOME=/usr/lib/jvm/java-17-openjdk-amd64
# C++ image
FROM base AS cpp
RUN apt-get update && apt-get install -y g++ clang# Python wrapper
import sys
import resource
# Set resource limits
resource.setrlimit(resource.RLIMIT_AS, (256 * 1024 * 1024,) * 2) # 256MB
resource.setrlimit(resource.RLIMIT_CPU, (10, 10)) # 10 seconds
# Execute user code
exec(open('/code/solution.py').read())// Java wrapper
public class Runner {
public static void main(String[] args) {
// Set security manager
System.setSecurityManager(new SandboxSecurityManager());
// Load and run user code
Solution solution = new Solution();
// ...
}
}function getCompileCommand(language: string, files: string[]): string {
switch (language) {
case 'cpp':
return `g++ -O2 -std=c++17 -o /code/solution ${files.join(' ')}`;
case 'java':
return `javac -d /code ${files.join(' ')}`;
case 'rust':
return `rustc -O -o /code/solution ${files[0]}`;
case 'go':
return `go build -o /code/solution ${files[0]}`;
default:
return null; // Interpreted language
}
}- Time-limited (2-3 hours)
- 4-6 problems
- Real-time leaderboard
- Fair queuing (no priority for repeat submissions)
┌─────────────────────────────────────────────────────────────────┐
│ Contest Mode │
│ │
│ ┌─────────────────┐ ┌─────────────────┐ │
│ │ Contest Service │ │ Leaderboard │ │
│ │ │ │ Service │ │
│ │ - Start/end │ │ │ │
│ │ - Enrollment │ │ - Real-time │ │
│ │ - Time sync │ │ - Scoring │ │
│ └────────┬────────┘ └────────┬────────┘ │
│ │ │ │
│ │ ┌───────────────┘ │
│ ▼ ▼ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ Redis (Real-time State) │ │
│ │ │ │
│ │ contest:{id}:leaderboard → Sorted Set (score, user_id) │ │
│ │ contest:{id}:submissions → List (submission_ids) │ │
│ │ contest:{id}:user:{uid} → Hash (solved, penalties) │ │
│ └─────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
interface ContestScore {
solved: number; // Number of problems solved
penalty: number; // Time penalty in minutes
submissions: Map<string, ProblemScore>;
}
interface ProblemScore {
solved: boolean;
attempts: number;
solvedAt: number; // Minutes from contest start
}
function calculateScore(userId: string, contestId: string): ContestScore {
const submissions = await getContestSubmissions(userId, contestId);
let solved = 0;
let penalty = 0;
const problems = new Map<string, ProblemScore>();
for (const sub of submissions) {
const problemScore = problems.get(sub.problemId) || {
solved: false, attempts: 0, solvedAt: 0
};
if (!problemScore.solved) {
if (sub.status === 'ACCEPTED') {
problemScore.solved = true;
problemScore.solvedAt = minutesFromStart(sub.submittedAt, contestStart);
solved++;
penalty += problemScore.solvedAt + (problemScore.attempts * 20);
} else {
problemScore.attempts++;
}
}
problems.set(sub.problemId, problemScore);
}
return { solved, penalty, submissions: problems };
}
// Ranking: Sort by solved DESC, then penalty ASC- Code similarity detection: Compare submissions using algorithms like MOSS
- IP tracking: Flag multiple accounts from same IP
- Timing analysis: Detect suspicious submission patterns
- Randomized test cases: Different test order per user
-- Problems
CREATE TABLE problems (
id UUID PRIMARY KEY,
title VARCHAR(255),
slug VARCHAR(100) UNIQUE,
description TEXT,
difficulty VARCHAR(20), -- 'easy', 'medium', 'hard'
time_limit_ms INTEGER DEFAULT 2000,
memory_limit_mb INTEGER DEFAULT 256,
created_at TIMESTAMP,
updated_at TIMESTAMP
);
-- Test cases
CREATE TABLE test_cases (
id UUID PRIMARY KEY,
problem_id UUID REFERENCES problems(id),
input TEXT,
expected_output TEXT,
is_sample BOOLEAN DEFAULT FALSE, -- Shown to users
order_index INTEGER
);
-- Submissions
CREATE TABLE submissions (
id UUID PRIMARY KEY,
user_id UUID REFERENCES users(id),
problem_id UUID REFERENCES problems(id),
contest_id UUID REFERENCES contests(id),
language VARCHAR(20),
code TEXT,
status VARCHAR(30),
runtime_ms INTEGER,
memory_kb INTEGER,
test_cases_passed INTEGER,
test_cases_total INTEGER,
created_at TIMESTAMP
);
-- User progress
CREATE TABLE user_problem_status (
user_id UUID REFERENCES users(id),
problem_id UUID REFERENCES problems(id),
status VARCHAR(20), -- 'solved', 'attempted', 'unsolved'
best_runtime_ms INTEGER,
best_memory_kb INTEGER,
attempts INTEGER DEFAULT 0,
solved_at TIMESTAMP,
PRIMARY KEY (user_id, problem_id)
);
-- Contests
CREATE TABLE contests (
id UUID PRIMARY KEY,
title VARCHAR(255),
start_time TIMESTAMP,
end_time TIMESTAMP,
is_rated BOOLEAN DEFAULT TRUE
);
CREATE TABLE contest_problems (
contest_id UUID REFERENCES contests(id),
problem_id UUID REFERENCES problems(id),
order_index INTEGER,
points INTEGER,
PRIMARY KEY (contest_id, problem_id)
);# Problems
GET /api/v1/problems - List problems
GET /api/v1/problems/{slug} - Get problem details
GET /api/v1/problems/{slug}/submissions - User's submissions
# Submissions
POST /api/v1/submissions - Submit code
GET /api/v1/submissions/{id} - Get submission result
GET /api/v1/submissions/{id}/status - Poll for result
# Contests
GET /api/v1/contests - List contests
GET /api/v1/contests/{id} - Contest details
POST /api/v1/contests/{id}/register - Register for contest
GET /api/v1/contests/{id}/leaderboard - Real-time leaderboard
# User
GET /api/v1/users/{id}/profile - User profile + stats
GET /api/v1/users/{id}/submissions - Submission history
// Subscribe to submission result
ws.send({ type: 'subscribe', submissionId: 'xxx' });
// Receive updates
ws.on('message', (data) => {
// { type: 'status', status: 'RUNNING', testCase: 5 }
// { type: 'result', status: 'ACCEPTED', runtime: 42, memory: 12340 }
});
// Subscribe to contest leaderboard
ws.send({ type: 'subscribe', contestId: 'xxx', channel: 'leaderboard' });// Auto-scale based on queue depth
async function autoScaleWorkers() {
const queueDepth = await getQueueDepth();
const processingCapacity = activeWorkers * avgThroughput;
// Target: process queue in 30 seconds
const targetCapacity = queueDepth / 30;
if (targetCapacity > processingCapacity * 1.2) {
// Scale up
const newWorkers = Math.ceil(
(targetCapacity - processingCapacity) / avgThroughput
);
await kubernetes.scaleDeployment('judge-workers', newWorkers);
}
}// Keep warm containers ready
class SandboxPool {
private warmContainers: Map<string, Sandbox[]> = new Map();
private minWarm = 5;
async acquire(language: string): Promise<Sandbox> {
const pool = this.warmContainers.get(language) || [];
if (pool.length > 0) {
return pool.pop()!;
}
// Create new if pool empty
return this.createSandbox(language);
}
async release(sandbox: Sandbox): Promise<void> {
await sandbox.reset();
const pool = this.warmContainers.get(sandbox.language) || [];
if (pool.length < this.minWarm * 2) {
pool.push(sandbox);
} else {
await sandbox.destroy();
}
}
}- Deploy workers in multiple regions
- Route submissions to nearest region
- Replicate problem database globally
| Decision | Trade-off |
|---|---|
| gVisor sandboxing | Strong security, but 10-20% overhead |
| Pre-compiled test images | Fast startup, but storage cost |
| Sequential test execution | Fair comparison, but slower |
| Per-language workers | Efficient, but complex scaling |
-
WebAssembly sandbox
- Portable, fast
- Limited language support
- Could use for JavaScript
-
AWS Lambda for execution
- Scalable, managed
- Cold start latency
- Higher cost at scale
-
Run all tests in parallel
- Faster results
- Higher resource usage
- Chose sequential for fairness
"I've designed an online judge system with:
- gVisor-based sandboxing for secure code execution with syscall filtering
- Language-specific worker pools with pre-warmed containers
- Queue-based submission processing for handling traffic spikes
- Real-time contest leaderboards with ICPC-style scoring
The key insight is that security and fairness are non-negotiable. We use multiple layers of isolation (containers, gVisor, seccomp, resource limits) to run untrusted code safely. Happy to discuss any aspect further."
-
How would you detect plagiarism?
- Tokenize code, remove variable names
- Calculate similarity using algorithms like MOSS
- Flag pairs above 80% similarity for review
-
How would you handle a fork bomb?
- PID limit in cgroup (max 10 processes)
- Seccomp blocking fork() syscall
- Timeout as last resort
-
How would you support custom test case input?
- "Run code" mode with user-provided input
- Separate pool with shorter limits
- Rate limited per user