Whiteboard exercise. Try the problem cold, then reveal the rubric to self-score.
Out of 10 points45 min whiteboardReference solution →
01
Prompt
Users submit code solutions in 15+ languages. The system compiles, runs against hidden test cases with strict time/memory limits, and returns a verdict — all within seconds. The hard parts: sandboxed execution that prevents user code from escaping or consuming unbounded resources, a judge queue that handles 10K concurrent contest submissions without starving practice users, and plagiarism detection across millions of historical submissions. LeetCode processes ~10M submissions/day.
Time budget: 45 min whiteboard. Draw architecture, estimate numbers, discuss tradeoffs.
02
Hints (progressive — click to reveal)
Hint 1
Lead with the sandbox. "We're running arbitrary untrusted code — isolation is the #1 concern." Then name gVisor or Firecracker.
Hint 2
Priority queue for contest vs practice. A single queue is wrong. Name the priority lanes explicitly.
Hint 3
Stream test cases, don't batch-load. Large inputs (10M-node graph) can't be pre-loaded into memory. Stream from S3 per case.
03
Rubric — 10 points
+2 Lead with the sandbox. "We're running arbitrary untrusted code — isolation is the #1 concern." Then name gVisor or Firecracker.
+2 Priority queue for contest vs practice. A single queue is wrong. Name the priority lanes explicitly.
+2 Stream test cases, don't batch-load. Large inputs (10M-node graph) can't be pre-loaded into memory. Stream from S3 per case.
+2 Autoscale workers, not the API. The API is lightweight; the workers are the bottleneck. Scale workers on queue depth.
+2 Plagiarism is offline, not real-time. Too expensive to run during contest. Batch after. MOSS / AST fingerprint.
Self-score: tally the points you would have mentioned unprompted. 7+ is interview-ready on this problem.
04
Red flags (things that tank the interview)
Run user code directly on the host (no sandbox)
Load all 200 test cases into memory before running