(a) CRDT for canvas objects. Each shape/text/drawing is a CRDT register with fields: {id, type, position, content, z_index, style, deleted}. The key insight: each field is an independent last-writer-wins (LWW) register, timestamped with a Lamport clock.
-- CRDT object structure
{
id: "obj_abc123", -- globally unique (UUID)
type: "rect",
position: LWW{x: 100, y: 200, ts: 42, author: "user_A"},
size: LWW{w: 300, h: 150, ts: 42, author: "user_A"},
content: LWW{text: "Hello", ts: 40, author: "user_B"},
z_index: LWW{val: 5, ts: 38, author: "user_A"},
deleted: LWW{val: false, ts: 35, author: "user_A"}
}
-- Merge rule: for each field, keep the value with highest timestamp
-- Concurrent creates: both objects appear (add-wins set)
-- Concurrent moves: last-writer-wins on position (acceptable for whiteboard)
The board itself is an add-wins set (OR-Set) of objects. Creating an object adds it to the set. Deleting sets deleted: true (tombstone). Concurrent create + delete: create wins (add-wins semantics). This guarantees no data loss.
Why LWW works for whiteboards but not for text. In a text editor, concurrent inserts at the same position must both appear (interleaving). LWW would lose one insert — unacceptable. But for spatial objects, if two users move the same rectangle simultaneously, having it end up at one position or the other is fine — the "loser" just moves it again. This is why whiteboards use LWW registers while Google Docs uses sequence CRDTs (RGA, Yjs).
Lamport clocks for ordering. Each client maintains a logical clock. On every operation, increment the clock. When receiving a remote operation, set clock = max(local, remote) + 1. This ensures a total order across all operations without relying on wall-clock time (which can skew between clients).
(b) Spatial indexing with R-tree. A board with 100K objects cannot send all of them to every client on connect. Solution: server maintains an R-tree (rectangle tree) index of all object bounding boxes.
-- Viewport query
client_viewport = {x: 500, y: 200, width: 1200, height: 800}
visible_objects = rtree.query(client_viewport)
-- Returns ~200 objects that intersect the viewport
-- Client only receives and renders these
-- On pan/zoom: client sends new viewport
-- Server diffs: new_objects = query(new_vp) - query(old_vp)
-- Only send the delta — objects entering the viewport
R-tree updates on every object move/create/delete. For 100K objects, R-tree query is O(log N + k) where k is the result count. Sub-millisecond for typical viewports.
Viewport subscription model. Each WebSocket connection registers its current viewport with the server. When any object changes, the server checks which viewports intersect the object's bounding box and broadcasts only to those connections. This turns a naive O(users x objects) broadcast into O(affected_users) per operation.
-- Viewport subscription (server-side)
on_object_change(obj):
affected_viewports = rtree_viewport_index.query(obj.bounding_box)
for viewport in affected_viewports:
viewport.connection.send(obj.crdt_delta)
on_viewport_change(connection, new_viewport):
leaving = rtree.query(old_viewport) - rtree.query(new_viewport)
entering = rtree.query(new_viewport) - rtree.query(old_viewport)
connection.send({remove: leaving.ids, add: entering.full_objects})
(c) Cursor broadcasting. Each user's cursor position is sent via WebSocket at ~15 Hz (every ~67 ms). The server does not broadcast to all users on the board — only to users whose viewports overlap with the cursor position. If user A's cursor is at (5000, 3000) and user B is viewing (0, 0) to (1200, 800), user B does not receive A's cursor.
Client-side cursor interpolation. At 15 Hz, raw cursor positions appear jumpy. The client interpolates between received positions using cubic Bezier curves, making the remote cursor appear smooth at 60 fps despite only receiving 15 updates/sec. This is the same technique used in multiplayer games for entity interpolation.
-- Client-side cursor interpolation
on_cursor_update(remote_user, new_pos):
remote_user.prev_pos = remote_user.target_pos
remote_user.target_pos = new_pos
remote_user.interp_start = now()
remote_user.interp_duration = 67ms -- 1/15 Hz
on_render_frame(): -- called at 60 fps
for user in remote_users:
t = (now() - user.interp_start) / user.interp_duration
t = clamp(t, 0, 1)
draw_cursor(lerp(user.prev_pos, user.target_pos, t))
(d) Per-user undo. Global undo (undo the last operation on the board) is a disaster in multiplayer: you undo someone else's work. Per-user undo: each user has a stack of their own operations. Undoing operation A means applying A-inverse.
-- Per-user undo stack
user_A.undo_stack = [
{op: "move obj_1 from (0,0) to (100,200)", inverse: "move obj_1 to (0,0)"},
{op: "create obj_2", inverse: "delete obj_2"},
{op: "change obj_1 color to red", inverse: "change obj_1 color to blue"}
]
-- Undo: pop stack, apply inverse operation
-- Inverse is itself a valid CRDT operation — it merges normally
-- Other users see the undo as a regular edit
Real-Time Collaboration SequenceMermaid
sequenceDiagram
participant A as User A (Client)
participant WS as WebSocket Server
participant CR as CRDT Engine
participant SI as Spatial Index
participant B as User B (Client)
A->>WS: op: move obj_1 to (100,200)
WS->>CR: merge CRDT operation
CR->>SI: update R-tree for obj_1
CR-->>WS: merged state
WS->>B: broadcast op (viewport overlap)
B->>B: apply op locally
A->>WS: cursor at (150,220)
WS->>WS: check viewport overlap
WS->>B: cursor A at (150,220)
B->>WS: op: create obj_2 at (300,400)
WS->>CR: merge CRDT operation
CR->>SI: insert obj_2 into R-tree
WS->>A: broadcast op (viewport overlap)
Offline sync protocol. When a client goes offline, it continues applying operations locally to its CRDT state. Operations are buffered in an ordered queue. On reconnect:
- Client sends its last-known server timestamp (vector clock)
- Server sends all operations since that timestamp (delta sync)
- Client merges server delta into local state (CRDT merge — automatic, no conflicts)
- Client sends its buffered offline operations to server
- Server merges offline ops and broadcasts to other clients
Because CRDTs are commutative and idempotent, the order of merge does not matter. The final state is identical regardless of whether operations arrive in-order or out-of-order. This is the fundamental advantage over OT.
Interview answer
"Each canvas object is a CRDT with LWW registers per field — concurrent edits merge automatically, no central coordinator needed. The board is an add-wins set (OR-Set) so concurrent creates both appear. Spatial indexing via R-tree means clients only receive the ~200 objects in their viewport, not all 100K. Cursors broadcast at 15 Hz only to users with overlapping viewports. Undo is per-user: each operation has an inverse, and applying the inverse is itself a valid CRDT op. Offline edits buffer locally and merge on reconnect — same CRDT semantics, just delayed."