
This Google VO question combines system design thinking with concrete implementation. It comes from real scenarios in log systems, stream processing pipelines, and distributed replication modules — all contexts where UDP's lack of ordering guarantees is a real challenge.
Problem Statement
UDP does not guarantee delivery order. An upstream sender continuously sends data fragments, each as (data, seq_num). Packets may arrive out of order, be duplicated, or be lost entirely.
Implement a Sequencer component on the receiver side that:
- Outputs data strictly in ascending
seq_numorder, with no gaps. - If a middle sequence number has not arrived yet, buffer subsequent packets and wait until the gap is filled before outputting.
Approach
Core data structures: HashMap + next_seq pointer
- Use a
dictto buffer received-but-not-yet-outputtable fragments:key = seq_num,value = data. - Maintain a variable
next_seq— the sequence number we expect to output next.
On receiving each new fragment:
- Store
(seq_num, data)in the buffer (deduplicate: ignore if already seen or already output). - Check if
next_seqis in the buffer. - If yes: output it, remove from buffer, increment
next_seq, repeat step 2. - If no: wait — do not output anything.
Why this works: HashMap gives O(1) lookup. Each packet is stored and retrieved exactly once. No sorting of the buffer is needed.
Code
class Sequencer:
def __init__(self):
self.buffer = {} # seq_num -> data
self.next_seq = 0 # next sequence number to output
self.output = [] # ordered output so far
def receive(self, seq_num: int, data: str):
"""Receive a data fragment."""
# Deduplicate: ignore already-output or already-buffered
if seq_num < self.next_seq or seq_num in self.buffer:
return
self.buffer[seq_num] = data
# Flush as many in-order packets as possible
while self.next_seq in self.buffer:
self.output.append(self.buffer.pop(self.next_seq))
self.next_seq += 1
def get_output(self):
return self.output
Dry Run
Arrival order: (2, "C") → (0, "A") → (1, "B") → (3, "D")
| Call | buffer after store | next_seq | Flush action | output |
|---|---|---|---|---|
| receive(2,"C") | {2:"C"} | 0 | 0 not in buffer, stop | [] |
| receive(0,"A") | {2:"C",0:"A"} | 0 | output "A", next=1; 1 not in buffer, stop | ["A"] |
| receive(1,"B") | {2:"C",1:"B"} | 1 | output "B", next=2; output "C", next=3; 3 not in buffer | ["A","B","C"] |
| receive(3,"D") | {3:"D"} | 3 | output "D", next=4 | ["A","B","C","D"] |
Final ordered output: A → B → C → D ✓
Complexity Analysis
- Time: O(1) per
receivecall (amortized) — each packet is stored once and flushed once. - Space: O(k) where k is the number of out-of-order packets currently buffered (worst case O(n)).
Follow-up: Packet Loss Handling Strategies
The interviewer will likely ask: what if a seq_num never arrives?
Strategy 1: Timeout + Retransmission Request (ARQ)
Track how long we've been waiting for next_seq. If it exceeds a threshold, send a retransmission request (NACK) to the sender.
import time
class SequencerWithTimeout(Sequencer):
def __init__(self, timeout=5.0):
super().__init__()
self.wait_start = None
self.timeout = timeout
def receive(self, seq_num, data):
super().receive(seq_num, data)
if self.next_seq not in self.buffer:
if self.wait_start is None:
self.wait_start = time.time()
else:
self.wait_start = None
def check_timeout(self):
"""Returns seq_num to retransmit, or None."""
if self.wait_start and time.time() - self.wait_start > self.timeout:
return self.next_seq
return None
Strategy 2: Skip and Forward
For latency-sensitive applications (live streaming, real-time audio/video), waiting is unacceptable. Skip the missing sequence number, insert a placeholder, and advance next_seq to the next available buffered packet.
def skip_missing(self):
"""Skip the missing seq and continue from next available."""
if not self.buffer:
return
next_available = min(self.buffer.keys())
self.output.append(None) # placeholder for lost packet
self.next_seq = next_available
while self.next_seq in self.buffer:
self.output.append(self.buffer.pop(self.next_seq))
self.next_seq += 1
Strategy Comparison
| Strategy | Best for | Pros | Cons |
|---|---|---|---|
| Timeout + ARQ | Log systems, replication | Data completeness | Added latency |
| Skip & Forward | Live streaming, real-time | Low latency | Data loss accepted |
Extended Discussion Points
- Duplicates: handled by
if seq_num < self.next_seq or seq_num in self.buffer. - Sliding window: add a maximum buffer size to prevent unbounded memory growth — analogous to TCP's receive window.
- Thread safety: in production,
receiveandcheck_timeoutrun on different threads — protectbufferandnext_seqwith a lock.
💬 Need VO Support?
Real-time VO assistance for Google, Amazon, TikTok, Microsoft, Meta, Uber, and more.
WeChat: Coding0201
📬 Contact
Email: [email protected]
Telegram: @OAVOProxy