Meta's system design interviews follow a very clear pattern: strong data consistency, low system latency, features aligned with product experience. No matter whether the question looks like a social, content-distribution, or storage system, the core always revolves around three dimensions — scale, consistency trade-off, and latency budget.
This article breaks down Meta's three most frequent system design questions and gives a reusable review framework.
The Right Way to Work Through a Question
- Define the core interaction: state the system's core user path in one sentence
- Estimate capacity: QPS, storage, and latency budgets
- Draw data and control flow: write path and read path, identifying boundaries between cache, database, and messaging
- Expand on trade-offs: go deep on one or two, e.g., read-fanout vs write-fanout
Question 1: News Feed (Fanout vs Real-Time Trade-Off)
The Feed system is Meta's classic high-frequency question, centered on the fanout vs real-time trade-off. At the product layer, one user posting means millions of followers' timelines need updating; at the system layer, that is one write triggering massive downstream change.
Push + Pull Hybrid Model
Meta's design tradition is a push + pull hybrid:
- High-activity users (e.g., celebrity accounts): do light fanout on write, storing the post in a global post store; on read, the follower's newsfeed service performs lazy pull aggregation
- Regular users: fanout directly to followers' timelines on write (write-fanout), hit directly on read
def deliver_post(post, author):
save_to_global_store(post)
if author.follower_count > CELEBRITY_THRESHOLD:
# celebrity: store globally only, pull on read
return "pull"
else:
# regular user: fanout to followers' timelines on write
for fan in author.followers:
timeline_cache[fan].push(post.id)
return "push"
Bonus: Decoupling the Ranker Pipeline
Emphasize ranking pipeline decoupling: a fast ranking model runs at the serving layer while a complex ML model updates features and weights offline. Showing this demonstrates depth of understanding of production-scale systems.
Question 2: Messenger / Chat (Message Consistency and Delivery Semantics)
Chat questions focus on message consistency and delivery semantics. Meta's messaging uses a multi-device synchronization model, requiring messages to display in consistent time order across devices while keeping "delivered" and "read" states accurate.
Store-and-Forward for Offline Messages
Interviewers often probe how offline messages are handled. The core is the store-and-forward mechanism:
- When the recipient is offline, the delivery service stashes the message in Redis or a persistent queue and records the offset
- When the recipient reconnects, the client pulls from the offset, guaranteeing no loss, no duplication
def deliver_message(msg, recipient):
if is_online(recipient):
push_to_device(recipient, msg)
else:
# offline: stash and record offset
offline_queue[recipient].append(msg)
update_offset(recipient, msg.seq)
def on_reconnect(recipient):
last = get_offset(recipient)
return [m for m in offline_queue[recipient] if m.seq > last]
Finally, discuss end-to-end encryption and multi-device key sync to show attention to privacy and security.
Question 3: Flash Sale / Seat Reservation (Distributed Lock Against Double Booking)
This question often appears in Meta's infrastructure or commerce team interviews, focusing on distributed lock design to prevent double booking. Model the system as a "scarce resource allocation problem": multiple users request the same seat/room simultaneously, and the system must keep uniqueness under high concurrency.
Reservation Token + TTL-Based Lock
The standard approach is a reservation token + TTL lock:
- When the user clicks to reserve, the system generates a token with an expiry, writing it to Redis as the unique holder of seat_id
- On confirmed purchase, upgrade the lock to a persistent state in the database; otherwise the TTL expires and releases automatically
def try_reserve(seat_id, user_id, ttl=120):
# SET NX: set only if the key is absent, ensuring a unique holder
ok = redis.set(f"lock:{seat_id}", user_id, nx=True, ex=ttl)
return bool(ok)
def confirm(seat_id, user_id):
holder = redis.get(f"lock:{seat_id}")
if holder == user_id:
persist_booking(seat_id, user_id) # upgrade to persistent state
return True
return False # lock expired or held by another
If the interviewer probes cross-partition consistency, explain sharding seat_id with consistent hashing, or introducing Redlock / a monotonically increasing fencing token to prevent dirty writes after lock expiry.
Three-Dimension Review Cheat Sheet
| Dimension | News Feed | Chat | Reservation |
|---|---|---|---|
| Scale | Celebrity fanout amplification | Multi-device sync | High-concurrency contention |
| Consistency | Eventual (timeline) | Ordered + no loss/dup | Strong (unique holder) |
| Latency | Millisecond read path | Real-time delivery | Low-latency lock acquisition |
FAQ
Q1: What does Meta system design value most? Three dimensions: scale, consistency (trade-offs), and latency (budget). Interviewers do not care how pretty your diagram is but whether you can go deep on one or two key trade-offs.
Q2: Should the News Feed question use push or pull? A hybrid is best: write-fanout (push) for regular users, read-fanout (pull) for celebrity accounts to avoid write amplification. Articulating the threshold and reasoning for switching scores points.
Q3: How do you handle offline messages in the Chat question? Use store-and-forward: stash to Redis/persistent queue and record the offset while offline; on reconnect, pull from the offset to guarantee no loss or duplication. Adding multi-device sync and E2E encryption makes it more complete.
Q4: How do you prevent double booking in the flash-sale/reservation question? Reservation token + TTL lock: use Redis SET NX for a unique holder, upgrade to persistent state on confirmation, and let the TTL release automatically if unconfirmed. Cross-partition: consistent-hash sharding or fencing tokens.
Q5: How do you finish a Meta system design in 45 minutes? Follow four steps: define the core interaction in one sentence, estimate capacity, draw read/write paths, then go deep on one or two trade-offs. When short on time, cut secondary features and reserve depth for the core trade-off.
Preparing for Meta system design?
If you can list components but struggle to go deep on trade-offs, or want to drill News Feed / Chat / Reservation to the point of expanding on the spot before onsite, let's talk: topic breakdowns, a review framework, and delivery rehearsal tailored to Meta's design round.
Contact
Need real interview questions and a custom prep plan? Message WeChat Coding0201 now and get the questions.
Email: [email protected] Telegram: @OAVOProxy