# SnowFlake 统一配额服务系统设计题——高频系统设计面试(类似 Google/Apple/Microsoft 云存储)
题目描述
Companies like Google, Apple, Microsoft, etc. offer various services such as drive,
photos, mail, and documents. Typically, they provide users with a single shared quota.
For example, if you purchase a 100GB storage plan, all these services will consume
capacity from the same quota bucket.
Under the hood, such systems usually look like this:
+--------------------------+
| QUOTA SERVICE |
+--------------------------+
/ | \
/ | \
S1 Storage S2 Storage ...
Applications and services consume from the shared quota through this centralized service.
Your task: Design a quota service that supports multiple independent services (S1, S2, …),
all sharing the same quota limit, ensuring consistency, concurrency safety, and accuracy.
问题分析
这是一道典型的统一存储配额(Quota)系统设计题,类似 Google Drive、iCloud、OneDrive 的后端模式。多个服务(例如相册、文档、邮箱等)共享同一个配额,系统需要做到:
- ✅ 跨服务的总配额一致性
- ✅ 并发写入不重复计数、不卡死
- ✅ 对每个服务的独立计数
- ✅ 面向客户端的查询与写入 API
- ✅ 高并发、高扩展
这道题考察的是你对分布式计数器、事务、锁机制、配额检查流程以及系统扩展性的理解,是高频的系统设计面试题。
核心需求
功能需求
- 配额查询:用户可以查询当前已使用和剩余配额
- 配额消费:各服务上传文件时消费配额
- 配额释放:删除文件时释放配额
- 配额限制检查:上传前检查是否超额
- 服务级别统计:每个服务的使用量统计
非功能需求
- 一致性:配额计数必须准确,不能超卖
- 并发安全:多个服务同时写入不冲突
- 高可用:服务不能单点故障
- 低延迟:配额检查要快(< 100ms)
- 可扩展:支持百万级用户
系统设计
架构图
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ Drive │ │ Photos │ │ Mail │
│ Service │ │ Service │ │ Service │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────────────┼───────────────────┘
│
┌──────▼──────┐
│ Quota │
│ Service │
└──────┬──────┘
│
┌────────────┼────────────┐
│ │ │
┌──────▼──────┐ ┌──▼───┐ ┌─────▼─────┐
│ Quota DB │ │Cache │ │ Metrics │
│ (Primary) │ │Redis │ │ Service │
└─────────────┘ └──────┘ └───────────┘
数据模型
用户配额表(User_Quota)
CREATE TABLE user_quota (
user_id BIGINT PRIMARY KEY,
total_quota BIGINT NOT NULL, -- 总配额(字节)
used_quota BIGINT NOT NULL DEFAULT 0, -- 已使用(字节)
version INT NOT NULL DEFAULT 0, -- 乐观锁版本号
updated_at TIMESTAMP NOT NULL,
INDEX idx_user_id (user_id)
);
服务使用明细表(Service_Usage)
CREATE TABLE service_usage (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
service_name VARCHAR(50) NOT NULL, -- drive, photos, mail
used_quota BIGINT NOT NULL DEFAULT 0,
updated_at TIMESTAMP NOT NULL,
UNIQUE KEY uk_user_service (user_id, service_name),
INDEX idx_user_id (user_id)
);
操作日志表(Quota_Log)
CREATE TABLE quota_log (
id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
service_name VARCHAR(50) NOT NULL,
operation VARCHAR(20) NOT NULL, -- ADD, RELEASE
delta BIGINT NOT NULL, -- 变化量
before_quota BIGINT NOT NULL,
after_quota BIGINT NOT NULL,
created_at TIMESTAMP NOT NULL,
INDEX idx_user_time (user_id, created_at)
);
核心算法
1. 配额消费(带乐观锁)
def consume_quota(user_id: int, service: str, size: int) -> bool:
"""
消费配额,使用乐观锁保证并发安全
"""
max_retries = 3
for attempt in range(max_retries):
# 1. 读取当前配额
quota = db.query(
"SELECT total_quota, used_quota, version "
"FROM user_quota WHERE user_id = %s",
user_id
)
# 2. 检查是否超额
if quota.used_quota + size > quota.total_quota:
return False # 配额不足
# 3. 尝试更新(乐观锁)
affected = db.execute(
"UPDATE user_quota "
"SET used_quota = used_quota + %s, "
" version = version + 1, "
" updated_at = NOW() "
"WHERE user_id = %s AND version = %s",
size, user_id, quota.version
)
if affected > 0:
# 4. 更新服务级别统计
db.execute(
"INSERT INTO service_usage (user_id, service_name, used_quota) "
"VALUES (%s, %s, %s) "
"ON DUPLICATE KEY UPDATE used_quota = used_quota + %s",
user_id, service, size, size
)
# 5. 记录日志
log_quota_change(user_id, service, 'ADD', size,
quota.used_quota, quota.used_quota + size)
# 6. 更新缓存
cache.delete(f"quota:{user_id}")
return True
# 版本冲突,重试
time.sleep(0.01 * (2 ** attempt)) # 指数退避
return False # 重试失败
2. 配额释放
def release_quota(user_id: int, service: str, size: int) -> bool:
"""
释放配额
"""
# 使用类似的乐观锁机制
max_retries = 3
for attempt in range(max_retries):
quota = db.query(
"SELECT used_quota, version FROM user_quota WHERE user_id = %s",
user_id
)
# 更新总配额
affected = db.execute(
"UPDATE user_quota "
"SET used_quota = GREATEST(used_quota - %s, 0), "
" version = version + 1, "
" updated_at = NOW() "
"WHERE user_id = %s AND version = %s",
size, user_id, quota.version
)
if affected > 0:
# 更新服务统计
db.execute(
"UPDATE service_usage "
"SET used_quota = GREATEST(used_quota - %s, 0) "
"WHERE user_id = %s AND service_name = %s",
size, user_id, service
)
log_quota_change(user_id, service, 'RELEASE', -size,
quota.used_quota, max(0, quota.used_quota - size))
cache.delete(f"quota:{user_id}")
return True
time.sleep(0.01 * (2 ** attempt))
return False
3. 配额查询(带缓存)
def get_quota(user_id: int) -> dict:
"""
查询配额,使用缓存加速
"""
# 1. 尝试从缓存获取
cache_key = f"quota:{user_id}"
cached = cache.get(cache_key)
if cached:
return json.loads(cached)
# 2. 从数据库查询
quota = db.query(
"SELECT total_quota, used_quota FROM user_quota WHERE user_id = %s",
user_id
)
# 3. 查询服务级别统计
services = db.query_all(
"SELECT service_name, used_quota FROM service_usage WHERE user_id = %s",
user_id
)
result = {
'total': quota.total_quota,
'used': quota.used_quota,
'available': quota.total_quota - quota.used_quota,
'services': {s.service_name: s.used_quota for s in services}
}
# 4. 写入缓存(TTL 60秒)
cache.setex(cache_key, 60, json.dumps(result))
return result
优化策略
1. 缓存策略
- 读缓存:配额查询结果缓存 60 秒
- 写穿透:更新配额时删除缓存
- 预热:高频用户配额预加载
2. 数据库优化
- 分库分表:按 user_id 哈希分片
- 读写分离:查询走从库,写入走主库
- 索引优化:user_id 上建立索引
3. 并发控制
- 乐观锁:使用 version 字段避免超卖
- 重试机制:冲突时指数退避重试
- 限流:单用户 QPS 限制
扩展性考虑
水平扩展
# 按 user_id 分片
def get_shard_id(user_id: int, num_shards: int) -> int:
return user_id % num_shards
# 路由到对应分片
def route_to_shard(user_id: int):
shard_id = get_shard_id(user_id, NUM_SHARDS)
return db_connections[shard_id]
异步处理
- 配额日志异步写入 Kafka
- 统计数据异步聚合
- 告警通知异步发送
总结
Snowflake 的配额系统设计题考察点:
- 并发控制:乐观锁 vs 悲观锁
- 一致性保证:事务、版本号
- 性能优化:缓存、分片、索引
- 可扩展性:水平扩展、异步处理
- 监控告警:配额使用率、异常检测
oavoservice 专注于 Snowflake / Google / Amazon 等大厂面试辅导,提供 OA 代做、VO 实时辅助等服务。如需帮助,欢迎联系我们。
需要面试真题? 立刻联系微信 Coding0201,获得真题。