⚙️ Software✍️ Khoa📅 20/04/2026☕ 10 phút đọc

Advanced System Design Trade-offs — Khi "It depends" là câu trả lời đúng

Ở level Senior, bạn thiết kế hệ thống. Ở level Staff, bạn defend trade-offs và navigate constraints mà Senior không thấy. Đây là nơi "kinh nghiệm chiến trường" tạo ra sự khác biệt.

System design không phải về việc vẽ ra kiến trúc đẹp nhất — mà là chọn được kiến trúc phù hợp nhất với constraints cụ thể: team size, timeline, budget, traffic pattern, consistency requirements.

1. Back-of-Envelope Estimation — Tính nhẩm như Google

1.1 Tại sao cần?

Interviewer: "Design a URL shortener"
Bạn: "Dùng MySQL"
Interviewer: "OK, MySQL handle được bao nhiêu requests?"
Bạn: "Uh..."

Estimation giúp bạn:
  → Chọn đúng technology (Redis vs MySQL vs DynamoDB)
  → Size infrastructure (bao nhiêu server? bao nhiêu storage?)
  → Identify bottleneck TRƯỚC khi build
  → Prove tính khả thi cho stakeholders

1.2 Magic Numbers cần nhớ

Latency:
  L1 cache:              1 ns
  L2 cache:              4 ns
  RAM access:            100 ns
  SSD random read:       16 µs
  HDD random read:       2 ms (100x chậm hơn SSD)
  Round trip same DC:    0.5 ms
  Round trip US→EU:      75 ms

Throughput:
  MySQL:     ~5K-10K QPS (single node, depends on query)
  PostgreSQL: ~5K-15K QPS
  Redis:      ~100K-500K QPS (single node)
  Kafka:      ~100K-1M msg/s (per partition)

Storage:
  1 character:   1 byte (ASCII) hoặc 1-4 bytes (UTF-8)
  1 UUID:        36 bytes (string) hoặc 16 bytes (binary)
  1 timestamp:   8 bytes
  1 email:       ~30 bytes average
  1 tweet:       ~300 bytes (140 chars + metadata)
  1 image:       ~300 KB (compressed JPEG)
  1 video minute: ~50 MB (720p)

Scale:
  1 million = 10^6 ≈ 1 MB data nếu mỗi record 1 byte
  1 billion = 10^9 ≈ 1 GB
  1 day = 86,400 seconds ≈ ~100K seconds (dễ nhớ)

1.3 Estimation Framework

Ví dụ: URL Shortener

Step 1: Traffic estimation
  → 100M URLs created/month = ~40 URLs/second (write)
  → Read:write ratio = 100:1
  → Read: 4000 URLs/second (read)
  → Peak: 3x average = 12K QPS read

Step 2: Storage estimation
  → URL record: short_url(7B) + long_url(200B) + created(8B) ≈ 250B
  → 100M/month × 12 months × 5 years = 6B records
  → 6B × 250B = 1.5 TB
  → Với index: ~2-3 TB total

Step 3: Bandwidth
  → Read: 4000 QPS × 250B = 1 MB/s
  → Manageable cho single server

Step 4: Technology decision
  → 12K QPS read → MySQL OK (với cache)
  → Hoặc Redis cho cache layer: 12K QPS = trivial cho Redis
  → 3 TB storage → single MySQL server đủ (partition sau)

Kết luận: Single MySQL + Redis cache. Simple. Đủ dùng 5 năm.
Đừng over-engineer.

2. Consistency Models — Deep Dive

2.1 Spectrum of Consistency

Strong ←────────────────────────────────→ Weak

Linearizable → Sequential → Causal → Eventual
    ↑              ↑           ↑          ↑
 Banking      Inventory    Social     Analytics
 Payment      Stock level  Feed       Page views

2.2 Khi nào dùng cái nào?

Linearizability (strongest):
  Mọi đọc đều thấy write mới nhất. Như 1 copy duy nhất.
  
  Use cases:
    → Bank balance (không được âm)
    → Distributed lock (chỉ 1 holder)
    → Leader election
  
  Trade-off: CHẬM (cần coordination), low throughput
  Tools: etcd, ZooKeeper, CockroachDB (serializable)

Causal consistency:
  Operations có quan hệ nhân quả → ordered.
  Concurrent operations → có thể thấy thứ tự khác nhau.
  
  Use cases:
    → Social media: reply phải thấy SAU post gốc
    → Chat: message order within conversation
    → Document editing (Google Docs style)
  
  Trade-off: Tốt hơn eventual, nhẹ hơn linearizable

Eventual consistency:
  Mọi replica SẼ converge... eventually.
  
  Use cases:
    → View count, like count (sai vài giây OK)
    → DNS propagation
    → Shopping cart (Amazon nổi tiếng dùng)
    → CDN content
  
  Trade-off: Fast, highly available, nhưng "read your own write"
  có thể bị miss.

Trick: Eventual + "read-your-writes" consistency
  → Client luôn thấy writes CỦA CHÍNH MÌNH
  → Implement: read from primary sau khi write, sticky sessions
  → Best of both worlds cho nhiều use cases

3. Multi-Region Architecture

3.1 Patterns

Active-Passive (simple):
  ┌──────────┐                ┌──────────┐
  │ Region A │   async repl   │ Region B │
  │ (active) │ ──────────────→│ (standby)│
  │ All traffic             │ No traffic│
  └──────────┘                └──────────┘
  
  Pros: Simple, strong consistency within region
  Cons: Failover = downtime (minutes), wasted standby capacity
  Use when: DR requirement, RPO minutes OK

Active-Active (complex):
  ┌──────────┐                ┌──────────┐
  │ Region A │  ←── sync ───→ │ Region B │
  │ Serves   │    hoặc        │ Serves   │
  │ traffic  │    async       │ traffic  │
  └──────────┘                └──────────┘
  
  Pros: No failover downtime, utilize all capacity, low latency
  Cons: Conflict resolution, complex, eventual consistency
  Use when: Global users, ultra-high availability (99.99%+)

3.2 Data Strategies cho Multi-Region

Strategy 1: Shard by geography
  → User ở VN → data ở Singapore region
  → User ở US → data ở US region
  → Mỗi user thuộc 1 "home region"
  → Cross-region reads cho social features
  → Đơn giản nhất, avoid conflicts

Strategy 2: Conflict-free Replicated Data Types (CRDTs)
  → Data structure tự merge conflicts
  → Counter: tăng ở region A và B → merge = tổng
  → Set: add ở A, add ở B → merge = union
  → Shopping cart (Amazon-style)
  → Use khi: mọi region cần write same data

Strategy 3: Last-Write-Wins (LWW)
  → Mỗi write có timestamp, mới nhất thắng
  → Simple nhưng có thể LOSE DATA
  → Chỉ dùng khi data loss acceptable
  → Profile update, preferences

3.3 Geo-Routing

           ┌─── VN users ──→ SG Region
           │
  DNS/LB ──┼─── EU users ──→ EU Region  
           │
           └─── US users ──→ US Region

Options:
  → DNS-based (Route 53 Geo): Simple, ~TTL delay khi failover
  → Anycast (Cloudflare): Fastest, IP-level routing
  → Application-level: Token/cookie chứa region hint

4. Large-Scale Migration Strategies

4.1 Strangler Fig Pattern

Thay vì rewrite toàn bộ (Big Bang = Big Risk):

Phase 1:  [Old System] ← 100% traffic

Phase 2:  [Old System] ← 90% traffic
          [New System] ← 10% traffic (1 feature)

Phase 3:  [Old System] ← 50%
          [New System] ← 50%

Phase 4:  [New System] ← 100% traffic
          [Old System] ← shutdown 🎉

Tên gọi từ Strangler Fig tree: dây leo bao quanh cây chủ,
dần dần thay thế hoàn toàn. Không cần chặt cây cũ.

4.2 Zero-Downtime Database Migration

Step 1: Dual-write
  App writes → Old DB + New DB (async hoặc via CDC)
  App reads  → Old DB only

Step 2: Shadow read
  App writes → Old DB + New DB
  App reads  → Old DB + New DB (compare results, log diff)
  Fix inconsistencies

Step 3: Switch read
  App writes → Old DB + New DB  
  App reads  → New DB (fallback to Old DB nếu error)

Step 4: Stop dual-write
  App writes → New DB only
  App reads  → New DB only
  Keep Old DB read-only (rollback safety, 1-2 weeks)

Step 5: Decommission
  Drop Old DB connection, shutdown

Mỗi step có rollback plan. Nếu step 3 fail → quay về step 2.

4.3 Feature Flags cho Migration

// Feature flag control migration
func GetUserProfile(ctx context.Context, userID string) (*Profile, error) {
    if featureflag.IsEnabled("use-new-profile-service", userID) {
        // New service — migrate dần dần
        profile, err := newProfileService.Get(ctx, userID)
        if err != nil {
            // Fallback to old service nếu new service lỗi
            metrics.Inc("new_profile_fallback")
            return oldProfileService.Get(ctx, userID)
        }
        return profile, nil
    }
    return oldProfileService.Get(ctx, userID)
}

5. Data Modeling Decisions

5.1 Decision Matrix

Relational (PostgreSQL, MySQL):
  ✅ Complex queries, JOINs, transactions
  ✅ ACID compliance
  ✅ Well-understood, mature tooling
  ❌ Horizontal scaling khó (sharding complexity)
  → Use for: Core business data, financial transactions

Document (MongoDB, DynamoDB):
  ✅ Flexible schema, easy horizontal scaling
  ✅ Good for hierarchical data
  ❌ No JOINs (denormalize hoặc application-side join)
  ❌ Eventual consistency (default)
  → Use for: Product catalogs, user profiles, content

Columnar (ClickHouse, BigQuery):
  ✅ Blazing fast aggregation queries
  ✅ Excellent compression
  ❌ Slow single-row operations
  ❌ Not for OLTP
  → Use for: Analytics, time-series, logs

Graph (Neo4j, Neptune):
  ✅ Relationship-heavy queries (friends of friends)
  ✅ Pattern matching
  ❌ Niche, smaller ecosystem
  → Use for: Social networks, recommendation, fraud detection

Time-series (TimescaleDB, InfluxDB):
  ✅ Optimized for time-ordered data
  ✅ Built-in downsampling, retention policies
  → Use for: Metrics, IoT sensor data, stock prices

6. API Design at Scale

6.1 Backward Compatibility — Hyrum's Law

Hyrum's Law: "Với đủ số lượng users, mọi observable
behavior của API sẽ được ai đó depend vào."

Hệ quả:
  → Error message thay đổi → client parse error message sẽ break
  → Response field order thay đổi → ai đó đang compare JSON string
  → Thêm field mới → client strict deserialization sẽ fail

Nguyên tắc:
  → Additive changes OK: thêm field, thêm endpoint
  → Breaking changes CẦN versioning
  → Document contract, không chỉ behavior
  → Breaking change = new version + deprecation notice + sunset date

6.2 Pagination at Scale

Offset-based (simple, flawed):
  GET /items?offset=1000&limit=20
  
  ❌ Offset lớn → query chậm (MySQL: OFFSET 1000000 = scan 1M rows)
  ❌ Data thay đổi giữa pages → miss/duplicate items

Cursor-based (scalable):
  GET /items?cursor=eyJpZCI6MTAwMH0&limit=20
  
  Response:
  {
    "items": [...],
    "next_cursor": "eyJpZCI6MTAyMH0",
    "has_more": true
  }
  
  ✅ Consistent O(1) query time
  ✅ No duplicate/missing items
  ❌ Không jump to page N (nhưng hầu hết UI không cần)

Keyset-based (best for sorted data):
  GET /items?created_after=2024-03-15T10:00:00Z&limit=20
  WHERE created_at > '2024-03-15T10:00:00Z'
  ORDER BY created_at ASC LIMIT 20
  
  ✅ Sử dụng index hiệu quả
  ✅ Consistent performance

6.3 Deprecation Strategy

Lifecycle:
  ACTIVE → DEPRECATED → SUNSET → REMOVED

  DEPRECATED (6 months):
    → Thêm header: Deprecation: true
    → Thêm header: Sunset: Sat, 01 Mar 2025 00:00:00 GMT
    → Log usage để track migration progress
    → Email/notify consumers

  SUNSET (3 months):
    → Return warning trong response body
    → Throttle deprecated endpoint (giảm rate limit)
    → Contact remaining consumers trực tiếp

  REMOVED:
    → Return 410 Gone với message hướng dẫn migration
    → Sau 1 tháng → 404

7. Capacity Planning

7.1 Framework

Step 1: Current baseline
  → Current QPS, storage, compute usage
  → Growth rate (monthly/quarterly)

Step 2: Project forward
  → "Nếu grow 3x trong 12 tháng, cần gì?"
  → Compute: CPU cores, memory
  → Storage: disk, backup
  → Network: bandwidth, connections

Step 3: Identify cliff
  → "Ở mức nào system sẽ break?"
  → Database: max connections, disk IOPS
  → Application: memory limit, CPU saturation
  → Network: bandwidth limit

Step 4: Plan ahead
  → Buffer: plan cho 2x expected peak
  → Lead time: ordering hardware/instances
  → Scaling strategy: horizontal vs vertical

7.2 Cost-Aware Architecture

Ở Staff level, bạn phải think về $$ nữa:

  "Solution A: 99.99% availability, $50K/month"
  "Solution B: 99.9% availability, $8K/month"

  Difference: 0.09% = ~39 phút/tháng downtime thêm
  Savings: $42K/month = $504K/year

  Business question: 39 phút downtime/tháng
  có cost > $504K/year không?

→ Nếu là e-commerce $1M revenue/day: YES, choose A
→ Nếu là internal tool: NO, choose B
→ Đây là cách Staff Engineer think.

8. System Design Interview — Staff Level Framework

Framework (35-40 phút):

  1. Requirements (5 min) — DRIVE conversation
     → Functional: "System cần làm gì?"
     → Non-functional: Scale? Latency? Consistency?
     → Constraints: "Existing infra? Team size?"
     → KHÔNG assume — ASK

  2. High-level Design (5 min)
     → Sketch components trên whiteboard
     → "Có 2 approaches, tôi sẽ explain trade-offs..."
     → Chọn approach, explain WHY

  3. Deep Dive (15-20 min)
     → Interviewer sẽ chọn area focus
     → Data model, API design, scaling strategy
     → Nói TRADE-OFFS, không chỉ "tôi dùng X"

  4. Scale & Edge Cases (5 min)
     → "Nếu traffic 100x, bottleneck ở đâu?"
     → "Failure mode: DB down thì sao?"
     → Monitoring & alerting strategy

  Staff signals interviewer tìm:
    ✅ Clarify trước khi design (không assume)
    ✅ Nói trade-offs cho MỌI decision
    ✅ Acknowledge uncertainty: "Tôi cần benchmark, nhưng..."
    ✅ Think about cost, team, timeline — not just tech
    ✅ Listen & adjust khi interviewer push back
    ✅ Drive conversation, không chờ hỏi

Tài liệu tham khảo

Designing Data-Intensive Applications — Martin Kleppmann
System Design Interview Vol 1 & 2 — Alex Xu
Martin Fowler: Strangler Fig
AWS Multi-Region Architectures
Google SRE: Data Processing Pipelines

💡 Remember: Staff-level system design = "I chose X over Y because Z, and here's when we'd reconsider." Không phải "Dùng Kafka vì nó tốt." 🎯