⚙️ Software✍️ Khoa📅 20/04/2026☕ 10 phút đọc

Advanced System Design Trade-offs — Khi "It depends" là câu trả lời đúng

Ở level Senior, bạn thiết kế hệ thống. Ở level Staff, bạn defend trade-offsnavigate constraints mà Senior không thấy. Đây là nơi "kinh nghiệm chiến trường" tạo ra sự khác biệt.

System design không phải về việc vẽ ra kiến trúc đẹp nhất — mà là chọn được kiến trúc phù hợp nhất với constraints cụ thể: team size, timeline, budget, traffic pattern, consistency requirements.


1. Back-of-Envelope Estimation — Tính nhẩm như Google

1.1 Tại sao cần?

Interviewer: "Design a URL shortener"
Bạn: "Dùng MySQL"
Interviewer: "OK, MySQL handle được bao nhiêu requests?"
Bạn: "Uh..."

Estimation giúp bạn:
  → Chọn đúng technology (Redis vs MySQL vs DynamoDB)
  → Size infrastructure (bao nhiêu server? bao nhiêu storage?)
  → Identify bottleneck TRƯỚC khi build
  → Prove tính khả thi cho stakeholders

1.2 Magic Numbers cần nhớ

Latency:
  L1 cache:              1 ns
  L2 cache:              4 ns
  RAM access:            100 ns
  SSD random read:       16 µs
  HDD random read:       2 ms (100x chậm hơn SSD)
  Round trip same DC:    0.5 ms
  Round trip US→EU:      75 ms

Throughput:
  MySQL:     ~5K-10K QPS (single node, depends on query)
  PostgreSQL: ~5K-15K QPS
  Redis:      ~100K-500K QPS (single node)
  Kafka:      ~100K-1M msg/s (per partition)

Storage:
  1 character:   1 byte (ASCII) hoặc 1-4 bytes (UTF-8)
  1 UUID:        36 bytes (string) hoặc 16 bytes (binary)
  1 timestamp:   8 bytes
  1 email:       ~30 bytes average
  1 tweet:       ~300 bytes (140 chars + metadata)
  1 image:       ~300 KB (compressed JPEG)
  1 video minute: ~50 MB (720p)

Scale:
  1 million = 10^6 ≈ 1 MB data nếu mỗi record 1 byte
  1 billion = 10^9 ≈ 1 GB
  1 day = 86,400 seconds ≈ ~100K seconds (dễ nhớ)

1.3 Estimation Framework

Ví dụ: URL Shortener

Step 1: Traffic estimation
  → 100M URLs created/month = ~40 URLs/second (write)
  → Read:write ratio = 100:1
  → Read: 4000 URLs/second (read)
  → Peak: 3x average = 12K QPS read

Step 2: Storage estimation
  → URL record: short_url(7B) + long_url(200B) + created(8B) ≈ 250B
  → 100M/month × 12 months × 5 years = 6B records
  → 6B × 250B = 1.5 TB
  → Với index: ~2-3 TB total

Step 3: Bandwidth
  → Read: 4000 QPS × 250B = 1 MB/s
  → Manageable cho single server

Step 4: Technology decision
  → 12K QPS read → MySQL OK (với cache)
  → Hoặc Redis cho cache layer: 12K QPS = trivial cho Redis
  → 3 TB storage → single MySQL server đủ (partition sau)

Kết luận: Single MySQL + Redis cache. Simple. Đủ dùng 5 năm.
Đừng over-engineer.

2. Consistency Models — Deep Dive

2.1 Spectrum of Consistency

Strong ←────────────────────────────────→ Weak

Linearizable → Sequential → Causal → Eventual
    ↑              ↑           ↑          ↑
 Banking      Inventory    Social     Analytics
 Payment      Stock level  Feed       Page views

2.2 Khi nào dùng cái nào?

Linearizability (strongest):
  Mọi đọc đều thấy write mới nhất. Như 1 copy duy nhất.
  
  Use cases:
    → Bank balance (không được âm)
    → Distributed lock (chỉ 1 holder)
    → Leader election
  
  Trade-off: CHẬM (cần coordination), low throughput
  Tools: etcd, ZooKeeper, CockroachDB (serializable)

Causal consistency:
  Operations có quan hệ nhân quả → ordered.
  Concurrent operations → có thể thấy thứ tự khác nhau.
  
  Use cases:
    → Social media: reply phải thấy SAU post gốc
    → Chat: message order within conversation
    → Document editing (Google Docs style)
  
  Trade-off: Tốt hơn eventual, nhẹ hơn linearizable

Eventual consistency:
  Mọi replica SẼ converge... eventually.
  
  Use cases:
    → View count, like count (sai vài giây OK)
    → DNS propagation
    → Shopping cart (Amazon nổi tiếng dùng)
    → CDN content
  
  Trade-off: Fast, highly available, nhưng "read your own write"
  có thể bị miss.

Trick: Eventual + "read-your-writes" consistency
  → Client luôn thấy writes CỦA CHÍNH MÌNH
  → Implement: read from primary sau khi write, sticky sessions
  → Best of both worlds cho nhiều use cases

3. Multi-Region Architecture

3.1 Patterns

Active-Passive (simple):
  ┌──────────┐                ┌──────────┐
  │ Region A │   async repl   │ Region B │
  │ (active) │ ──────────────→│ (standby)│
  │ All traffic             │ No traffic│
  └──────────┘                └──────────┘
  
  Pros: Simple, strong consistency within region
  Cons: Failover = downtime (minutes), wasted standby capacity
  Use when: DR requirement, RPO minutes OK

Active-Active (complex):
  ┌──────────┐                ┌──────────┐
  │ Region A │  ←── sync ───→ │ Region B │
  │ Serves   │    hoặc        │ Serves   │
  │ traffic  │    async       │ traffic  │
  └──────────┘                └──────────┘
  
  Pros: No failover downtime, utilize all capacity, low latency
  Cons: Conflict resolution, complex, eventual consistency
  Use when: Global users, ultra-high availability (99.99%+)

3.2 Data Strategies cho Multi-Region

Strategy 1: Shard by geography
  → User ở VN → data ở Singapore region
  → User ở US → data ở US region
  → Mỗi user thuộc 1 "home region"
  → Cross-region reads cho social features
  → Đơn giản nhất, avoid conflicts

Strategy 2: Conflict-free Replicated Data Types (CRDTs)
  → Data structure tự merge conflicts
  → Counter: tăng ở region A và B → merge = tổng
  → Set: add ở A, add ở B → merge = union
  → Shopping cart (Amazon-style)
  → Use khi: mọi region cần write same data

Strategy 3: Last-Write-Wins (LWW)
  → Mỗi write có timestamp, mới nhất thắng
  → Simple nhưng có thể LOSE DATA
  → Chỉ dùng khi data loss acceptable
  → Profile update, preferences

3.3 Geo-Routing

           ┌─── VN users ──→ SG Region
           │
  DNS/LB ──┼─── EU users ──→ EU Region  
           │
           └─── US users ──→ US Region

Options:
  → DNS-based (Route 53 Geo): Simple, ~TTL delay khi failover
  → Anycast (Cloudflare): Fastest, IP-level routing
  → Application-level: Token/cookie chứa region hint

4. Large-Scale Migration Strategies

4.1 Strangler Fig Pattern

Thay vì rewrite toàn bộ (Big Bang = Big Risk):

Phase 1:  [Old System] ← 100% traffic

Phase 2:  [Old System] ← 90% traffic
          [New System] ← 10% traffic (1 feature)

Phase 3:  [Old System] ← 50%
          [New System] ← 50%

Phase 4:  [New System] ← 100% traffic
          [Old System] ← shutdown 🎉

Tên gọi từ Strangler Fig tree: dây leo bao quanh cây chủ,
dần dần thay thế hoàn toàn. Không cần chặt cây cũ.

4.2 Zero-Downtime Database Migration

Step 1: Dual-write
  App writes → Old DB + New DB (async hoặc via CDC)
  App reads  → Old DB only

Step 2: Shadow read
  App writes → Old DB + New DB
  App reads  → Old DB + New DB (compare results, log diff)
  Fix inconsistencies

Step 3: Switch read
  App writes → Old DB + New DB  
  App reads  → New DB (fallback to Old DB nếu error)

Step 4: Stop dual-write
  App writes → New DB only
  App reads  → New DB only
  Keep Old DB read-only (rollback safety, 1-2 weeks)

Step 5: Decommission
  Drop Old DB connection, shutdown

Mỗi step có rollback plan. Nếu step 3 fail → quay về step 2.

4.3 Feature Flags cho Migration

// Feature flag control migration
func GetUserProfile(ctx context.Context, userID string) (*Profile, error) {
    if featureflag.IsEnabled("use-new-profile-service", userID) {
        // New service — migrate dần dần
        profile, err := newProfileService.Get(ctx, userID)
        if err != nil {
            // Fallback to old service nếu new service lỗi
            metrics.Inc("new_profile_fallback")
            return oldProfileService.Get(ctx, userID)
        }
        return profile, nil
    }
    return oldProfileService.Get(ctx, userID)
}

5. Data Modeling Decisions

5.1 Decision Matrix

Relational (PostgreSQL, MySQL):
  ✅ Complex queries, JOINs, transactions
  ✅ ACID compliance
  ✅ Well-understood, mature tooling
  ❌ Horizontal scaling khó (sharding complexity)
  → Use for: Core business data, financial transactions

Document (MongoDB, DynamoDB):
  ✅ Flexible schema, easy horizontal scaling
  ✅ Good for hierarchical data
  ❌ No JOINs (denormalize hoặc application-side join)
  ❌ Eventual consistency (default)
  → Use for: Product catalogs, user profiles, content

Columnar (ClickHouse, BigQuery):
  ✅ Blazing fast aggregation queries
  ✅ Excellent compression
  ❌ Slow single-row operations
  ❌ Not for OLTP
  → Use for: Analytics, time-series, logs

Graph (Neo4j, Neptune):
  ✅ Relationship-heavy queries (friends of friends)
  ✅ Pattern matching
  ❌ Niche, smaller ecosystem
  → Use for: Social networks, recommendation, fraud detection

Time-series (TimescaleDB, InfluxDB):
  ✅ Optimized for time-ordered data
  ✅ Built-in downsampling, retention policies
  → Use for: Metrics, IoT sensor data, stock prices

6. API Design at Scale

6.1 Backward Compatibility — Hyrum's Law

Hyrum's Law: "Với đủ số lượng users, mọi observable
behavior của API sẽ được ai đó depend vào."

Hệ quả:
  → Error message thay đổi → client parse error message sẽ break
  → Response field order thay đổi → ai đó đang compare JSON string
  → Thêm field mới → client strict deserialization sẽ fail

Nguyên tắc:
  → Additive changes OK: thêm field, thêm endpoint
  → Breaking changes CẦN versioning
  → Document contract, không chỉ behavior
  → Breaking change = new version + deprecation notice + sunset date

6.2 Pagination at Scale

Offset-based (simple, flawed):
  GET /items?offset=1000&limit=20
  
  ❌ Offset lớn → query chậm (MySQL: OFFSET 1000000 = scan 1M rows)
  ❌ Data thay đổi giữa pages → miss/duplicate items

Cursor-based (scalable):
  GET /items?cursor=eyJpZCI6MTAwMH0&limit=20
  
  Response:
  {
    "items": [...],
    "next_cursor": "eyJpZCI6MTAyMH0",
    "has_more": true
  }
  
  ✅ Consistent O(1) query time
  ✅ No duplicate/missing items
  ❌ Không jump to page N (nhưng hầu hết UI không cần)

Keyset-based (best for sorted data):
  GET /items?created_after=2024-03-15T10:00:00Z&limit=20
  WHERE created_at > '2024-03-15T10:00:00Z'
  ORDER BY created_at ASC LIMIT 20
  
  ✅ Sử dụng index hiệu quả
  ✅ Consistent performance

6.3 Deprecation Strategy

Lifecycle:
  ACTIVE → DEPRECATED → SUNSET → REMOVED

  DEPRECATED (6 months):
    → Thêm header: Deprecation: true
    → Thêm header: Sunset: Sat, 01 Mar 2025 00:00:00 GMT
    → Log usage để track migration progress
    → Email/notify consumers

  SUNSET (3 months):
    → Return warning trong response body
    → Throttle deprecated endpoint (giảm rate limit)
    → Contact remaining consumers trực tiếp

  REMOVED:
    → Return 410 Gone với message hướng dẫn migration
    → Sau 1 tháng → 404

7. Capacity Planning

7.1 Framework

Step 1: Current baseline
  → Current QPS, storage, compute usage
  → Growth rate (monthly/quarterly)

Step 2: Project forward
  → "Nếu grow 3x trong 12 tháng, cần gì?"
  → Compute: CPU cores, memory
  → Storage: disk, backup
  → Network: bandwidth, connections

Step 3: Identify cliff
  → "Ở mức nào system sẽ break?"
  → Database: max connections, disk IOPS
  → Application: memory limit, CPU saturation
  → Network: bandwidth limit

Step 4: Plan ahead
  → Buffer: plan cho 2x expected peak
  → Lead time: ordering hardware/instances
  → Scaling strategy: horizontal vs vertical

7.2 Cost-Aware Architecture

Ở Staff level, bạn phải think về $$ nữa:

  "Solution A: 99.99% availability, $50K/month"
  "Solution B: 99.9% availability, $8K/month"

  Difference: 0.09% = ~39 phút/tháng downtime thêm
  Savings: $42K/month = $504K/year

  Business question: 39 phút downtime/tháng
  có cost > $504K/year không?

→ Nếu là e-commerce $1M revenue/day: YES, choose A
→ Nếu là internal tool: NO, choose B
→ Đây là cách Staff Engineer think.

8. System Design Interview — Staff Level Framework

Framework (35-40 phút):

  1. Requirements (5 min) — DRIVE conversation
     → Functional: "System cần làm gì?"
     → Non-functional: Scale? Latency? Consistency?
     → Constraints: "Existing infra? Team size?"
     → KHÔNG assume — ASK

  2. High-level Design (5 min)
     → Sketch components trên whiteboard
     → "Có 2 approaches, tôi sẽ explain trade-offs..."
     → Chọn approach, explain WHY

  3. Deep Dive (15-20 min)
     → Interviewer sẽ chọn area focus
     → Data model, API design, scaling strategy
     → Nói TRADE-OFFS, không chỉ "tôi dùng X"

  4. Scale & Edge Cases (5 min)
     → "Nếu traffic 100x, bottleneck ở đâu?"
     → "Failure mode: DB down thì sao?"
     → Monitoring & alerting strategy

  Staff signals interviewer tìm:
    ✅ Clarify trước khi design (không assume)
    ✅ Nói trade-offs cho MỌI decision
    ✅ Acknowledge uncertainty: "Tôi cần benchmark, nhưng..."
    ✅ Think about cost, team, timeline — not just tech
    ✅ Listen & adjust khi interviewer push back
    ✅ Drive conversation, không chờ hỏi

Tài liệu tham khảo


💡 Remember: Staff-level system design = "I chose X over Y because Z, and here's when we'd reconsider." Không phải "Dùng Kafka vì nó tốt." 🎯