Files
gh-dotclaude-marketplace-pl…/commands/system-design.md
2025-11-29 18:24:07 +08:00

9.6 KiB

model, allowed-tools, argument-hint, description
model allowed-tools argument-hint description
claude-opus-4-1 Task, Read, Bash, Grep, Glob, Write <system-name> [--depth=standard|deep] [--focus=architecture|scalability|trade-offs] [--generate-diagram=true|false] Design complete systems with WHY, WHAT, HOW, CONSIDERATIONS, and DEEP-DIVE framework

System Design Interview Coach

Complete framework for designing systems from problem to implementation. Includes WHY/WHAT/HOW structure, trade-off analysis, mermaid diagrams, and deep-dive optimizations.

Interview Flow (60 minutes)

Phase 1: Requirements & Context (5 minutes)

Your goal: Understand the problem deeply before designing

Ask clarifying questions:

  • Scale: users, requests per second, data volume?
  • Availability: SLA requirements (99.9%, 99.99%)?
  • Latency: response time targets?
  • Consistency: strong or eventual?
  • Features: read-heavy, write-heavy, or balanced?
  • Growth: expected growth rate?

Interviewer's watching:

  • Do you ask the right questions?
  • Do you understand the constraints?
  • Can you estimate numbers?

Phase 2: High-Level Architecture (10 minutes)

Your goal: Outline the system at 30,000 feet

Cover:

  • Major components (load balancer, services, databases, caches)
  • Communication patterns (sync/async, protocols)
  • Data flow from user request to response
  • Rough scalability approach

Draw simple diagram showing component interactions.

Interviewer's watching:

  • Do you think in systems?
  • Can you structure complexity?
  • Do you know when to keep it simple?

Phase 3: Detailed Component Design (20 minutes)

Your goal: Explain key components with confidence

Pick 2-3 components to discuss:

  • How does this component work?
  • Why this technology choice?
  • What are the constraints it handles?
  • How does it scale?

Interviewer's watching:

  • Do you have technical depth?
  • Can you justify decisions?
  • Do you know trade-offs?

Phase 4: Scalability & Trade-Offs (15 minutes)

Your goal: Show senior-level thinking

Discuss:

  • Bottlenecks: What breaks first at 10x growth?
  • Consistency: Strong vs eventual? Why?
  • Reliability: Failure modes and recovery?
  • Cost: What drives operational expense?
  • Complexity: Is this operationally feasible?

Interviewer's watching:

  • Do you think like a Staff engineer?
  • Can you make principled trade-offs?
  • Do you understand operational reality?

Phase 5: Extensions & Deep-Dives (8 minutes)

Your goal: Demonstrate mastery

Address follow-up questions:

  • "How would you handle [new requirement]?"
  • "What's the hardest part of operating this?"
  • "What would you optimize for [metric]?"
  • "How would you debug this in production?"

Interviewer's watching:

  • Are you thinking ahead?
  • Can you handle surprises?
  • Do you know what you don't know?

System Design Framework

WHY: Problem & Context

What to cover:

  • Problem statement (1-2 sentences)
  • Primary use cases (top 3-5)
  • User base and growth expectations
  • Non-functional requirements (scale, latency, availability)
  • Business context (why does this matter?)

For interviewer's benefit:

  • Shows you understand the problem before solving it
  • Demonstrates customer empathy
  • Proves you can estimate and scope

WHAT: Components & Data Model

What to cover:

  • Core entities (Users, Posts, Comments, etc.)
  • Entity relationships
  • Storage requirements (how much data?)
  • Major services (Authentication, Feed, Search, etc.)
  • API contracts (what endpoints do we need?)

For interviewer's benefit:

  • Shows you think about data structure
  • Demonstrates you can decompose systems
  • Proves you understand component boundaries

HOW: Architecture & Patterns

What to cover:

  • Request flow (from user → response)
  • Service architecture (monolith vs microservices decision)
  • Communication patterns (synchronous, asynchronous, pub-sub)
  • Storage topology (where does data live?)
  • Caching strategy (where, what, how long?)
  • Replication and failover

For interviewer's benefit:

  • Shows you know architectural patterns
  • Demonstrates systems thinking
  • Proves you can make principled decisions

CONSIDERATIONS: Trade-Offs & Reality

What to analyze:

Consistency

  • Strong: Always get latest data (high latency, low availability)
  • Eventual: Might get stale data (low latency, high availability)
  • Your choice: "For [reason], we accept [consistency model]"

Scalability

  • Vertical: Big machines (simpler, limited)
  • Horizontal: More machines (complex, unlimited)
  • Your choice: "We scale [direction] because [reason]"

Reliability

  • Single point of failure? (bad)
  • Replication strategy? (multiple copies)
  • Disaster recovery? (backup and restore procedure)
  • Your choice: "We replicate [this way] to handle [failure]"

Cost

  • Storage: What's the cost per GB?
  • Compute: What's the cost of this many servers?
  • Bandwidth: What's the egress cost?
  • Your choice: "This costs [X] but solves [Y]"

Operational Complexity

  • How many different technologies?
  • How hard is debugging?
  • What's the on-call pain?
  • Your choice: "We keep it simple: [reason]"

DEEP-DIVE: Component Optimization

For each major component, be prepared to discuss:

  1. Bottleneck Analysis

    • What's the scaling limit?
    • Where would we hit the wall first?
    • How do we know?
  2. Optimization Opportunities

    • What could we do to handle more load?
    • What are the trade-offs?
    • When is this optimization worth doing?
  3. Failure Modes

    • What if [component] fails?
    • How do we detect it?
    • How do we recover?
  4. Operational Concerns

    • How do we monitor this?
    • What metrics matter?
    • How do we debug issues?
  5. Alternative Approaches

    • What's another way to design this?
    • When would you choose it?
    • What problems does it have?

Mermaid Diagram Strategy

Create diagrams that show:

  1. Architecture Diagram: Components and communication
  2. Data Flow: Request path through the system
  3. Database Schema: Key entities and relationships

Tips:

  • Keep diagrams simple initially
  • Add detail when asked
  • Label important decisions
  • Annotate bottlenecks

Example: Design Facebook Feed

WHY

  • Problem: Show users their friends' posts in a personalized, real-time feed
  • Use Cases:
    1. User opens app → see recent posts from friends
    2. Friend posts → appears in followers' feeds quickly
    3. Massive scale: billions of posts, minutes of latency acceptable
  • Requirements:
    • Read-heavy (100:1 read to write ratio)
    • Latency: Feed load < 200ms
    • Availability: 99.99%
    • Consistency: Eventual OK (a few minutes lag acceptable)

WHAT

  • Entities: User, Post, Friendship, Like, Comment
  • Relationships: User → Post (1:many), User → Friend (many:many)
  • Storage: Posts: 100s of billions, User data: billions
  • Services: Auth, Post Creation, Feed Service, Search
  • APIs:
    • POST /posts (create)
    • GET /feed (get user's feed)
    • POST /posts/{id}/like (like post)

HOW

  • Load balancers distribute requests
  • Stateless web servers handle auth and routing
  • Post service writes posts to database
  • Feed service reads from cache first, database second
  • Cache layer (Redis) stores hot posts
  • Fanout on write: When user posts, push to all followers' feeds
  • Asynchronous: Queue for fanout, workers process

CONSIDERATIONS

  • Consistency: Eventual consistency (a few second lag OK)
  • Scalability: Horizontal—more servers as needed
  • Reliability: Multi-region replication for availability
  • Cost: Balance storage vs computation
  • Complexity: Fanout-on-write is complex but enables fast reads

DEEP-DIVE

  1. Fanout Bottleneck: Celebrity posts with 100M followers?
    • Solution: Hybrid fanout—fanout for normal users, cache for celebrities
  2. Feed Personalization: How do we rank posts?
    • Solution: ML model, but start with recency + engagement
  3. Real-time Updates: How do we push new posts?
    • Solution: Long-polling, WebSockets, or event stream

Talking Points During Interview

When introducing your design:

  • "Let me outline the system at a high level..."
  • "The key insight here is [insight]"
  • "This design makes [requirement] easy"

When defending a choice:

  • "We chose [option] because [constraint] → [option] is better"
  • "The trade-off is [cost] for [benefit]"
  • "This would change if [different constraint]"

When asked about scaling:

  • "Currently [component] is the bottleneck"
  • "We'd scale [direction] because [reason]"
  • "This approach works until [limit], then we'd [next evolution]"

When asked about failure:

  • "If [component] fails, [other component] takes over"
  • "We'd detect it via [monitoring], then [recovery action]"
  • "This is why we replicate [data/component]"

Red Flags to Avoid

Diving into implementation details too early Not asking clarifying questions Designing for scale you don't need Making technology choices without justification Ignoring operational reality Treating consistency/availability as separate concerns Not discussing trade-offs

✓ Start broad, add detail on request ✓ Ask clarifying questions upfront ✓ Design for the specified scale ✓ Justify technology choices ✓ Consider how humans operate it ✓ Explicitly discuss trade-offs ✓ Show you understand what you don't know

Success Criteria

You're ready when you can:

  • ✓ Clarify ambiguous requirements with good questions
  • ✓ Outline architecture clearly on a whiteboard
  • ✓ Explain each component's role
  • ✓ Justify your technology choices
  • ✓ Discuss trade-offs explicitly
  • ✓ Handle "what if" questions with confidence
  • ✓ Show understanding of operational reality
  • ✓ Demonstrate Staff+ systems thinking