13 KiB
Deliberation, Debate & Red Teaming: Advanced Methodology
Workflow
Copy this checklist for advanced red team scenarios:
Advanced Red Teaming Progress:
- [ ] Step 1: Select appropriate red team technique
- [ ] Step 2: Design adversarial simulation or exercise
- [ ] Step 3: Facilitate session and capture critiques
- [ ] Step 4: Synthesize findings with structured argumentation
- [ ] Step 5: Build consensus on mitigations
Step 1: Select appropriate red team technique - Match technique to proposal complexity and stakes. See Technique Selection.
Step 2: Design adversarial simulation - Structure attack trees, pre-mortem, wargaming, or tabletop exercise. See techniques below.
Step 3: Facilitate session - Manage group dynamics, overcome defensiveness, calibrate intensity. See Facilitation Techniques.
Step 4: Synthesize findings - Use structured argumentation to evaluate critique validity. See Argumentation Framework.
Step 5: Build consensus - Align stakeholders on risk prioritization and mitigations. See Consensus Building.
Technique Selection
Match technique to proposal characteristics:
| Proposal Type | Complexity | Stakes | Group Size | Best Technique |
|---|---|---|---|---|
| Security/Architecture | High | High | 3-5 | Attack Trees |
| Strategy/Product | Medium | High | 5-10 | Pre-mortem |
| Policy/Process | Medium | Medium | 8-15 | Tabletop Exercise |
| Crisis Response | High | Critical | 4-8 | Wargaming |
| Feature/Design | Low | Medium | 3-5 | Structured Critique (template.md) |
Time availability:
- 1-2 hours: Structured critique (template.md), Pre-mortem
- Half-day: Tabletop exercise, Attack trees
- Full-day: Wargaming, Multi-round simulation
1. Attack Trees
What Are Attack Trees?
Systematic enumeration of attack vectors against a system. Start with attacker goal (root), decompose into sub-goals using AND/OR logic.
Use case: Security architecture, product launches with abuse potential
Building Attack Trees
Process:
- Define attacker goal (root node): "Compromise user data"
- Decompose with AND/OR gates:
- OR gate: Attacker succeeds if ANY child succeeds
- AND gate: Must achieve ALL children
- Assign properties to each path: Feasibility (1-5), Cost (L/M/H), Detection (1-5)
- Identify critical paths: High feasibility + low detection + low cost
- Design mitigations: Prevent (remove vulnerability), Detect (monitoring), Respond (incident plan)
Example tree:
[Compromise user data]
OR
├─ [Exploit API] → SQL injection / Auth bypass / Rate limit bypass
├─ [Social engineer] → Phish credentials AND Access admin panel
└─ [Physical access] → Breach datacenter AND Extract disk
Template:
## Attack Tree: [Goal]
**Attacker profile:** [Script kiddie / Insider / Nation-state]
**Attack paths:**
1. **[Attack vector]** - Feasibility: [1-5] | Cost: [L/M/H] | Detection: [1-5] | Critical: [Y/N] | Mitigation: [Defense]
2. **[Attack vector]** [Same structure]
**Critical paths:** [Feasibility ≥4, detection ≤2]
**Recommended defenses:** [Prioritized mitigations]
2. Pre-mortem
What Is Pre-mortem?
Assume proposal failed in future, work backwards to identify causes. Exploits prospective hindsight (easier to imagine causes of known failure than predict unknowns).
Use case: Product launches, strategic decisions, high-stakes initiatives
Pre-mortem Process (90 min total)
- Set the stage (5 min): "It's [date]. Our proposal failed spectacularly. [Describe worst outcome]"
- Individual brainstorming (10 min): Each person writes 5-10 failure reasons independently
- Round-robin sharing (20 min): Go around room, each shares one reason until all surfaced
- Cluster and prioritize (15 min): Group similar, vote (3 votes/person), identify top 5-7
- Risk assessment (20 min): For each: Severity (1-5), Likelihood (1-5), Early warning signs
- Design mitigations (30 min): Preventative actions for highest-risk modes
Template:
## Pre-mortem: [Proposal]
**Scenario:** It's [date]. Failed. [Vivid worst outcome]
**Failure modes (by votes):**
1. **[Mode]** (Votes: [X]) - Why: [Root cause] | S: [1-5] L: [1-5] Score: [S×L] | Warnings: [Indicators] | Mitigation: [Action]
2. [Same structure]
**Showstoppers (≥15):** [Must-address]
**Revised plan:** [Changes based on pre-mortem]
Facilitator tips: Make failure vivid, encourage wild ideas, avoid blame, time-box ruthlessly
3. Wargaming
What Is Wargaming?
Multi-party simulation where teams play adversarial roles over multiple rounds. Reveals dynamic effects (competitor responses, escalation, unintended consequences).
Use case: Competitive strategy, crisis response, market entry
Wargaming Structure
Roles: Proposer team, Adversary team(s) (competitors, regulators), Control team (adjudicates outcomes)
Turn sequence per round (35 min):
- Planning (15 min): Teams plan moves in secret
- Execution (5 min): Reveal simultaneously
- Adjudication (10 min): Control determines outcomes, updates game state
- Debrief (5 min): Reflect on consequences
Process:
- Define scenario (30 min): Scenario, victory conditions per team, constraints
- Brief teams (15 min): Role sheets with incentives, capabilities, constraints
- Run 3-5 rounds (45 min each): Control introduces events to stress-test
- Post-game debrief (45 min): Strategies emerged, vulnerabilities exposed, contingencies needed
Template:
## Wargame: [Proposal]
**Scenario:** [Environment] | **Teams:** Proposer: [Us] | Adversary 1: [Competitor] | Adversary 2: [Regulator] | Control: [Facilitator]
**Victory conditions:** Proposer: [Goal] | Adversary 1: [Goal] | Adversary 2: [Goal]
**Round 1:** Proposer: [Move] | Adv1: [Response] | Adv2: [Response] | Outcome: [New state]
**Round 2-5:** [Same structure]
**Key insights:** [Unexpected dynamics, blind spots, countermoves]
**Recommendations:** [Mitigations, contingencies]
4. Tabletop Exercises
What Are Tabletop Exercises?
Structured walkthrough where participants discuss how they'd respond to scenario. Focuses on coordination, process gaps, decision-making under stress.
Use case: Incident response, crisis management, operational readiness
Tabletop Process
- Design scenario (1 hr prep): Realistic incident with injects (new info at intervals), decision points
- Brief participants (10 min): Set scene, define roles, clarify it's simulation
- Run scenario (90 min): Present 5-7 injects, discuss responses (10-15 min each)
- Debrief (30 min): What went well? Gaps exposed? Changes needed?
Example inject sequence:
- T+0: "Alert fires: unusual DB access" → Who's notified? First action?
- T+15: "10K records accessed" → Who notify (legal, PR)? Communication?
- T+30: "CEO wants briefing, reporter called" → CEO message? PR statement?
Template:
## Tabletop: [Scenario]
**Objective:** Test [plan/procedure] | **Participants:** [Roles] | **Scenario:** [Incident description]
**Injects:**
**T+0 - [Event]** | Q: [Who responsible? What action?] | Decisions: [Responses] | Gaps: [Unclear/missing]
**T+15 - [Escalation]** [Same structure]
**Debrief:** Strengths: [Worked well] | Gaps: [Process/tool/authority] | Recommendations: [Changes]
Facilitation Techniques
Managing Defensive Responses
| Pattern | Response | Goal |
|---|---|---|
| "We already thought of that" | "Great. Walk me through the analysis and mitigation?" | Verify claim, check adequacy |
| "That's not realistic" | "What makes this unlikely?" (Socratic) | Challenge without confrontation |
| "You don't understand context" | "You're right, help me. Can you explain [X]? How does that address [critique]?" | Acknowledge expertise, stay focused |
| Dismissive tone/eye-rolling | "Sensing resistance. Goal is improve, not attack. What would help?" | Reset tone, reaffirm purpose |
Calibrating Adversarial Intensity
Too aggressive: Team shuts down, hostile | Too soft: Superficial critiques, groupthink
Escalation approach:
- Round 1: Curious questions ("What if X?")
- Round 2: Direct challenges ("Assumes Y, but what if false?")
- Round 3: Aggressive probing ("How does this survive Z?")
Adjust to culture:
- High-trust teams: Aggressive critique immediately
- Defensive teams: Start curious, frame as "helping improve"
"Yes, and..." technique: "Yes, solves X, AND creates Y for users Z" (acknowledges value + raises concern)
Facilitator Tactics
- Parking lot: "Important but out-of-scope. Capture for later."
- Redirect attacks: "Critique proposal, not people. Rephrase?"
- Balance airtime: "Let's hear from [quiet person]."
- Synthesize: "Here's what I heard: [3-5 themes]. Accurate?"
- Strategic silence: Wait 10+ sec after tough question. Forces deeper thinking.
Argumentation Framework
Toulmin Model for Evaluating Critiques
Use case: Determine if critique is valid or strawman
Components: Claim (assertion) + Data (evidence) + Warrant (logical link) + Backing (support for warrant) + Qualifier (certainty) + Rebuttal (conditions where claim fails)
Example:
- Claim: "Feature will fail, users won't adopt"
- Data: "5% beta adoption"
- Warrant: "Beta users = target audience, beta predicts production"
- Backing: "Past 3 features: beta adoption r=0.89 correlation"
- Qualifier: "Likely"
- Rebuttal: "Unless we improve onboarding (not in beta)"
Evaluating Critique Validity
Strong: Specific data, logical warrant, backing exists, acknowledges rebuttals Weak (strawman): Vague hypotheticals, illogical warrant, no backing, ignores rebuttals
Example evaluation: "API slow because complex DB queries" | Data: "5+ table joins" ✓ | Warrant: "Multi-joins slow" ✓ | Backing: "Prior 5+ joins = 2s" ✓ | Rebuttal acknowledged? No (caching, indexes) | Verdict: Moderate strength, address rebuttal
Structured Rebuttal
Proposer response:
- Accept: Valid, will address → Add to mitigation
- Refine: Partially valid → Clarify conditions
- Reject: Invalid → Provide counter-data + counter-warrant (substantive, not dismissive)
Consensus Building
Multi-Stakeholder Alignment (65 min)
Challenge: Different stakeholders prioritize different risks
Process:
- Acknowledge perspectives (15 min): Each states top concern, facilitator captures
- Identify shared goals (10 min): What do all agree on?
- Negotiate showstoppers (30 min): For risks ≥15, discuss: Is this truly showstopper? Minimum mitigation? Vote if needed (stakeholder-weighted scoring)
- Accept disagreements (10 min): Decision-maker breaks tie on non-showstoppers. Document dissent.
Delphi Method (Asynchronous)
Use case: Distributed team, avoid group pressure
Process: Round 1 (independent assessments) → Round 2 (share anonymized, experts revise) → Round 3 (share aggregate, final assessments) → Convergence or decision-maker adjudicates
Advantage: Eliminates groupthink, HiPPO effect | Disadvantage: Slower (days/weeks)
Advanced Critique Patterns
Second-Order Effects
Identify ripple effects: "If we change this, what happens next? Then what?" (3-5 iterations)
Example: Launch referral → Users invite friends → Invited users lower engagement (didn't choose organically) → Churn ↑, LTV ↓ → Unit economics worsen → Budget cuts
Inversion
Ask "How do we guarantee failure?" then check if proposal avoids those modes
Example: New market entry
- Inversion: Wrong product-market fit, underestimate competition, violate regulations, misunderstand culture
- Check: Market research? Regulatory review? Localization?
Assumption Surfacing
For each claim: "What must be true for this to work?"
Example: "Feature increases engagement 20%"
- Assumptions: Users want it (validated?), will discover it (discoverability?), works reliably (load tested?), 20% credible (source?)
- Test each. If questionable, critique valid.
Common Pitfalls & Mitigations
| Pitfall | Detection | Mitigation |
|---|---|---|
| Analysis paralysis | Red team drags on for weeks, no decision | Time-box exercise (half-day max). Focus on showstoppers only. |
| Strawman arguments | Critiques are unrealistic or extreme | Use Toulmin model to evaluate. Require data and backing. |
| Groupthink persists | All critiques are minor, no real challenges | Use adversarial roles explicitly. Pre-mortem or attack trees force critical thinking. |
| Defensive shutdown | Team rejects all critiques, hostility | Recalibrate tone. Use "Yes, and..." framing. Reaffirm red team purpose. |
| HiPPO effect | Highest-paid person's opinion dominates | Anonymous brainstorming (pre-mortem). Delphi method. |
| No follow-through | Great critiques, no mitigations implemented | Assign owners and deadlines to each mitigation. Track in project plan. |
| Red team as rubber stamp | Critique is superficial, confirms bias | Choose truly adversarial roles. Bring in external red team if internal team too aligned. |
| Over-optimization of low-risk items | Spending time on low-impact risks | Use risk matrix. Only address showstoppers and high-priority. |