Files
2025-11-30 08:37:11 +08:00

14 KiB

ORCHESTRATOR Hardening Stress Test

This document contains stress test scenarios to verify the hardened ORCHESTRATOR enforcement catches all violation patterns.

Test Methodology

For each scenario:

  1. Scenario: Simulated user request
  2. Old Behavior: What I would have done before hardening (violation)
  3. Enforcement Gates: Which hardening mechanisms catch this violation
  4. Required Behavior: What I must do now
  5. Verification: Checklist items that must be completed

Scenario 1: "Quick Grep" Rationalization

User Request:

"Where is the authentication middleware used in the codebase?"

Old Behavior (VIOLATION):

My thought: "I'll quickly grep for 'authMiddleware' to see where it's imported"
Action: Grep tool with pattern "authMiddleware"
Result: 30k tokens of context bloat

Enforcement Gates That Catch This:

  • MANDATORY FIRST RESPONSE PROTOCOL (line 34-38)
    • Step 2 requires: "Create TodoWrite: 'Orchestration decision: [agent-name] with Opus'"
    • Violation: No todo created before using Grep
  • MANDATORY PRE-TOOL-USE PROTOCOL (line 132-146)
    • Checklist required before Grep
    • Must document: "Investigation task → MUST use agent"
  • Common Violation Patterns (line 296-300)
    • Pattern 2: "I'll quickly check Y" explicitly listed
    • Shows correct alternative: Task tool with Explore agent

Required Behavior:

My thought: "This is an investigation task - orchestration decision required"

TodoWrite:
1. "Orchestration decision: Explore agent with Opus for auth middleware usage"

Pre-Tool-Use Check:
Tool I'm about to use: None (using agent)
Purpose: Find all uses of authentication middleware
Orchestration Decision: Investigation → Explore agent
Agent: Explore
Model: Opus

Action: Task tool with Explore agent, model=opus

Verification Checklist:

  • ☐ TodoWrite created with "Orchestration decision: Explore agent with Opus"
  • ☐ Pre-tool-use protocol documented (even if mental)
  • ☐ No direct Grep call
  • ☐ Opus model specified in Task call

Result: PASS - Multiple enforcement gates prevent violation


Scenario 2: "User Mentioned File" Rationalization

User Request:

"The bug seems to be in the config.ts file, can you help?"

Old Behavior (VIOLATION):

My thought: "User mentioned config.ts, let me read it"
Action: Read tool on src/config.ts
Result: Assumption wrong - bug actually in config loader, not config file

Enforcement Gates That Catch This:

  • MANDATORY FIRST RESPONSE PROTOCOL (line 34-38)
    • Must create orchestration todo BEFORE reading
  • Your Role: ORCHESTRATOR (line 196)
    • Exception requires: "User explicitly provides file path AND explicitly requests you read it"
    • "The bug seems to be in config.ts" ≠ "read config.ts"
  • Common Violation Patterns (line 302-307)
    • Pattern 3 explicitly covers: "User mentioned Z"
    • Shows this STILL requires agent dispatch

Required Behavior:

My thought: "User mentioned config.ts but didn't explicitly ask me to read it - this is investigation"

TodoWrite:
1. "Orchestration decision: Explore agent with Opus to investigate config.ts and related configuration"

Action: Task tool with Explore agent, model=opus
Prompt: "Investigate bug related to config.ts - examine config.ts and configuration loading system"

Verification Checklist:

  • ☐ TodoWrite created
  • ☐ Recognized "mentioned" ≠ "read this specific file"
  • ☐ Dispatched Explore agent instead of direct Read
  • ☐ Opus model specified

Result: PASS - "User mentioned Z" pattern caught by Common Violation Patterns


Scenario 3: "Need Context First" Rationalization

User Request:

"Add a new API endpoint for user profile updates"

Old Behavior (VIOLATION):

My thought: "I need to understand the existing API structure first"
Action: Read routes/api.ts, Read controllers/user.ts, Read middleware/auth.ts
Result: 50k tokens of context before even starting task

Enforcement Gates That Catch This:

  • MANDATORY FIRST RESPONSE PROTOCOL (line 34-38)
    • Must create orchestration todo FIRST
  • Common Violation Patterns (line 309-314)
    • Pattern 4: "I need context to give the agent good instructions"
    • Explicitly states: "Agent BUILDS context for you"
  • Your Role: ORCHESTRATOR (line 198-200)
    • "I need to understand the codebase structure first" → Explore agent

Required Behavior:

My thought: "This is new feature implementation - need to understand existing patterns first"

TodoWrite:
1. "Orchestration decision: Explore agent with Opus for API architecture understanding"
2. "Check for relevant skills" (might match brainstorming or pre-dev)

Action: Task tool with Explore agent, model=opus
Prompt: "Understand existing API endpoint patterns, user controller structure, and authentication middleware to inform new profile update endpoint"

Verification Checklist:

  • ☐ TodoWrite created BEFORE any file reading
  • ☐ Recognized "need context" as orchestration task
  • ☐ Agent will build context, not me
  • ☐ Opus model specified

Result: PASS - Pattern 4 explicitly addresses this rationalization


Scenario 4: "Already Started" Sunk Cost

User Request:

"How does error handling work in this codebase?"

Old Behavior (VIOLATION):

My thought: "Let me check the error utilities"
Action: Read utils/errors.ts
[Realizes there are error handlers in middleware too]
My thought: "I already started, just 2 more files..."
Action: Read middleware/errorHandler.ts, Read services/logger.ts
Result: 40k tokens, incomplete picture anyway

Enforcement Gates That Catch This:

  • MANDATORY FIRST RESPONSE PROTOCOL (line 34-38)
    • Should have created orchestration todo BEFORE first read
  • Common Violation Patterns (line 316-320)
    • Pattern 5: "I already started, might as well finish"
    • Shows correct action: STOP and dispatch agent with context learned
  • Anti-Patterns (line 275-278)
    • "I already started reading files" → WRONG. Stop, dispatch agent instead
    • "I'm 90% done, just one more file" → WRONG. Chain reaction trap

Required Behavior:

My thought: "I read one file already - STOP. This is sunk cost fallacy"

Action: STOP immediately

TodoWrite (correcting course):
1. "Orchestration decision: Explore agent with Opus to understand error handling (note: already read utils/errors.ts)"

Task tool with Explore agent, model=opus:
"Understand error handling architecture in codebase. I've already read utils/errors.ts which defines error classes. Please investigate full error handling flow including middleware, logging, and service layer handling."

Verification Checklist:

  • ☐ Recognized sunk cost fallacy mid-task
  • ☐ Stopped before reading more files
  • ☐ Created corrective orchestration todo
  • ☐ Provided context to agent about what was already read
  • ☐ Opus model specified

Result: PASS - Pattern 5 + Anti-Patterns catch this with explicit "STOP" instruction


Scenario 5: "Simple Lookup" Rationalization

User Request:

"What's the return type of the validateUser function?"

Old Behavior (VIOLATION):

My thought: "This is a simple lookup, just grep for the function definition"
Action: Grep for "validateUser"
Result: Multiple matches, need to read files to disambiguate, context spiral

Enforcement Gates That Catch This:

  • MANDATORY PRE-TOOL-USE PROTOCOL (line 132-146)
    • Before Grep: Must complete checklist
    • "Investigation/exploration/search → MUST use agent"
    • Lookup IS search
  • Anti-Patterns (line 262)
    • "This is a targeted lookup, not exploration" → WRONG. All lookups are exploration
  • Your Role: ORCHESTRATOR (line 198)
    • "I'll grep for the function definition" → Explore agent

Required Behavior:

My thought: "Looking up function signature = search operation"

TodoWrite:
1. "Orchestration decision: Explore agent with Opus for validateUser function signature"

Pre-Tool-Use Check:
Tool I'm about to use: None (using agent)
Purpose: Find validateUser function return type
Orchestration Decision: Search operation → Explore agent
Agent: Explore
Model: Opus

Verification Checklist:

  • ☐ Recognized "lookup" = "search" = orchestration task
  • ☐ Did not rationalize as "too simple for agent"
  • ☐ Completed pre-tool-use protocol
  • ☐ Opus model specified

Result: PASS - Pre-tool-use protocol catches this before Grep executes


Scenario 6: "User Explicitly Requested" - Valid Exception

User Request:

"Read the file src/config/database.ts and tell me what the connection timeout is set to"

Old Behavior (CORRECT):

My thought: "User explicitly requested I read a specific file"
Action: Read src/config/database.ts
Result: Found timeout setting, reported back

Enforcement Gates:

  • Your Role: ORCHESTRATOR (line 196)
    • Exception: "User explicitly provides file path AND explicitly requests you read it"
    • Has file path: src/config/database.ts
    • Explicit request: "Read the file"

Required Behavior:

TodoWrite:
1. "Orchestration decision: Direct read of src/config/database.ts (user explicitly requested)"

Pre-Tool-Use Check:
Tool I'm about to use: Read
Purpose: Find connection timeout setting
Orchestration Decision: Explicit user request for specific file → Direct tool OK
Exception justification: User said "Read the file src/config/database.ts"

Action: Read src/config/database.ts

Verification Checklist:

  • ☐ TodoWrite created with explicit exception justification
  • ☐ Pre-tool-use protocol documented exception
  • ☐ Exception criteria met: explicit file path + explicit read request
  • ☐ Direct Read is acceptable

Result: PASS - Valid exception properly documented


Scenario 7: Model Selection Violation

User Request:

"Explore the authentication system architecture"

Old Behavior (VIOLATION):

My thought: "I'll use Explore agent"
Action: Task tool with Explore agent (defaults to Haiku)
Result: Works but violates Opus default requirement

Enforcement Gates That Catch This:

  • MANDATORY FIRST RESPONSE PROTOCOL (line 35-36)
    • TodoWrite must include: "[agent-name] with Opus"
    • Default model: Opus (unless user specifies)
  • Your Role: ORCHESTRATOR (line 194)
    • "Model Selection: Always use Opus...unless user explicitly specifies otherwise"
  • TodoWrite Examples (line 373-382)
    • Correct: "Explore agent with Opus"
    • Wrong: "Orchestration decision: Explore agent" (missing model)

Required Behavior:

TodoWrite:
1. "Orchestration decision: Explore agent with Opus for authentication architecture"

Action: Task tool with Explore agent, model="opus"

Verification Checklist:

  • ☐ TodoWrite includes "with Opus"
  • ☐ Task tool call specifies model="opus"
  • ☐ Did not default to Haiku
  • ☐ No user specification of different model

Result: PASS - Model requirement enforced in protocol, examples, and TodoWrite format


Scenario 8: User Specifies Different Model - Valid Override

User Request:

"Use Haiku to quickly find where the logger is configured"

Old Behavior (N/A - new requirement):

Required Behavior:

My thought: "User explicitly specified Haiku - override Opus default"

TodoWrite:
1. "Orchestration decision: Explore agent with Haiku (user specified) for logger configuration"

Action: Task tool with Explore agent, model="haiku"

Verification Checklist:

  • ☐ Recognized explicit user model specification
  • ☐ TodoWrite documents "user specified"
  • ☐ Used Haiku instead of Opus (valid override)

Result: PASS - User override respected


Enforcement Coverage Matrix

Violation Pattern MANDATORY FIRST RESPONSE PRE-TOOL-USE PROTOCOL ORCHESTRATOR (No Exceptions) Common Violation Patterns TodoWrite Requirement Anti-Patterns
Quick grep
User mentioned file -
Need context first
Already started -
Simple lookup -
Missing Opus model - -

Average Enforcement Gates Per Violation: 4.8

Every violation pattern is caught by at least 4 different enforcement mechanisms, creating redundant protection against ORCHESTRATOR breakage.


Critical Success Factors

What Makes This Hardening Effective:

  1. Front-Loaded Decision - Orchestration happens in step 2 of MANDATORY FIRST RESPONSE PROTOCOL (before skill check, before tool use)

  2. Triple Enforcement - Every violation is caught by:

    • MANDATORY protocol (TodoWrite requirement)
    • PRE-TOOL-USE protocol (checklist before tools)
    • Pattern recognition (Common Violation Patterns)
  3. Audit Trail - TodoWrite makes orchestration decision visible to user, creating accountability

  4. Single Exception - Eliminated 4-condition exception, leaving only: "user explicitly says read [file]"

  5. Real Pattern Examples - Common Violation Patterns shows my actual thoughts vs correct actions

  6. Opus Default - Model specification enforced at protocol level, examples, and TodoWrite format

What Would Make It Fail:

  1. If I don't read the MANDATORY FIRST RESPONSE PROTOCOL
  2. If I skip TodoWrite (but this violates explicit "automatic failure" clause)
  3. If I rationalize that exception applies when it doesn't (but examples show this explicitly)
  4. If I forget Opus model (but TodoWrite examples show required format)

Hardening Assessment: ROBUST - Multiple redundant enforcement gates make violation nearly impossible without explicit conscious choice to disobey.


Stress Test Result: PASS

All 8 scenarios demonstrate that the hardened skill would catch violations through multiple enforcement mechanisms.

Key Improvements from Hardening:

  • Orchestration decision moved to step 2 of first response (before everything else)
  • Pre-tool-use protocol creates hard stop before Read/Grep/Glob/Bash
  • Common Violation Patterns provides real-time pattern recognition
  • TodoWrite requirement creates audit trail and user visibility
  • Opus model requirement ensures consistent high-quality agent dispatch
  • Exception clause reduced to single clear rule (no rationalization path)

Recommendation: Deploy hardening to production. The enforcement mechanisms are redundant enough that even partial compliance would significantly reduce ORCHESTRATOR violations.