gh-martybonacci-specswarm/commands/orchestrate.md at 7e16d8a2a10d27a5fb9b2e2e6275bca351703bf1

Files

Zhongwei Li 7e16d8a2a1 Initial commit

2025-11-30 08:39:24 +08:00

20 KiB

Raw Blame History

description

description
Run automated workflow orchestration with agent execution and validation

User Input

$ARGUMENTS

You MUST consider the user input before proceeding (if not empty).

Goal

Execute autonomous workflow orchestration:

Parse Workflow - Read workflow specification
Generate Comprehensive Prompt - Create detailed agent instructions
Launch Agent - Use Task tool to execute in target project
Validate Results - Run validation suite
Report Outcome - Summarize results and next steps

Usage: /speclabs:orchestrate <workflow-file> <project-path>

Example:

/speclabs:orchestrate features/001-fix-bug/workflow.md /home/marty/code-projects/tweeter-spectest

Pre-Orchestration Hook

echo "🎯 Project Orchestrator - Fully Autonomous Execution"
echo ""
echo "Phase 1b: Full Automation (Zero Manual Steps)"
echo "Components: State Manager + Decision Maker + Prompt Refiner + Validation + Metrics"
echo "Automation: Agent Launch + Validation + Retry Loop"
echo ""

# Record start time
ORCHESTRATION_START_TIME=$(date +%s)

# Source Phase 1a/1b components
PLUGIN_DIR="/home/marty/code-projects/specswarm/plugins/speclabs"
source "${PLUGIN_DIR}/lib/state-manager.sh"
source "${PLUGIN_DIR}/lib/decision-maker.sh"
source "${PLUGIN_DIR}/lib/prompt-refiner.sh"
source "${PLUGIN_DIR}/lib/metrics-tracker.sh"

echo "✅ All components loaded"
echo "✅ Full automation enabled"
echo ""

Execution Steps

Step 1: Parse Arguments

# Get arguments
ARGS="$ARGUMENTS"

# Parse test workflow file (required)
TEST_WORKFLOW=$(echo "$ARGS" | awk '{print $1}')

# Parse project path (required)
PROJECT_PATH=$(echo "$ARGS" | awk '{print $2}')

# Validate arguments
if [ -z "$TEST_WORKFLOW" ] || [ -z "$PROJECT_PATH" ]; then
  echo "❌ Error: Missing required arguments"
  echo ""
  echo "Usage: /speclabs:orchestrate <workflow-file> <project-path>"
  echo ""
  echo "Example:"
  echo "/speclabs:orchestrate features/001-fix-bug/workflow.md /home/marty/code-projects/tweeter-spectest"
  exit 1
fi

# Check files exist
if [ ! -f "$TEST_WORKFLOW" ]; then
  echo "❌ Error: Test workflow file not found: $TEST_WORKFLOW"
  exit 1
fi

if [ ! -d "$PROJECT_PATH" ]; then
  echo "❌ Error: Project path does not exist: $PROJECT_PATH"
  exit 1
fi

echo "📋 Test Workflow: $TEST_WORKFLOW"
echo "📁 Project: $PROJECT_PATH"
echo ""

Step 2: Parse Test Workflow

Read the test workflow file and extract:

Task description
Files to modify
Expected outcome
Validation criteria

Test Workflow Format (Markdown):

# Test: [Task Name]

## Description
[What needs to be done]

## Files to Modify
- path/to/file1.ts
- path/to/file2.ts

## Changes Required
[Detailed description of changes]

## Expected Outcome
[What should happen after changes]

## Validation
- [ ] Criterion 1
- [ ] Criterion 2
- [ ] Criterion 3

## Test URL
http://localhost:5173/[path]

echo "📖 Parsing Test Workflow..."
echo ""

# Read workflow file
WORKFLOW_CONTENT=$(cat "$TEST_WORKFLOW")

# Extract task name (first heading)
TASK_NAME=$(echo "$WORKFLOW_CONTENT" | grep "^# Test:" | head -1 | sed 's/^# Test: //')

if [ -z "$TASK_NAME" ]; then
  echo "❌ Error: Invalid workflow format (missing '# Test:' heading)"
  exit 1
fi

echo "✅ Task: $TASK_NAME"
echo ""

# Create orchestration session
echo "📝 Creating orchestration session..."
SESSION_ID=$(state_create_session "$TEST_WORKFLOW" "$PROJECT_PATH" "$TASK_NAME")
echo "✅ Session: $SESSION_ID"
echo ""

Step 3: Generate Comprehensive Prompt

This is the CORE of Project Orchestrator - generating effective prompts for agents.

Prompt Template:

# Task: [TASK_NAME]

## Context
You are working on project: [PROJECT_PATH]

This is an autonomous development task executed by Project Orchestrator (Phase 0).

## Your Mission
[TASK_DESCRIPTION from workflow]

## Files to Modify
[FILES_LIST from workflow]

## Detailed Requirements
[CHANGES_REQUIRED from workflow]

## Success Criteria
[VALIDATION_CRITERIA from workflow]

## Technical Guidelines
- Make ALL changes in [PROJECT_PATH] directory
- Follow existing code patterns and conventions
- Ensure all changes are complete and functional
- Run tests if applicable
- Report any blockers or issues

## Validation
After completing your work, the orchestrator will:
1. Run browser validation with Playwright
2. Check console for errors
3. Analyze UI with Claude Vision API
4. Verify success criteria

## Execution Instructions
1. Read the workflow file to understand requirements
2. Analyze existing code in target files
3. Make necessary changes
4. Verify changes work correctly
5. Report completion with summary

Work autonomously. Only escalate if you encounter blockers that prevent completion.

---

**Test Workflow File**: [WORKFLOW_FILE_PATH]

Please read this file first to understand the full requirements, then proceed with implementation.

Generate the actual prompt:

echo "✍️  Generating comprehensive prompt..."
echo ""

# Create prompt (simplified for Phase 0 - just reference the workflow file)
AGENT_PROMPT="# Task: ${TASK_NAME}

## Context
You are working on project: ${PROJECT_PATH}

This is an autonomous development task executed by Project Orchestrator (Phase 0 POC).

## Your Mission
Read and execute the test workflow specified in: ${TEST_WORKFLOW}

## Technical Guidelines
- Make ALL changes in ${PROJECT_PATH} directory
- Follow existing code patterns and conventions
- Ensure all changes are complete and functional
- Run tests if applicable
- Report any blockers or issues

## Execution Instructions
1. Read the workflow file: ${TEST_WORKFLOW}
2. Understand all requirements and validation criteria
3. Analyze existing code in target files
4. Make necessary changes
5. Verify changes work correctly
6. Report completion with summary

Work autonomously. Only escalate if you encounter blockers that prevent completion.

---

**IMPORTANT**: All file operations must be in the ${PROJECT_PATH} directory.
"

echo "✅ Prompt generated ($(echo "$AGENT_PROMPT" | wc -w) words)"
echo ""

Step 4: Automated Orchestration with Intelligent Retry

Phase 1b Full Automation: Agent launch + Validation + Retry loop - all automatic!

echo "🔄 Starting Automated Orchestration (max 3 attempts)..."
echo ""

# Configuration
MAX_RETRIES=3

I will now execute the orchestration with automatic retry logic:

For each attempt (up to 3):

Check current retry count from state
Select prompt (original on first attempt, refined on retries)
Launch agent automatically using Task tool
Run validation automatically using orchestrate-validate
Parse validation results and update state
Make decision (complete/retry/escalate)
If retry needed: Resume state and loop back to step 1
If complete or escalate: Exit loop and proceed to report

Automatic Loop Implementation:

Let me execute the retry loop automatically...

# This will be executed by Claude Code in the loop
echo "Orchestration loop will execute automatically..."
echo ""

Orchestration Loop Execution

I'll now begin the automatic orchestration loop. For each attempt:

Attempt Execution:

Check State & Select Prompt

# Get current retry count
RETRY_COUNT=$(state_get "$SESSION_ID" "retries.count")
ATTEMPT_NUM=$((RETRY_COUNT + 1))

echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo "📍 Attempt $ATTEMPT_NUM of $MAX_RETRIES"
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""

if [ "$RETRY_COUNT" -eq 0 ]; then
  # First attempt - use original prompt
  CURRENT_PROMPT="$AGENT_PROMPT"
  echo "📝 Using original prompt ($(echo "$CURRENT_PROMPT" | wc -w) words)"
else
  # Retry attempt - use refined prompt with failure context
  echo "🔧 Generating refined prompt with failure analysis..."
  CURRENT_PROMPT=$(prompt_refine "$SESSION_ID" "$AGENT_PROMPT")
  prompt_save "$SESSION_ID" "$CURRENT_PROMPT"
  echo "✅ Refined prompt generated ($(echo "$CURRENT_PROMPT" | wc -w) words)"

  # Show what changed
  prompt_diff "$SESSION_ID" "$AGENT_PROMPT" "$CURRENT_PROMPT" | head -20
fi

echo ""

# Update state for this attempt
state_update "$SESSION_ID" "agent.status" '"running"'
state_update "$SESSION_ID" "agent.started_at" "\"$(date -Iseconds)\""

# Save prompt for this attempt
PROMPT_FILE="/tmp/orchestrate-prompt-${SESSION_ID}-attempt-${ATTEMPT_NUM}.md"
echo "$CURRENT_PROMPT" > "$PROMPT_FILE"

echo "🚀 Launching agent for ${PROJECT_PATH}..."
echo ""

Launch Agent - I'll use the Task tool to launch a general-purpose agent with the current prompt:

# Agent will be launched by Claude Code using Task tool
echo "Agent launch: Task tool with prompt from $PROMPT_FILE"
echo "Working directory: $PROJECT_PATH"
echo ""

Agent Execution - Wait for agent to complete the task...
Update Agent Status

echo "✅ Agent completed execution"
echo ""

# Update state
state_update "$SESSION_ID" "agent.status" '"completed"'
state_update "$SESSION_ID" "agent.completed_at" "\"$(date -Iseconds)\""

Run Validation - I'll execute the validation suite automatically:

echo "🔍 Running Automated Validation Suite..."
echo ""

# Update validation status
state_update "$SESSION_ID" "validation.status" '"running"'

Execute validation: /speclabs:orchestrate-validate ${PROJECT_PATH}

Parse Validation Results

# Parse validation output and update state
# (Validation results will be captured from orchestrate-validate output)

echo ""
echo "📊 Validation Results:"
echo "- Playwright: [from validation output]"
echo "- Console Errors: [from validation output]"
echo "- Network Errors: [from validation output]"
echo "- Vision API: [from validation output]"
echo ""

Make Decision

echo "🧠 Decision Maker Analysis..."
echo ""

# Make decision based on state
DECISION=$(decision_make "$SESSION_ID")
RETRY_COUNT=$(state_get "$SESSION_ID" "retries.count")

echo "Decision: $DECISION (after attempt $ATTEMPT_NUM)"
echo ""

Process Decision

case "$DECISION" in
  complete)
    echo "✅ COMPLETE - Task successful!"

    # Record decision
    decision_record "$SESSION_ID" "complete" "Agent completed successfully and validation passed"

    # Mark session complete
    state_complete "$SESSION_ID" "success"

    # Track metrics
    metrics_track_completion "$SESSION_ID"

    echo ""
    echo "🎉 Orchestration Complete (success on attempt $ATTEMPT_NUM)"

    # Exit loop
    SHOULD_EXIT_LOOP=true
    ;;

  retry)
    echo "🔄 RETRY - Will attempt again with refined prompt"

    # Get detailed decision with reasoning
    DETAILED_DECISION=$(decision_make_detailed "$SESSION_ID")
    REASON=$(echo "$DETAILED_DECISION" | jq -r '.reason')

    echo "Reason: $REASON"
    echo ""

    # Analyze failure
    FAILURE_ANALYSIS=$(decision_analyze_failure "$SESSION_ID")
    FAILURE_TYPE=$(echo "$FAILURE_ANALYSIS" | jq -r '.failure_type')

    echo "Failure Type: $FAILURE_TYPE"

    # Generate retry strategy
    RETRY_STRATEGY=$(decision_generate_retry_strategy "$SESSION_ID")
    BASE_STRATEGY=$(echo "$RETRY_STRATEGY" | jq -r '.base_strategy')

    echo "Retry Strategy: $BASE_STRATEGY"
    echo ""

    # Record decision
    decision_record "$SESSION_ID" "retry" "$REASON"

    # Resume session for retry
    state_resume "$SESSION_ID"

    # Track retry
    metrics_track_retry "$SESSION_ID" "$FAILURE_TYPE"

    # Check if we should continue
    NEW_RETRY_COUNT=$(state_get "$SESSION_ID" "retries.count")

    if [ "$NEW_RETRY_COUNT" -ge "$MAX_RETRIES" ]; then
      echo "⚠️  Maximum retries reached. Escalating..."
      SHOULD_EXIT_LOOP=true
    else
      echo "↻ Continuing to next attempt..."
      echo ""
      # Loop will continue automatically
      SHOULD_EXIT_LOOP=false
    fi
    ;;

  escalate)
    echo "⚠️  ESCALATE - Human intervention required"

    # Get escalation message
    ESCALATION_MSG=$(decision_get_escalation_message "$SESSION_ID")

    echo ""
    echo "$ESCALATION_MSG"
    echo ""

    # Record decision
    decision_record "$SESSION_ID" "escalate" "Max retries exceeded or unrecoverable error"

    # Mark session escalated
    state_complete "$SESSION_ID" "escalated"

    # Track escalation
    metrics_track_escalation "$SESSION_ID"

    echo "🎯 Escalation Actions Required:"
    echo "1. Review session state: state_print_summary $SESSION_ID"
    echo "2. Examine failure history"
    echo "3. Manually fix issues or adjust workflow"
    echo "4. Re-run orchestration if needed"
    echo ""

    # Exit loop
    SHOULD_EXIT_LOOP=true
    ;;
esac

echo ""

Loop Control: After each attempt, I'll check if we should continue:

If decision is "complete" or "escalate" → Exit loop, proceed to report
If decision is "retry" and retries < max → Continue to next attempt
If decision is "retry" but retries >= max → Escalate and exit

I'll now execute this loop automatically, making up to 3 attempts with intelligent retry logic.

Step 5: Generate Orchestration Report

echo "📊 Orchestration Report"
echo "======================="
echo ""

# Print session summary using State Manager
state_print_summary "$SESSION_ID"

echo ""
echo "━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━"
echo ""

# Get final state
FINAL_STATUS=$(state_get "$SESSION_ID" "status")
RETRY_COUNT=$(state_get "$SESSION_ID" "retries.count")
FINAL_DECISION=$(state_get "$SESSION_ID" "decision.action")

echo "Final Status: $FINAL_STATUS"
echo "Decision: $FINAL_DECISION"
echo "Retry Count: $RETRY_COUNT"
echo ""

# Calculate duration
ORCHESTRATION_END_TIME=$(date +%s)
ORCHESTRATION_DURATION=$((ORCHESTRATION_END_TIME - ORCHESTRATION_START_TIME))
ORCHESTRATION_MINUTES=$((ORCHESTRATION_DURATION / 60))
ORCHESTRATION_SECONDS=$((ORCHESTRATION_DURATION % 60))

echo "⏱️  Total Duration: ${ORCHESTRATION_MINUTES}m ${ORCHESTRATION_SECONDS}s"

# Update metrics in state
state_update "$SESSION_ID" "metrics.total_time_seconds" "$ORCHESTRATION_DURATION"

echo ""
echo "📁 Session Directory: $(state_get_session_dir $SESSION_ID)"
echo ""

Post-Orchestration Hook

echo "🎣 Post-Orchestration Hook"
echo ""

echo "✅ Orchestration Complete"
echo ""

# Show next steps based on final decision
FINAL_DECISION=$(state_get "$SESSION_ID" "decision.action")

case "$FINAL_DECISION" in
  complete)
    echo "📈 Next Steps (Success):"
    echo "1. Review agent's changes in: ${PROJECT_PATH}"
    echo "2. Test manually in browser"
    echo "3. Commit changes if satisfied"
    echo "4. Review metrics: metrics_get_summary"
    ;;

  retry)
    echo "📈 Next Steps (Retry Needed):"
    echo "1. Review failure analysis in session state"
    echo "2. Optionally adjust workflow or requirements"
    echo "3. Re-run orchestration to retry automatically"
    echo "4. Or manually fix and test"
    ;;

  escalate)
    echo "📈 Next Steps (Escalated):"
    echo "1. Review escalation message above"
    echo "2. Examine session state: state_print_summary $SESSION_ID"
    echo "3. Manually investigate and fix issues"
    echo "4. Update workflow if needed"
    echo "5. Re-run orchestration after fixes"
    ;;

  *)
    echo "📈 Next Steps:"
    echo "1. Review session state: state_print_summary $SESSION_ID"
    echo "2. Check agent output in: ${PROJECT_PATH}"
    echo "3. Review metrics: metrics_get_summary"
    ;;
esac

echo ""
echo "📊 Phase 1a Metrics:"
echo "- Session ID: $SESSION_ID"
echo "- Session Dir: $(state_get_session_dir $SESSION_ID)"
echo "- View all sessions: state_list_sessions"
echo "- View metrics: metrics_get_summary"
echo ""

Error Handling

If workflow file is invalid:

Display error message with format requirements
Exit with error

If agent fails to complete:

Capture agent's error/blocker message
Display summary
Suggest manual intervention

If validation fails:

Display validation errors
Suggest fixes or retry

Success Criteria

Phase 1a Implementation: ✅ Test workflow parsed successfully ✅ Orchestration session created and tracked ✅ Phase 1a components loaded (State Manager, Decision Maker, Prompt Refiner, Metrics) ✅ Comprehensive prompt generated ✅ Agent launched with appropriate prompt (original or refined) ✅ Agent status tracked in state ✅ Validation executed (manual for now, automated in Phase 1b) ✅ Decision made (complete/retry/escalate) ✅ Retry logic with refined prompt (up to 3 attempts) ✅ Failure analysis performed ✅ Escalation handling when needed ✅ Metrics tracked and reported ✅ Orchestration report generated with Phase 1a data ✅ Session state persisted to .specswarm/orchestrator/

Learning Objectives

Phase 1a Questions:

Does intelligent retry logic improve success rate?
How effective is failure analysis categorization?
Do refined prompts reduce repeated failures?
What failure types are most common?
How many retries before escalation is typical?
Is decision maker accurate (correct complete/retry/escalate)?
Do agents improve on retry attempts?
What prompt refinements are most effective?

Phase 1a Metrics Tracked:

Session status (success/failure/escalated)
Retry count per session
Failure type distribution
Time per attempt (agent + validation)
Total orchestration time
Success rate by attempt number
Escalation rate
Decision accuracy
Prompt refinement effectiveness

Phase 1b Status: IMPLEMENTED ✅

Implemented Features (Phase 1b - Full Automation):

✅ State Manager - Session persistence and tracking
✅ Decision Maker - Intelligent complete/retry/escalate logic
✅ Prompt Refiner - Context-injected prompts on retry
✅ Automatic Agent Launch - No manual Task tool usage required
✅ Automatic Validation - Orchestrate-validate runs automatically
✅ True Retry Loop - Automatic retry with refined prompts (up to 3 attempts)
✅ Failure Analysis - 9 failure types categorized
✅ Escalation Handling - Automatic human escalation when needed
✅ Metrics Tracking - Session analytics and metrics
✅ State Persistence - All state saved to .specswarm/orchestrator/
✅ Full Automation - No manual steps during orchestration

Key Phase 1b Improvements:

🚀 10x Faster Testing - Automated execution vs manual steps
🔄 True Retry Loop - Automatic continuation without user intervention
📊 Real Validation Data - Actual Playwright/console/network results
⚡ Rapid Iteration - Test cases run in 2-5 minutes vs 15-30 minutes

Phase 1c Enhancements (Planned):

Real Vision API integration (vs mock)
Multiple workflow parallel execution
Advanced metrics dashboards
Cross-session learning

Phase 2 Enhancements (Future):

Multi-agent coordination
Dynamic prompt optimization with ML
Self-healing workflows
Predictive failure prevention
Autonomous workflow generation

20 KiB Raw Blame History