Initial commit
This commit is contained in:
556
commands/batch.md
Normal file
556
commands/batch.md
Normal file
@@ -0,0 +1,556 @@
|
||||
---
|
||||
description: Batch process multiple repos with StackShift analysis running in parallel. Analyzes 5 repos at a time, tracks progress, and aggregates results. Perfect for analyzing monorepo services or multiple related projects.
|
||||
---
|
||||
|
||||
# StackShift Batch Processing
|
||||
|
||||
**Analyze multiple repositories in parallel**
|
||||
|
||||
Run StackShift on 10, 50, or 100+ repos simultaneously with progress tracking and result aggregation.
|
||||
|
||||
---
|
||||
|
||||
## Quick Start
|
||||
|
||||
**Analyze all services in a monorepo:**
|
||||
|
||||
```bash
|
||||
# From monorepo services directory
|
||||
cd ~/git/my-monorepo/services
|
||||
|
||||
# Let me analyze all service-* directories in batches of 5
|
||||
```
|
||||
|
||||
I'll:
|
||||
1. ✅ Find all service-* directories
|
||||
2. ✅ Filter to valid repos (has package.json)
|
||||
3. ✅ Process in batches of 5 (configurable)
|
||||
4. ✅ Track progress in `batch-results/`
|
||||
5. ✅ Aggregate results when complete
|
||||
|
||||
---
|
||||
|
||||
## What I'll Do
|
||||
|
||||
### Step 1: Discovery
|
||||
|
||||
```bash
|
||||
echo "=== Discovering repositories in ~/git/my-monorepo/services ==="
|
||||
|
||||
# Find all service directories
|
||||
find ~/git/my-monorepo/services -maxdepth 1 -type d -name "service-*" | sort > /tmp/services-to-analyze.txt
|
||||
|
||||
# Count
|
||||
SERVICE_COUNT=$(wc -l < /tmp/services-to-analyze.txt)
|
||||
echo "Found $SERVICE_COUNT services"
|
||||
|
||||
# Show first 10
|
||||
head -10 /tmp/services-to-analyze.txt
|
||||
```
|
||||
|
||||
### Step 2: Batch Configuration
|
||||
|
||||
**IMPORTANT:** I'll ask ALL configuration questions upfront, ONCE. Your answers will be saved to a batch session file and automatically applied to ALL repos in all batches. You won't need to answer these questions again during this batch run!
|
||||
|
||||
I'll ask you:
|
||||
|
||||
**Question 1: How many to process?**
|
||||
- A) All services ($WIDGET_COUNT total)
|
||||
- B) First 10 (test run)
|
||||
- C) First 25 (small batch)
|
||||
- D) Custom number
|
||||
|
||||
**Question 2: Parallel batch size?**
|
||||
- A) 3 at a time (conservative)
|
||||
- B) 5 at a time (recommended)
|
||||
- C) 10 at a time (aggressive, may slow down)
|
||||
- D) Sequential (1 at a time, safest)
|
||||
|
||||
**Question 3: What route?**
|
||||
- A) Auto-detect (auto-detect (monorepo for service-*), ask for others)
|
||||
- B) Force monorepo-service for all
|
||||
- C) Force greenfield for all
|
||||
- D) Force brownfield for all
|
||||
|
||||
**Question 4: Brownfield mode?** _(If route = brownfield)_
|
||||
- A) Standard - Just create specs for current state
|
||||
- B) Upgrade - Create specs + upgrade all dependencies
|
||||
|
||||
**Question 5: Transmission?**
|
||||
- A) Manual - Review each gear before proceeding
|
||||
- B) Cruise Control - Shift through all gears automatically
|
||||
|
||||
**Question 6: Clarifications strategy?** _(If transmission = cruise control)_
|
||||
- A) Defer - Mark them, continue around them
|
||||
- B) Prompt - Stop and ask questions
|
||||
- C) Skip - Only implement fully-specified features
|
||||
|
||||
**Question 7: Implementation scope?** _(If transmission = cruise control)_
|
||||
- A) None - Stop after specs are ready
|
||||
- B) P0 only - Critical features only
|
||||
- C) P0 + P1 - Critical + high-value features
|
||||
- D) All - Every feature
|
||||
|
||||
**Question 8: Spec output location?** _(If route = greenfield)_
|
||||
- A) Current repository (default)
|
||||
- B) New application repository
|
||||
- C) Separate documentation repository
|
||||
- D) Custom location
|
||||
|
||||
**Question 9: Target stack?** _(If greenfield + implementation scope != none)_
|
||||
- Examples:
|
||||
- Next.js 15 + TypeScript + Prisma + PostgreSQL
|
||||
- Python/FastAPI + SQLAlchemy + PostgreSQL
|
||||
- Your choice: [specify]
|
||||
|
||||
**Question 10: Build location?** _(If greenfield + implementation scope != none)_
|
||||
- A) Subfolder (recommended) - e.g., greenfield/, v2/
|
||||
- B) Separate directory - e.g., ~/git/my-new-app
|
||||
- C) Replace in place (destructive)
|
||||
|
||||
**Then I'll:**
|
||||
1. ✅ Save all answers to `.stackshift-batch-session.json` (in current directory)
|
||||
2. ✅ Show batch session summary
|
||||
3. ✅ Start processing batches with auto-applied configuration
|
||||
4. ✅ Clear batch session when complete (or keep if you want)
|
||||
|
||||
**Why directory-scoped?**
|
||||
- Multiple batch sessions can run simultaneously in different directories
|
||||
- Each batch (monorepo services, etc.) has its own isolated configuration
|
||||
- No conflicts between parallel batch runs
|
||||
- Session file is co-located with the repos being processed
|
||||
|
||||
### Step 3: Create Batch Session & Spawn Agents
|
||||
|
||||
**First: Create batch session with all answers**
|
||||
|
||||
```bash
|
||||
# After collecting all configuration answers, create batch session
|
||||
# Stored in current directory for isolation from other batch runs
|
||||
cat > .stackshift-batch-session.json <<EOF
|
||||
{
|
||||
"sessionId": "batch-$(date +%s)",
|
||||
"startedAt": "$(date -u +%Y-%m-%dT%H:%M:%SZ)",
|
||||
"batchRootDirectory": "$(pwd)",
|
||||
"totalRepos": ${TOTAL_REPOS},
|
||||
"batchSize": ${BATCH_SIZE},
|
||||
"answers": {
|
||||
"route": "${ROUTE}",
|
||||
"transmission": "${TRANSMISSION}",
|
||||
"spec_output_location": "${SPEC_OUTPUT}",
|
||||
"target_stack": "${TARGET_STACK}",
|
||||
"build_location": "${BUILD_LOCATION}",
|
||||
"clarifications_strategy": "${CLARIFICATIONS}",
|
||||
"implementation_scope": "${SCOPE}"
|
||||
},
|
||||
"processedRepos": []
|
||||
}
|
||||
EOF
|
||||
|
||||
echo "✅ Batch session created: $(pwd)/.stackshift-batch-session.json"
|
||||
echo "📦 Configuration will be auto-applied to all ${TOTAL_REPOS} repos"
|
||||
```
|
||||
|
||||
**Then: Spawn parallel agents (they'll auto-use batch session)**
|
||||
|
||||
```typescript
|
||||
// Use Task tool to spawn parallel agents
|
||||
const batch1 = [
|
||||
'service-user-api',
|
||||
'service-inventory',
|
||||
'service-contact',
|
||||
'service-search',
|
||||
'service-pricing'
|
||||
];
|
||||
|
||||
// Spawn 5 agents in parallel
|
||||
const agents = batch1.map(service => ({
|
||||
task: `Analyze ${service} service with StackShift`,
|
||||
description: `StackShift analysis: ${service}`,
|
||||
subagent_type: 'general-purpose',
|
||||
prompt: `
|
||||
cd ~/git/my-monorepo/services/${service}
|
||||
|
||||
IMPORTANT: Batch session is active (will be auto-detected by walking up to parent)
|
||||
Parent directory has: .stackshift-batch-session.json
|
||||
All configuration will be auto-applied. DO NOT ask configuration questions.
|
||||
|
||||
Run StackShift Gear 1: Analyze
|
||||
- Will auto-detect route (batch session: ${ROUTE})
|
||||
- Will use spec output location: ${SPEC_OUTPUT}
|
||||
- Analyze service + shared packages
|
||||
- Generate analysis-report.md
|
||||
|
||||
Then run Gear 2: Reverse Engineer
|
||||
- Extract business logic
|
||||
- Document all shared package dependencies
|
||||
- Create comprehensive documentation
|
||||
|
||||
Then run Gear 3: Create Specifications
|
||||
- Generate .specify/ structure
|
||||
- Create constitution
|
||||
- Generate feature specs
|
||||
|
||||
Save all results to:
|
||||
${SPEC_OUTPUT}/${service}/
|
||||
|
||||
When complete, create completion marker:
|
||||
${SPEC_OUTPUT}/${service}/.complete
|
||||
`
|
||||
}));
|
||||
|
||||
// Launch all 5 in parallel
|
||||
agents.forEach(agent => spawnAgent(agent));
|
||||
```
|
||||
|
||||
### Step 4: Progress Tracking
|
||||
|
||||
```bash
|
||||
# Create tracking directory
|
||||
mkdir -p ~/git/stackshift-batch-results
|
||||
|
||||
# Monitor progress
|
||||
while true; do
|
||||
COMPLETE=$(find ~/git/stackshift-batch-results -name ".complete" | wc -l)
|
||||
echo "Completed: $COMPLETE / $WIDGET_COUNT"
|
||||
|
||||
# Check if batch done
|
||||
if [ $COMPLETE -ge 5 ]; then
|
||||
echo "✅ Batch 1 complete"
|
||||
break
|
||||
fi
|
||||
|
||||
sleep 30
|
||||
done
|
||||
|
||||
# Start next batch...
|
||||
```
|
||||
|
||||
### Step 5: Result Aggregation
|
||||
|
||||
```bash
|
||||
# After all batches complete
|
||||
echo "=== Aggregating Results ==="
|
||||
|
||||
# Create master report
|
||||
cat > ~/git/stackshift-batch-results/BATCH_SUMMARY.md <<EOF
|
||||
# StackShift Batch Analysis Results
|
||||
|
||||
**Date:** $(date)
|
||||
**Widgets Analyzed:** $WIDGET_COUNT
|
||||
**Batches:** $(($WIDGET_COUNT / 5))
|
||||
**Total Time:** [calculated]
|
||||
|
||||
## Completion Status
|
||||
|
||||
$(for service in $(cat /tmp/services-to-analyze.txt); do
|
||||
service_name=$(basename $service)
|
||||
if [ -f ~/git/stackshift-batch-results/$service_name/.complete ]; then
|
||||
echo "- ✅ $service_name - Complete"
|
||||
else
|
||||
echo "- ❌ $service_name - Failed or incomplete"
|
||||
fi
|
||||
done)
|
||||
|
||||
## Results by Widget
|
||||
|
||||
$(for service in $(cat /tmp/services-to-analyze.txt); do
|
||||
service_name=$(basename $service)
|
||||
if [ -f ~/git/stackshift-batch-results/$service_name/.complete ]; then
|
||||
echo "### $service_name"
|
||||
echo ""
|
||||
echo "**Specs created:** $(find ~/git/stackshift-batch-results/$service_name/.specify/memory/specifications -name "*.md" 2>/dev/null | wc -l)"
|
||||
echo "**Modules analyzed:** $(cat ~/git/stackshift-batch-results/$service_name/.stackshift-state.json 2>/dev/null | jq -r '.metadata.modulesAnalyzed // 0')"
|
||||
echo ""
|
||||
fi
|
||||
done)
|
||||
|
||||
## Next Steps
|
||||
|
||||
All specifications are ready for review:
|
||||
- Review specs in each service's batch-results directory
|
||||
- Merge specs to actual repos if satisfied
|
||||
- Run Gears 4-6 as needed
|
||||
EOF
|
||||
|
||||
cat ~/git/stackshift-batch-results/BATCH_SUMMARY.md
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Result Structure
|
||||
|
||||
```
|
||||
~/git/stackshift-batch-results/
|
||||
├── BATCH_SUMMARY.md # Master summary
|
||||
├── batch-progress.json # Real-time tracking
|
||||
│
|
||||
├── service-user-api/
|
||||
│ ├── .complete # Marker file
|
||||
│ ├── .stackshift-state.json # State
|
||||
│ ├── analysis-report.md # Gear 1 output
|
||||
│ ├── docs/reverse-engineering/ # Gear 2 output
|
||||
│ │ ├── functional-specification.md
|
||||
│ │ ├── service-logic.md
|
||||
│ │ ├── modules/
|
||||
│ │ │ ├── shared-pricing-utils.md
|
||||
│ │ │ └── shared-discount-utils.md
|
||||
│ │ └── [7 more docs]
|
||||
│ └── .specify/ # Gear 3 output
|
||||
│ └── memory/
|
||||
│ ├── constitution.md
|
||||
│ └── specifications/
|
||||
│ ├── pricing-display.md
|
||||
│ ├── incentive-logic.md
|
||||
│ └── [more specs]
|
||||
│
|
||||
├── service-inventory/
|
||||
│ └── [same structure]
|
||||
│
|
||||
└── [88 more services...]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Monitoring Progress
|
||||
|
||||
**Real-time status:**
|
||||
|
||||
```bash
|
||||
# I'll show you periodic updates
|
||||
echo "=== Batch Progress ==="
|
||||
echo "Batch 1 (5 services): 3/5 complete"
|
||||
echo " ✅ service-user-api - Complete (12 min)"
|
||||
echo " ✅ service-inventory - Complete (8 min)"
|
||||
echo " ✅ service-contact - Complete (15 min)"
|
||||
echo " 🔄 service-search - Running (7 min elapsed)"
|
||||
echo " ⏳ service-pricing - Queued"
|
||||
echo ""
|
||||
echo "Estimated time remaining: 25 minutes"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Handling
|
||||
|
||||
**If a service fails:**
|
||||
```bash
|
||||
# Retry failed services
|
||||
failed_services=(service-search service-pricing)
|
||||
|
||||
for service in "${failed_services[@]}"; do
|
||||
echo "Retrying: $service"
|
||||
# Spawn new agent for retry
|
||||
done
|
||||
```
|
||||
|
||||
**Common failures:**
|
||||
- Missing package.json
|
||||
- Tests failing (can continue anyway)
|
||||
- Module source not found (prompt for location)
|
||||
|
||||
---
|
||||
|
||||
## Use Cases
|
||||
|
||||
**1. Entire monorepo migration:**
|
||||
```
|
||||
Analyze all 90+ ws-* services for migration planning
|
||||
↓
|
||||
Result: Complete business logic extracted from entire platform
|
||||
↓
|
||||
Use specs to plan Next.js migration strategy
|
||||
```
|
||||
|
||||
**2. Selective analysis:**
|
||||
```
|
||||
Analyze just the 10 high-priority services first
|
||||
↓
|
||||
Review results
|
||||
↓
|
||||
Then batch process remaining 80
|
||||
```
|
||||
|
||||
**3. Module analysis:**
|
||||
```
|
||||
cd ~/git/my-monorepo/services
|
||||
Analyze all shared packages (not services)
|
||||
↓
|
||||
Result: Shared module documentation
|
||||
↓
|
||||
Understand dependencies before service migration
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Configuration Options
|
||||
|
||||
I'll ask you to configure:
|
||||
|
||||
- **Repository list:** All in folder, or custom list?
|
||||
- **Batch size:** How many parallel (3/5/10)?
|
||||
- **Gears to run:** 1-3 only or full 1-6?
|
||||
- **Route:** Auto-detect or force specific route?
|
||||
- **Output location:** Central results dir or per-repo?
|
||||
- **Error handling:** Stop on failure or continue?
|
||||
|
||||
---
|
||||
|
||||
## Comparison with thoth-cli
|
||||
|
||||
**thoth-cli (Upgrades):**
|
||||
- Orchestrates 90+ service upgrades
|
||||
- 3 phases: coverage → discovery → implementation
|
||||
- Tracks in .upgrade-state.json
|
||||
- Parallel processing (2-5 at a time)
|
||||
|
||||
**StackShift Batch (Analysis):**
|
||||
- Orchestrates 90+ service analyses
|
||||
- 6 gears: analyze → reverse-engineer → create-specs → gap → clarify → implement
|
||||
- Tracks in .stackshift-state.json
|
||||
- Parallel processing (3-10 at a time)
|
||||
- Can output to central location
|
||||
|
||||
---
|
||||
|
||||
## Example Session
|
||||
|
||||
```
|
||||
You: "I want to analyze all Osiris services in ~/git/my-monorepo/services"
|
||||
|
||||
Me: "Found 92 services! Let me configure batch processing..."
|
||||
|
||||
[Asks questions via AskUserQuestion]
|
||||
- Process all 92? ✅
|
||||
- Batch size: 5
|
||||
- Gears: 1-3 (just analyze and spec, no implementation)
|
||||
- Output: Central results directory
|
||||
|
||||
Me: "Starting batch analysis..."
|
||||
|
||||
Batch 1 (5 services): service-user-api, service-inventory, service-contact, ws-inventory, service-pricing
|
||||
[Spawns 5 parallel agents using Task tool]
|
||||
|
||||
[15 minutes later]
|
||||
"Batch 1 complete! Starting batch 2..."
|
||||
|
||||
[3 hours later]
|
||||
"✅ All 92 services analyzed!
|
||||
|
||||
Results: ~/git/stackshift-batch-results/
|
||||
- 92 analysis reports
|
||||
- 92 sets of specifications
|
||||
- 890 total specs extracted
|
||||
- Multiple shared packages documented
|
||||
|
||||
Next: Review specs and begin migration planning"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Managing Batch Sessions
|
||||
|
||||
### View Current Batch Session
|
||||
|
||||
```bash
|
||||
# Check if batch session exists in current directory and view configuration
|
||||
if [ -f .stackshift-batch-session.json ]; then
|
||||
echo "📦 Active Batch Session in $(pwd)"
|
||||
cat .stackshift-batch-session.json | jq '.'
|
||||
else
|
||||
echo "No active batch session in current directory"
|
||||
fi
|
||||
```
|
||||
|
||||
### View All Batch Sessions
|
||||
|
||||
```bash
|
||||
# Find all active batch sessions
|
||||
echo "🔍 Finding all active batch sessions..."
|
||||
find ~/git -name ".stackshift-batch-session.json" -type f 2>/dev/null | while read session; do
|
||||
echo ""
|
||||
echo "📦 $(dirname $session)"
|
||||
cat "$session" | jq -r '" Route: \(.answers.route) | Repos: \(.processedRepos | length)/\(.totalRepos)"'
|
||||
done
|
||||
```
|
||||
|
||||
### Clear Batch Session
|
||||
|
||||
**After batch completes:**
|
||||
```bash
|
||||
# I'll ask you:
|
||||
# "Batch processing complete! Clear batch session? (Y/n)"
|
||||
|
||||
# If yes:
|
||||
rm .stackshift-batch-session.json
|
||||
echo "✅ Batch session cleared"
|
||||
|
||||
# If no:
|
||||
echo "✅ Batch session kept (will be used for next batch run in this directory)"
|
||||
```
|
||||
|
||||
**Manual clear (current directory):**
|
||||
```bash
|
||||
# Clear batch session in current directory
|
||||
rm .stackshift-batch-session.json
|
||||
```
|
||||
|
||||
**Manual clear (specific directory):**
|
||||
```bash
|
||||
# Clear batch session in specific directory
|
||||
rm ~/git/my-monorepo/services/.stackshift-batch-session.json
|
||||
```
|
||||
|
||||
**Why keep batch session?**
|
||||
- Run another batch with same configuration
|
||||
- Process more repos later in same directory
|
||||
- Continue interrupted batch
|
||||
- Consistent settings for related batches
|
||||
|
||||
**Why clear batch session?**
|
||||
- Done with current migration
|
||||
- Want different configuration for next batch
|
||||
- Starting fresh analysis
|
||||
- Free up directory for different batch type
|
||||
|
||||
---
|
||||
|
||||
## Batch Session Benefits
|
||||
|
||||
**Without batch session (old way):**
|
||||
```
|
||||
Batch 1: Answer 10 questions ⏱️ 2 min
|
||||
↓ Process 3 repos (15 min)
|
||||
|
||||
Batch 2: Answer 10 questions AGAIN ⏱️ 2 min
|
||||
↓ Process 3 repos (15 min)
|
||||
|
||||
Batch 3: Answer 10 questions AGAIN ⏱️ 2 min
|
||||
↓ Process 3 repos (15 min)
|
||||
|
||||
Total: 30 questions answered, 6 min wasted
|
||||
```
|
||||
|
||||
**With batch session (new way):**
|
||||
```
|
||||
Setup: Answer 10 questions ONCE ⏱️ 2 min
|
||||
↓ Batch 1: Process 3 repos (15 min)
|
||||
↓ Batch 2: Process 3 repos (15 min)
|
||||
↓ Batch 3: Process 3 repos (15 min)
|
||||
|
||||
Total: 10 questions answered, 0 min wasted
|
||||
Saved: 4 minutes per 9 repos processed
|
||||
```
|
||||
|
||||
**For 90 repos in batches of 3:**
|
||||
- Old way: 300 questions answered (60 min of clicking)
|
||||
- New way: 10 questions answered (2 min of clicking)
|
||||
- **Time saved: 58 minutes!** ⚡
|
||||
|
||||
---
|
||||
|
||||
**This batch processing system is perfect for:**
|
||||
- Monorepo migration (90+ services)
|
||||
- Multi-repo monorepo analysis
|
||||
- Department-wide code audits
|
||||
- Portfolio modernization projects
|
||||
Reference in New Issue
Block a user