9.4 KiB
Component Extraction
This resource supports Step 3 of the Skill Creator workflow.
Input files: $SESSION_DIR/global-context.md, $SESSION_DIR/step-2-output.md, $SOURCE_DOC (section-by-section reading)
Output files: $SESSION_DIR/step-3-output.md, $SESSION_DIR/step-3-extraction-workspace.md (intermediate), updates global-context.md
Stage goal: Extract all actionable components systematically using the interpretive reading approach.
Why Reading Strategy
WHY Strategy Selection Matters
Different document sizes and structures need different reading approaches:
- Too large to read at once: Context overflow, can't hold everything in memory
- Too structured for linear reading: Miss relationships between non-adjacent sections
- Too dense for single pass: Need multiple focused extractions
Mental model: You wouldn't read a 500-page book the same way you read a 10-page article. Match your reading strategy to the document characteristics.
Without a deliberate strategy: context overflow, missed content, inefficient extraction, fatigue.
WHAT Strategies Exist
Choose based on document characteristics identified in Steps 1-2:
Three strategies:
- Section-Based: Document has clear sections < 50 pages. Read one section, extract, write notes, clear, repeat.
- Windowing: Long document > 50 pages without clear breaks. Read 200-line chunks with 20-line overlap.
- Targeted: Hybrid content. Based on Step 2, read only high-value sections identified.
WHAT to Decide
Present options to user:
## Reading Strategy Options
Based on document analysis:
- Size: [X pages/lines]
- Structure: [Clear sections / Continuous flow]
- Relevant parts: [All / Specific sections]
**Recommended strategy:** [Section-based / Windowing / Targeted]
**Rationale:** [Why this strategy fits]
**Alternative approach:** [If applicable]
Which approach would you like to use?
Section-Based Extraction
WHY Programmatic Approach
Reading entire document into context:
- Floods context with potentially thousands of lines
- Makes it hard to focus on one section at a time
- Risk of losing extracted content before writing it down
Solution: Read one section, extract components, write to intermediate file, clear from context, repeat.
WHAT to Do
Step 1: Read Global Context and Previous Output
# Read what we know so far
Read("$SESSION_DIR/global-context.md")
Read("$SESSION_DIR/step-2-output.md")
From step-2-output, you'll have the list of major parts/sections.
Step 2: Initialize Extraction File
# Create extraction workspace
cat > "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'
# Component Extraction Workspace
## Sections Processed
[Mark sections as you complete them]
## Extracted Components
[Append after each section]
EOF
Step 3: Process Each Section
For each section from step-2-output:
# Example for Section 1
# Read just that section from source document
Read("$SOURCE_DOC", offset=[start_line], limit=[section_length])
# Extract components from this section following the guides below:
# - Extract Terms (see Why Extract Terms section)
# - Extract Propositions (see Why Extract Propositions section)
# - Extract Arguments (see Why Extract Arguments section)
# - Extract Solutions (see Why Extract Solutions section)
# Write extraction notes to workspace
cat >> "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'
### Section [X]: [Section Name]
**Terms extracted:**
- [Term 1]: [Definition]
- [Term 2]: [Definition]
**Propositions extracted:**
- [Proposition 1]
- [Proposition 2]
**Arguments extracted:**
- [Argument 1]
**Solutions/Examples:**
- [Example 1]
EOF
# Mark section as complete
echo "- [x] Section [X]: [Name]" >> "$SESSION_DIR/step-3-extraction-workspace.md"
# Clear this section from context (it's now in the file)
# Move to next section
Repeat for all sections.
Step 4: Synthesize Extraction Notes
After all sections processed:
# Read the extraction workspace
Read("$SESSION_DIR/step-3-extraction-workspace.md")
# Synthesize into final step-3-output
# Combine all terms, remove duplicates
# Combine all propositions, identify core ones
# Combine all arguments, identify workflow sequences
# Combine all solutions/examples
Write final output: (see end of this file for output template)
Why Extract Terms
WHY Key Terms Matter
Terms are the building blocks of understanding:
- They're the specialized vocabulary of the methodology
- Define the conceptual framework
- Must be understood to apply the skill
- Become the "Key Concepts" section of your skill
Adler's rule: "Come to terms with the author by interpreting key words."
Mental model: Terms are like the variables in an equation. You can't solve the equation without knowing what the variables mean.
Without term extraction: users misunderstand the skill, can't apply it correctly, confusion about core concepts.
WHAT to Extract
Look for: Defined terms, repeated concepts, technical vocabulary, emphasized terms. Skip: Common words, one-off mentions.
Format per term: Name, Definition, Context (why it matters), Usage (how used in practice).
How many: 5-15 terms. Test: Would users be confused without this term?
Why Extract Propositions
WHY Propositions Matter
Propositions are the key assertions or principles:
- They're the "truths" the author claims
- Form the theoretical foundation
- Often become guidelines or principles in your skill
- Different from process steps (those come from arguments)
Adler's rule: "Grasp the author's leading propositions by dealing with important sentences."
Mental model: Propositions are like theorems in mathematics - fundamental truths that everything else builds on.
Without proposition extraction: shallow skill that misses underlying principles, can't explain WHY the method works.
WHAT to Extract
Look for: Declarative/principle/causal statements. Signal phrases: "key insight", "research shows", "fundamental principle".
Format per proposition: Short title, Statement (one sentence), Evidence (why true), Implication (for practice).
How many: 5-10 core propositions that explain why methodology works.
Why Extract Arguments
WHY Arguments Matter
Arguments are the logical sequences that connect premises to conclusions:
- They explain HOW the methodology works
- Often become the step-by-step workflow
- Show dependencies and order
- Reveal decision points
Adler's rule: "Know the author's arguments by finding them in sequences of sentences."
Mental model: Arguments are like algorithms - step-by-step logic from inputs to outputs.
Without argument extraction: missing the procedural flow, unclear how to apply the methodology, no decision logic.
WHAT to Extract
Look for: If-then sequences, step sequences, causal chains, decision trees. Signal phrases: "the process is", "follow these steps", "this leads to".
Format per argument: Name, Premise, Sequence (numbered steps), Conclusion, Decision points.
Map to workflow: Sequential → linear steps; Decision-tree → branching; Parallel → optional/modular.
Why Extract Solutions
WHY Solutions Matter
Solutions are the practical applications and outcomes:
- They show what success looks like
- Often become examples in your skill
- Reveal edge cases and variations
- Demonstrate application in different contexts
Adler's rule: "Determine which of the problems the author has solved, and which they have not."
Mental model: Solutions are the "proof" that the methodology works - concrete instances of application.
Without solution extraction: theoretical skill without practical grounding, unclear what success looks like, no examples.
WHAT to Extract
Look for: Examples, case studies, templates, before/after, success criteria. Types: worked examples, templates, checklists, success indicators.
Format per solution: Name, Problem, Application (how methodology applied), Outcome, Key factors, Transferability.
For templates: Name, Purpose, Structure outline, How to use (steps), Example if available.
Write Step 3 Output
After completing extraction from all sections and getting user validation, write to output file:
cat > "$SESSION_DIR/step-3-output.md" << 'EOF'
# Step 3: Component Extraction Output
## Key Terms (5-15)
### [Term 1]
**Definition:** [Clear definition]
**Context:** [Why it matters]
### [Term 2]
...
## Core Propositions (5-10)
1. **[Proposition title]:** [Statement]
- Evidence: [Support]
- Implication: [For practice]
2. **[Proposition 2]:** ...
## Arguments & Sequences
### Argument 1: [Name]
**Premise:** [Starting condition]
**Sequence:**
1. [Step 1]
2. [Step 2]
**Conclusion:** [Result]
**Decision points:** [Choices]
### Argument 2: ...
## Solutions & Examples
### Example 1: [Name]
**Problem:** [Context]
**Application:** [How applied]
**Outcome:** [Result]
### Example 2: ...
## Gaps Identified
- [Gap 1 to address in synthesis]
- [Gap 2]
## User Validation
**Status:** [Approved / Needs revision]
**User notes:** [Feedback]
EOF
Update global context:
cat >> "$SESSION_DIR/global-context.md" << 'EOF'
## Step 3 Complete
**Components extracted:** [X terms, Y propositions, Z arguments, W examples]
**Gaps identified:** [List if any]
EOF
Next step: Step 4 (Synthesis) will read global-context.md + step-3-output.md.