Files
gh-lyndonkl-claude/skills/skill-creator/resources/component-extraction.md
2025-11-30 08:38:26 +08:00

341 lines
9.4 KiB
Markdown

# Component Extraction
This resource supports **Step 3** of the Skill Creator workflow.
**Input files:** `$SESSION_DIR/global-context.md`, `$SESSION_DIR/step-2-output.md`, `$SOURCE_DOC` (section-by-section reading)
**Output files:** `$SESSION_DIR/step-3-output.md`, `$SESSION_DIR/step-3-extraction-workspace.md` (intermediate), updates `global-context.md`
**Stage goal:** Extract all actionable components systematically using the interpretive reading approach.
---
## Why Reading Strategy
### WHY Strategy Selection Matters
Different document sizes and structures need different reading approaches:
- **Too large to read at once:** Context overflow, can't hold everything in memory
- **Too structured for linear reading:** Miss relationships between non-adjacent sections
- **Too dense for single pass:** Need multiple focused extractions
**Mental model:** You wouldn't read a 500-page book the same way you read a 10-page article. Match your reading strategy to the document characteristics.
Without a deliberate strategy: context overflow, missed content, inefficient extraction, fatigue.
### WHAT Strategies Exist
Choose based on document characteristics identified in Steps 1-2:
**Three strategies:**
1. **Section-Based:** Document has clear sections < 50 pages. Read one section, extract, write notes, clear, repeat.
2. **Windowing:** Long document > 50 pages without clear breaks. Read 200-line chunks with 20-line overlap.
3. **Targeted:** Hybrid content. Based on Step 2, read only high-value sections identified.
---
### WHAT to Decide
**Present options to user:**
```markdown
## Reading Strategy Options
Based on document analysis:
- Size: [X pages/lines]
- Structure: [Clear sections / Continuous flow]
- Relevant parts: [All / Specific sections]
**Recommended strategy:** [Section-based / Windowing / Targeted]
**Rationale:** [Why this strategy fits]
**Alternative approach:** [If applicable]
Which approach would you like to use?
```
---
## Section-Based Extraction
### WHY Programmatic Approach
Reading entire document into context:
- Floods context with potentially thousands of lines
- Makes it hard to focus on one section at a time
- Risk of losing extracted content before writing it down
**Solution:** Read one section, extract components, write to intermediate file, clear from context, repeat.
### WHAT to Do
#### Step 1: Read Global Context and Previous Output
```bash
# Read what we know so far
Read("$SESSION_DIR/global-context.md")
Read("$SESSION_DIR/step-2-output.md")
```
From step-2-output, you'll have the list of major parts/sections.
#### Step 2: Initialize Extraction File
```bash
# Create extraction workspace
cat > "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'
# Component Extraction Workspace
## Sections Processed
[Mark sections as you complete them]
## Extracted Components
[Append after each section]
EOF
```
#### Step 3: Process Each Section
**For each section from step-2-output:**
```bash
# Example for Section 1
# Read just that section from source document
Read("$SOURCE_DOC", offset=[start_line], limit=[section_length])
# Extract components from this section following the guides below:
# - Extract Terms (see Why Extract Terms section)
# - Extract Propositions (see Why Extract Propositions section)
# - Extract Arguments (see Why Extract Arguments section)
# - Extract Solutions (see Why Extract Solutions section)
# Write extraction notes to workspace
cat >> "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'
### Section [X]: [Section Name]
**Terms extracted:**
- [Term 1]: [Definition]
- [Term 2]: [Definition]
**Propositions extracted:**
- [Proposition 1]
- [Proposition 2]
**Arguments extracted:**
- [Argument 1]
**Solutions/Examples:**
- [Example 1]
EOF
# Mark section as complete
echo "- [x] Section [X]: [Name]" >> "$SESSION_DIR/step-3-extraction-workspace.md"
# Clear this section from context (it's now in the file)
# Move to next section
```
**Repeat for all sections.**
#### Step 4: Synthesize Extraction Notes
After all sections processed:
```bash
# Read the extraction workspace
Read("$SESSION_DIR/step-3-extraction-workspace.md")
# Synthesize into final step-3-output
# Combine all terms, remove duplicates
# Combine all propositions, identify core ones
# Combine all arguments, identify workflow sequences
# Combine all solutions/examples
```
**Write final output:** (see end of this file for output template)
---
## Why Extract Terms
### WHY Key Terms Matter
Terms are the building blocks of understanding:
- They're the **specialized vocabulary** of the methodology
- Define the conceptual framework
- Must be understood to apply the skill
- Become the "Key Concepts" section of your skill
**Adler's rule:** "Come to terms with the author by interpreting key words."
**Mental model:** Terms are like the variables in an equation. You can't solve the equation without knowing what the variables mean.
Without term extraction: users misunderstand the skill, can't apply it correctly, confusion about core concepts.
### WHAT to Extract
**Look for:** Defined terms, repeated concepts, technical vocabulary, emphasized terms. **Skip:** Common words, one-off mentions.
**Format per term:** Name, Definition, Context (why it matters), Usage (how used in practice).
**How many:** 5-15 terms. Test: Would users be confused without this term?
---
## Why Extract Propositions
### WHY Propositions Matter
Propositions are the **key assertions** or principles:
- They're the "truths" the author claims
- Form the theoretical foundation
- Often become guidelines or principles in your skill
- Different from process steps (those come from arguments)
**Adler's rule:** "Grasp the author's leading propositions by dealing with important sentences."
**Mental model:** Propositions are like theorems in mathematics - fundamental truths that everything else builds on.
Without proposition extraction: shallow skill that misses underlying principles, can't explain WHY the method works.
### WHAT to Extract
**Look for:** Declarative/principle/causal statements. Signal phrases: "key insight", "research shows", "fundamental principle".
**Format per proposition:** Short title, Statement (one sentence), Evidence (why true), Implication (for practice).
**How many:** 5-10 core propositions that explain why methodology works.
---
## Why Extract Arguments
### WHY Arguments Matter
Arguments are the **logical sequences** that connect premises to conclusions:
- They explain HOW the methodology works
- Often become the step-by-step workflow
- Show dependencies and order
- Reveal decision points
**Adler's rule:** "Know the author's arguments by finding them in sequences of sentences."
**Mental model:** Arguments are like algorithms - step-by-step logic from inputs to outputs.
Without argument extraction: missing the procedural flow, unclear how to apply the methodology, no decision logic.
### WHAT to Extract
**Look for:** If-then sequences, step sequences, causal chains, decision trees. Signal phrases: "the process is", "follow these steps", "this leads to".
**Format per argument:** Name, Premise, Sequence (numbered steps), Conclusion, Decision points.
**Map to workflow:** Sequential → linear steps; Decision-tree → branching; Parallel → optional/modular.
---
## Why Extract Solutions
### WHY Solutions Matter
Solutions are the **practical applications** and outcomes:
- They show what success looks like
- Often become examples in your skill
- Reveal edge cases and variations
- Demonstrate application in different contexts
**Adler's rule:** "Determine which of the problems the author has solved, and which they have not."
**Mental model:** Solutions are the "proof" that the methodology works - concrete instances of application.
Without solution extraction: theoretical skill without practical grounding, unclear what success looks like, no examples.
### WHAT to Extract
**Look for:** Examples, case studies, templates, before/after, success criteria. Types: worked examples, templates, checklists, success indicators.
**Format per solution:** Name, Problem, Application (how methodology applied), Outcome, Key factors, Transferability.
**For templates:** Name, Purpose, Structure outline, How to use (steps), Example if available.
---
## Write Step 3 Output
After completing extraction from all sections and getting user validation, write to output file:
```bash
cat > "$SESSION_DIR/step-3-output.md" << 'EOF'
# Step 3: Component Extraction Output
## Key Terms (5-15)
### [Term 1]
**Definition:** [Clear definition]
**Context:** [Why it matters]
### [Term 2]
...
## Core Propositions (5-10)
1. **[Proposition title]:** [Statement]
- Evidence: [Support]
- Implication: [For practice]
2. **[Proposition 2]:** ...
## Arguments & Sequences
### Argument 1: [Name]
**Premise:** [Starting condition]
**Sequence:**
1. [Step 1]
2. [Step 2]
**Conclusion:** [Result]
**Decision points:** [Choices]
### Argument 2: ...
## Solutions & Examples
### Example 1: [Name]
**Problem:** [Context]
**Application:** [How applied]
**Outcome:** [Result]
### Example 2: ...
## Gaps Identified
- [Gap 1 to address in synthesis]
- [Gap 2]
## User Validation
**Status:** [Approved / Needs revision]
**User notes:** [Feedback]
EOF
```
**Update global context:**
```bash
cat >> "$SESSION_DIR/global-context.md" << 'EOF'
## Step 3 Complete
**Components extracted:** [X terms, Y propositions, Z arguments, W examples]
**Gaps identified:** [List if any]
EOF
```
**Next step:** Step 4 (Synthesis) will read `global-context.md` + `step-3-output.md`.