Files

Zhongwei Li 41d9f6b189 Initial commit

2025-11-30 08:38:26 +08:00

9.4 KiB

Raw Permalink Blame History

Component Extraction

This resource supports Step 3 of the Skill Creator workflow.

Input files: $SESSION_DIR/global-context.md, $SESSION_DIR/step-2-output.md, $SOURCE_DOC (section-by-section reading) Output files: $SESSION_DIR/step-3-output.md, $SESSION_DIR/step-3-extraction-workspace.md (intermediate), updates global-context.md

Stage goal: Extract all actionable components systematically using the interpretive reading approach.

Why Reading Strategy

WHY Strategy Selection Matters

Different document sizes and structures need different reading approaches:

Too large to read at once: Context overflow, can't hold everything in memory
Too structured for linear reading: Miss relationships between non-adjacent sections
Too dense for single pass: Need multiple focused extractions

Mental model: You wouldn't read a 500-page book the same way you read a 10-page article. Match your reading strategy to the document characteristics.

Without a deliberate strategy: context overflow, missed content, inefficient extraction, fatigue.

WHAT Strategies Exist

Choose based on document characteristics identified in Steps 1-2:

Three strategies:

Section-Based: Document has clear sections < 50 pages. Read one section, extract, write notes, clear, repeat.
Windowing: Long document > 50 pages without clear breaks. Read 200-line chunks with 20-line overlap.
Targeted: Hybrid content. Based on Step 2, read only high-value sections identified.

WHAT to Decide

Present options to user:

## Reading Strategy Options

Based on document analysis:
- Size: [X pages/lines]
- Structure: [Clear sections / Continuous flow]
- Relevant parts: [All / Specific sections]

**Recommended strategy:** [Section-based / Windowing / Targeted]

**Rationale:** [Why this strategy fits]

**Alternative approach:** [If applicable]

Which approach would you like to use?

Section-Based Extraction

WHY Programmatic Approach

Reading entire document into context:

Floods context with potentially thousands of lines
Makes it hard to focus on one section at a time
Risk of losing extracted content before writing it down

Solution: Read one section, extract components, write to intermediate file, clear from context, repeat.

WHAT to Do

Step 1: Read Global Context and Previous Output

# Read what we know so far
Read("$SESSION_DIR/global-context.md")
Read("$SESSION_DIR/step-2-output.md")

From step-2-output, you'll have the list of major parts/sections.

Step 2: Initialize Extraction File

# Create extraction workspace
cat > "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'
# Component Extraction Workspace

## Sections Processed

[Mark sections as you complete them]

## Extracted Components

[Append after each section]

EOF

Step 3: Process Each Section

For each section from step-2-output:

# Example for Section 1
# Read just that section from source document
Read("$SOURCE_DOC", offset=[start_line], limit=[section_length])

# Extract components from this section following the guides below:
# - Extract Terms (see Why Extract Terms section)
# - Extract Propositions (see Why Extract Propositions section)
# - Extract Arguments (see Why Extract Arguments section)
# - Extract Solutions (see Why Extract Solutions section)

# Write extraction notes to workspace
cat >> "$SESSION_DIR/step-3-extraction-workspace.md" << 'EOF'

### Section [X]: [Section Name]

**Terms extracted:**
- [Term 1]: [Definition]
- [Term 2]: [Definition]

**Propositions extracted:**
- [Proposition 1]
- [Proposition 2]

**Arguments extracted:**
- [Argument 1]

**Solutions/Examples:**
- [Example 1]

EOF

# Mark section as complete
echo "- [x] Section [X]: [Name]" >> "$SESSION_DIR/step-3-extraction-workspace.md"

# Clear this section from context (it's now in the file)
# Move to next section

Repeat for all sections.

Step 4: Synthesize Extraction Notes

After all sections processed:

# Read the extraction workspace
Read("$SESSION_DIR/step-3-extraction-workspace.md")

# Synthesize into final step-3-output
# Combine all terms, remove duplicates
# Combine all propositions, identify core ones
# Combine all arguments, identify workflow sequences
# Combine all solutions/examples

Write final output: (see end of this file for output template)

Why Extract Terms

WHY Key Terms Matter

Terms are the building blocks of understanding:

They're the specialized vocabulary of the methodology
Define the conceptual framework
Must be understood to apply the skill
Become the "Key Concepts" section of your skill

Adler's rule: "Come to terms with the author by interpreting key words."

Mental model: Terms are like the variables in an equation. You can't solve the equation without knowing what the variables mean.

Without term extraction: users misunderstand the skill, can't apply it correctly, confusion about core concepts.

WHAT to Extract

Look for: Defined terms, repeated concepts, technical vocabulary, emphasized terms. Skip: Common words, one-off mentions.

Format per term: Name, Definition, Context (why it matters), Usage (how used in practice).

How many: 5-15 terms. Test: Would users be confused without this term?

Why Extract Propositions

WHY Propositions Matter

Propositions are the key assertions or principles:

They're the "truths" the author claims
Form the theoretical foundation
Often become guidelines or principles in your skill
Different from process steps (those come from arguments)

Adler's rule: "Grasp the author's leading propositions by dealing with important sentences."

Mental model: Propositions are like theorems in mathematics - fundamental truths that everything else builds on.

Without proposition extraction: shallow skill that misses underlying principles, can't explain WHY the method works.

WHAT to Extract

Look for: Declarative/principle/causal statements. Signal phrases: "key insight", "research shows", "fundamental principle".

Format per proposition: Short title, Statement (one sentence), Evidence (why true), Implication (for practice).

How many: 5-10 core propositions that explain why methodology works.

Why Extract Arguments

WHY Arguments Matter

Arguments are the logical sequences that connect premises to conclusions:

They explain HOW the methodology works
Often become the step-by-step workflow
Show dependencies and order
Reveal decision points

Adler's rule: "Know the author's arguments by finding them in sequences of sentences."

Mental model: Arguments are like algorithms - step-by-step logic from inputs to outputs.

Without argument extraction: missing the procedural flow, unclear how to apply the methodology, no decision logic.

WHAT to Extract

Look for: If-then sequences, step sequences, causal chains, decision trees. Signal phrases: "the process is", "follow these steps", "this leads to".

Format per argument: Name, Premise, Sequence (numbered steps), Conclusion, Decision points.

Map to workflow: Sequential → linear steps; Decision-tree → branching; Parallel → optional/modular.

Why Extract Solutions

WHY Solutions Matter

Solutions are the practical applications and outcomes:

They show what success looks like
Often become examples in your skill
Reveal edge cases and variations
Demonstrate application in different contexts

Adler's rule: "Determine which of the problems the author has solved, and which they have not."

Mental model: Solutions are the "proof" that the methodology works - concrete instances of application.

Without solution extraction: theoretical skill without practical grounding, unclear what success looks like, no examples.

WHAT to Extract

Look for: Examples, case studies, templates, before/after, success criteria. Types: worked examples, templates, checklists, success indicators.

Format per solution: Name, Problem, Application (how methodology applied), Outcome, Key factors, Transferability.

For templates: Name, Purpose, Structure outline, How to use (steps), Example if available.

Write Step 3 Output

After completing extraction from all sections and getting user validation, write to output file:

cat > "$SESSION_DIR/step-3-output.md" << 'EOF'
# Step 3: Component Extraction Output

## Key Terms (5-15)

### [Term 1]
**Definition:** [Clear definition]
**Context:** [Why it matters]

### [Term 2]
...

## Core Propositions (5-10)

1. **[Proposition title]:** [Statement]
   - Evidence: [Support]
   - Implication: [For practice]

2. **[Proposition 2]:** ...

## Arguments & Sequences

### Argument 1: [Name]
**Premise:** [Starting condition]
**Sequence:**
1. [Step 1]
2. [Step 2]
**Conclusion:** [Result]
**Decision points:** [Choices]

### Argument 2: ...

## Solutions & Examples

### Example 1: [Name]
**Problem:** [Context]
**Application:** [How applied]
**Outcome:** [Result]

### Example 2: ...

## Gaps Identified

- [Gap 1 to address in synthesis]
- [Gap 2]

## User Validation

**Status:** [Approved / Needs revision]
**User notes:** [Feedback]

EOF

Update global context:

cat >> "$SESSION_DIR/global-context.md" << 'EOF'

## Step 3 Complete

**Components extracted:** [X terms, Y propositions, Z arguments, W examples]
**Gaps identified:** [List if any]

EOF

Next step: Step 4 (Synthesis) will read global-context.md + step-3-output.md.

9.4 KiB Raw Permalink Blame History

Component Extraction

Why Reading Strategy

WHY Strategy Selection Matters

WHAT Strategies Exist

WHAT to Decide

Section-Based Extraction

WHY Programmatic Approach

WHAT to Do

Step 1: Read Global Context and Previous Output

Step 2: Initialize Extraction File

Step 3: Process Each Section

Step 4: Synthesize Extraction Notes

Why Extract Terms

WHY Key Terms Matter

WHAT to Extract

Why Extract Propositions

WHY Propositions Matter

WHAT to Extract

Why Extract Arguments

WHY Arguments Matter

WHAT to Extract

Why Extract Solutions

WHY Solutions Matter

WHAT to Extract

Write Step 3 Output

9.4 KiB

Raw Permalink Blame History