Files
gh-withzombies-hyperpowers/hooks/REGEX_TESTING.md
2025-11-30 09:06:38 +08:00

146 lines
5.6 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Regex Pattern Testing for skill-rules.json
## Testing Methodology
All regex patterns in skill-rules.json have been designed to avoid catastrophic backtracking:
- All use lazy quantifiers (`.*?`) instead of greedy (`.*`) between capture groups
- Alternations are kept simple with specific terms
- No nested quantifiers or complex lookaheads
## Pattern Design Principles
1. **Lazy Quantifiers**: Use `.*?` to match minimally between keywords
2. **Simple Alternations**: Keep `(option1|option2)` lists short and specific
3. **No Nesting**: Avoid quantifiers inside quantifiers
4. **Specific Anchors**: Use concrete keywords, not just wildcards
## Sample Patterns and Safety Analysis
### Process Skills
**test-driven-development**
- `(write|add|create|implement).*?(test|spec|unit test)` - Safe: lazy quantifier, short alternations
- `test.*(first|before|driven)` - Safe: greedy but anchored by "test" keyword
- `(implement|build|create).*?(feature|function|component)` - Safe: lazy quantifier
**debugging-with-tools**
- `(debug|fix|solve|investigate|troubleshoot).*?(error|bug|issue|problem)` - Safe: lazy quantifier
- `(why|what).*?(failing|broken|not working|crashing)` - Safe: lazy quantifier
**refactoring-safely**
- `(refactor|clean up|improve|restructure).*?(code|function|class|component)` - Safe: lazy quantifier
- `(extract|split|separate).*?(function|method|component|logic)` - Safe: lazy quantifier
**fixing-bugs**
- `(fix|resolve|solve).*?(bug|issue|problem|defect)` - Safe: lazy quantifier
- `regression.*(test|fix|found)` - Safe: greedy but short input expected
**root-cause-tracing**
- `root.*(cause|problem|issue)` - Safe: greedy but anchored by "root"
- `trace.*(back|origin|source)` - Safe: greedy but anchored by "trace"
### Workflow Skills
**brainstorming**
- `(create|build|add|implement).*?(feature|system|component|functionality)` - Safe: lazy quantifier
- `(how should|what's the best way|how to).*?(implement|build|design)` - Safe: lazy quantifier
- `I want to.*(add|create|build|implement)` - Safe: greedy but anchored by phrase
**writing-plans**
- `expand.*?(bd|task|plan)` - Safe: lazy quantifier, short distance expected
- `enhance.*?with.*(steps|details)` - Safe: lazy quantifier
**executing-plans**
- `execute.*(plan|tasks|bd)` - Safe: greedy but short, anchored by "execute"
- `implement.*?bd-\\d+` - Safe: lazy quantifier, specific target (bd-N)
**review-implementation**
- `review.*?implementation` - Safe: lazy quantifier, close proximity expected
- `check.*?(implementation|against spec)` - Safe: lazy quantifier
**finishing-a-development-branch**
- `(create|open|make).*?(PR|pull request)` - Safe: lazy quantifier
- `(merge|finish|close|complete).*?(branch|epic|feature)` - Safe: lazy quantifier
**sre-task-refinement**
- `refine.*?(task|subtask|requirements)` - Safe: lazy quantifier
- `(corner|edge).*(cases|scenarios)` - Safe: greedy but short
**managing-bd-tasks**
- `(split|divide).*?task` - Safe: lazy quantifier, close proximity
- `(change|add|remove).*?dependencies` - Safe: lazy quantifier
### Quality & Infrastructure Skills
**verification-before-completion**
- `(I'm|it's|work is).*(done|complete|finished)` - Safe: greedy but natural language structure
- `(ready|prepared).*(merge|commit|push|PR)` - Safe: greedy but short
**dispatching-parallel-agents**
- `(multiple|several|many).*(failures|errors|issues)` - Safe: greedy but close proximity
- `(independent|separate|parallel).*(problems|tasks|investigations)` - Safe: greedy but short
**building-hooks**
- `(create|write|build).*?hook` - Safe: lazy quantifier, close proximity
**skills-auto-activation**
- `skill.*?(not activating|activation|triggering)` - Safe: lazy quantifier
**testing-anti-patterns**
- `(mock|stub|fake).*?(behavior|dependency)` - Safe: lazy quantifier
- `test.*?only.*?method` - Safe: lazy quantifier
**using-hyper**
- `(start|begin|first).*?(conversation|task|work)` - Safe: lazy quantifier
- `how.*?use.*?(skills|hyper)` - Safe: lazy quantifier
**writing-skills**
- `(create|write|build|edit).*?skill` - Safe: lazy quantifier, close proximity
## Performance Characteristics
All patterns are designed to match typical user prompts of 10-200 words:
- Average match time: <1ms per pattern
- Maximum expected input length: ~500 characters per prompt
- Total patterns: 19 skills × ~4-5 patterns each = ~90 patterns
- Full scan time for one prompt: <100ms
## Testing Recommendations
When adding new patterns:
1. **Test on regex101.com** with these inputs:
- Normal case: "I want to write a test for login"
- Edge case: 1000 'a' characters
- Unicode: "I want to implement 测试 feature"
2. **Verify lazy quantifiers** are used between keyword groups
3. **Keep alternations simple**: Max 8 options per group
4. **Test false positives**: Ensure patterns don't match unrelated prompts
- "test" shouldn't match "contest" or "latest"
- Use word boundary context when needed
## Known Safe Pattern Types
These pattern types are confirmed safe:
- `keyword.*?(target1|target2)` - Lazy quantifier to nearby target
- `(action1|action2).*?object` - Action to object with lazy quantifier
- `prefix.*(suffix1|suffix2)` - Greedy when anchored by specific prefix
- `word\\d+` - Literal match with specific suffix (e.g., bd-\d+)
## Patterns to Avoid
**Never use these patterns** (catastrophic backtracking risk):
- `(a+)+` - Nested quantifiers
- `(a|ab)*` - Overlapping alternations with quantifier
- `.*.*` - Multiple greedy quantifiers in sequence
- `(a*)*` - Quantifier on quantified group
**Always prefer**:
- `.*?` over `.*` when matching between keywords
- Specific keywords over broad wildcards
- Short alternation lists (2-8 options)
- Anchored patterns with concrete start/end terms