Files
2025-11-30 08:42:59 +08:00

501 lines
15 KiB
Markdown

---
description: Reflect on previus response and output, based on Self-refinement framework for iterative improvement with complexity triage and verification
argument-hint: None required - automatically reviews recent work output
---
# Self-Refinement and Iterative Improvement Framework
Reflect on previus response and output.
## TASK COMPLEXITY TRIAGE
First, categorize the task to apply appropriate reflection depth:
### Quick Path (5-second check)
For simple tasks like:
- Single file edits
- Documentation updates
- Simple queries or explanations
- Straightforward bug fixes
**Skip to "Final Verification" section**
### Standard Path (Full reflection)
For tasks involving:
- Multiple file changes
- New feature implementation
- Architecture decisions
- Complex problem solving
**Follow complete framework + require confidence >70%**
### Deep Reflection Path
For critical tasks:
- Core system changes
- Security-related code
- Performance-critical sections
- API design decisions
**Follow framework + require confidence >90%**
## IMMEDIATE REFLECTION PROTOCOL
### Step 1: Initial Assessment
Before proceeding, evaluate your most recent output against these criteria:
1. **Completeness Check**
- [ ] Does the solution fully address the user's request?
- [ ] Are all requirements explicitly mentioned by the user covered?
- [ ] Are there any implicit requirements that should be addressed?
2. **Quality Assessment**
- [ ] Is the solution at the appropriate level of complexity?
- [ ] Could the approach be simplified without losing functionality?
- [ ] Are there obvious improvements that could be made?
3. **Correctness Verification**
- [ ] Have you verified the logical correctness of your solution?
- [ ] Are there edge cases that haven't been considered?
- [ ] Could there be unintended side effects?
4. **Fact-Checking Required**
- [ ] Have you made any claims about performance? (needs verification)
- [ ] Have you stated any technical facts? (needs source/verification)
- [ ] Have you referenced best practices? (needs validation)
- [ ] Have you made security assertions? (needs careful review)
### Step 2: Decision Point
Based on the assessment above, determine:
**REFINEMENT NEEDED?** [YES/NO]
If YES, proceed to Step 3. If NO, skip to Final Verification.
### Step 3: Refinement Planning
If improvement is needed, generate a specific plan:
1. **Identify Issues** (List specific problems found)
- Issue 1: [Describe]
- Issue 2: [Describe]
- ...
2. **Propose Solutions** (For each issue)
- Solution 1: [Specific improvement]
- Solution 2: [Specific improvement]
- ...
3. **Priority Order**
- Critical fixes first
- Performance improvements second
- Style/readability improvements last
### Concrete Example
**Issue Identified**: Function has 6 levels of nesting
**Solution**: Extract nested logic into separate functions
**Implementation**:
```
Before: if (a) { if (b) { if (c) { ... } } }
After: if (!shouldProcess(a, b, c)) return;
processData();
```
## CODE-SPECIFIC REFLECTION CRITERIA
When the output involves code, additionally evaluate:
### STOP: Library & Existing Solution Check
**BEFORE PROCEEDING WITH CUSTOM CODE:**
1. **Search for Existing Libraries**
- [ ] Have you searched npm/PyPI/Maven for existing solutions?
- [ ] Is this a common problem that others have already solved?
- [ ] Are you reinventing the wheel for utility functions?
**Common areas to check:**
- Date/time manipulation → moment.js, date-fns, dayjs
- Form validation → joi, yup, zod
- HTTP requests → axios, fetch, got
- State management → Redux, MobX, Zustand
- Utility functions → lodash, ramda, underscore
2. **Existing Service/Solution Evaluation**
- [ ] Could this be handled by an existing service/SaaS?
- [ ] Is there an open-source solution that fits?
- [ ] Would a third-party API be more maintainable?
**Examples:**
- Authentication → Auth0, Supabase, Firebase Auth
- Email sending → SendGrid, Mailgun, AWS SES
- File storage → S3, Cloudinary, Firebase Storage
- Search → Elasticsearch, Algolia, MeiliSearch
- Queue/Jobs → Bull, RabbitMQ, AWS SQS
3. **Decision Framework**
```
IF common utility function → Use established library
ELSE IF complex domain-specific → Check for specialized libraries
ELSE IF infrastructure concern → Look for managed services
ELSE → Consider custom implementation
```
4. **When Custom Code IS Justified**
- Specific business logic unique to your domain
- Performance-critical paths with special requirements
- When external dependencies would be overkill (e.g., lodash for one function)
- Security-sensitive code requiring full control
- When existing solutions don't meet requirements after evaluation
### Real Examples of Library-First Approach
**❌ BAD: Custom Implementation**
```javascript
// utils/dateFormatter.js
function formatDate(date) {
const d = new Date(date);
return `${d.getMonth()+1}/${d.getDate()}/${d.getFullYear()}`;
}
```
**✅ GOOD: Use Existing Library**
```javascript
import { format } from 'date-fns';
const formatted = format(new Date(), 'MM/dd/yyyy');
```
**❌ BAD: Generic Utilities Folder**
```
/src/utils/
- helpers.js
- common.js
- shared.js
```
**✅ GOOD: Domain-Driven Structure**
```
/src/order/
- domain/OrderCalculator.js
- infrastructure/OrderRepository.js
/src/user/
- domain/UserValidator.js
- application/UserRegistrationService.js
```
### Common Anti-Patterns to Avoid
1. **NIH (Not Invented Here) Syndrome**
- Building custom auth when Auth0/Supabase exists
- Writing custom state management instead of using Redux/Zustand
- Creating custom form validation instead of using Formik/React Hook Form
2. **Poor Architectural Choices**
- Mixing business logic with UI components
- Database queries in controllers
- No clear separation of concerns
3. **Generic Naming Anti-Patterns**
- `utils.js` with 50 unrelated functions
- `helpers/misc.js` as a dumping ground
- `common/shared.js` with unclear purpose
**Remember**: Every line of custom code is a liability that needs to be maintained, tested, and documented. Use existing solutions whenever possible.
### Architecture and Design
1. **Clean Architecture & DDD Alignment**
- [ ] Does naming follow ubiquitous language of the domain?
- [ ] Are domain entities separated from infrastructure?
- [ ] Is business logic independent of frameworks?
- [ ] Are use cases clearly defined and isolated?
**Naming Convention Check:**
- Avoid generic names: `utils`, `helpers`, `common`, `shared`
- Use domain-specific names: `OrderCalculator`, `UserAuthenticator`
- Follow bounded context naming: `Billing.InvoiceGenerator`
2. **Design Patterns**
- Is the current design pattern appropriate?
- Could a different pattern simplify the solution?
- Are SOLID principles being followed?
3. **Modularity**
- Can the code be broken into smaller, reusable functions?
- Are responsibilities properly separated?
- Is there unnecessary coupling between components?
- Does each module have a single, clear purpose?
### Code Quality
1. **Simplification Opportunities**
- Can any complex logic be simplified?
- Are there redundant operations?
- Can loops be replaced with more elegant solutions?
2. **Performance Considerations**
- Are there obvious performance bottlenecks?
- Could algorithmic complexity be improved?
- Are resources being used efficiently?
- **IMPORTANT**: Any performance claims in comments must be verified
3. **Error Handling**
- Are all potential errors properly handled?
- Is error handling consistent throughout?
- Are error messages informative?
### Testing and Validation
1. **Test Coverage**
- Are all critical paths tested?
- Missing edge cases to test:
- Boundary conditions
- Null/empty inputs
- Large/extreme values
- Concurrent access scenarios
- Are tests meaningful and not just for coverage?
2. **Test Quality**
- Are tests independent and isolated?
- Do tests follow AAA pattern (Arrange, Act, Assert)?
- Are test names descriptive?
## FACT-CHECKING AND CLAIM VERIFICATION
### Claims Requiring Immediate Verification
1. **Performance Claims**
- "This is X% faster" → Requires benchmarking
- "This has O(n) complexity" → Requires analysis proof
- "This reduces memory usage" → Requires profiling
**Verification Method**: Run actual benchmarks if exists or provide algorithmic analysis
2. **Technical Facts**
- "This API supports..." → Check official documentation
- "The framework requires..." → Verify with current docs
- "This library version..." → Confirm version compatibility
**Verification Method**: Cross-reference with official documentation
3. **Security Assertions**
- "This is secure against..." → Requires security analysis
- "This prevents injection..." → Needs proof/testing
- "This follows OWASP..." → Verify against standards
**Verification Method**: Reference security standards and test
4. **Best Practice Claims**
- "It's best practice to..." → Cite authoritative source
- "Industry standard is..." → Provide reference
- "Most developers prefer..." → Need data/surveys
**Verification Method**: Cite specific sources or standards
### Fact-Checking Checklist
- [ ] All performance claims have benchmarks or Big-O analysis
- [ ] Technical specifications match current documentation
- [ ] Security claims are backed by standards or testing
- [ ] Best practices are cited from authoritative sources
- [ ] Version numbers and compatibility are verified
- [ ] Statistical claims have sources or data
### Red Flags Requiring Double-Check
- Absolute statements ("always", "never", "only")
- Superlatives ("best", "fastest", "most secure")
- Specific numbers without context (percentages, metrics)
- Claims about third-party tools/libraries
- Historical or temporal claims ("recently", "nowadays")
### Concrete Example of Fact-Checking
**Claim Made**: "Using Map is 50% faster than using Object for this use case"
**Verification Process**:
1. Search for benchmark or documentation comparing both approaches
2. Provide algorithmic analysis
**Corrected Statement**: "Map performs better for large collections (10K+ items), while Object is more efficient for small sets (<100 items)"
## NON-CODE OUTPUT REFLECTION
For documentation, explanations, and analysis outputs:
### Content Quality
1. **Clarity and Structure**
- Is the information well-organized?
- Are complex concepts explained simply?
- Is there a logical flow of ideas?
2. **Completeness**
- Are all aspects of the question addressed?
- Are examples provided where helpful?
- Are limitations or caveats mentioned?
3. **Accuracy**
- Are technical details correct?
- Are claims verifiable?
- Are sources or reasoning provided?
### Improvement Triggers for Non-Code
- Ambiguous explanations
- Missing context or background
- Overly complex language for the audience
- Lack of concrete examples
- Unsubstantiated claims
## ITERATIVE REFINEMENT WORKFLOW
### Chain of Verification (CoV)
1. **Generate**: Create initial solution
2. **Verify**: Check each component/claim
3. **Question**: What could go wrong?
4. **Re-answer**: Address identified issues
### Tree of Thoughts (ToT)
For complex problems, consider multiple approaches:
1. **Branch 1**: Current approach
- Pros: [List advantages]
- Cons: [List disadvantages]
2. **Branch 2**: Alternative approach
- Pros: [List advantages]
- Cons: [List disadvantages]
3. **Decision**: Choose best path based on:
- Simplicity
- Maintainability
- Performance
- Extensibility
## REFINEMENT TRIGGERS
Automatically trigger refinement if any of these conditions are met:
1. **Complexity Threshold**
- Cyclomatic complexity > 10
- Nested depth > 3 levels
- Function length > 50 lines
2. **Code Smells**
- Duplicate code blocks
- Long parameter lists (>4)
- God classes/functions
- Magic numbers/strings
- Generic utility folders (`utils/`, `helpers/`, `common/`)
- NIH syndrome indicators (custom implementations of standard solutions)
3. **Missing Elements**
- No error handling
- No input validation
- No documentation for complex logic
- No tests for critical functionality
- No library search for common problems
- No consideration of existing services
4. **Architecture Violations**
- Business logic in controllers/views
- Domain logic depending on infrastructure
- Unclear boundaries between contexts
- Generic naming instead of domain terms
## FINAL VERIFICATION
Before finalizing any output:
### Self-Refine Checklist
- [ ] Have I considered at least one alternative approach?
- [ ] Have I verified my assumptions?
- [ ] Is this the simplest correct solution?
- [ ] Would another developer easily understand this?
- [ ] Have I anticipated likely future requirements?
- [ ] Have all factual claims been verified or sourced?
- [ ] Are performance/security assertions backed by evidence?
- [ ] Did I search for existing libraries before writing custom code?
- [ ] Is the architecture aligned with Clean Architecture/DDD principles?
- [ ] Are names domain-specific rather than generic (utils/helpers)?
### Reflexion Questions
1. **What worked well in this solution?**
2. **What could be improved?**
3. **What would I do differently next time?**
4. **Are there patterns here that could be reused?**
## IMPROVEMENT DIRECTIVE
If after reflection you identify improvements:
1. **STOP** current implementation
2. **SEARCH** for existing solutions before continuing
- Check package registries (npm, PyPI, etc.)
- Research existing services/APIs
- Review architectural patterns and libraries
3. **DOCUMENT** the improvements needed
- Why custom vs library?
- What architectural pattern fits?
- How does it align with Clean Architecture/DDD?
4. **IMPLEMENT** the refined solution
5. **RE-EVALUATE** using this framework again
## CONFIDENCE ASSESSMENT
Rate your confidence in the current solution:
- [ ] High (>90%) - Solution is robust and well-tested
- [ ] Medium (70-90%) - Solution works but could be improved
- [ ] Low (<70%) - Significant improvements needed
If confidence is not enough based on the TASK COMPLEXITY TRIAGE, iterate again.
## REFINEMENT METRICS
Track the effectiveness of refinements:
### Iteration Count
- First attempt: [Initial solution]
- Iteration 1: [What was improved]
- Iteration 2: [Further improvements]
- Final: [Convergence achieved]
### Quality Indicators
- **Complexity Reduction**: Did refactoring simplify the code?
- **Bug Prevention**: Were potential issues identified and fixed?
- **Performance Gain**: Was efficiency improved?
- **Readability Score**: Is the final version clearer?
### Learning Points
Document patterns for future use:
- What type of issue was this?
- What solution pattern worked?
- Can this be reused elsewhere?
---
**REMEMBER**: The goal is not perfection on the first try, but continuous improvement through structured reflection. Each iteration should bring the solution closer to optimal.