15 KiB
description, argument-hint
| description | argument-hint |
|---|---|
| Reflect on previus response and output, based on Self-refinement framework for iterative improvement with complexity triage and verification | None required - automatically reviews recent work output |
Self-Refinement and Iterative Improvement Framework
Reflect on previus response and output.
TASK COMPLEXITY TRIAGE
First, categorize the task to apply appropriate reflection depth:
Quick Path (5-second check)
For simple tasks like:
- Single file edits
- Documentation updates
- Simple queries or explanations
- Straightforward bug fixes
→ Skip to "Final Verification" section
Standard Path (Full reflection)
For tasks involving:
- Multiple file changes
- New feature implementation
- Architecture decisions
- Complex problem solving
→ Follow complete framework + require confidence >70%
Deep Reflection Path
For critical tasks:
- Core system changes
- Security-related code
- Performance-critical sections
- API design decisions
→ Follow framework + require confidence >90%
IMMEDIATE REFLECTION PROTOCOL
Step 1: Initial Assessment
Before proceeding, evaluate your most recent output against these criteria:
-
Completeness Check
- Does the solution fully address the user's request?
- Are all requirements explicitly mentioned by the user covered?
- Are there any implicit requirements that should be addressed?
-
Quality Assessment
- Is the solution at the appropriate level of complexity?
- Could the approach be simplified without losing functionality?
- Are there obvious improvements that could be made?
-
Correctness Verification
- Have you verified the logical correctness of your solution?
- Are there edge cases that haven't been considered?
- Could there be unintended side effects?
-
Fact-Checking Required
- Have you made any claims about performance? (needs verification)
- Have you stated any technical facts? (needs source/verification)
- Have you referenced best practices? (needs validation)
- Have you made security assertions? (needs careful review)
Step 2: Decision Point
Based on the assessment above, determine:
REFINEMENT NEEDED? [YES/NO]
If YES, proceed to Step 3. If NO, skip to Final Verification.
Step 3: Refinement Planning
If improvement is needed, generate a specific plan:
-
Identify Issues (List specific problems found)
- Issue 1: [Describe]
- Issue 2: [Describe]
- ...
-
Propose Solutions (For each issue)
- Solution 1: [Specific improvement]
- Solution 2: [Specific improvement]
- ...
-
Priority Order
- Critical fixes first
- Performance improvements second
- Style/readability improvements last
Concrete Example
Issue Identified: Function has 6 levels of nesting Solution: Extract nested logic into separate functions Implementation:
Before: if (a) { if (b) { if (c) { ... } } }
After: if (!shouldProcess(a, b, c)) return;
processData();
CODE-SPECIFIC REFLECTION CRITERIA
When the output involves code, additionally evaluate:
STOP: Library & Existing Solution Check
BEFORE PROCEEDING WITH CUSTOM CODE:
-
Search for Existing Libraries
- Have you searched npm/PyPI/Maven for existing solutions?
- Is this a common problem that others have already solved?
- Are you reinventing the wheel for utility functions?
Common areas to check:
- Date/time manipulation → moment.js, date-fns, dayjs
- Form validation → joi, yup, zod
- HTTP requests → axios, fetch, got
- State management → Redux, MobX, Zustand
- Utility functions → lodash, ramda, underscore
-
Existing Service/Solution Evaluation
- Could this be handled by an existing service/SaaS?
- Is there an open-source solution that fits?
- Would a third-party API be more maintainable?
Examples:
- Authentication → Auth0, Supabase, Firebase Auth
- Email sending → SendGrid, Mailgun, AWS SES
- File storage → S3, Cloudinary, Firebase Storage
- Search → Elasticsearch, Algolia, MeiliSearch
- Queue/Jobs → Bull, RabbitMQ, AWS SQS
-
Decision Framework
IF common utility function → Use established library ELSE IF complex domain-specific → Check for specialized libraries ELSE IF infrastructure concern → Look for managed services ELSE → Consider custom implementation -
When Custom Code IS Justified
- Specific business logic unique to your domain
- Performance-critical paths with special requirements
- When external dependencies would be overkill (e.g., lodash for one function)
- Security-sensitive code requiring full control
- When existing solutions don't meet requirements after evaluation
Real Examples of Library-First Approach
❌ BAD: Custom Implementation
// utils/dateFormatter.js
function formatDate(date) {
const d = new Date(date);
return `${d.getMonth()+1}/${d.getDate()}/${d.getFullYear()}`;
}
✅ GOOD: Use Existing Library
import { format } from 'date-fns';
const formatted = format(new Date(), 'MM/dd/yyyy');
❌ BAD: Generic Utilities Folder
/src/utils/
- helpers.js
- common.js
- shared.js
✅ GOOD: Domain-Driven Structure
/src/order/
- domain/OrderCalculator.js
- infrastructure/OrderRepository.js
/src/user/
- domain/UserValidator.js
- application/UserRegistrationService.js
Common Anti-Patterns to Avoid
-
NIH (Not Invented Here) Syndrome
- Building custom auth when Auth0/Supabase exists
- Writing custom state management instead of using Redux/Zustand
- Creating custom form validation instead of using Formik/React Hook Form
-
Poor Architectural Choices
- Mixing business logic with UI components
- Database queries in controllers
- No clear separation of concerns
-
Generic Naming Anti-Patterns
utils.jswith 50 unrelated functionshelpers/misc.jsas a dumping groundcommon/shared.jswith unclear purpose
Remember: Every line of custom code is a liability that needs to be maintained, tested, and documented. Use existing solutions whenever possible.
Architecture and Design
-
Clean Architecture & DDD Alignment
- Does naming follow ubiquitous language of the domain?
- Are domain entities separated from infrastructure?
- Is business logic independent of frameworks?
- Are use cases clearly defined and isolated?
Naming Convention Check:
- Avoid generic names:
utils,helpers,common,shared - Use domain-specific names:
OrderCalculator,UserAuthenticator - Follow bounded context naming:
Billing.InvoiceGenerator
-
Design Patterns
- Is the current design pattern appropriate?
- Could a different pattern simplify the solution?
- Are SOLID principles being followed?
-
Modularity
- Can the code be broken into smaller, reusable functions?
- Are responsibilities properly separated?
- Is there unnecessary coupling between components?
- Does each module have a single, clear purpose?
Code Quality
-
Simplification Opportunities
- Can any complex logic be simplified?
- Are there redundant operations?
- Can loops be replaced with more elegant solutions?
-
Performance Considerations
- Are there obvious performance bottlenecks?
- Could algorithmic complexity be improved?
- Are resources being used efficiently?
- IMPORTANT: Any performance claims in comments must be verified
-
Error Handling
- Are all potential errors properly handled?
- Is error handling consistent throughout?
- Are error messages informative?
Testing and Validation
-
Test Coverage
- Are all critical paths tested?
- Missing edge cases to test:
- Boundary conditions
- Null/empty inputs
- Large/extreme values
- Concurrent access scenarios
- Are tests meaningful and not just for coverage?
-
Test Quality
- Are tests independent and isolated?
- Do tests follow AAA pattern (Arrange, Act, Assert)?
- Are test names descriptive?
FACT-CHECKING AND CLAIM VERIFICATION
Claims Requiring Immediate Verification
-
Performance Claims
- "This is X% faster" → Requires benchmarking
- "This has O(n) complexity" → Requires analysis proof
- "This reduces memory usage" → Requires profiling
Verification Method: Run actual benchmarks if exists or provide algorithmic analysis
-
Technical Facts
- "This API supports..." → Check official documentation
- "The framework requires..." → Verify with current docs
- "This library version..." → Confirm version compatibility
Verification Method: Cross-reference with official documentation
-
Security Assertions
- "This is secure against..." → Requires security analysis
- "This prevents injection..." → Needs proof/testing
- "This follows OWASP..." → Verify against standards
Verification Method: Reference security standards and test
-
Best Practice Claims
- "It's best practice to..." → Cite authoritative source
- "Industry standard is..." → Provide reference
- "Most developers prefer..." → Need data/surveys
Verification Method: Cite specific sources or standards
Fact-Checking Checklist
- All performance claims have benchmarks or Big-O analysis
- Technical specifications match current documentation
- Security claims are backed by standards or testing
- Best practices are cited from authoritative sources
- Version numbers and compatibility are verified
- Statistical claims have sources or data
Red Flags Requiring Double-Check
- Absolute statements ("always", "never", "only")
- Superlatives ("best", "fastest", "most secure")
- Specific numbers without context (percentages, metrics)
- Claims about third-party tools/libraries
- Historical or temporal claims ("recently", "nowadays")
Concrete Example of Fact-Checking
Claim Made: "Using Map is 50% faster than using Object for this use case" Verification Process:
- Search for benchmark or documentation comparing both approaches
- Provide algorithmic analysis Corrected Statement: "Map performs better for large collections (10K+ items), while Object is more efficient for small sets (<100 items)"
NON-CODE OUTPUT REFLECTION
For documentation, explanations, and analysis outputs:
Content Quality
-
Clarity and Structure
- Is the information well-organized?
- Are complex concepts explained simply?
- Is there a logical flow of ideas?
-
Completeness
- Are all aspects of the question addressed?
- Are examples provided where helpful?
- Are limitations or caveats mentioned?
-
Accuracy
- Are technical details correct?
- Are claims verifiable?
- Are sources or reasoning provided?
Improvement Triggers for Non-Code
- Ambiguous explanations
- Missing context or background
- Overly complex language for the audience
- Lack of concrete examples
- Unsubstantiated claims
ITERATIVE REFINEMENT WORKFLOW
Chain of Verification (CoV)
- Generate: Create initial solution
- Verify: Check each component/claim
- Question: What could go wrong?
- Re-answer: Address identified issues
Tree of Thoughts (ToT)
For complex problems, consider multiple approaches:
-
Branch 1: Current approach
- Pros: [List advantages]
- Cons: [List disadvantages]
-
Branch 2: Alternative approach
- Pros: [List advantages]
- Cons: [List disadvantages]
-
Decision: Choose best path based on:
- Simplicity
- Maintainability
- Performance
- Extensibility
REFINEMENT TRIGGERS
Automatically trigger refinement if any of these conditions are met:
-
Complexity Threshold
- Cyclomatic complexity > 10
- Nested depth > 3 levels
- Function length > 50 lines
-
Code Smells
- Duplicate code blocks
- Long parameter lists (>4)
- God classes/functions
- Magic numbers/strings
- Generic utility folders (
utils/,helpers/,common/) - NIH syndrome indicators (custom implementations of standard solutions)
-
Missing Elements
- No error handling
- No input validation
- No documentation for complex logic
- No tests for critical functionality
- No library search for common problems
- No consideration of existing services
-
Architecture Violations
- Business logic in controllers/views
- Domain logic depending on infrastructure
- Unclear boundaries between contexts
- Generic naming instead of domain terms
FINAL VERIFICATION
Before finalizing any output:
Self-Refine Checklist
- Have I considered at least one alternative approach?
- Have I verified my assumptions?
- Is this the simplest correct solution?
- Would another developer easily understand this?
- Have I anticipated likely future requirements?
- Have all factual claims been verified or sourced?
- Are performance/security assertions backed by evidence?
- Did I search for existing libraries before writing custom code?
- Is the architecture aligned with Clean Architecture/DDD principles?
- Are names domain-specific rather than generic (utils/helpers)?
Reflexion Questions
- What worked well in this solution?
- What could be improved?
- What would I do differently next time?
- Are there patterns here that could be reused?
IMPROVEMENT DIRECTIVE
If after reflection you identify improvements:
- STOP current implementation
- SEARCH for existing solutions before continuing
- Check package registries (npm, PyPI, etc.)
- Research existing services/APIs
- Review architectural patterns and libraries
- DOCUMENT the improvements needed
- Why custom vs library?
- What architectural pattern fits?
- How does it align with Clean Architecture/DDD?
- IMPLEMENT the refined solution
- RE-EVALUATE using this framework again
CONFIDENCE ASSESSMENT
Rate your confidence in the current solution:
- High (>90%) - Solution is robust and well-tested
- Medium (70-90%) - Solution works but could be improved
- Low (<70%) - Significant improvements needed
If confidence is not enough based on the TASK COMPLEXITY TRIAGE, iterate again.
REFINEMENT METRICS
Track the effectiveness of refinements:
Iteration Count
- First attempt: [Initial solution]
- Iteration 1: [What was improved]
- Iteration 2: [Further improvements]
- Final: [Convergence achieved]
Quality Indicators
- Complexity Reduction: Did refactoring simplify the code?
- Bug Prevention: Were potential issues identified and fixed?
- Performance Gain: Was efficiency improved?
- Readability Score: Is the final version clearer?
Learning Points
Document patterns for future use:
- What type of issue was this?
- What solution pattern worked?
- Can this be reused elsewhere?
REMEMBER: The goal is not perfection on the first try, but continuous improvement through structured reflection. Each iteration should bring the solution closer to optimal.