Files
2025-11-30 08:42:59 +08:00

15 KiB

description, argument-hint
description argument-hint
Reflect on previus response and output, based on Self-refinement framework for iterative improvement with complexity triage and verification None required - automatically reviews recent work output

Self-Refinement and Iterative Improvement Framework

Reflect on previus response and output.

TASK COMPLEXITY TRIAGE

First, categorize the task to apply appropriate reflection depth:

Quick Path (5-second check)

For simple tasks like:

  • Single file edits
  • Documentation updates
  • Simple queries or explanations
  • Straightforward bug fixes

Skip to "Final Verification" section

Standard Path (Full reflection)

For tasks involving:

  • Multiple file changes
  • New feature implementation
  • Architecture decisions
  • Complex problem solving

Follow complete framework + require confidence >70%

Deep Reflection Path

For critical tasks:

  • Core system changes
  • Security-related code
  • Performance-critical sections
  • API design decisions

Follow framework + require confidence >90%

IMMEDIATE REFLECTION PROTOCOL

Step 1: Initial Assessment

Before proceeding, evaluate your most recent output against these criteria:

  1. Completeness Check

    • Does the solution fully address the user's request?
    • Are all requirements explicitly mentioned by the user covered?
    • Are there any implicit requirements that should be addressed?
  2. Quality Assessment

    • Is the solution at the appropriate level of complexity?
    • Could the approach be simplified without losing functionality?
    • Are there obvious improvements that could be made?
  3. Correctness Verification

    • Have you verified the logical correctness of your solution?
    • Are there edge cases that haven't been considered?
    • Could there be unintended side effects?
  4. Fact-Checking Required

    • Have you made any claims about performance? (needs verification)
    • Have you stated any technical facts? (needs source/verification)
    • Have you referenced best practices? (needs validation)
    • Have you made security assertions? (needs careful review)

Step 2: Decision Point

Based on the assessment above, determine:

REFINEMENT NEEDED? [YES/NO]

If YES, proceed to Step 3. If NO, skip to Final Verification.

Step 3: Refinement Planning

If improvement is needed, generate a specific plan:

  1. Identify Issues (List specific problems found)

    • Issue 1: [Describe]
    • Issue 2: [Describe]
    • ...
  2. Propose Solutions (For each issue)

    • Solution 1: [Specific improvement]
    • Solution 2: [Specific improvement]
    • ...
  3. Priority Order

    • Critical fixes first
    • Performance improvements second
    • Style/readability improvements last

Concrete Example

Issue Identified: Function has 6 levels of nesting Solution: Extract nested logic into separate functions Implementation:

Before: if (a) { if (b) { if (c) { ... } } }
After: if (!shouldProcess(a, b, c)) return;
       processData();

CODE-SPECIFIC REFLECTION CRITERIA

When the output involves code, additionally evaluate:

STOP: Library & Existing Solution Check

BEFORE PROCEEDING WITH CUSTOM CODE:

  1. Search for Existing Libraries

    • Have you searched npm/PyPI/Maven for existing solutions?
    • Is this a common problem that others have already solved?
    • Are you reinventing the wheel for utility functions?

    Common areas to check:

    • Date/time manipulation → moment.js, date-fns, dayjs
    • Form validation → joi, yup, zod
    • HTTP requests → axios, fetch, got
    • State management → Redux, MobX, Zustand
    • Utility functions → lodash, ramda, underscore
  2. Existing Service/Solution Evaluation

    • Could this be handled by an existing service/SaaS?
    • Is there an open-source solution that fits?
    • Would a third-party API be more maintainable?

    Examples:

    • Authentication → Auth0, Supabase, Firebase Auth
    • Email sending → SendGrid, Mailgun, AWS SES
    • File storage → S3, Cloudinary, Firebase Storage
    • Search → Elasticsearch, Algolia, MeiliSearch
    • Queue/Jobs → Bull, RabbitMQ, AWS SQS
  3. Decision Framework

    IF common utility function → Use established library
    ELSE IF complex domain-specific → Check for specialized libraries
    ELSE IF infrastructure concern → Look for managed services
    ELSE → Consider custom implementation
    
  4. When Custom Code IS Justified

    • Specific business logic unique to your domain
    • Performance-critical paths with special requirements
    • When external dependencies would be overkill (e.g., lodash for one function)
    • Security-sensitive code requiring full control
    • When existing solutions don't meet requirements after evaluation

Real Examples of Library-First Approach

BAD: Custom Implementation

// utils/dateFormatter.js
function formatDate(date) {
  const d = new Date(date);
  return `${d.getMonth()+1}/${d.getDate()}/${d.getFullYear()}`;
}

GOOD: Use Existing Library

import { format } from 'date-fns';
const formatted = format(new Date(), 'MM/dd/yyyy');

BAD: Generic Utilities Folder

/src/utils/
  - helpers.js
  - common.js
  - shared.js

GOOD: Domain-Driven Structure

/src/order/
  - domain/OrderCalculator.js
  - infrastructure/OrderRepository.js
/src/user/
  - domain/UserValidator.js
  - application/UserRegistrationService.js

Common Anti-Patterns to Avoid

  1. NIH (Not Invented Here) Syndrome

    • Building custom auth when Auth0/Supabase exists
    • Writing custom state management instead of using Redux/Zustand
    • Creating custom form validation instead of using Formik/React Hook Form
  2. Poor Architectural Choices

    • Mixing business logic with UI components
    • Database queries in controllers
    • No clear separation of concerns
  3. Generic Naming Anti-Patterns

    • utils.js with 50 unrelated functions
    • helpers/misc.js as a dumping ground
    • common/shared.js with unclear purpose

Remember: Every line of custom code is a liability that needs to be maintained, tested, and documented. Use existing solutions whenever possible.

Architecture and Design

  1. Clean Architecture & DDD Alignment

    • Does naming follow ubiquitous language of the domain?
    • Are domain entities separated from infrastructure?
    • Is business logic independent of frameworks?
    • Are use cases clearly defined and isolated?

    Naming Convention Check:

    • Avoid generic names: utils, helpers, common, shared
    • Use domain-specific names: OrderCalculator, UserAuthenticator
    • Follow bounded context naming: Billing.InvoiceGenerator
  2. Design Patterns

    • Is the current design pattern appropriate?
    • Could a different pattern simplify the solution?
    • Are SOLID principles being followed?
  3. Modularity

    • Can the code be broken into smaller, reusable functions?
    • Are responsibilities properly separated?
    • Is there unnecessary coupling between components?
    • Does each module have a single, clear purpose?

Code Quality

  1. Simplification Opportunities

    • Can any complex logic be simplified?
    • Are there redundant operations?
    • Can loops be replaced with more elegant solutions?
  2. Performance Considerations

    • Are there obvious performance bottlenecks?
    • Could algorithmic complexity be improved?
    • Are resources being used efficiently?
    • IMPORTANT: Any performance claims in comments must be verified
  3. Error Handling

    • Are all potential errors properly handled?
    • Is error handling consistent throughout?
    • Are error messages informative?

Testing and Validation

  1. Test Coverage

    • Are all critical paths tested?
    • Missing edge cases to test:
      • Boundary conditions
      • Null/empty inputs
      • Large/extreme values
      • Concurrent access scenarios
    • Are tests meaningful and not just for coverage?
  2. Test Quality

    • Are tests independent and isolated?
    • Do tests follow AAA pattern (Arrange, Act, Assert)?
    • Are test names descriptive?

FACT-CHECKING AND CLAIM VERIFICATION

Claims Requiring Immediate Verification

  1. Performance Claims

    • "This is X% faster" → Requires benchmarking
    • "This has O(n) complexity" → Requires analysis proof
    • "This reduces memory usage" → Requires profiling

    Verification Method: Run actual benchmarks if exists or provide algorithmic analysis

  2. Technical Facts

    • "This API supports..." → Check official documentation
    • "The framework requires..." → Verify with current docs
    • "This library version..." → Confirm version compatibility

    Verification Method: Cross-reference with official documentation

  3. Security Assertions

    • "This is secure against..." → Requires security analysis
    • "This prevents injection..." → Needs proof/testing
    • "This follows OWASP..." → Verify against standards

    Verification Method: Reference security standards and test

  4. Best Practice Claims

    • "It's best practice to..." → Cite authoritative source
    • "Industry standard is..." → Provide reference
    • "Most developers prefer..." → Need data/surveys

    Verification Method: Cite specific sources or standards

Fact-Checking Checklist

  • All performance claims have benchmarks or Big-O analysis
  • Technical specifications match current documentation
  • Security claims are backed by standards or testing
  • Best practices are cited from authoritative sources
  • Version numbers and compatibility are verified
  • Statistical claims have sources or data

Red Flags Requiring Double-Check

  • Absolute statements ("always", "never", "only")
  • Superlatives ("best", "fastest", "most secure")
  • Specific numbers without context (percentages, metrics)
  • Claims about third-party tools/libraries
  • Historical or temporal claims ("recently", "nowadays")

Concrete Example of Fact-Checking

Claim Made: "Using Map is 50% faster than using Object for this use case" Verification Process:

  1. Search for benchmark or documentation comparing both approaches
  2. Provide algorithmic analysis Corrected Statement: "Map performs better for large collections (10K+ items), while Object is more efficient for small sets (<100 items)"

NON-CODE OUTPUT REFLECTION

For documentation, explanations, and analysis outputs:

Content Quality

  1. Clarity and Structure

    • Is the information well-organized?
    • Are complex concepts explained simply?
    • Is there a logical flow of ideas?
  2. Completeness

    • Are all aspects of the question addressed?
    • Are examples provided where helpful?
    • Are limitations or caveats mentioned?
  3. Accuracy

    • Are technical details correct?
    • Are claims verifiable?
    • Are sources or reasoning provided?

Improvement Triggers for Non-Code

  • Ambiguous explanations
  • Missing context or background
  • Overly complex language for the audience
  • Lack of concrete examples
  • Unsubstantiated claims

ITERATIVE REFINEMENT WORKFLOW

Chain of Verification (CoV)

  1. Generate: Create initial solution
  2. Verify: Check each component/claim
  3. Question: What could go wrong?
  4. Re-answer: Address identified issues

Tree of Thoughts (ToT)

For complex problems, consider multiple approaches:

  1. Branch 1: Current approach

    • Pros: [List advantages]
    • Cons: [List disadvantages]
  2. Branch 2: Alternative approach

    • Pros: [List advantages]
    • Cons: [List disadvantages]
  3. Decision: Choose best path based on:

    • Simplicity
    • Maintainability
    • Performance
    • Extensibility

REFINEMENT TRIGGERS

Automatically trigger refinement if any of these conditions are met:

  1. Complexity Threshold

    • Cyclomatic complexity > 10
    • Nested depth > 3 levels
    • Function length > 50 lines
  2. Code Smells

    • Duplicate code blocks
    • Long parameter lists (>4)
    • God classes/functions
    • Magic numbers/strings
    • Generic utility folders (utils/, helpers/, common/)
    • NIH syndrome indicators (custom implementations of standard solutions)
  3. Missing Elements

    • No error handling
    • No input validation
    • No documentation for complex logic
    • No tests for critical functionality
    • No library search for common problems
    • No consideration of existing services
  4. Architecture Violations

    • Business logic in controllers/views
    • Domain logic depending on infrastructure
    • Unclear boundaries between contexts
    • Generic naming instead of domain terms

FINAL VERIFICATION

Before finalizing any output:

Self-Refine Checklist

  • Have I considered at least one alternative approach?
  • Have I verified my assumptions?
  • Is this the simplest correct solution?
  • Would another developer easily understand this?
  • Have I anticipated likely future requirements?
  • Have all factual claims been verified or sourced?
  • Are performance/security assertions backed by evidence?
  • Did I search for existing libraries before writing custom code?
  • Is the architecture aligned with Clean Architecture/DDD principles?
  • Are names domain-specific rather than generic (utils/helpers)?

Reflexion Questions

  1. What worked well in this solution?
  2. What could be improved?
  3. What would I do differently next time?
  4. Are there patterns here that could be reused?

IMPROVEMENT DIRECTIVE

If after reflection you identify improvements:

  1. STOP current implementation
  2. SEARCH for existing solutions before continuing
    • Check package registries (npm, PyPI, etc.)
    • Research existing services/APIs
    • Review architectural patterns and libraries
  3. DOCUMENT the improvements needed
    • Why custom vs library?
    • What architectural pattern fits?
    • How does it align with Clean Architecture/DDD?
  4. IMPLEMENT the refined solution
  5. RE-EVALUATE using this framework again

CONFIDENCE ASSESSMENT

Rate your confidence in the current solution:

  • High (>90%) - Solution is robust and well-tested
  • Medium (70-90%) - Solution works but could be improved
  • Low (<70%) - Significant improvements needed

If confidence is not enough based on the TASK COMPLEXITY TRIAGE, iterate again.

REFINEMENT METRICS

Track the effectiveness of refinements:

Iteration Count

  • First attempt: [Initial solution]
  • Iteration 1: [What was improved]
  • Iteration 2: [Further improvements]
  • Final: [Convergence achieved]

Quality Indicators

  • Complexity Reduction: Did refactoring simplify the code?
  • Bug Prevention: Were potential issues identified and fixed?
  • Performance Gain: Was efficiency improved?
  • Readability Score: Is the final version clearer?

Learning Points

Document patterns for future use:

  • What type of issue was this?
  • What solution pattern worked?
  • Can this be reused elsewhere?

REMEMBER: The goal is not perfection on the first try, but continuous improvement through structured reflection. Each iteration should bring the solution closer to optimal.