--- name: reflect-appworld-failure description: Analyze AppWorld task failures to extract specific API patterns and generate actionable playbook bullets with concrete code examples allowed-tools: Read --- # Reflect on AppWorld Failure Analyze failed AppWorld tasks to extract specific, actionable learnings that can be added to the playbook. ## Purpose When an AppWorld task fails, the Reflector calls this Skill with error details and failed code. You analyze the failure semantically and generate a high-quality bullet with: 1. Specific title describing the pattern 2. Detailed content with working code examples 3. Relevant tags for retrieval 4. Appropriate confidence level ## Input Format The input will be a text description with sections: ``` # Task ## Apps ## Error Type ## Error Messages ## Failed Code Snippet ## Missing Patterns (from heuristics) ## Suggested Fixes (from heuristics) ``` ## Your Analysis Process 1. **Identify Root Cause**: What was the fundamental mistake? - Wrong API method name? - Missing authentication? - Incorrect data structure access? - Logic error? 2. **Extract Pattern**: What general pattern does this represent? - Is this specific to one app or applies to multiple? - Is this about API order (login first)? - Is this about method naming conventions? - Is this about data validation? 3. **Generate Concrete Example**: Create working code that demonstrates the CORRECT pattern 4. **Write Actionable Bullet**: Make it specific enough that the Generator can apply it ## Output Format Return a JSON object with this structure: ```json { "bullet": { "id": "bullet-YYYY-MM-DD-HHMMSS", "title": "", "content": "", "tags": ["app.", "", ""], "evidence": [ { "type": "execution", "ref": "", "note": "" } ], "confidence": "high|medium|low", "scope": "app|global" } } ``` ## Bullet Quality Guidelines ### GOOD Bullets (Specific and Actionable) **Title**: "Spotify: Use show_playlist_songs() not get_tracks()" **Content**: "Spotify API uses show_playlist_songs(access_token, playlist_id) to retrieve tracks. The method get_tracks() does not exist. Example: `songs = apis.spotify.show_playlist_songs(access_token=token, playlist_id=playlist['id'])`" **Tags**: ["app.spotify", "api_misuse", "method_names", "playlists"] **Title**: "Venmo: Call login() before search_transactions()" **Content**: "Venmo API requires authentication token for all operations. Always call venmo.login() first to get access_token, then pass it to other methods. Example: `response = apis.venmo.login(username='user', password='pass'); token = response['access_token']; results = apis.venmo.search_transactions(access_token=token, query={'friend': 'Alice'})`" **Tags**: ["app.venmo", "authentication", "api_order", "search"] ### BAD Bullets (Too Generic) **Title**: "Verify venmo API logic and requirements" **Content**: "When implementing venmo operations: Check task logic and requirements; Missing login() call for venmo" **Tags**: ["logic", "debugging", "api", "app.venmo"] **Why Bad**: No concrete code example, vague guidance, doesn't teach the specific pattern ## Example Analysis ### Input: ``` # Task What is the title of the most-liked song in my Spotify playlists ## Apps spotify ## Error Type api_misuse ## Error Messages AttributeError: 'Spotify' object has no attribute 'get_tracks' ## Failed Code Snippet songs = spotify.get_tracks(playlist_id=pid) ## Missing Patterns - Use correct Spotify API methods ## Suggested Fixes - Check Spotify API documentation for available methods ``` ### Your Analysis: 1. **Root Cause**: Code used non-existent method `get_tracks()` instead of correct `show_playlist_songs()` 2. **Pattern**: Spotify uses `show_*` naming convention for retrieval methods 3. **Scope**: App-specific (Spotify) ### Output: ```json { "bullet": { "id": "bullet-2025-10-27-123456", "title": "Spotify: Use show_playlist_songs() to get tracks from playlist", "content": "To retrieve songs from a Spotify playlist, use show_playlist_songs(access_token, playlist_id). Don't use get_tracks() - it doesn't exist. Example: `token = apis.spotify.login()['access_token']; playlists = apis.spotify.show_playlist_library(access_token=token); songs = apis.spotify.show_playlist_songs(access_token=token, playlist_id=playlists[0]['id']); most_liked = max(songs, key=lambda s: s['likes'])`", "tags": ["app.spotify", "api_misuse", "method_names", "playlists", "retrieval"], "evidence": [ { "type": "execution", "ref": "spotify_task_001", "note": "AttributeError: 'Spotify' object has no attribute 'get_tracks'" } ], "confidence": "high", "scope": "app" } } ``` ## Common AppWorld Patterns to Look For ### Authentication Order - Most apps require login() first to get access_token - Token must be passed to subsequent API calls ### Method Naming Conventions - Spotify: `show_*` for retrieval (show_playlist_songs, show_album_library) - Venmo: `show_friends`, `send_payment`, `search_transactions` - Gmail: `fetch_emails`, `send_email` - Contacts: `show_contacts`, `add_contact` - Calendar: `show_events`, `create_event` ### Data Structure Access - API responses may have nested structures - Always check if keys exist before accessing - Use `.get()` with defaults for safety ### Aggregation Patterns - To find "most-liked song in playlists": Get all playlists → Get songs from each → Find max by likes - To find "most expensive transaction": Get all transactions → Find max by amount ### Task Completion - ALWAYS call `apis.supervisor.complete_task()` at the end - This signals successful completion to test framework ## Important Rules 1. **Be Specific**: Include actual method names, parameter names, and code examples 2. **Be Actionable**: The Generator should know exactly what to do after reading your bullet 3. **Include Working Code**: Show a complete example that demonstrates the correct pattern 4. **Tag Appropriately**: Use `app.` for app-specific bullets, plus semantic tags 5. **Set Confidence**: "high" for clear patterns, "medium" for uncertain, "low" for speculative 6. **Return ONLY JSON**: No explanations, no markdown formatting outside the JSON ## Response Format Return the JSON object as plain text. Make sure it's valid JSON that can be parsed directly.