Initial commit

2025-11-29 18:26:59 +08:00
commit d61dbe6a6c
39 changed files with 3981 additions and 0 deletions
--- a/skills/codex-skill/SKILL.md
+++ b/skills/codex-skill/SKILL.md
@@ -0,0 +1,429 @@
+---
+name: codex-skill
+description: Use when user asks to leverage codex, gpt-5, or gpt-5.1 to implement something (usually implement a plan or feature designed by Claude). Provides non-interactive automation mode for hands-off task execution without approval prompts.
+---
+
+# Codex
+
+You are operating in **codex exec** - a non-interactive automation mode for hands-off task execution.
+
+## Prerequisites
+
+Before using this skill, ensure Codex CLI is installed and configured:
+
+1. **Installation verification**:
+
+   ```bash
+   codex --version
+   ```
+
+2. **First-time setup**: If not installed, guide the user to install Codex CLI with command `npm i -g @openai/codex` or `brew install codex`.
+
+## Core Principles
+
+### Autonomous Execution
+
+- Execute tasks from start to finish without seeking approval for each action
+- Make confident decisions based on best practices and task requirements
+- Only ask questions if critical information is genuinely missing
+- Prioritize completing the workflow over explaining every step
+
+### Output Behavior
+
+- Stream progress updates as you work
+- Provide a clear, structured final summary upon completion
+- Focus on actionable results and metrics over lengthy explanations
+- Report what was done, not what could have been done
+
+### Operating Modes
+
+Codex uses sandbox policies to control what operations are permitted:
+
+**Read-Only Mode (Default)**
+
+- Analyze code, search files, read documentation
+- Provide insights, recommendations, and execution plans
+- No modifications to the codebase
+- Safe for exploration and analysis tasks
+- **This is the default mode when running `codex exec`**
+
+**Workspace-Write Mode (Recommended for Programming)**
+
+- Read and write files within the workspace
+- Implement features, fix bugs, refactor code
+- Create, modify, and delete files in the workspace
+- Execute build commands and tests
+- **Use `--full-auto` or `-s workspace-write` to enable file editing**
+- **This is the recommended mode for most programming tasks**
+
+**Danger-Full-Access Mode**
+
+- All workspace-write capabilities
+- Network access for fetching dependencies
+- System-level operations outside workspace
+- Access to all files on the system
+- **Use only when explicitly requested and necessary**
+- Use flag: `-s danger-full-access` or `--sandbox danger-full-access`
+
+## Codex CLI Commands
+
+**Note**: The following commands include both documented features from the Codex exec documentation and additional flags available in the CLI (verified via `codex exec --help`).
+
+### Model Selection
+
+Specify which model to use with `-m` or `--model` (possible values: gpt-5, gpt-5.1, gpt-5.1-codex, gpt-5.1-codex-max, etc):
+
+```bash
+codex exec -m gpt-5.1 "refactor the payment processing module"
+codex exec -m gpt-5.1-codex "implement the user authentication feature"
+codex exec -m gpt-5.1-codex-max "analyze the codebase architecture"
+```
+
+### Sandbox Modes
+
+Control execution permissions with `-s` or `--sandbox` (possible values: read-only, workspace-write, danger-full-access):
+
+#### Read-Only Mode
+
+```bash
+codex exec -s read-only "analyze the codebase structure and count lines of code"
+codex exec --sandbox read-only "review code quality and suggest improvements"
+```
+
+Analyze code without making any modifications.
+
+#### Workspace-Write Mode (Recommended for Programming)
+
+```bash
+codex exec -s workspace-write "implement the user authentication feature"
+codex exec --sandbox workspace-write "fix the bug in login flow"
+```
+
+Read and write files within the workspace. **Must be explicitly enabled (not the default). Use this for most programming tasks.**
+
+#### Danger-Full-Access Mode
+
+```bash
+codex exec -s danger-full-access "install dependencies and update the API integration"
+codex exec --sandbox danger-full-access "setup development environment with npm packages"
+```
+
+Network access and system-level operations. Use only when necessary.
+
+### Full-Auto Mode (Convenience Alias)
+
+```bash
+codex exec --full-auto "implement the user authentication feature"
+```
+
+**Convenience alias for**: `-s workspace-write` (enables file editing).
+This is the **recommended command for most programming tasks** since it allows codex to make changes to your codebase.
+
+### Configuration Profiles
+
+Use saved profiles from `~/.codex/config.toml` with `-p` or `--profile` (if supported in your version):
+
+```bash
+codex exec -p production "deploy the latest changes"
+codex exec --profile development "run integration tests"
+```
+
+Profiles can specify default model, sandbox mode, and other options.
+*Verify availability with `codex exec --help`*
+
+### Working Directory
+
+Specify a different working directory with `-C` or `--cd` (if supported in your version):
+
+```bash
+codex exec -C /path/to/project "implement the feature"
+codex exec --cd ~/projects/myapp "run tests and fix failures"
+```
+
+*Verify availability with `codex exec --help`*
+
+### Additional Writable Directories
+
+Allow writing to additional directories outside the main workspace with `--add-dir` (if supported in your version):
+
+```bash
+codex exec --add-dir /tmp/output --add-dir ~/shared "generate reports in multiple locations"
+```
+
+Useful when the task needs to write to specific external directories.
+*Verify availability with `codex exec --help`*
+
+### JSON Output
+
+```bash
+codex exec --json "run tests and report results"
+codex exec --json -s read-only "analyze security vulnerabilities"
+```
+
+Outputs structured JSON Lines format with reasoning, commands, file changes, and metrics.
+
+### Save Output to File
+
+```bash
+codex exec -o report.txt "generate a security audit report"
+codex exec -o results.json --json "run performance benchmarks"
+```
+
+Writes the final message to a file instead of stdout.
+
+### Skip Git Repository Check
+
+```bash
+codex exec --skip-git-repo-check "analyze this non-git directory"
+```
+
+Bypasses the requirement for the directory to be a git repository.
+
+### Resume Previous Session
+
+```bash
+codex exec resume --last "now implement the next feature"
+```
+
+Resumes the last session and continues with a new task.
+
+### Bypass Approvals and Sandbox (If Available)
+
+**⚠️ WARNING: Verify this flag exists before using ⚠️**
+
+Some versions of Codex may support `--dangerously-bypass-approvals-and-sandbox`:
+
+```bash
+codex exec --dangerously-bypass-approvals-and-sandbox "perform the task"
+```
+
+**If this flag is available**:
+- Skips ALL confirmation prompts
+- Executes commands WITHOUT sandboxing
+- Should ONLY be used in externally sandboxed environments (containers, VMs)
+- **EXTREMELY DANGEROUS - NEVER use on your development machine**
+
+**Verify availability first**: Run `codex exec --help` to check if this flag is supported in your version.
+
+### Combined Examples
+
+Combine multiple flags for complex scenarios:
+
+```bash
+# Use specific model with workspace write and JSON output
+codex exec -m gpt-5.1-codex -s workspace-write --json "implement authentication and output results"
+
+# Use profile with custom working directory
+codex exec -p production -C /var/www/app "deploy updates"
+
+# Full-auto with additional directories and output file
+codex exec --full-auto --add-dir /tmp/logs -o summary.txt "refactor and log changes"
+
+# Skip git check with specific model in different directory
+codex exec -m gpt-5.1-codex -C ~/non-git-project --skip-git-repo-check "analyze and improve code"
+```
+
+## Execution Workflow
+
+1. **Parse the Request**: Understand the complete objective and scope
+2. **Plan Efficiently**: Create a minimal, focused execution plan
+3. **Execute Autonomously**: Implement the solution with confidence
+4. **Verify Results**: Run tests, checks, or validations as appropriate
+5. **Report Clearly**: Provide a structured summary of accomplishments
+
+## Best Practices
+
+### Speed and Efficiency
+
+- Make reasonable assumptions when minor details are ambiguous
+- Use parallel operations whenever possible (read multiple files, run multiple commands)
+- Avoid verbose explanations during execution - focus on doing
+- Don't seek confirmation for standard operations
+
+### Scope Management
+
+- Focus strictly on the requested task
+- Don't add unrequested features or improvements
+- Avoid refactoring code that isn't part of the task
+- Keep solutions minimal and direct
+
+### Quality Standards
+
+- Follow existing code patterns and conventions
+- Run relevant tests after making changes
+- Verify the solution actually works
+- Report any errors or limitations encountered
+
+## When to Interrupt Execution
+
+Only pause for user input when encountering:
+
+- **Destructive operations**: Deleting databases, force pushing to main, dropping tables
+- **Security decisions**: Exposing credentials, changing authentication, opening ports
+- **Ambiguous requirements**: Multiple valid approaches with significant trade-offs
+- **Missing critical information**: Cannot proceed without user-specific data
+
+For all other decisions, proceed autonomously using best judgment.
+
+## Final Output Format
+
+Always conclude with a structured summary:
+
+```
+✓ Task completed successfully
+
+Changes made:
+- [List of files modified/created]
+- [Key code changes]
+
+Results:
+- [Metrics: lines changed, files affected, tests run]
+- [What now works that didn't before]
+
+Verification:
+- [Tests run, checks performed]
+
+Next steps (if applicable):
+- [Suggestions for follow-up tasks]
+```
+
+## Example Usage Scenarios
+
+### Code Analysis (Read-Only)
+
+**User**: "Count the lines of code in this project by language"
+**Mode**: Read-only
+**Command**:
+
+```bash
+codex exec -s read-only "count the total number of lines of code in this project, broken down by language"
+```
+
+**Action**: Search all files, categorize by extension, count lines, report totals
+
+### Bug Fixing (Workspace-Write)
+
+**User**: "Use gpt-5 to fix the authentication bug in the login flow"
+**Mode**: Workspace-write
+**Command**:
+
+```bash
+codex exec -m gpt-5 --full-auto "fix the authentication bug in the login flow"
+```
+
+**Action**: Find the bug, implement fix, run tests, commit changes
+
+### Feature Implementation (Workspace-Write)
+
+**User**: "Let codex implement dark mode support for the UI"
+**Mode**: Workspace-write
+**Command**:
+
+```bash
+codex exec --full-auto "add dark mode support to the UI with theme context and style updates"
+```
+
+**Action**: Identify components, add theme context, update styles, test in both modes
+
+### Batch Operations (Workspace-Write)
+
+**User**: "Have gpt-5.1 update all imports from old-lib to new-lib"
+**Mode**: Workspace-write
+**Command**:
+
+```bash
+codex exec -m gpt-5.1 -s workspace-write "update all imports from old-lib to new-lib across the entire codebase"
+```
+
+**Action**: Find all imports, perform replacements, verify syntax, run tests
+
+### Generate Report with JSON Output (Read-Only)
+
+**User**: "Analyze security vulnerabilities and output as JSON"
+**Mode**: Read-only
+**Command**:
+
+```bash
+codex exec -s read-only --json "analyze the codebase for security vulnerabilities and provide a detailed report"
+```
+
+**Action**: Scan code, identify issues, output structured JSON with findings
+
+### Install Dependencies and Integrate API (Danger-Full-Access)
+
+**User**: "Install the new payment SDK and integrate it"
+**Mode**: Danger-Full-Access
+**Command**:
+
+```bash
+codex exec -s danger-full-access "install the payment SDK dependencies and integrate the API"
+```
+
+**Action**: Install packages, update code, add integration points, test functionality
+
+### Multi-Project Work (Custom Directory)
+
+**User**: "Use codex to implement the API in the backend project"
+**Mode**: Workspace-write
+**Command**:
+
+```bash
+codex exec -C ~/projects/backend --full-auto "implement the REST API endpoints for user management"
+```
+
+**Action**: Switch to backend directory, implement API endpoints, write tests
+
+### Refactoring with Logging (Additional Directories)
+
+**User**: "Refactor the database layer and log changes"
+**Mode**: Workspace-write
+**Command**:
+
+```bash
+codex exec --full-auto --add-dir /tmp/refactor-logs "refactor the database layer for better performance and log all changes"
+```
+
+**Action**: Refactor code, write logs to external directory, run tests
+
+### Production Deployment (Using Profile)
+
+**User**: "Deploy using the production profile"
+**Mode**: Profile-based
+**Command**:
+
+```bash
+codex exec -p production "deploy the latest changes to production environment"
+```
+
+**Action**: Use production config, deploy code, verify deployment
+
+### Non-Git Project Analysis
+
+**User**: "Analyze this legacy codebase that's not in git"
+**Mode**: Read-only
+**Command**:
+
+```bash
+codex exec -s read-only --skip-git-repo-check "analyze the architecture and suggest modernization approach"
+```
+
+**Action**: Analyze code structure, provide modernization recommendations
+
+## Error Handling
+
+When errors occur:
+
+1. Attempt automatic recovery if possible
+2. Log the error clearly in the output
+3. Continue with remaining tasks if error is non-blocking
+4. Report all errors in the final summary
+5. Only stop if the error makes continuation impossible
+
+## Resumable Execution
+
+If execution is interrupted:
+
+- Clearly state what was completed
+- Provide exact commands/steps to resume
+- List any state that needs to be preserved
+- Explain what remains to be done
--- a/skills/nanobanana-skill/SKILL.md
+++ b/skills/nanobanana-skill/SKILL.md
@@ -0,0 +1,136 @@
+---
+name: nanobanana-skill
+description: Generate or edit images using Google Gemini API via nanobanana. Use when the user asks to create, generate, edit images with nanobanana, or mentions image generation/editing tasks.
+allowed-tools: Bash
+---
+
+# Nanobanana Image Generation Skill
+
+Generate or edit images using Google Gemini API through the nanobanana tool.
+
+## Requirements
+
+1. **GEMINI_API_KEY**: Must be configured in `~/.nanobanana.env` or `export GEMINI_API_KEY=<your-api-key>`
+2. **Python3 with depedent packages installed**: google-genai, Pillow, python-dotenv. They could be installed via `python3 -m pip install -r ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/requirements.txt` if not installed yet.
+3. **Executable**: `${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py`
+
+## Instructions
+
+### For image generation
+
+1. Ask the user for:
+   - What they want to create (the prompt)
+   - Desired aspect ratio/size (optional, defaults to 9:16 portrait)
+   - Output filename (optional, auto-generates UUID if not specified)
+   - Model preference (optional, defaults to gemini-3-pro-image-preview)
+   - Resolution (optional, defaults to 1K)
+
+2. Run the nanobanana script with appropriate parameters:
+
+   ```bash
+   python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py --prompt "description of image" --output "filename.png"
+   ```
+
+3. Show the user the saved image path when complete
+
+### For image editing
+
+1. Ask the user for:
+   - Input image file(s) to edit
+   - What changes they want (the prompt)
+   - Output filename (optional)
+
+2. Run with input images:
+
+   ```bash
+   python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py --prompt "editing instructions" --input image1.png image2.png --output "edited.png"
+   ```
+
+## Available Options
+
+### Aspect Ratios (--size)
+
+- `1024x1024` (1:1) - Square
+- `832x1248` (2:3) - Portrait
+- `1248x832` (3:2) - Landscape
+- `864x1184` (3:4) - Portrait
+- `1184x864` (4:3) - Landscape
+- `896x1152` (4:5) - Portrait
+- `1152x896` (5:4) - Landscape
+- `768x1344` (9:16) - Portrait (default)
+- `1344x768` (16:9) - Landscape
+- `1536x672` (21:9) - Ultra-wide
+
+### Models (--model)
+
+- `gemini-3-pro-image-preview` (default) - Higher quality
+- `gemini-2.5-flash-image` - Faster generation
+
+### Resolution (--resolution)
+
+- `1K` (default)
+- `2K`
+- `4K`
+
+## Examples
+
+### Generate a simple image
+
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py --prompt "A serene mountain landscape at sunset with a lake"
+```
+
+### Generate with specific size and output
+
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py \
+  --prompt "Modern minimalist logo for a tech startup" \
+  --size 1024x1024 \
+  --output "logo.png"
+```
+
+### Generate landscape image with high resolution
+
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py \
+  --prompt "Futuristic cityscape with flying cars" \
+  --size 1344x768 \
+  --resolution 2K \
+  --output "cityscape.png"
+```
+
+### Edit existing images
+
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py \
+  --prompt "Add a rainbow in the sky" \
+  --input photo.png \
+  --output "photo-with-rainbow.png"
+```
+
+### Use faster model
+
+```bash
+python3 ${CLAUDE_PLUGIN_ROOT}/skills/nanobanana-skill/nanobanana.py \
+  --prompt "Quick sketch of a cat" \
+  --model gemini-2.5-flash-image \
+  --output "cat-sketch.png"
+```
+
+## Error Handling
+
+If the script fails:
+
+- Check that `GEMINI_API_KEY` is exported or set in ~/.nanobanana.env
+- Verify input image files exist and are readable
+- Ensure the output directory is writable
+- If no image is generated, try making the prompt more specific about wanting an image
+
+## Best Practices
+
+1. Be descriptive in prompts - include style, mood, colors, composition
+2. For logos/graphics, use square aspect ratio (1024x1024)
+3. For social media posts, use 9:16 for stories or 1:1 for posts
+4. For wallpapers, use 16:9 or 21:9
+5. Start with 1K resolution for testing, upgrade to 2K/4K for final output
+6. Use gemini-3-pro-image-preview for best quality, gemini-2.5-flash-image for speed
--- a/skills/nanobanana-skill/nanobanana.py
+++ b/skills/nanobanana-skill/nanobanana.py
@@ -0,0 +1,147 @@
+#!/usr/bin/env python3
+# Generate or edit images using Google Gemini API
+import os
+import argparse
+import uuid
+from pathlib import Path
+from dotenv import load_dotenv
+from google import genai
+from google.genai import types
+from PIL import Image
+from io import BytesIO
+
+# Load environment variables
+load_dotenv(os.path.expanduser("~") + "/.nanobanana.env")
+
+# Google API configuration from environment variables
+api_key = os.getenv("GEMINI_API_KEY") or ""
+
+if not api_key:
+    raise ValueError(
+        "Missing GEMINI_API_KEY environment variable. Please check your .env file."
+    )
+
+# Initialize Gemini client
+client = genai.Client(api_key=api_key)
+
+# Aspect ratio to resolution mapping
+ASPECT_RATIO_MAP = {
+    "1024x1024": "1:1",  # 1:1
+    "832x1248": "2:3",  # 2:3
+    "1248x832": "3:2",  # 3:2
+    "864x1184": "3:4",  # 3:4
+    "1184x864": "4:3",  # 4:3
+    "896x1152": "4:5",  # 4:5
+    "1152x896": "5:4",  # 5:4
+    "768x1344": "9:16",  # 9:16
+    "1344x768": "16:9",  # 16:9
+    "1536x672": "21:9",  # 21:9
+}
+
+
+def main():
+    # Parse command-line arguments
+    parser = argparse.ArgumentParser(
+        description="Generate or edit images using Google Gemini API"
+    )
+    parser.add_argument(
+        "--prompt",
+        type=str,
+        required=True,
+        help="Prompt for image generation or editing",
+    )
+    parser.add_argument(
+        "--output",
+        type=str,
+        default=f"nanobanana-{uuid.uuid4()}.png",
+        help="Output image filename (default: nanobanana-<UUID>.png)",
+    )
+    parser.add_argument(
+        "--input", type=str, nargs="*", help="Input image files for editing (optional)"
+    )
+    parser.add_argument(
+        "--size",
+        type=str,
+        default="768x1344",
+        choices=list(ASPECT_RATIO_MAP.keys()),
+        help="Size/aspect ratio of the generated image (default: 768x1344 / 9:16)",
+    )
+    parser.add_argument(
+        "--model",
+        type=str,
+        default="gemini-3-pro-image-preview",
+        choices=["gemini-3-pro-image-preview", "gemini-2.5-flash-image"],
+        help="Model to use for image generation (default: gemini-3-pro-image-preview)",
+    )
+    parser.add_argument(
+        "--resolution",
+        type=str,
+        default="1K",
+        choices=["1K", "2K", "4K"],
+        help="Resolution of the generated image (default: 1K)",
+    )
+
+    args = parser.parse_args()
+
+    # Get aspect ratio from size
+    aspect_ratio = ASPECT_RATIO_MAP.get(args.size, "16:9")
+
+    # Build contents list for the API call
+    contents = []
+
+    # Check if input images are provided
+    if args.input and len(args.input) > 0:
+        # Use images.generate_content() with images for editing
+        print(f"Editing images with prompt: {args.prompt}")
+        print(f"Input images: {args.input}")
+        print(f"Aspect ratio: {aspect_ratio} ({args.size})")
+
+        # Add prompt first
+        contents.append(args.prompt)
+
+        # Add all input images
+        for img_path in args.input:
+            image = Image.open(img_path)
+            contents.append(image)
+    else:
+        print(f"Generating image (size: {args.size}) with prompt: {args.prompt}")
+        contents.append(args.prompt)
+
+    # Generate or edit image with config
+    response = client.models.generate_content(
+        model=args.model,
+        contents=contents,
+        config=types.GenerateContentConfig(
+            response_modalities=['TEXT', 'IMAGE'],
+            tools=[types.Tool(google_search=types.GoogleSearch())],
+            image_config=types.ImageConfig(
+                aspect_ratio=aspect_ratio,
+                image_size=args.resolution,
+            ),
+        ),
+    )
+
+    if (response.candidates is None
+        or len(response.candidates) == 0
+        or response.candidates[0].content is None
+        or response.candidates[0].content.parts is None):
+        raise ValueError("No data received from the API.")
+
+    # Extract image from response
+    image_saved = False
+    for part in response.candidates[0].content.parts:
+        if part.text is not None:
+            print(f"{part.text}", end="")
+        elif part.inline_data is not None and part.inline_data.data is not None:
+            image = Image.open(BytesIO(part.inline_data.data))
+
+            image.save(args.output)
+            image_saved = True
+            print(f"\n\nImage saved to: {args.output}")
+
+    if not image_saved:
+        print(f"\n\nWarning: No image data found in the API response. This usually means the model returned only text. Please try again with a different prompt to make image generation more clear.")
+
+
+if __name__ == "__main__":
+    main()
--- a/skills/nanobanana-skill/requirements.txt
+++ b/skills/nanobanana-skill/requirements.txt
@@ -0,0 +1,4 @@
+python-dotenv
+httpx[socks]
+google-genai
+Pillow