--- name: image-gen description: Generate images using Google's Nano Banana Pro (Gemini 3 Pro Image) with workflow-based prompting triggers: - "create image" - "generate image" - "make infographic" - "create infographic" - "generate diagram" - "make diagram" - "design visual" - "create visual" allowed-tools: Read, Write, Bash version: 0.1.0 --- # Image Generation Skill Generate professional images, infographics, and diagrams using Google's Nano Banana Pro model (gemini-3-pro-image-preview). ## Model Capabilities **Nano Banana Pro** (released November 20, 2025): - **Text rendering** - Accurate, legible text in images - **Google Search grounding** - Real-time data (weather, stocks, etc.) - **Multi-turn conversation** - Iterative refinement - **Up to 14 reference images** - For composition and style transfer - **Resolutions**: 1K, 2K, 4K - **Aspect ratios**: 1:1, 2:3, 3:2, 4:3, 16:9, 21:9 ## Scripts All scripts use Python via `uv run` with inline dependencies. ### generate.py - Text to Image ```bash uv run scripts/generate.py "prompt" output.png [aspect_ratio] [size] ``` **Examples:** ```bash # Basic image uv run scripts/generate.py "A cozy coffee shop in autumn" coffee.png # Infographic with specific aspect ratio uv run scripts/generate.py "Infographic explaining how neural networks work" nn.png 16:9 2K # 4K professional image uv run scripts/generate.py "Professional headshot, studio lighting" headshot.png 3:2 4K ``` ### edit.py - Image Editing ```bash uv run scripts/edit.py input.png "edit instructions" output.png ``` **Examples:** ```bash # Edit existing image uv run scripts/edit.py photo.png "Change the background to a beach sunset" edited.png ``` ### compose.py - Multi-Image Composition ```bash uv run scripts/compose.py "prompt" output.png --refs image1.png image2.png ``` **Examples:** ```bash # Combine styles from multiple images uv run scripts/compose.py "Combine these styles into a logo" logo.png --refs style1.png style2.png ``` ## Workflows Workflows provide structured approaches for specific visual types. Each workflow follows the PAI 6-step editorial process: 1. **Extract narrative** - Understand the complete story/concept 2. **Derive visual concept** - Single metaphor with 2-3 physical objects 3. **Apply aesthetic** - Define style, colors, mood 4. **Construct prompt** - Build detailed generation instructions 5. **Generate** - Execute via script 6. **Validate** - Check against criteria, regenerate if needed ### Available Workflows - **infographic.md** - Data visualization, statistics, explainers - **diagram.md** - Technical diagrams, flowcharts, architecture ## Workflow Usage When generating images, follow the appropriate workflow: ### For Infographics ```markdown 1. What data/concept needs visualization? 2. What's the key insight or takeaway? 3. Aspect ratio: 16:9 (landscape) recommended 4. Include: clear hierarchy, minimal text, supporting icons 5. Generate at 2K minimum for text clarity ``` ### For Diagrams ```markdown 1. What system/process is being illustrated? 2. What are the key components and relationships? 3. Style: flat colors, clean lines, minimal detail 4. Generate at 2K for label clarity ``` ## Environment Setup Requires `GEMINI_API_KEY` environment variable. This should be set from Geoffrey's secrets: ```bash source ~/Library/Mobile\ Documents/com~apple~CloudDocs/Geoffrey/secrets/.env ``` ## Best Practices ### Infographics - Use simple, direct prompts: "Infographic explaining how X works" - Model auto-includes relevant icons/logos - 16:9 aspect ratio works best - Generate at 2K+ for readable text ### General - Multi-turn refinement: generate, then ask for specific changes - Reference images improve consistency - Be specific about style, mood, lighting - SynthID watermark is automatic (Google provenance) ## Output Location By default, save images to `/tmp/` or user-specified paths. For persistent storage, use: ``` ~/Library/Mobile Documents/com~apple~CloudDocs/Geoffrey/images/ ``` ## Limitations - No photorealistic humans (safety filter) - No copyrighted characters - Maximum 14 reference images for composition - 4K only available with Nano Banana Pro ## Pricing | Size | Cost per Image | |------|---------------| | 1K | Free tier / $0.04 | | 2K | $0.134 | | 4K | $0.24 |