4.2 KiB
4.2 KiB
name, description, triggers, allowed-tools, version
| name | description | triggers | allowed-tools | version | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| image-gen | Generate images using Google's Nano Banana Pro (Gemini 3 Pro Image) with workflow-based prompting |
|
Read, Write, Bash | 0.1.0 |
Image Generation Skill
Generate professional images, infographics, and diagrams using Google's Nano Banana Pro model (gemini-3-pro-image-preview).
Model Capabilities
Nano Banana Pro (released November 20, 2025):
- Text rendering - Accurate, legible text in images
- Google Search grounding - Real-time data (weather, stocks, etc.)
- Multi-turn conversation - Iterative refinement
- Up to 14 reference images - For composition and style transfer
- Resolutions: 1K, 2K, 4K
- Aspect ratios: 1:1, 2:3, 3:2, 4:3, 16:9, 21:9
Scripts
All scripts use Python via uv run with inline dependencies.
generate.py - Text to Image
uv run scripts/generate.py "prompt" output.png [aspect_ratio] [size]
Examples:
# Basic image
uv run scripts/generate.py "A cozy coffee shop in autumn" coffee.png
# Infographic with specific aspect ratio
uv run scripts/generate.py "Infographic explaining how neural networks work" nn.png 16:9 2K
# 4K professional image
uv run scripts/generate.py "Professional headshot, studio lighting" headshot.png 3:2 4K
edit.py - Image Editing
uv run scripts/edit.py input.png "edit instructions" output.png
Examples:
# Edit existing image
uv run scripts/edit.py photo.png "Change the background to a beach sunset" edited.png
compose.py - Multi-Image Composition
uv run scripts/compose.py "prompt" output.png --refs image1.png image2.png
Examples:
# Combine styles from multiple images
uv run scripts/compose.py "Combine these styles into a logo" logo.png --refs style1.png style2.png
Workflows
Workflows provide structured approaches for specific visual types. Each workflow follows the PAI 6-step editorial process:
- Extract narrative - Understand the complete story/concept
- Derive visual concept - Single metaphor with 2-3 physical objects
- Apply aesthetic - Define style, colors, mood
- Construct prompt - Build detailed generation instructions
- Generate - Execute via script
- Validate - Check against criteria, regenerate if needed
Available Workflows
- infographic.md - Data visualization, statistics, explainers
- diagram.md - Technical diagrams, flowcharts, architecture
Workflow Usage
When generating images, follow the appropriate workflow:
For Infographics
1. What data/concept needs visualization?
2. What's the key insight or takeaway?
3. Aspect ratio: 16:9 (landscape) recommended
4. Include: clear hierarchy, minimal text, supporting icons
5. Generate at 2K minimum for text clarity
For Diagrams
1. What system/process is being illustrated?
2. What are the key components and relationships?
3. Style: flat colors, clean lines, minimal detail
4. Generate at 2K for label clarity
Environment Setup
Requires GEMINI_API_KEY environment variable. This should be set from Geoffrey's secrets:
source ~/Library/Mobile\ Documents/com~apple~CloudDocs/Geoffrey/secrets/.env
Best Practices
Infographics
- Use simple, direct prompts: "Infographic explaining how X works"
- Model auto-includes relevant icons/logos
- 16:9 aspect ratio works best
- Generate at 2K+ for readable text
General
- Multi-turn refinement: generate, then ask for specific changes
- Reference images improve consistency
- Be specific about style, mood, lighting
- SynthID watermark is automatic (Google provenance)
Output Location
By default, save images to /tmp/ or user-specified paths. For persistent storage, use:
~/Library/Mobile Documents/com~apple~CloudDocs/Geoffrey/images/
Limitations
- No photorealistic humans (safety filter)
- No copyrighted characters
- Maximum 14 reference images for composition
- 4K only available with Nano Banana Pro
Pricing
| Size | Cost per Image |
|---|---|
| 1K | Free tier / $0.04 |
| 2K | $0.134 |
| 4K | $0.24 |