zhongwei/gh-hirefrank-hirefrank-marketplace-plugins-edge-stack

Fork 0

Files

History

Zhongwei Li bd85f56f7c Initial commit

2025-11-29 18:45:50 +08:00

scripts

Initial commit

2025-11-29 18:45:50 +08:00

.env.example

Initial commit

2025-11-29 18:45:50 +08:00

.gitignore

Initial commit

2025-11-29 18:45:50 +08:00

package.json

Initial commit

2025-11-29 18:45:50 +08:00

README.md

Initial commit

2025-11-29 18:45:50 +08:00

SKILL.md

Initial commit

2025-11-29 18:45:50 +08:00

tsconfig.json

Initial commit

2025-11-29 18:45:50 +08:00

README.md

Gemini ImageGen Skill

AI-powered image generation, editing, and composition using Google's Gemini API.

Quick Start

Install dependencies:
```
npm install
```
Set your API key:
```
export GEMINI_API_KEY="your-api-key-here"
```
Get your key from: https://makersuite.google.com/app/apikey

Generate an image:

npm run generate "a sunset over mountains" output.png

Features

Generate: Create images from text descriptions
Edit: Modify existing images with natural language prompts
Compose: Combine multiple images with flexible layouts

Usage Examples

Generate Images

# Basic generation
npm run generate "futuristic city skyline" city.png

# Custom size
npm run generate "modern office" office.png -- --width 1920 --height 1080

Edit Images

# Style transformation
npm run edit photo.jpg "make it look like a watercolor painting" artistic.png

# Object modification
npm run edit landscape.png "add a rainbow in the sky" enhanced.png

Compose Images

# Grid layout (default)
npm run compose collage.png img1.jpg img2.jpg img3.jpg img4.jpg

# Horizontal banner
npm run compose banner.png left.png right.png -- --layout horizontal

# Custom composition
npm run compose result.png a.jpg b.jpg -- --prompt "blend seamlessly"

Scripts

npm run generate <prompt> <output> - Generate image from text
npm run edit <source> <prompt> <output> - Edit existing image
npm run compose <output> <images...> - Compose multiple images

Configuration

Environment Variables

GEMINI_API_KEY (required) - Your Google Gemini API key

Options

See SKILL.md for detailed documentation on all available options and parameters.

Development Notes

This is a local development skill that runs on your machine, not on Cloudflare Workers. It's designed for:

Design workflows and asset creation
Visual content generation
Image manipulation and prototyping
Creating test images for development

Implementation Status

Note: The current implementation includes:

Complete TypeScript structure
Argument parsing and validation
Gemini API integration for image analysis
Comprehensive error handling

For production use with actual image generation/editing, you'll need to:

Use the Imagen model (imagen-3.0-generate-001)
Implement proper image data handling
Add output file writing with actual image data

Refer to the Gemini Imagen documentation for implementation details.

License

MIT