Initial commit
This commit is contained in:
281
skills/vertex-media-master/SKILL.md
Normal file
281
skills/vertex-media-master/SKILL.md
Normal file
@@ -0,0 +1,281 @@
|
||||
---
|
||||
name: Vertex AI Media Master
|
||||
description: |
|
||||
Automatic activation for ALL Google Vertex AI multimodal operations - video processing, audio generation, image creation, and marketing campaigns.
|
||||
**TRIGGER PHRASES:**
|
||||
- "vertex ai", "gemini multimodal", "process video", "generate audio", "create images", "marketing campaign"
|
||||
- "imagen", "video understanding", "multimodal", "content generation", "media assets"
|
||||
**AUTO-INVOKES FOR:**
|
||||
- Video processing and understanding (up to 6 hours)
|
||||
- Audio generation and transcription
|
||||
- Image generation with Imagen 4
|
||||
- Marketing campaign automation
|
||||
- Social media content creation
|
||||
- Ad creative generation
|
||||
- Multimodal content workflows
|
||||
allowed-tools: Read, Write, Edit, Grep, Glob, Bash
|
||||
version: 1.0.0
|
||||
---
|
||||
|
||||
# Vertex AI Media Master - Comprehensive Multimodal AI Operations
|
||||
|
||||
This Agent Skill provides comprehensive mastery of Google Vertex AI multimodal capabilities for video, audio, image, and text processing with focus on marketing applications.
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### 🎥 Video Processing (Gemini 2.0/2.5)
|
||||
- **Video Understanding**: Process videos up to 6 hours at low resolution or 2 hours at default resolution
|
||||
- **2M Context Window**: Gemini 2.5 Pro handles massive video content
|
||||
- **Audio Track Processing**: Automatic audio transcription from video
|
||||
- **Multi-video Analysis**: Process multiple videos in single request
|
||||
- **Video Summarization**: Extract key moments, scenes, and insights
|
||||
- **Marketing Use Cases**:
|
||||
- Analyze competitor video ads
|
||||
- Extract highlights from long-form content
|
||||
- Generate video summaries for social media
|
||||
- Transcribe and caption video content
|
||||
- Identify brand mentions and product placements
|
||||
|
||||
### 🎵 Audio Generation & Processing
|
||||
- **Lyria Model (2025)**: Native audio and music generation
|
||||
- **Speech-to-Text**: Transcribe audio with speaker diarization
|
||||
- **Text-to-Speech**: Generate natural voiceovers
|
||||
- **Music Composition**: Background music for campaigns
|
||||
- **Audio Enhancement**: Noise reduction and quality improvement
|
||||
- **Marketing Use Cases**:
|
||||
- Generate podcast scripts and voiceovers
|
||||
- Create audio ads and radio spots
|
||||
- Produce background music for video campaigns
|
||||
- Transcribe customer interviews
|
||||
- Generate multilingual voiceovers
|
||||
|
||||
### 🖼️ Image Generation (Imagen 4 & Gemini 2.5 Flash Image)
|
||||
- **Imagen 4**: Highest quality text-to-image generation
|
||||
- **Gemini 2.5 Flash Image**: Interleaved image generation with text
|
||||
- **Style Transfer**: Apply brand styles to generated images
|
||||
- **Product Visualization**: Generate product mockups
|
||||
- **Campaign Assets**: Create ad creatives and social media graphics
|
||||
- **Marketing Use Cases**:
|
||||
- Generate personalized ad images (Adios solution)
|
||||
- Create social media graphics at scale
|
||||
- Produce product lifestyle images
|
||||
- Generate A/B test variations
|
||||
- Create branded campaign visuals
|
||||
|
||||
### 📢 Marketing Campaign Automation
|
||||
- **ViGenAiR**: Convert long-form video ads to short formats automatically
|
||||
- **Adios**: Generate personalized ad images tailored to audience context
|
||||
- **Campaign Asset Generation**: Photos, soundtracks, voiceovers from prompts
|
||||
- **Content Pipeline**: Email copy, blog posts, social media, PMax assets
|
||||
- **Catalog Enrichment**: Multi-agent workflow for product onboarding
|
||||
- **Marketing Use Cases**:
|
||||
- Automated campaign asset production
|
||||
- Personalized content at scale
|
||||
- Multi-channel content distribution
|
||||
- Product catalog enhancement
|
||||
- Visual merchandising automation
|
||||
|
||||
### 🔧 Technical Implementation
|
||||
|
||||
**API Integration:**
|
||||
```python
|
||||
from google.cloud import aiplatform
|
||||
from vertexai.preview.generative_models import GenerativeModel
|
||||
|
||||
# Initialize Vertex AI
|
||||
aiplatform.init(project="your-project", location="us-central1")
|
||||
|
||||
# Gemini 2.5 Pro for video
|
||||
model = GenerativeModel("gemini-2.5-pro")
|
||||
|
||||
# Process video with audio
|
||||
response = model.generate_content([
|
||||
"Analyze this video and extract key marketing insights",
|
||||
video_file, # Up to 6 hours
|
||||
])
|
||||
|
||||
# Imagen 4 for image generation
|
||||
from vertexai.preview.vision_models import ImageGenerationModel
|
||||
imagen = ImageGenerationModel.from_pretrained("imagen-4")
|
||||
images = imagen.generate_images(
|
||||
prompt="Professional product photo, studio lighting, white background",
|
||||
number_of_images=4
|
||||
)
|
||||
```
|
||||
|
||||
**Gemini 2.5 Flash Image (Interleaved Generation):**
|
||||
```python
|
||||
# Generate images within text responses
|
||||
model = GenerativeModel("gemini-2.5-flash-image")
|
||||
response = model.generate_content([
|
||||
"Create a 5-step recipe with images for each step"
|
||||
])
|
||||
# Returns text + images interleaved
|
||||
```
|
||||
|
||||
**Audio Generation (Lyria):**
|
||||
```python
|
||||
from vertexai.preview.audio_models import AudioGenerationModel
|
||||
lyria = AudioGenerationModel.from_pretrained("lyria")
|
||||
audio = lyria.generate_audio(
|
||||
prompt="Upbeat background music for product launch video, 30 seconds",
|
||||
duration=30
|
||||
)
|
||||
```
|
||||
|
||||
### 📊 Marketing Workflow Automation
|
||||
|
||||
**1. Multi-Channel Campaign Creation:**
|
||||
```python
|
||||
# Single prompt generates all assets
|
||||
campaign = model.generate_content([
|
||||
"""Create a product launch campaign for [product]:
|
||||
- Hero image (1920x1080)
|
||||
- 3 social media graphics (1080x1080)
|
||||
- 30-second video script
|
||||
- Background music description
|
||||
- Email marketing copy
|
||||
- Instagram caption"""
|
||||
])
|
||||
```
|
||||
|
||||
**2. Video Repurposing Pipeline:**
|
||||
```python
|
||||
# Long-form to short-form conversion (ViGenAiR approach)
|
||||
long_video = "gs://bucket/original-ad-60s.mp4"
|
||||
response = model.generate_content([
|
||||
f"Extract 3 engaging 15-second clips from this video for TikTok/Reels",
|
||||
long_video
|
||||
])
|
||||
# Auto-generates format-specific versions
|
||||
```
|
||||
|
||||
**3. Personalized Ad Generation:**
|
||||
```python
|
||||
# Context-aware image generation (Adios approach)
|
||||
for audience in audiences:
|
||||
ad_image = imagen.generate_images(
|
||||
prompt=f"Product ad for {product}, targeting {audience.demographics}, {audience.style_preference}",
|
||||
aspect_ratio="16:9"
|
||||
)
|
||||
```
|
||||
|
||||
### 🎯 Best Practices for Jeremy
|
||||
|
||||
**1. Project Setup:**
|
||||
```bash
|
||||
# Set environment variables
|
||||
export GOOGLE_CLOUD_PROJECT="your-project-id"
|
||||
export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"
|
||||
|
||||
# Install SDK
|
||||
pip install google-cloud-aiplatform[vision,audio] google-generativeai
|
||||
```
|
||||
|
||||
**2. Rate Limits & Quotas:**
|
||||
- Gemini 2.5 Pro: 2M tokens/min (video processing)
|
||||
- Imagen 4: 100 images/min
|
||||
- Monitor usage in Cloud Console
|
||||
|
||||
**3. Cost Optimization:**
|
||||
- Use Gemini 2.5 Flash for faster, cheaper operations
|
||||
- Batch image generation requests
|
||||
- Cache video embeddings for repeated analysis
|
||||
- Use low-resolution video setting when appropriate
|
||||
|
||||
**4. Security & Compliance:**
|
||||
- Keep API keys in Secret Manager, never in code
|
||||
- Use service accounts with minimal permissions
|
||||
- Enable VPC Service Controls for data residency
|
||||
- Log all API calls for audit trails
|
||||
|
||||
### 🚀 Advanced Marketing Use Cases
|
||||
|
||||
**1. Campaign Performance Analysis:**
|
||||
```python
|
||||
# Analyze competitor campaigns
|
||||
competitor_videos = ["gs://bucket/competitor1.mp4", "gs://bucket/competitor2.mp4"]
|
||||
analysis = model.generate_content([
|
||||
"Compare these competitor videos: themes, messaging, CTAs, production quality",
|
||||
*competitor_videos
|
||||
])
|
||||
```
|
||||
|
||||
**2. Content Localization:**
|
||||
```python
|
||||
# Generate multilingual campaigns
|
||||
for lang in ["en", "es", "fr", "de", "ja"]:
|
||||
localized_content = model.generate_content([
|
||||
f"Translate and culturally adapt this campaign for {lang} market:",
|
||||
campaign_brief,
|
||||
hero_image
|
||||
])
|
||||
```
|
||||
|
||||
**3. A/B Test Generation:**
|
||||
```python
|
||||
# Generate variations automatically
|
||||
variations = []
|
||||
for style in ["minimalist", "bold", "luxury", "playful"]:
|
||||
variation = imagen.generate_images(
|
||||
prompt=f"Product ad, {style} style, {brand_guidelines}",
|
||||
number_of_images=1
|
||||
)
|
||||
variations.append(variation)
|
||||
```
|
||||
|
||||
### 📚 Reference Documentation
|
||||
|
||||
**Official Documentation:**
|
||||
- Vertex AI Multimodal: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview
|
||||
- Gemini 2.5 Pro: https://cloud.google.com/vertex-ai/generative-ai/docs/models
|
||||
- Imagen 4: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
|
||||
- Video Understanding: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding
|
||||
|
||||
**Marketing Solutions:**
|
||||
- GenAI for Marketing: https://github.com/GoogleCloudPlatform/genai-for-marketing
|
||||
- ViGenAiR (video repurposing)
|
||||
- Adios (personalized ad images)
|
||||
|
||||
**Pricing:**
|
||||
- Gemini 2.5 Pro: $3.50/1M input tokens, $10.50/1M output tokens
|
||||
- Imagen 4: $0.04/image
|
||||
- Video processing: Included in Gemini token pricing
|
||||
|
||||
## When This Skill Activates
|
||||
|
||||
This skill automatically activates when you mention:
|
||||
- Video processing, analysis, or understanding
|
||||
- Audio generation, music composition, or voiceovers
|
||||
- Image generation, ad creatives, or visual content
|
||||
- Marketing campaigns, content automation, or asset production
|
||||
- Gemini multimodal capabilities
|
||||
- Vertex AI media operations
|
||||
- Social media content, email marketing, or PMax campaigns
|
||||
|
||||
## Integration with Other Tools
|
||||
|
||||
**Google Cloud Services:**
|
||||
- Cloud Storage for media asset management
|
||||
- BigQuery for campaign analytics
|
||||
- Cloud Functions for automation triggers
|
||||
- Vertex AI Pipelines for content workflows
|
||||
|
||||
**Third-Party Integrations:**
|
||||
- Social media APIs (LinkedIn, Twitter, Instagram)
|
||||
- Marketing automation platforms (HubSpot, Marketo)
|
||||
- CMS integrations (WordPress, Contentful)
|
||||
- DAM systems (Bynder, Cloudinary)
|
||||
|
||||
## Success Metrics
|
||||
|
||||
**Track These KPIs:**
|
||||
- Asset generation speed (baseline: 5 images/min)
|
||||
- Content approval rate (target: >80%)
|
||||
- Campaign personalization scale (target: 1000+ variants)
|
||||
- Cost per asset (target: <$0.10/image)
|
||||
- Time saved vs manual production (target: 90% reduction)
|
||||
|
||||
---
|
||||
|
||||
**This skill makes Jeremy a Vertex AI multimodal expert with instant access to video processing, audio generation, image creation, and marketing automation capabilities.**
|
||||
26
skills/vertex-media-master/assets/README.md
Normal file
26
skills/vertex-media-master/assets/README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Skill Assets
|
||||
|
||||
This directory contains static assets used by this skill.
|
||||
|
||||
## Purpose
|
||||
|
||||
Assets can include:
|
||||
- Configuration files (JSON, YAML)
|
||||
- Data files
|
||||
- Templates
|
||||
- Schemas
|
||||
- Test fixtures
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Keep assets small and focused
|
||||
- Document asset purpose and format
|
||||
- Use standard file formats
|
||||
- Include schema validation where applicable
|
||||
|
||||
## Common Asset Types
|
||||
|
||||
- **config.json** - Configuration templates
|
||||
- **schema.json** - JSON schemas
|
||||
- **template.yaml** - YAML templates
|
||||
- **test-data.json** - Test fixtures
|
||||
26
skills/vertex-media-master/references/README.md
Normal file
26
skills/vertex-media-master/references/README.md
Normal file
@@ -0,0 +1,26 @@
|
||||
# Skill References
|
||||
|
||||
This directory contains reference materials that enhance this skill's capabilities.
|
||||
|
||||
## Purpose
|
||||
|
||||
References can include:
|
||||
- Code examples
|
||||
- Style guides
|
||||
- Best practices documentation
|
||||
- Template files
|
||||
- Configuration examples
|
||||
|
||||
## Guidelines
|
||||
|
||||
- Keep references concise and actionable
|
||||
- Use markdown for documentation
|
||||
- Include clear examples
|
||||
- Link to external resources when appropriate
|
||||
|
||||
## Types of References
|
||||
|
||||
- **examples.md** - Usage examples
|
||||
- **style-guide.md** - Coding standards
|
||||
- **templates/** - Reusable templates
|
||||
- **patterns.md** - Design patterns
|
||||
24
skills/vertex-media-master/scripts/README.md
Normal file
24
skills/vertex-media-master/scripts/README.md
Normal file
@@ -0,0 +1,24 @@
|
||||
# Skill Scripts
|
||||
|
||||
This directory contains optional helper scripts that support this skill's functionality.
|
||||
|
||||
## Purpose
|
||||
|
||||
Scripts here can be:
|
||||
- Referenced by the skill for automation
|
||||
- Used as examples for users
|
||||
- Executed during skill activation
|
||||
|
||||
## Guidelines
|
||||
|
||||
- All scripts should be well-documented
|
||||
- Include usage examples in comments
|
||||
- Make scripts executable (`chmod +x`)
|
||||
- Use `#!/bin/bash` or `#!/usr/bin/env python3` shebangs
|
||||
|
||||
## Adding Scripts
|
||||
|
||||
1. Create script file (e.g., `analyze.sh`, `process.py`)
|
||||
2. Add documentation header
|
||||
3. Make executable: `chmod +x script-name.sh`
|
||||
4. Test thoroughly before committing
|
||||
Reference in New Issue
Block a user