Initial commit

2025-11-30 08:21:41 +08:00
commit 0b3ddcd76a
8 changed files with 603 additions and 0 deletions
--- a/skills/vertex-media-master/SKILL.md
+++ b/skills/vertex-media-master/SKILL.md
@@ -0,0 +1,281 @@
+---
+name: Vertex AI Media Master
+description: |
+  Automatic activation for ALL Google Vertex AI multimodal operations - video processing, audio generation, image creation, and marketing campaigns.
+  **TRIGGER PHRASES:**
+  - "vertex ai", "gemini multimodal", "process video", "generate audio", "create images", "marketing campaign"
+  - "imagen", "video understanding", "multimodal", "content generation", "media assets"
+  **AUTO-INVOKES FOR:**
+  - Video processing and understanding (up to 6 hours)
+  - Audio generation and transcription
+  - Image generation with Imagen 4
+  - Marketing campaign automation
+  - Social media content creation
+  - Ad creative generation
+  - Multimodal content workflows
+allowed-tools: Read, Write, Edit, Grep, Glob, Bash
+version: 1.0.0
+---
+
+# Vertex AI Media Master - Comprehensive Multimodal AI Operations
+
+This Agent Skill provides comprehensive mastery of Google Vertex AI multimodal capabilities for video, audio, image, and text processing with focus on marketing applications.
+
+## Core Capabilities
+
+### 🎥 Video Processing (Gemini 2.0/2.5)
+- **Video Understanding**: Process videos up to 6 hours at low resolution or 2 hours at default resolution
+- **2M Context Window**: Gemini 2.5 Pro handles massive video content
+- **Audio Track Processing**: Automatic audio transcription from video
+- **Multi-video Analysis**: Process multiple videos in single request
+- **Video Summarization**: Extract key moments, scenes, and insights
+- **Marketing Use Cases**:
+  - Analyze competitor video ads
+  - Extract highlights from long-form content
+  - Generate video summaries for social media
+  - Transcribe and caption video content
+  - Identify brand mentions and product placements
+
+### 🎵 Audio Generation & Processing
+- **Lyria Model (2025)**: Native audio and music generation
+- **Speech-to-Text**: Transcribe audio with speaker diarization
+- **Text-to-Speech**: Generate natural voiceovers
+- **Music Composition**: Background music for campaigns
+- **Audio Enhancement**: Noise reduction and quality improvement
+- **Marketing Use Cases**:
+  - Generate podcast scripts and voiceovers
+  - Create audio ads and radio spots
+  - Produce background music for video campaigns
+  - Transcribe customer interviews
+  - Generate multilingual voiceovers
+
+### 🖼️ Image Generation (Imagen 4 & Gemini 2.5 Flash Image)
+- **Imagen 4**: Highest quality text-to-image generation
+- **Gemini 2.5 Flash Image**: Interleaved image generation with text
+- **Style Transfer**: Apply brand styles to generated images
+- **Product Visualization**: Generate product mockups
+- **Campaign Assets**: Create ad creatives and social media graphics
+- **Marketing Use Cases**:
+  - Generate personalized ad images (Adios solution)
+  - Create social media graphics at scale
+  - Produce product lifestyle images
+  - Generate A/B test variations
+  - Create branded campaign visuals
+
+### 📢 Marketing Campaign Automation
+- **ViGenAiR**: Convert long-form video ads to short formats automatically
+- **Adios**: Generate personalized ad images tailored to audience context
+- **Campaign Asset Generation**: Photos, soundtracks, voiceovers from prompts
+- **Content Pipeline**: Email copy, blog posts, social media, PMax assets
+- **Catalog Enrichment**: Multi-agent workflow for product onboarding
+- **Marketing Use Cases**:
+  - Automated campaign asset production
+  - Personalized content at scale
+  - Multi-channel content distribution
+  - Product catalog enhancement
+  - Visual merchandising automation
+
+### 🔧 Technical Implementation
+
+**API Integration:**
+```python
+from google.cloud import aiplatform
+from vertexai.preview.generative_models import GenerativeModel
+
+# Initialize Vertex AI
+aiplatform.init(project="your-project", location="us-central1")
+
+# Gemini 2.5 Pro for video
+model = GenerativeModel("gemini-2.5-pro")
+
+# Process video with audio
+response = model.generate_content([
+    "Analyze this video and extract key marketing insights",
+    video_file,  # Up to 6 hours
+])
+
+# Imagen 4 for image generation
+from vertexai.preview.vision_models import ImageGenerationModel
+imagen = ImageGenerationModel.from_pretrained("imagen-4")
+images = imagen.generate_images(
+    prompt="Professional product photo, studio lighting, white background",
+    number_of_images=4
+)
+```
+
+**Gemini 2.5 Flash Image (Interleaved Generation):**
+```python
+# Generate images within text responses
+model = GenerativeModel("gemini-2.5-flash-image")
+response = model.generate_content([
+    "Create a 5-step recipe with images for each step"
+])
+# Returns text + images interleaved
+```
+
+**Audio Generation (Lyria):**
+```python
+from vertexai.preview.audio_models import AudioGenerationModel
+lyria = AudioGenerationModel.from_pretrained("lyria")
+audio = lyria.generate_audio(
+    prompt="Upbeat background music for product launch video, 30 seconds",
+    duration=30
+)
+```
+
+### 📊 Marketing Workflow Automation
+
+**1. Multi-Channel Campaign Creation:**
+```python
+# Single prompt generates all assets
+campaign = model.generate_content([
+    """Create a product launch campaign for [product]:
+    - Hero image (1920x1080)
+    - 3 social media graphics (1080x1080)
+    - 30-second video script
+    - Background music description
+    - Email marketing copy
+    - Instagram caption"""
+])
+```
+
+**2. Video Repurposing Pipeline:**
+```python
+# Long-form to short-form conversion (ViGenAiR approach)
+long_video = "gs://bucket/original-ad-60s.mp4"
+response = model.generate_content([
+    f"Extract 3 engaging 15-second clips from this video for TikTok/Reels",
+    long_video
+])
+# Auto-generates format-specific versions
+```
+
+**3. Personalized Ad Generation:**
+```python
+# Context-aware image generation (Adios approach)
+for audience in audiences:
+    ad_image = imagen.generate_images(
+        prompt=f"Product ad for {product}, targeting {audience.demographics}, {audience.style_preference}",
+        aspect_ratio="16:9"
+    )
+```
+
+### 🎯 Best Practices for Jeremy
+
+**1. Project Setup:**
+```bash
+# Set environment variables
+export GOOGLE_CLOUD_PROJECT="your-project-id"
+export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account.json"
+
+# Install SDK
+pip install google-cloud-aiplatform[vision,audio] google-generativeai
+```
+
+**2. Rate Limits & Quotas:**
+- Gemini 2.5 Pro: 2M tokens/min (video processing)
+- Imagen 4: 100 images/min
+- Monitor usage in Cloud Console
+
+**3. Cost Optimization:**
+- Use Gemini 2.5 Flash for faster, cheaper operations
+- Batch image generation requests
+- Cache video embeddings for repeated analysis
+- Use low-resolution video setting when appropriate
+
+**4. Security & Compliance:**
+- Keep API keys in Secret Manager, never in code
+- Use service accounts with minimal permissions
+- Enable VPC Service Controls for data residency
+- Log all API calls for audit trails
+
+### 🚀 Advanced Marketing Use Cases
+
+**1. Campaign Performance Analysis:**
+```python
+# Analyze competitor campaigns
+competitor_videos = ["gs://bucket/competitor1.mp4", "gs://bucket/competitor2.mp4"]
+analysis = model.generate_content([
+    "Compare these competitor videos: themes, messaging, CTAs, production quality",
+    *competitor_videos
+])
+```
+
+**2. Content Localization:**
+```python
+# Generate multilingual campaigns
+for lang in ["en", "es", "fr", "de", "ja"]:
+    localized_content = model.generate_content([
+        f"Translate and culturally adapt this campaign for {lang} market:",
+        campaign_brief,
+        hero_image
+    ])
+```
+
+**3. A/B Test Generation:**
+```python
+# Generate variations automatically
+variations = []
+for style in ["minimalist", "bold", "luxury", "playful"]:
+    variation = imagen.generate_images(
+        prompt=f"Product ad, {style} style, {brand_guidelines}",
+        number_of_images=1
+    )
+    variations.append(variation)
+```
+
+### 📚 Reference Documentation
+
+**Official Documentation:**
+- Vertex AI Multimodal: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/overview
+- Gemini 2.5 Pro: https://cloud.google.com/vertex-ai/generative-ai/docs/models
+- Imagen 4: https://cloud.google.com/vertex-ai/generative-ai/docs/image/overview
+- Video Understanding: https://cloud.google.com/vertex-ai/generative-ai/docs/multimodal/video-understanding
+
+**Marketing Solutions:**
+- GenAI for Marketing: https://github.com/GoogleCloudPlatform/genai-for-marketing
+- ViGenAiR (video repurposing)
+- Adios (personalized ad images)
+
+**Pricing:**
+- Gemini 2.5 Pro: $3.50/1M input tokens, $10.50/1M output tokens
+- Imagen 4: $0.04/image
+- Video processing: Included in Gemini token pricing
+
+## When This Skill Activates
+
+This skill automatically activates when you mention:
+- Video processing, analysis, or understanding
+- Audio generation, music composition, or voiceovers
+- Image generation, ad creatives, or visual content
+- Marketing campaigns, content automation, or asset production
+- Gemini multimodal capabilities
+- Vertex AI media operations
+- Social media content, email marketing, or PMax campaigns
+
+## Integration with Other Tools
+
+**Google Cloud Services:**
+- Cloud Storage for media asset management
+- BigQuery for campaign analytics
+- Cloud Functions for automation triggers
+- Vertex AI Pipelines for content workflows
+
+**Third-Party Integrations:**
+- Social media APIs (LinkedIn, Twitter, Instagram)
+- Marketing automation platforms (HubSpot, Marketo)
+- CMS integrations (WordPress, Contentful)
+- DAM systems (Bynder, Cloudinary)
+
+## Success Metrics
+
+**Track These KPIs:**
+- Asset generation speed (baseline: 5 images/min)
+- Content approval rate (target: >80%)
+- Campaign personalization scale (target: 1000+ variants)
+- Cost per asset (target: <$0.10/image)
+- Time saved vs manual production (target: 90% reduction)
+
+---
+
+**This skill makes Jeremy a Vertex AI multimodal expert with instant access to video processing, audio generation, image creation, and marketing automation capabilities.**
--- a/skills/vertex-media-master/assets/README.md
+++ b/skills/vertex-media-master/assets/README.md
@@ -0,0 +1,26 @@
+# Skill Assets
+
+This directory contains static assets used by this skill.
+
+## Purpose
+
+Assets can include:
+- Configuration files (JSON, YAML)
+- Data files
+- Templates
+- Schemas
+- Test fixtures
+
+## Guidelines
+
+- Keep assets small and focused
+- Document asset purpose and format
+- Use standard file formats
+- Include schema validation where applicable
+
+## Common Asset Types
+
+- **config.json** - Configuration templates
+- **schema.json** - JSON schemas
+- **template.yaml** - YAML templates
+- **test-data.json** - Test fixtures
--- a/skills/vertex-media-master/references/README.md
+++ b/skills/vertex-media-master/references/README.md
@@ -0,0 +1,26 @@
+# Skill References
+
+This directory contains reference materials that enhance this skill's capabilities.
+
+## Purpose
+
+References can include:
+- Code examples
+- Style guides
+- Best practices documentation
+- Template files
+- Configuration examples
+
+## Guidelines
+
+- Keep references concise and actionable
+- Use markdown for documentation
+- Include clear examples
+- Link to external resources when appropriate
+
+## Types of References
+
+- **examples.md** - Usage examples
+- **style-guide.md** - Coding standards
+- **templates/** - Reusable templates
+- **patterns.md** - Design patterns
--- a/skills/vertex-media-master/scripts/README.md
+++ b/skills/vertex-media-master/scripts/README.md
@@ -0,0 +1,24 @@
+# Skill Scripts
+
+This directory contains optional helper scripts that support this skill's functionality.
+
+## Purpose
+
+Scripts here can be:
+- Referenced by the skill for automation
+- Used as examples for users
+- Executed during skill activation
+
+## Guidelines
+
+- All scripts should be well-documented
+- Include usage examples in comments
+- Make scripts executable (`chmod +x`)
+- Use `#!/bin/bash` or `#!/usr/bin/env python3` shebangs
+
+## Adding Scripts
+
+1. Create script file (e.g., `analyze.sh`, `process.py`)
+2. Add documentation header
+3. Make executable: `chmod +x script-name.sh`
+4. Test thoroughly before committing