Initial commit

2025-11-30 08:30:59 +08:00
commit 4efdca7e88
18 changed files with 1843 additions and 0 deletions
--- a/skills/youtube-thumbnail/SKILL.md
+++ b/skills/youtube-thumbnail/SKILL.md
@@ -0,0 +1,138 @@
+---
+name: youtube-thumbnail
+description: "Skill for creating and editing Youtube thumbnails that are optimized for click-through rate. Use when the user asks to create a thumbnail from scratch or edit an existing thumbnail."
+---
+
+# YouTube Thumbnail Skill
+
+This skill enables generation of high-performing YouTube thumbnails optimized for click-through rate (CTR). Thumbnails are designed to spark curiosity, complement titles, and compel viewers to click.
+
+## Thumbkit
+
+This skill uses Thumbkit, a CLI tool for generating and editing high-performing YouTube thumbnails. Thumbkit is built on top of Gemini 2.5 Flash (NanoBanana) image generation model. 
+
+Thumbkit is **required** for this skill. Assume Thumbkit has been installed as a uv tool and is available globally on the user's system. If Thumbkit is not installed, please install it before proceeding.
+
+### Testing Installation
+
+To test if Thumbkit is installed, run the following command:
+
+```bash
+thumbkit
+```
+
+If you see the help menu, Thumbkit is installed.
+
+### Installation
+
+```bash
+uv tool install https://github.com/kenneth-liao/thumbkit.git
+```
+
+### Upgrading
+
+```bash
+uv tool upgrade thumbkit
+```
+
+### Thumbkit Documentation
+
+To access the full CLI reference documentation, run the following command:
+
+```bash
+thumbkit docs
+```
+
+If not accessible through the CLI, the full documentation for Thumbkit can be found at `https://github.com/kenneth-liao/thumbkit/blob/main/thumbkit/CLI_REFERENCE.md`.
+
+**CRITICAL**: You **MUST** read the full documentation before using Thumbkit to generate thumbnails.
+
+### Thumbnail Output Directory
+
+By default, thumbnails generated by Thumbkit are saved to `./youtube/thumbnails/` in the user's current working directory. You should always save newly generated thumbnails to this directory unless otherwise specified by the user. To specify a different directory, use the `--output-dir` flag and pass the absolute path to the desired directory.
+
+## 🚨 REQUIRED READING 🚨
+
+The following documents are **MANDATORY READING**. You **MUST** read both documents before generating ANY thumbnail.
+
+1. You **MUST** read the complete Thumbkit CLI reference documentation by running `thumbkit docs`.
+2. `references/design-requirements.md` - The design requirements are what enable you to generate high click-through-rate thumbnails through proven strategies.
+3. `references/prompting-guidelines.md` - Thumbnails are generated using NanoBanana, an image generation large language model. The prompting guidelines will enable you to get more predictable and consistent results from NanoBanana.
+
+It's a **MANDATORY REQUIREMENT** that you follow both the design requirements and prompting guidelines in order to generate high converting thumbnails. Failure to do so will result in a failed task.
+
+## Reference Images
+
+With both generating and editing thumbnails, you can include reference images. Examples include but are not limited to base thumbnails to edit, thumbnail templates, the user's headshots, icons, logos, or images for style transfer.
+
+All reference images **MUST** be passed using absolute paths.
+
+### Using Official Logos
+
+If using company logos, use actual images by passing the absolute path to the image files instead of simply describing them. Nanobanana does not know what common company logos look like.
+
+If a company logo is not locally available, you can search for it online and download it using curl, then pass the absolute path to the downloaded image in the prompt. Save all downloaded images to `./youtube/downloads/`, making the dir if it doesn't exist.
+
+### Common Mistakes to Avoid
+
+❌ **WRONG**: "create the Claude AI logo (an orange C shape)"
+✓ **CORRECT**: Pass the actual logo file as a reference image
+
+❌ **WRONG**: "add the Python logo"
+✓ **CORRECT**: Use `/absolute/path/to/python-logo.png` as a reference image
+
+## Workflows
+
+### Generating Thumbnail Concepts
+
+Once you have generated an initial thumbnail concept or prompt, you **MUST** use the `Thumbnail Reviewer` agent to review the concept and provide feedback. The reviewer will provide a critique and suggest improvements. Refine the prompt before proceeding to generate the thumbnail.
+
+### Generating Thumbnails from Scratch
+
+For most cases, you will be editing a base image to preserve all of or most of the original image, such as with a template or a headshot. However, when the goal is to generate a new thumbnail where preserving original reference images is not important, you can generate a new thumbnail from scratch.
+
+### Editing Base Images
+
+For most cases, you will be editing a base image to preserve all of or most of the original image, such as with a template, headshot, or example thumbnails. If a user has provided a headshot but no base image to edit, use the headshot as the base image. This ensures the original headshot is used without modification. When using headshots as reference images rather than base images, they are loosely replicated, not exactly copied. This can result in the person in the final image looking different from the original headshot.
+
+### Face Swapping / Person Replacement Best Practices
+
+Face swapping with AI image generation is unreliable. The model tends to generate new faces rather than accurately preserving reference faces.
+
+#### Recommended Approach
+
+1. Use Headshot as Base Image (BEST)
+When to use: When you need the person's face to be accurate Instead of generating a thumbnail and trying to swap faces, use the headshot as the base image and build the thumbnail around it.
+thumbkit edit \
+  --prompt "Create a YouTube thumbnail using the person from this headshot. Place them on the [left/right] side with [pose/gesture]. Add [background elements, text, graphics]. The person should maintain their exact facial features from the headshot." \
+  --base "/path/to/headshot.png" \
+  --ref "/path/to/style-reference.jpg" \
+  --out-dir "/path/to/output"
+Why it works: The model preserves the base image's face more accurately than when trying to swap faces onto a different person.
+
+#### What Doesn't Work
+
+Using headshot as reference only: When the headshot is just a reference (not base), the model loosely interprets facial features rather than copying them exactly
+Simple face swap prompts: Prompts like "replace the face with Kenny's face" produce inconsistent results
+Multiple generation attempts: Regenerating rarely improves face accuracy
+
+#### Example: Building Thumbnail Around Headshot
+
+Good - headshot as base
+```bash
+thumbkit edit \
+--prompt "Create a YouTube thumbnail using the person from this headshot. Place them on the left side with a thumbs up gesture. Add a blueprint-style diagram on the right showing a workflow. Add text 'this plans my videos' at the top. The person should maintain their exact facial features." \
+--base "/Users/name/headshots/excited-face.png" \
+--ref "/Users/name/examples/style-reference.jpg" \
+--out-dir "./thumbnails"
+```
+
+### Optimizing Thumbnails
+
+Because you can edit a base image with Thumbkit, you can iteratively modify/improve a previously generated thumbnail. For example, if you've generated a thumbnail but want to change the color scheme, you can pass the generated thumbnail's absolute path as a reference image and ask NanoBanana to make the necessary updates.
+
+Always review generated thumbnails to ensure they meet the complete design requirements and original intent. If not, suggest improvements to the user and ask if they want you to iterate. 
+
+## User Assets
+
+If the user has specified any local assets (e.g. thumbnail templates, headshots, icons, logos, etc.) in their local context, bias towards incorporating them into the thumbnail when relevant.
--- a/skills/youtube-thumbnail/references/design-requirements.md
+++ b/skills/youtube-thumbnail/references/design-requirements.md
@@ -0,0 +1,109 @@
+# YouTube Thumbnail Design Requirements
+
+## Critical Requirements (**MUST ALWAYS** Follow)
+
+### 1. **Pass The Glance Test** ⚡
+**The viewer must understand the thumbnail in 1 second or less.**
+
+- The full image must be comprehensible at a glance
+- No mental effort required to figure out what's going on
+- **Test criterion**: Would this be immediately clear when viewed at mobile size?
+- If the viewer's eye has to search or study the image, it **FAILS**
+
+### 2. **Spark Curiosity** 🎯
+**This is the #1 most important principle for clickable thumbnails.**
+
+- Create intrigue and tension in the viewer's mind
+- Make viewers feel compelled to click to resolve the curiosity
+- The thumbnail should make viewers want to know more
+- Without curiosity, other principles won't matter as much
+
+### 3. **Single Clear Focal Point** 👁️
+**The viewer's eye must be drawn to ONE point, not multiple competing elements.**
+
+- **NEVER** create thumbnails with multiple focal points
+- As soon as the eye needs to search for what to notice, it fails The Glance Test
+- One dominant element should immediately grab attention
+
+### 4. **Mobile-First Design** 📱
+**Most viewers see thumbnails small - design must work at small sizes.**
+
+- Always preview thumbnails at mobile/small size during design
+- Important details **MUST** remain visible when thumbnail is small
+- What looks good on a big monitor may fail on mobile
+- **Critical**: Don't let important details get lost at small sizes
+
+---
+
+## Text Guidelines
+
+### **NEVER:**
+- ❌ Repeat the video title in the thumbnail text (viewer already has that information)
+- ❌ Use too much text (breaks The Glance Test)
+- ❌ Use text that's too small to read on mobile devices
+
+### **ALWAYS:**
+- ✅ Use text that **complements** (not repeats) the video title
+- ✅ Ensure text is large enough to read at mobile thumbnail size
+- ✅ Keep text minimal and impactful
+- ✅ Test text readability at small sizes
+
+### **Best Practice - Short, Punchy Text:**
+- Use brief, impactful phrases that describe the video
+- Example: "10x Your Creative Production" (with visual emphasis like neon background highlights)
+- **Exception**: Slightly longer text is acceptable when there are minimal other elements and text takes up most of the space
+- Text should be descriptive and add value beyond the title
+
+---
+
+## Visual Composition
+
+### **AVOID:**
+- ❌ Clutter (multiple competing elements)
+- ❌ Images where nothing stands out
+- ❌ Complex compositions that require study to understand
+- ❌ Designs that take mental work to process
+
+### **PRIORITIZE:**
+- ✅ Clear, simple compositions
+- ✅ High contrast elements
+- ✅ Single dominant subject or element
+- ✅ Immediate visual clarity
+
+### **Performance Boosters:**
+
+#### 1. **Eye-Catching Graphics and Colors**
+- Use bold, vibrant colors that stand out
+- High contrast between elements
+- Graphics should be visually striking and attention-grabbing
+
+#### 2. **People (Especially Faces)**
+- **Faces perform exceptionally well** in thumbnails
+- Ideally feature someone from the video
+- Human faces create connection and draw attention
+- Facial expressions can convey emotion and intrigue
+
+---
+
+## Hierarchy of Importance
+
+1. **Spark Curiosity** - Without this, nothing else matters
+2. **Pass The Glance Test** - Just as important; all other principles serve this goal
+3. Single focal point, mobile optimization, and text guidelines - All support the above two
+
+---
+
+## Evaluation Checklist
+
+When evaluating or creating a thumbnail, ask:
+
+1. ✓ Can I understand this in 1 second? (Glance Test)
+2. ✓ Does this make me curious to learn more? (Curiosity)
+3. ✓ Is there ONE clear focal point? (Not multiple)
+4. ✓ Does this work at mobile size? (Mobile-first)
+5. ✓ If text is used: Does it complement (not repeat) the title?
+6. ✓ If text is used: Is it short, punchy, and readable at small sizes?
+7. ✓ Does it use eye-catching graphics and colors?
+8. ✓ Does it feature people (ideally faces from the video)?
+9. ✓ Is the composition simple and uncluttered?
+
--- a/skills/youtube-thumbnail/references/prompting-guidelines.md
+++ b/skills/youtube-thumbnail/references/prompting-guidelines.md
@@ -0,0 +1,168 @@
+# Prompting Guide and Strategies
+
+Mastering Gemini 2.5 Flash (NanoBanana) Image Generation starts with one fundamental principle:
+
+> **Describe the scene, don't just list keywords.** The model's core strength is its deep language understanding. A narrative, descriptive paragraph will almost always produce a better, more coherent image than a list of disconnected words.
+
+---
+
+## Prompts for Generating Images
+
+The following strategies will help you create effective prompts to generate exactly the images you're looking for.
+
+### 1. Photorealistic Scenes
+
+For realistic images, use photography terms. Mention camera angles, lens types, lighting, and fine details to guide the model toward a photorealistic result.
+
+**Template:**
+```
+A photorealistic [shot type] of [subject], [action or expression], set in
+[environment]. The scene is illuminated by [lighting description], creating
+a [mood] atmosphere. Captured with a [camera/lens details], emphasizing
+[key textures and details]. The image should be in a [aspect ratio] format.
+```
+
+### 2. Stylized Illustrations & Stickers
+
+To create stickers, icons, or assets, be explicit about the style and request a transparent background.
+
+**Template:**
+```
+A [style] sticker of a [subject], featuring [key characteristics] and a
+[color palette]. The design should have [line style] and [shading style].
+The background must be transparent.
+```
+
+### 3. Accurate Text in Images
+
+Gemini excels at rendering text. Be clear about the text, the font style (descriptively), and the overall design.
+
+**Template:**
+```
+Create a [image type] for [brand/concept] with the text "[text to render]"
+in a [font style]. The design should be [style description], with a
+[color scheme].
+```
+
+### 4. Product Mockups & Commercial Photography
+
+Perfect for creating clean, professional product shots for e-commerce, advertising, or branding.
+
+**Template:**
+```
+A high-resolution, studio-lit product photograph of a [product description]
+on a [background surface/description]. The lighting is a [lighting setup,
+e.g., three-point softbox setup] to [lighting purpose]. The camera angle is
+a [angle type] to showcase [specific feature]. Ultra-realistic, with sharp
+focus on [key detail]. [Aspect ratio].
+```
+
+### 5. Minimalist & Negative Space Design
+
+Excellent for creating backgrounds for websites, presentations, or marketing materials where text will be overlaid.
+
+**Template:**
+```
+A minimalist composition featuring a single [subject] positioned in the
+[bottom-right/top-left/etc.] of the frame. The background is a vast, empty
+[color] canvas, creating significant negative space. Soft, subtle lighting.
+[Aspect ratio].
+```
+
+### 6. Sequential Art (Comic Panel / Storyboard)
+
+Builds on character consistency and scene description to create panels for visual storytelling.
+
+**Template:**
+```
+A single comic book panel in a [art style] style. In the foreground,
+[character description and action]. In the background, [setting details].
+The panel has a [dialogue/caption box] with the text "[Text]". The lighting
+creates a [mood] mood. [Aspect ratio].
+```
+
+---
+
+## Prompts for Editing Images
+
+These examples show how to provide images alongside your text prompts for editing, composition, and style transfer.
+
+### 1. Adding and Removing Elements
+
+Provide an image and describe your change. The model will match the original image's style, lighting, and perspective.
+
+**Template:**
+```
+Using the provided image of [subject], please [add/remove/modify] [element]
+to/from the scene. Ensure the change is [description of how the change should
+integrate].
+```
+
+### 2. Inpainting (Semantic Masking)
+
+Conversationally define a "mask" to edit a specific part of an image while leaving the rest untouched.
+
+**Template:**
+```
+Using the provided image, change only the [specific element] to [new
+element/description]. Keep everything else in the image exactly the same,
+preserving the original style, lighting, and composition.
+```
+
+### 3. Style Transfer
+
+Provide an image and ask the model to recreate its content in a different artistic style.
+
+**Template:**
+```
+Transform the provided photograph of [subject] into the artistic style of
+[artist/art style]. Preserve the original composition but render it with
+[description of stylistic elements].
+```
+
+### 4. Advanced Composition: Combining Multiple Images
+
+Provide multiple images as context to create a new, composite scene. This is perfect for product mockups or creative collages.
+
+**Template:**
+```
+Create a new image by combining the elements from the provided images. Take
+the [element from image 1] and place it with/on the [element from image 2].
+The final image should be a [description of the final scene].
+```
+
+### 5. High-Fidelity Detail Preservation
+
+To ensure critical details (like a face or logo) are preserved during an edit, describe them in great detail along with your edit request.
+
+**Template:**
+```
+Using the provided images, place [element from image 2] onto [element from
+image 1]. Ensure that the features of [element from image 1] remain
+completely unchanged. The added element should [description of how the
+element should integrate].
+```
+
+---
+
+## Best Practices
+
+To elevate your results from good to great, incorporate these professional strategies into your workflow.
+
+### Be Hyper-Specific
+The more detail you provide, the more control you have. Instead of "fantasy armor," describe it: "ornate elven plate armor, etched with silver leaf patterns, with a high collar and pauldrons shaped like falcon wings."
+
+### Provide Context and Intent
+Explain the purpose of the image. The model's understanding of context will influence the final output. For example, "Create a logo for a high-end, minimalist skincare brand" will yield better results than just "Create a logo."
+
+### Iterate and Refine
+Don't expect a perfect image on the first try. Use the conversational nature of the model to make small changes. Follow up with prompts like, "That's great, but can you make the lighting a bit warmer?" or "Keep everything the same, but change the character's expression to be more serious."
+
+### Use Step-by-Step Instructions
+For complex scenes with many elements, break your prompt into steps. "First, create a background of a serene, misty forest at dawn. Then, in the foreground, add a moss-covered ancient stone altar. Finally, place a single, glowing sword on top of the altar."
+
+### Use "Semantic Negative Prompts"
+Instead of saying "no cars," describe the desired scene positively: "an empty, deserted street with no signs of traffic."
+
+### Control the Camera
+Use photographic and cinematic language to control the composition. Terms like `wide-angle shot`, `macro shot`, `low-angle perspective`.