Initial commit
This commit is contained in:
134
skills/paper-2-web/references/installation.md
Normal file
134
skills/paper-2-web/references/installation.md
Normal file
@@ -0,0 +1,134 @@
|
||||
# Installation and Configuration
|
||||
|
||||
## System Requirements
|
||||
|
||||
### Hardware Requirements
|
||||
- **GPU**: NVIDIA A6000 (48GB minimum) required for video generation with talking-head features
|
||||
- **CPU**: Multi-core processor recommended for PDF processing and document conversion
|
||||
- **RAM**: 16GB minimum, 32GB recommended for large papers
|
||||
|
||||
### Software Requirements
|
||||
- **Python**: 3.11 or higher
|
||||
- **LibreOffice**: Required for document format conversion (PDF to PPTX, etc.)
|
||||
- **Poppler utilities**: Required for PDF processing and manipulation
|
||||
|
||||
## Installation Steps
|
||||
|
||||
### 1. Clone the Repository
|
||||
```bash
|
||||
git clone https://github.com/YuhangChen1/Paper2All.git
|
||||
cd Paper2All
|
||||
```
|
||||
|
||||
### 2. Install Dependencies
|
||||
```bash
|
||||
uv pip install -r requirements.txt
|
||||
```
|
||||
|
||||
### 3. Install System Dependencies
|
||||
|
||||
**Ubuntu/Debian:**
|
||||
```bash
|
||||
sudo apt-get install libreoffice poppler-utils
|
||||
```
|
||||
|
||||
**macOS:**
|
||||
```bash
|
||||
brew install libreoffice poppler
|
||||
```
|
||||
|
||||
**Windows:**
|
||||
- Download and install LibreOffice from https://www.libreoffice.org/
|
||||
- Download and install Poppler from https://github.com/oschwartz10612/poppler-windows
|
||||
|
||||
## API Configuration
|
||||
|
||||
Create a `.env` file in the project root with the following credentials:
|
||||
|
||||
### Required API Keys
|
||||
|
||||
**Option 1: OpenAI API**
|
||||
```
|
||||
OPENAI_API_KEY=your_openai_api_key_here
|
||||
```
|
||||
|
||||
**Option 2: OpenRouter API** (alternative to OpenAI)
|
||||
```
|
||||
OPENROUTER_API_KEY=your_openrouter_api_key_here
|
||||
```
|
||||
|
||||
### Optional API Keys
|
||||
|
||||
**Google Search API** (for automatic logo discovery)
|
||||
```
|
||||
GOOGLE_API_KEY=your_google_api_key_here
|
||||
GOOGLE_CSE_ID=your_custom_search_engine_id_here
|
||||
```
|
||||
|
||||
## Model Configuration
|
||||
|
||||
The system supports multiple LLM backends:
|
||||
|
||||
### Supported Models
|
||||
- GPT-4 (recommended for best quality)
|
||||
- GPT-4.1 (latest version)
|
||||
- GPT-3.5-turbo (faster, lower cost)
|
||||
- Claude models via OpenRouter
|
||||
- Other OpenRouter-supported models
|
||||
|
||||
### Model Selection
|
||||
|
||||
Specify models using the `--model-choice` parameter or `--model_name_t` and `--model_name_v` parameters:
|
||||
- Model choice 1: GPT-4 for all components
|
||||
- Model choice 2: GPT-4.1 for all components
|
||||
- Custom: Specify separate models for text and visual processing
|
||||
|
||||
## Verification
|
||||
|
||||
Test the installation:
|
||||
|
||||
```bash
|
||||
python pipeline_all.py --help
|
||||
```
|
||||
|
||||
If successful, you should see the help menu with all available options.
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**1. LibreOffice not found**
|
||||
- Ensure LibreOffice is installed and in your system PATH
|
||||
- Try running `libreoffice --version` to verify
|
||||
|
||||
**2. Poppler utilities not found**
|
||||
- Verify installation with `pdftoppm -v`
|
||||
- Add Poppler bin directory to PATH if needed
|
||||
|
||||
**3. GPU/CUDA errors for video generation**
|
||||
- Ensure NVIDIA drivers are up to date
|
||||
- Verify CUDA toolkit is installed
|
||||
- Check GPU memory with `nvidia-smi`
|
||||
|
||||
**4. API key errors**
|
||||
- Verify `.env` file is in the project root
|
||||
- Check that API keys are valid and have sufficient credits
|
||||
- Ensure no extra spaces or quotes around keys in `.env`
|
||||
|
||||
## Directory Structure
|
||||
|
||||
After installation, organize your workspace:
|
||||
|
||||
```
|
||||
Paper2All/
|
||||
├── .env # API credentials
|
||||
├── input/ # Place your paper files here
|
||||
│ └── paper_name/ # Each paper in its own directory
|
||||
│ └── main.tex # LaTeX source or PDF
|
||||
├── output/ # Generated outputs
|
||||
│ └── paper_name/
|
||||
│ ├── website/ # Generated website files
|
||||
│ ├── video/ # Generated video files
|
||||
│ └── poster/ # Generated poster files
|
||||
└── ...
|
||||
```
|
||||
346
skills/paper-2-web/references/paper2poster.md
Normal file
346
skills/paper-2-web/references/paper2poster.md
Normal file
@@ -0,0 +1,346 @@
|
||||
# Paper2Poster: Academic Poster Generation
|
||||
|
||||
## Overview
|
||||
|
||||
Paper2Poster automatically generates professional academic posters from research papers. The system extracts key content, designs visually appealing layouts, and creates print-ready posters suitable for conferences, symposiums, and academic presentations.
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### 1. Content Extraction
|
||||
- Identifies key findings and contributions
|
||||
- Extracts important figures and tables
|
||||
- Summarizes methodology
|
||||
- Highlights results and conclusions
|
||||
- Preserves citations and references
|
||||
|
||||
### 2. Layout Design
|
||||
- Creates balanced, professional layouts
|
||||
- Optimizes content density and white space
|
||||
- Establishes clear visual hierarchy
|
||||
- Supports multiple poster sizes
|
||||
- Adapts to different content types
|
||||
|
||||
### 3. Visual Design
|
||||
- Applies color schemes and branding
|
||||
- Optimizes typography for readability
|
||||
- Ensures figure quality and sizing
|
||||
- Creates cohesive visual identity
|
||||
- Maintains academic presentation standards
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Poster Generation
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir "path/to/papers" \
|
||||
--output-dir "path/to/output" \
|
||||
--model-choice 1 \
|
||||
--generate-poster
|
||||
```
|
||||
|
||||
### Custom Poster Dimensions
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir "path/to/papers" \
|
||||
--output-dir "path/to/output" \
|
||||
--model-choice 2 \
|
||||
--generate-poster \
|
||||
--poster-width-inches 60 \
|
||||
--poster-height-inches 40
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
**Basic Configuration:**
|
||||
- `--input-dir`: Directory containing paper files
|
||||
- `--output-dir`: Directory for generated posters
|
||||
- `--model-choice`: LLM model selection (1=GPT-4, 2=GPT-4.1)
|
||||
- `--generate-poster`: Enable poster generation
|
||||
|
||||
**Poster Dimensions:**
|
||||
- `--poster-width-inches`: Width in inches (default: 48)
|
||||
- `--poster-height-inches`: Height in inches (default: 36)
|
||||
- `--poster-orientation`: Portrait or landscape (default: landscape)
|
||||
- `--poster-dpi`: Resolution in DPI (default: 300)
|
||||
|
||||
**Design Options:**
|
||||
- `--poster-template`: Template style (default: modern)
|
||||
- `--color-scheme`: Color palette selection
|
||||
- `--institution-branding`: Include institution colors and logos
|
||||
- `--font-family`: Typography selection
|
||||
|
||||
## Standard Poster Sizes
|
||||
|
||||
### Conference Standard Sizes
|
||||
- **4' × 3'** (48" × 36"): Most common conference poster
|
||||
- **5' × 4'** (60" × 48"): Large format for major conferences
|
||||
- **3' × 4'** (36" × 48"): Portrait orientation for narrow spaces
|
||||
- **A0** (841mm × 1189mm): International standard
|
||||
- **A1** (594mm × 841mm): Compact conference poster
|
||||
|
||||
### Custom Sizes
|
||||
The system supports any custom dimensions. Specify using:
|
||||
```bash
|
||||
--poster-width-inches [width] --poster-height-inches [height]
|
||||
```
|
||||
|
||||
## Input Requirements
|
||||
|
||||
### Supported Input Formats
|
||||
1. **LaTeX source** (preferred)
|
||||
- Main `.tex` file with complete paper
|
||||
- All figures and tables referenced
|
||||
- Compiled successfully
|
||||
|
||||
2. **PDF**
|
||||
- High-quality PDF with embedded fonts
|
||||
- Selectable text (not scanned)
|
||||
- High-resolution figures
|
||||
|
||||
### Required Content Elements
|
||||
- Title and authors
|
||||
- Abstract or summary
|
||||
- Methodology description
|
||||
- Key results
|
||||
- Conclusions
|
||||
- References (optional but recommended)
|
||||
|
||||
### Recommended Assets
|
||||
- High-resolution figures (300 DPI minimum)
|
||||
- Vector graphics (PDF, SVG, EPS)
|
||||
- Institution logo
|
||||
- Author photos (optional)
|
||||
- QR codes for website/repo links
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
output/paper_name/poster/
|
||||
├── poster_final.pdf # Print-ready poster
|
||||
├── poster_final.png # High-res PNG version
|
||||
├── poster_preview.pdf # Low-res preview
|
||||
├── poster_source/ # Source files
|
||||
│ ├── layout.pptx # Editable PowerPoint
|
||||
│ ├── layout.svg # Vector graphics
|
||||
│ └── layout.json # Layout specification
|
||||
├── assets/ # Extracted assets
|
||||
│ ├── figures/ # Poster figures
|
||||
│ ├── logos/ # Institution logos
|
||||
│ └── qrcodes/ # Generated QR codes
|
||||
└── metadata/
|
||||
├── design_spec.json # Design specifications
|
||||
└── content_map.json # Content organization
|
||||
```
|
||||
|
||||
## Poster Layout Sections
|
||||
|
||||
### Standard Sections
|
||||
1. **Header**
|
||||
- Title (large, prominent)
|
||||
- Authors and affiliations
|
||||
- Institution logos
|
||||
- Conference information
|
||||
|
||||
2. **Introduction/Background**
|
||||
- Problem statement
|
||||
- Research motivation
|
||||
- Brief literature context
|
||||
|
||||
3. **Methods**
|
||||
- Experimental design
|
||||
- Key procedures
|
||||
- Important parameters
|
||||
- Visual workflow diagram
|
||||
|
||||
4. **Results**
|
||||
- Key findings (largest section)
|
||||
- Primary figures and tables
|
||||
- Statistical summaries
|
||||
- Visual data representations
|
||||
|
||||
5. **Conclusions**
|
||||
- Main takeaways
|
||||
- Implications
|
||||
- Future work
|
||||
|
||||
6. **References & Contact**
|
||||
- Selected key references
|
||||
- Author contact information
|
||||
- QR codes for paper/website
|
||||
- Acknowledgments
|
||||
|
||||
## Design Templates
|
||||
|
||||
### Modern Template (Default)
|
||||
- Clean, minimalist design
|
||||
- Bold colors for headers
|
||||
- Ample white space
|
||||
- Modern typography
|
||||
- Focus on visual hierarchy
|
||||
|
||||
### Academic Template
|
||||
- Traditional academic styling
|
||||
- Conservative color palette
|
||||
- Dense information layout
|
||||
- Classic serif typography
|
||||
- Standard section organization
|
||||
|
||||
### Visual Template
|
||||
- Image-focused layout
|
||||
- Large figure displays
|
||||
- Minimal text density
|
||||
- Infographic elements
|
||||
- Story-driven flow
|
||||
|
||||
### Technical Template
|
||||
- Equation-friendly layout
|
||||
- Code snippet support
|
||||
- Detailed methodology sections
|
||||
- Technical figure emphasis
|
||||
- Engineering/CS aesthetic
|
||||
|
||||
## Color Schemes
|
||||
|
||||
### Predefined Schemes
|
||||
- **Institutional**: Uses institution branding colors
|
||||
- **Professional**: Navy blue and gray palette
|
||||
- **Vibrant**: Bold, eye-catching colors
|
||||
- **Nature**: Green and earth tones
|
||||
- **Tech**: Modern blue and cyan
|
||||
- **Warm**: Orange and red accents
|
||||
- **Cool**: Blue and purple tones
|
||||
|
||||
### Custom Color Schemes
|
||||
Specify custom colors in configuration:
|
||||
```json
|
||||
{
|
||||
"primary": "#1E3A8A",
|
||||
"secondary": "#3B82F6",
|
||||
"accent": "#F59E0B",
|
||||
"background": "#FFFFFF",
|
||||
"text": "#1F2937"
|
||||
}
|
||||
```
|
||||
|
||||
## Typography Options
|
||||
|
||||
### Font Families
|
||||
- **Sans-serif** (default): Clean, modern, highly readable
|
||||
- **Serif**: Traditional academic appearance
|
||||
- **Mixed**: Serif for body, sans-serif for headers
|
||||
- **Monospace**: For code and technical content
|
||||
|
||||
### Size Hierarchy
|
||||
- **Title**: 72-96pt
|
||||
- **Section headers**: 48-60pt
|
||||
- **Subsection headers**: 36-48pt
|
||||
- **Body text**: 24-32pt
|
||||
- **Captions**: 18-24pt
|
||||
- **References**: 16-20pt
|
||||
|
||||
## Quality Assurance
|
||||
|
||||
### Automated Checks
|
||||
- **Text readability**: Minimum font size verification
|
||||
- **Color contrast**: Accessibility compliance
|
||||
- **Figure quality**: Resolution and clarity checks
|
||||
- **Layout balance**: Content distribution analysis
|
||||
- **Branding consistency**: Logo and color verification
|
||||
|
||||
### Manual Review Checklist
|
||||
1. ☐ All figures are high resolution and clear
|
||||
2. ☐ Text is readable from 3-6 feet away
|
||||
3. ☐ Color scheme is professional and consistent
|
||||
4. ☐ No text overlaps or layout issues
|
||||
5. ☐ Institution logos are correct and high quality
|
||||
6. ☐ QR codes work and link to correct URLs
|
||||
7. ☐ Author information is accurate
|
||||
8. ☐ Key findings are prominently displayed
|
||||
9. ☐ References are properly formatted
|
||||
10. ☐ File is correct size and resolution for printing
|
||||
|
||||
## Print Preparation
|
||||
|
||||
### File Specifications
|
||||
- **Format**: PDF/X-1a or PDF/X-4 for professional printing
|
||||
- **Resolution**: 300 DPI minimum, 600 DPI for fine details
|
||||
- **Color mode**: CMYK for print (system auto-converts from RGB)
|
||||
- **Bleed**: 0.125" bleed on all sides (automatically added)
|
||||
- **Fonts**: All fonts embedded in PDF
|
||||
|
||||
### Printing Recommendations
|
||||
1. **Print shop**: Use professional poster printing service
|
||||
2. **Paper type**: Matte or satin finish for academic posters
|
||||
3. **Backing**: Foam core or rigid backing for stability
|
||||
4. **Protection**: Lamination optional but recommended
|
||||
5. **Test print**: Print A4/Letter size preview first
|
||||
|
||||
### Budget Options
|
||||
- **Standard**: $50-100 for 4'×3' poster at professional shop
|
||||
- **Economy**: $20-40 for print-only (no mounting)
|
||||
- **Premium**: $150-300 for high-end materials and mounting
|
||||
- **DIY**: <$10 for multiple pages tiled and assembled
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### QR Code Generation
|
||||
Automatically generates QR codes for:
|
||||
- Paper PDF or DOI
|
||||
- Project website
|
||||
- GitHub repository
|
||||
- Data repository
|
||||
- Author profiles (ORCID, Google Scholar)
|
||||
|
||||
### Institution Branding
|
||||
When enabled:
|
||||
- Extracts institution from author affiliations
|
||||
- Searches for official logos (requires Google Search API)
|
||||
- Applies institution color schemes
|
||||
- Matches brand guidelines
|
||||
|
||||
### Interactive Elements (Digital Posters)
|
||||
For digital display or virtual conferences:
|
||||
- Clickable links and references
|
||||
- Embedded videos in figures
|
||||
- Interactive data visualizations
|
||||
- Animated transitions
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Content Optimization
|
||||
1. **Focus on key findings**: Poster should tell story at a glance
|
||||
2. **Limit text**: Use bullet points, avoid paragraphs
|
||||
3. **Prioritize visuals**: Figures should dominate the space
|
||||
4. **Clear flow**: Guide viewer through logical progression
|
||||
5. **Highlight contributions**: Make novelty obvious
|
||||
|
||||
### Design Optimization
|
||||
1. **Use contrast**: Ensure text is easily readable
|
||||
2. **Maintain hierarchy**: Size indicates importance
|
||||
3. **Balance content**: Avoid crowding any section
|
||||
4. **Consistent styling**: Same fonts, colors throughout
|
||||
5. **White space**: Don't fill every inch
|
||||
|
||||
### Figure Optimization
|
||||
1. **Large enough**: Minimum 6" width for main figures
|
||||
2. **High resolution**: 300 DPI minimum
|
||||
3. **Clear labels**: Axis labels, legends readable
|
||||
4. **Remove clutter**: Simplify for poster format
|
||||
5. **Use captions**: Brief, informative descriptions
|
||||
|
||||
## Limitations
|
||||
|
||||
- Complex equations may need manual adjustment for readability
|
||||
- Very long papers may require content prioritization
|
||||
- Custom branding requires manual specification or API access
|
||||
- Multi-language support limited to common languages
|
||||
- 3D visualizations may lose quality in 2D poster format
|
||||
|
||||
## Integration with Other Components
|
||||
|
||||
Combine Paper2Poster with:
|
||||
- **Paper2Web**: Use matching visual design and color scheme
|
||||
- **Paper2Video**: Create poster walk-through video
|
||||
- **AutoPR**: Generate social media graphics from poster
|
||||
305
skills/paper-2-web/references/paper2video.md
Normal file
305
skills/paper-2-web/references/paper2video.md
Normal file
@@ -0,0 +1,305 @@
|
||||
# Paper2Video: Presentation Video Generation
|
||||
|
||||
## Overview
|
||||
|
||||
Paper2Video generates presentation videos from LaTeX sources, transforming academic papers into engaging video presentations. The system processes papers through multiple specialized modules to create professional presentation videos complete with slides, narration, and optional talking-head video.
|
||||
|
||||
## Core Components
|
||||
|
||||
### 1. Slide Generation Module
|
||||
- Extracts key content from paper structure
|
||||
- Creates visually appealing presentation slides
|
||||
- Organizes content in logical flow
|
||||
- Includes figures, tables, and equations
|
||||
- Optimizes text density for readability
|
||||
|
||||
### 2. Subtitle Generation Module
|
||||
- Generates natural presentation script
|
||||
- Synchronizes text with slide transitions
|
||||
- Creates speaker notes and timing
|
||||
- Supports multiple languages
|
||||
- Optimizes for speech synthesis
|
||||
|
||||
### 3. Speech Synthesis Module
|
||||
- Converts subtitles to natural-sounding speech
|
||||
- Supports multiple voices and accents
|
||||
- Controls pacing and emphasis
|
||||
- Generates audio track for video
|
||||
- Handles technical terminology
|
||||
|
||||
### 4. Cursor Movement Module
|
||||
- Simulates presenter cursor movements
|
||||
- Highlights key points on slides
|
||||
- Guides viewer attention
|
||||
- Creates natural presentation flow
|
||||
- Synchronizes with narration
|
||||
|
||||
### 5. Talking-Head Video Generation (Optional)
|
||||
- Uses Hallo2 for realistic presenter video
|
||||
- Lip-syncs with generated audio
|
||||
- Requires reference image or video
|
||||
- GPU-intensive (NVIDIA A6000 48GB minimum)
|
||||
- Creates engaging presenter presence
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Video Generation (Without Talking-Head)
|
||||
|
||||
```bash
|
||||
python pipeline_light.py \
|
||||
--model_name_t gpt-4.1 \
|
||||
--model_name_v gpt-4.1 \
|
||||
--result_dir /path/to/output \
|
||||
--paper_latex_root /path/to/paper
|
||||
```
|
||||
|
||||
### Full Video Generation (With Talking-Head)
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir "path/to/papers" \
|
||||
--output-dir "path/to/output" \
|
||||
--model-choice 1 \
|
||||
--enable-talking-head
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
**Model Configuration:**
|
||||
- `--model_name_t`: Model for text/subtitle generation (default: gpt-4.1)
|
||||
- `--model_name_v`: Model for visual/slide generation (default: gpt-4.1)
|
||||
- `--model-choice`: Preset model configuration (1=GPT-4, 2=GPT-4.1)
|
||||
|
||||
**Input/Output:**
|
||||
- `--paper_latex_root`: Root directory of LaTeX paper source
|
||||
- `--result_dir` or `--output-dir`: Output directory for generated videos
|
||||
- `--input-dir`: Directory containing multiple papers to process
|
||||
|
||||
**Video Options:**
|
||||
- `--enable-talking-head`: Enable talking-head video generation (requires GPU)
|
||||
- `--video-duration`: Target video duration in seconds (default: auto-calculated)
|
||||
- `--slides-per-minute`: Control presentation pacing (default: 2-3)
|
||||
- `--voice`: Voice selection for speech synthesis
|
||||
|
||||
**Quality Settings:**
|
||||
- `--video-resolution`: Output resolution (default: 1920x1080)
|
||||
- `--video-fps`: Frame rate (default: 30)
|
||||
- `--audio-quality`: Audio bitrate (default: 192kbps)
|
||||
|
||||
## Input Requirements
|
||||
|
||||
### LaTeX Source Structure
|
||||
```
|
||||
paper_directory/
|
||||
├── main.tex # Main paper file
|
||||
├── sections/ # Section files (if split)
|
||||
│ ├── introduction.tex
|
||||
│ ├── methods.tex
|
||||
│ └── results.tex
|
||||
├── figures/ # Figure files
|
||||
│ ├── fig1.pdf
|
||||
│ ├── fig2.png
|
||||
│ └── ...
|
||||
├── tables/ # Table files
|
||||
└── bibliography.bib # References
|
||||
```
|
||||
|
||||
### Required Elements
|
||||
- Valid LaTeX source that compiles
|
||||
- Proper section structure (abstract, introduction, methods, results, conclusion)
|
||||
- High-quality figures (vector formats preferred)
|
||||
- Complete bibliography
|
||||
|
||||
### Optional Elements
|
||||
- Author photos for talking-head generation
|
||||
- Custom slide templates
|
||||
- Background music or sound effects
|
||||
- Institution branding assets
|
||||
|
||||
## Output Structure
|
||||
|
||||
```
|
||||
output/paper_name/video/
|
||||
├── final_video.mp4 # Complete presentation video
|
||||
├── slides/ # Generated slide images
|
||||
│ ├── slide_001.png
|
||||
│ ├── slide_002.png
|
||||
│ └── ...
|
||||
├── audio/ # Audio components
|
||||
│ ├── narration.mp3 # Speech synthesis output
|
||||
│ └── background.mp3 # Optional background audio
|
||||
├── subtitles/ # Subtitle files
|
||||
│ ├── subtitles.srt # Standard subtitle format
|
||||
│ └── subtitles.vtt # WebVTT format
|
||||
├── script/ # Presentation script
|
||||
│ ├── full_script.txt # Complete narration text
|
||||
│ └── slide_notes.json # Slide-by-slide notes
|
||||
└── metadata/ # Video metadata
|
||||
├── timings.json # Slide timing information
|
||||
└── video_info.json # Video properties
|
||||
```
|
||||
|
||||
## Video Generation Process
|
||||
|
||||
### Phase 1: Content Analysis
|
||||
1. Parse LaTeX source structure
|
||||
2. Extract key concepts and findings
|
||||
3. Identify important figures and equations
|
||||
4. Determine logical presentation flow
|
||||
|
||||
### Phase 2: Slide Creation
|
||||
1. Design slide layouts based on content
|
||||
2. Allocate content across appropriate number of slides
|
||||
3. Incorporate figures and visual elements
|
||||
4. Apply consistent styling and branding
|
||||
|
||||
### Phase 3: Script Generation
|
||||
1. Write natural presentation narration
|
||||
2. Time script sections to slides
|
||||
3. Add transitions and emphasis
|
||||
4. Optimize for speech synthesis
|
||||
|
||||
### Phase 4: Audio Production
|
||||
1. Generate speech from script
|
||||
2. Add emphasis and pacing
|
||||
3. Include pauses for slide transitions
|
||||
4. Mix with optional background audio
|
||||
|
||||
### Phase 5: Video Assembly
|
||||
1. Combine slides with timing information
|
||||
2. Synchronize audio track
|
||||
3. Add cursor movements and highlights
|
||||
4. Generate talking-head video (if enabled)
|
||||
5. Render final video file
|
||||
|
||||
## Customization Options
|
||||
|
||||
### Presentation Style
|
||||
- **Academic**: Formal, detailed, comprehensive
|
||||
- **Conference**: Focused on key findings, faster pace
|
||||
- **Public**: Simplified language, engaging storytelling
|
||||
- **Tutorial**: Step-by-step explanation, educational focus
|
||||
|
||||
### Voice Configuration
|
||||
Available voice options (via speech synthesis):
|
||||
- Multiple languages and accents
|
||||
- Male/female voice selection
|
||||
- Speaking rate adjustment
|
||||
- Pitch and tone customization
|
||||
|
||||
### Visual Themes
|
||||
- Institution branding colors
|
||||
- Conference template matching
|
||||
- Custom backgrounds and fonts
|
||||
- Dark mode presentations
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
### Content Quality Metrics
|
||||
- **Completeness**: Coverage of paper content
|
||||
- **Clarity**: Explanation quality and coherence
|
||||
- **Flow**: Logical progression of ideas
|
||||
- **Engagement**: Visual appeal and pacing
|
||||
|
||||
### Technical Quality Metrics
|
||||
- **Audio quality**: Speech clarity and naturalness
|
||||
- **Video quality**: Resolution and encoding
|
||||
- **Synchronization**: Audio-visual alignment
|
||||
- **Timing**: Appropriate slide duration
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Multi-Language Support
|
||||
- Generate presentations in multiple languages
|
||||
- Automatic translation of script
|
||||
- Language-appropriate voice selection
|
||||
- Cultural adaptation of presentation style
|
||||
|
||||
### Talking-Head Generation with Hallo2
|
||||
Requires:
|
||||
- NVIDIA A6000 GPU (48GB minimum)
|
||||
- Reference image or short video of presenter
|
||||
- Additional processing time (2-3x longer)
|
||||
|
||||
Benefits:
|
||||
- More engaging presentation
|
||||
- Professional presenter appearance
|
||||
- Natural gestures and expressions
|
||||
- Lip-sync accuracy
|
||||
|
||||
### Interactive Elements
|
||||
- Embedded clickable links
|
||||
- Navigation menu
|
||||
- Chapter markers
|
||||
- Supplementary material links
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Input Preparation
|
||||
1. **Clean LaTeX source**: Remove unnecessary comments and artifacts
|
||||
2. **High-quality figures**: Use vector formats when possible
|
||||
3. **Clear structure**: Well-organized sections and subsections
|
||||
4. **Complete content**: Include all necessary files and references
|
||||
|
||||
### Model Selection
|
||||
- **Text generation (model_name_t)**: GPT-4.1 for best script quality
|
||||
- **Visual generation (model_name_v)**: GPT-4.1 for optimal slide design
|
||||
- For faster processing with acceptable quality: GPT-3.5-turbo
|
||||
|
||||
### Video Optimization
|
||||
1. **Target duration**: 10-15 minutes for conference talks, 30-45 for detailed presentations
|
||||
2. **Pacing**: 2-3 slides per minute for technical content
|
||||
3. **Resolution**: 1920x1080 for standard, 3840x2160 for high-quality
|
||||
4. **Audio**: 192kbps minimum for clear speech
|
||||
|
||||
### Quality Review
|
||||
Before finalizing:
|
||||
1. Watch entire video for content accuracy
|
||||
2. Check audio synchronization with slides
|
||||
3. Verify figure quality and readability
|
||||
4. Test subtitle accuracy and timing
|
||||
5. Review cursor movements for natural flow
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
### Processing Time
|
||||
- **Without talking-head**: 10-30 minutes per paper (depending on length)
|
||||
- **With talking-head**: 30-120 minutes per paper
|
||||
- **Factors**: Paper length, figure count, model speed, GPU availability
|
||||
|
||||
### Resource Requirements
|
||||
- **CPU**: Multi-core recommended for parallel processing
|
||||
- **RAM**: 16GB minimum, 32GB for large papers
|
||||
- **GPU**: Optional for standard, required for talking-head (A6000 48GB)
|
||||
- **Storage**: 1-5GB per video depending on length and quality
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
**1. LaTeX parsing errors**
|
||||
- Ensure LaTeX source compiles successfully
|
||||
- Check for special packages or custom commands
|
||||
- Verify all referenced files are present
|
||||
|
||||
**2. Speech synthesis problems**
|
||||
- Check audio quality settings
|
||||
- Verify text is properly formatted
|
||||
- Test with different voice options
|
||||
|
||||
**3. Video rendering failures**
|
||||
- Check available disk space
|
||||
- Verify all dependencies are installed
|
||||
- Review error logs for specific issues
|
||||
|
||||
**4. Talking-head generation errors**
|
||||
- Confirm GPU memory (48GB required)
|
||||
- Check CUDA drivers are up to date
|
||||
- Verify reference image quality and format
|
||||
|
||||
## Integration with Other Components
|
||||
|
||||
Combine Paper2Video with:
|
||||
- **Paper2Web**: Embed video in generated website
|
||||
- **Paper2Poster**: Use matching visual style
|
||||
- **AutoPR**: Create promotional clips from full video
|
||||
187
skills/paper-2-web/references/paper2web.md
Normal file
187
skills/paper-2-web/references/paper2web.md
Normal file
@@ -0,0 +1,187 @@
|
||||
# Paper2Web: Academic Homepage Generation
|
||||
|
||||
## Overview
|
||||
|
||||
Paper2Web converts academic papers into interactive, explorable academic homepages. Unlike traditional approaches (direct generation, template-based, or HTML conversion), Paper2Web creates layout-aware, interactive websites through an iterative refinement process.
|
||||
|
||||
## Core Capabilities
|
||||
|
||||
### 1. Layout-Aware Generation
|
||||
- Analyzes paper structure and content organization
|
||||
- Creates responsive, multi-section layouts
|
||||
- Adapts design based on paper type (research article, review, preprint, etc.)
|
||||
|
||||
### 2. Interactive Elements
|
||||
- Expandable sections for detailed content
|
||||
- Interactive figures and tables
|
||||
- Embedded citations and references
|
||||
- Navigation menu for easy browsing
|
||||
- Mobile-responsive design
|
||||
|
||||
### 3. Content Refinement
|
||||
The system uses an iterative pipeline:
|
||||
1. Initial content extraction and structuring
|
||||
2. Layout generation with visual hierarchy
|
||||
3. Interactive element integration
|
||||
4. Aesthetic refinement
|
||||
5. Quality assessment and validation
|
||||
|
||||
## Usage
|
||||
|
||||
### Basic Website Generation
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir "path/to/papers" \
|
||||
--output-dir "path/to/output" \
|
||||
--model-choice 1
|
||||
```
|
||||
|
||||
### Parameters
|
||||
|
||||
- `--input-dir`: Directory containing paper files (PDF or LaTeX)
|
||||
- `--output-dir`: Directory for generated website files
|
||||
- `--model-choice`: LLM model selection (1=GPT-4, 2=GPT-4.1)
|
||||
- `--enable-logo-search`: Use Google Search API to find institution logos (optional)
|
||||
|
||||
### Input Format Requirements
|
||||
|
||||
**Supported Input Formats:**
|
||||
1. **LaTeX source** (preferred for best results)
|
||||
- Main file: `main.tex`
|
||||
- Include all referenced figures, tables, and bibliography files
|
||||
- Organize in a single directory per paper
|
||||
|
||||
2. **PDF files**
|
||||
- High-quality PDF with selectable text
|
||||
- Embedded figures should be high resolution
|
||||
- Proper section headers and structure
|
||||
|
||||
**Directory Structure:**
|
||||
```
|
||||
input/
|
||||
└── paper_name/
|
||||
├── main.tex # LaTeX source
|
||||
├── bibliography.bib # References
|
||||
├── figures/ # Figure files
|
||||
│ ├── fig1.png
|
||||
│ └── fig2.pdf
|
||||
└── tables/ # Table files
|
||||
```
|
||||
|
||||
## Output Structure
|
||||
|
||||
Generated websites include:
|
||||
|
||||
```
|
||||
output/paper_name/website/
|
||||
├── index.html # Main webpage
|
||||
├── styles.css # Styling
|
||||
├── script.js # Interactive features
|
||||
├── assets/ # Images and media
|
||||
│ ├── figures/
|
||||
│ └── logos/
|
||||
└── data/ # Structured data (optional)
|
||||
```
|
||||
|
||||
## Customization Options
|
||||
|
||||
### Visual Design
|
||||
The generated websites automatically include:
|
||||
- Professional color schemes based on paper content
|
||||
- Typography optimized for readability
|
||||
- Consistent spacing and visual hierarchy
|
||||
- Dark mode support (optional)
|
||||
|
||||
### Content Sections
|
||||
Standard sections include:
|
||||
- Abstract
|
||||
- Key findings/contributions
|
||||
- Methodology overview
|
||||
- Results and visualizations
|
||||
- Discussion and implications
|
||||
- References and citations
|
||||
- Author information and affiliations
|
||||
|
||||
Additional sections are automatically added based on paper content:
|
||||
- Code repositories
|
||||
- Dataset links
|
||||
- Supplementary materials
|
||||
- Related publications
|
||||
|
||||
## Quality Assessment
|
||||
|
||||
Paper2Web includes built-in evaluation:
|
||||
|
||||
### Aesthetic Metrics
|
||||
- Layout balance and spacing
|
||||
- Color harmony
|
||||
- Typography consistency
|
||||
- Visual hierarchy effectiveness
|
||||
|
||||
### Informativeness Metrics
|
||||
- Content completeness
|
||||
- Key finding clarity
|
||||
- Method explanation adequacy
|
||||
- Results presentation quality
|
||||
|
||||
### Technical Metrics
|
||||
- Page load time
|
||||
- Mobile responsiveness
|
||||
- Browser compatibility
|
||||
- Accessibility compliance
|
||||
|
||||
## Advanced Features
|
||||
|
||||
### Logo Discovery
|
||||
When enabled with Google Search API:
|
||||
- Automatically finds institution logos
|
||||
- Matches author affiliations
|
||||
- Downloads and optimizes logo images
|
||||
- Integrates into website header
|
||||
|
||||
### Citation Integration
|
||||
- Interactive reference list
|
||||
- Hover previews for citations
|
||||
- Links to DOI and external sources
|
||||
- Citation count tracking (if available)
|
||||
|
||||
### Figure Enhancement
|
||||
- High-resolution figure rendering
|
||||
- Zoom and pan functionality
|
||||
- Caption and description integration
|
||||
- Multi-panel figure navigation
|
||||
|
||||
## Best Practices
|
||||
|
||||
### Input Preparation
|
||||
1. **Use LaTeX when possible**: Provides best structure extraction
|
||||
2. **Include all assets**: Figures, tables, and bibliography files
|
||||
3. **Clean formatting**: Remove compilation artifacts and temporary files
|
||||
4. **High-quality figures**: Use vector formats (PDF, SVG) when available
|
||||
|
||||
### Model Selection
|
||||
- **GPT-4**: Best balance of quality and cost
|
||||
- **GPT-4.1**: Latest features, higher cost
|
||||
- **GPT-3.5-turbo**: Faster processing, acceptable for simple papers
|
||||
|
||||
### Output Optimization
|
||||
1. Review generated content for accuracy
|
||||
2. Check that all figures render correctly
|
||||
3. Test interactive elements functionality
|
||||
4. Verify mobile responsiveness
|
||||
5. Validate external links
|
||||
|
||||
## Limitations
|
||||
|
||||
- Complex mathematical equations may require manual review
|
||||
- Multi-column layouts in PDF may affect extraction quality
|
||||
- Large papers (>50 pages) may require extended processing time
|
||||
- Some specialized figure types may need manual adjustment
|
||||
|
||||
## Integration with Other Components
|
||||
|
||||
Paper2Web can be combined with:
|
||||
- **Paper2Video**: Generate companion video for the website
|
||||
- **Paper2Poster**: Create matching poster design
|
||||
- **AutoPR**: Generate promotional content linking to website
|
||||
436
skills/paper-2-web/references/usage_examples.md
Normal file
436
skills/paper-2-web/references/usage_examples.md
Normal file
@@ -0,0 +1,436 @@
|
||||
# Usage Examples and Workflows
|
||||
|
||||
## Complete Workflow Examples
|
||||
|
||||
### Example 1: Conference Presentation Package
|
||||
|
||||
**Scenario**: Preparing for a major conference presentation with website, poster, and video.
|
||||
|
||||
**User Request**: "I need to create a complete presentation package for my NeurIPS paper submission. Generate a website, poster, and video presentation."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Step 1: Organize paper files
|
||||
mkdir -p input/neurips2025_paper
|
||||
cp main.tex input/neurips2025_paper/
|
||||
cp -r figures/ input/neurips2025_paper/
|
||||
cp -r tables/ input/neurips2025_paper/
|
||||
cp bibliography.bib input/neurips2025_paper/
|
||||
|
||||
# Step 2: Generate all components
|
||||
python pipeline_all.py \
|
||||
--input-dir input/neurips2025_paper \
|
||||
--output-dir output/ \
|
||||
--model-choice 1 \
|
||||
--generate-website \
|
||||
--generate-poster \
|
||||
--generate-video \
|
||||
--poster-width-inches 48 \
|
||||
--poster-height-inches 36 \
|
||||
--enable-logo-search
|
||||
|
||||
# Step 3: Review outputs
|
||||
ls -R output/neurips2025_paper/
|
||||
# - website/index.html
|
||||
# - poster/poster_final.pdf
|
||||
# - video/final_video.mp4
|
||||
```
|
||||
|
||||
**Output**:
|
||||
- Interactive website showcasing research
|
||||
- 4'×3' conference poster (print-ready)
|
||||
- 12-minute presentation video
|
||||
- Processing time: ~45 minutes (without talking-head)
|
||||
|
||||
---
|
||||
|
||||
### Example 2: Quick Website for Preprint
|
||||
|
||||
**Scenario**: Creating an explorable homepage for a bioRxiv preprint.
|
||||
|
||||
**User Request**: "Convert my genomics preprint to an interactive website to accompany the bioRxiv submission."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Using PDF input (LaTeX not available)
|
||||
python pipeline_all.py \
|
||||
--input-dir papers/genomics_preprint/ \
|
||||
--output-dir output/genomics_web/ \
|
||||
--model-choice 1 \
|
||||
--generate-website
|
||||
|
||||
# Deploy to GitHub Pages or personal server
|
||||
cd output/genomics_web/website/
|
||||
# Add link to bioRxiv paper, data repositories, code
|
||||
# Upload to hosting service
|
||||
```
|
||||
|
||||
**Tips**:
|
||||
- Include links to bioRxiv DOI
|
||||
- Add GitHub repository links
|
||||
- Include data availability section
|
||||
- Embed interactive visualizations if possible
|
||||
|
||||
---
|
||||
|
||||
### Example 3: Video Abstract for Journal Submission
|
||||
|
||||
**Scenario**: Creating a video abstract for a journal that encourages multimedia submissions.
|
||||
|
||||
**User Request**: "Generate a 5-minute video abstract for my Nature Communications submission."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Generate concise video focusing on key findings
|
||||
python pipeline_light.py \
|
||||
--model_name_t gpt-4.1 \
|
||||
--model_name_v gpt-4.1 \
|
||||
--result_dir output/video_abstract/ \
|
||||
--paper_latex_root papers/nature_comms/ \
|
||||
--video-duration 300 \
|
||||
--slides-per-minute 3
|
||||
|
||||
# Optional: Add custom intro/outro slides
|
||||
# Optional: Include talking-head for introduction
|
||||
```
|
||||
|
||||
**Output**:
|
||||
- 5-minute video abstract
|
||||
- Focus on visual results
|
||||
- Clear, accessible narration
|
||||
- Journal-ready format
|
||||
|
||||
---
|
||||
|
||||
### Example 4: Multi-Paper Website Generation
|
||||
|
||||
**Scenario**: Creating websites for multiple papers from a research group.
|
||||
|
||||
**User Request**: "Generate websites for all 5 papers our lab published this year."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Organize papers
|
||||
mkdir -p batch_input/
|
||||
# Create subdirectories: paper1/, paper2/, paper3/, paper4/, paper5/
|
||||
# Each with their LaTeX sources
|
||||
|
||||
# Batch process
|
||||
python pipeline_all.py \
|
||||
--input-dir batch_input/ \
|
||||
--output-dir batch_output/ \
|
||||
--model-choice 1 \
|
||||
--generate-website \
|
||||
--enable-logo-search
|
||||
|
||||
# Creates:
|
||||
# batch_output/paper1/website/
|
||||
# batch_output/paper2/website/
|
||||
# batch_output/paper3/website/
|
||||
# batch_output/paper4/website/
|
||||
# batch_output/paper5/website/
|
||||
```
|
||||
|
||||
**Best Practice**:
|
||||
- Use consistent naming conventions
|
||||
- Process overnight for large batches
|
||||
- Review each website for accuracy
|
||||
- Deploy to unified lab website
|
||||
|
||||
---
|
||||
|
||||
### Example 5: Poster for Virtual Conference
|
||||
|
||||
**Scenario**: Creating a digital poster for a virtual conference with interactive elements.
|
||||
|
||||
**User Request**: "Create a poster for the virtual ISMB conference with clickable links to code and data."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Generate poster with QR codes and links
|
||||
python pipeline_all.py \
|
||||
--input-dir papers/ismb_submission/ \
|
||||
--output-dir output/ismb_poster/ \
|
||||
--model-choice 1 \
|
||||
--generate-poster \
|
||||
--poster-width-inches 48 \
|
||||
--poster-height-inches 36 \
|
||||
--enable-qr-codes
|
||||
|
||||
# Manually add QR codes to:
|
||||
# - GitHub repository
|
||||
# - Interactive results dashboard
|
||||
# - Supplementary data
|
||||
# - Video presentation
|
||||
```
|
||||
|
||||
**Digital Enhancements**:
|
||||
- PDF with embedded hyperlinks
|
||||
- High-resolution PNG for virtual platform
|
||||
- Separate PDF with video links for download
|
||||
|
||||
---
|
||||
|
||||
### Example 6: Promotional Video Clip
|
||||
|
||||
**Scenario**: Creating a short promotional video for social media.
|
||||
|
||||
**User Request**: "Generate a 2-minute highlight video of our Cell paper for Twitter."
|
||||
|
||||
**Workflow**:
|
||||
|
||||
```bash
|
||||
# Generate short, engaging video
|
||||
python pipeline_light.py \
|
||||
--model_name_t gpt-4.1 \
|
||||
--model_name_v gpt-4.1 \
|
||||
--result_dir output/promo_video/ \
|
||||
--paper_latex_root papers/cell_paper/ \
|
||||
--video-duration 120 \
|
||||
--presentation-style public
|
||||
|
||||
# Post-process:
|
||||
# - Extract key 30-second clip for Twitter
|
||||
# - Add captions for sound-off viewing
|
||||
# - Optimize file size for social media
|
||||
```
|
||||
|
||||
**Social Media Optimization**:
|
||||
- Square format (1:1) for Instagram
|
||||
- Horizontal format (16:9) for Twitter/LinkedIn
|
||||
- Vertical format (9:16) for TikTok/Stories
|
||||
- Add text overlays for key findings
|
||||
|
||||
---
|
||||
|
||||
## Common Use Case Patterns
|
||||
|
||||
### Pattern 1: LaTeX Paper → Full Package
|
||||
|
||||
**Input**: LaTeX source with all assets
|
||||
**Output**: Website + Poster + Video
|
||||
**Time**: 45-90 minutes
|
||||
**Best for**: Major publications, conference presentations
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir [latex_dir] \
|
||||
--output-dir [output_dir] \
|
||||
--model-choice 1 \
|
||||
--generate-website \
|
||||
--generate-poster \
|
||||
--generate-video
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Pattern 2: PDF → Interactive Website
|
||||
|
||||
**Input**: Published PDF paper
|
||||
**Output**: Explorable website
|
||||
**Time**: 15-30 minutes
|
||||
**Best for**: Post-publication promotion, preprint enhancement
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir [pdf_dir] \
|
||||
--output-dir [output_dir] \
|
||||
--model-choice 1 \
|
||||
--generate-website
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Pattern 3: LaTeX → Conference Poster
|
||||
|
||||
**Input**: LaTeX paper
|
||||
**Output**: Print-ready poster (custom size)
|
||||
**Time**: 10-20 minutes
|
||||
**Best for**: Conference poster sessions
|
||||
|
||||
```bash
|
||||
python pipeline_all.py \
|
||||
--input-dir [latex_dir] \
|
||||
--output-dir [output_dir] \
|
||||
--model-choice 1 \
|
||||
--generate-poster \
|
||||
--poster-width-inches [width] \
|
||||
--poster-height-inches [height]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Pattern 4: LaTeX → Presentation Video
|
||||
|
||||
**Input**: LaTeX paper
|
||||
**Output**: Narrated presentation video
|
||||
**Time**: 20-60 minutes (without talking-head)
|
||||
**Best for**: Video abstracts, online presentations, course materials
|
||||
|
||||
```bash
|
||||
python pipeline_light.py \
|
||||
--model_name_t gpt-4.1 \
|
||||
--model_name_v gpt-4.1 \
|
||||
--result_dir [output_dir] \
|
||||
--paper_latex_root [latex_dir]
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Platform-Specific Outputs
|
||||
|
||||
### Twitter/X Promotional Content
|
||||
|
||||
The system auto-detects Twitter targeting for numeric folder names:
|
||||
|
||||
```bash
|
||||
# Create Twitter-optimized content
|
||||
mkdir -p input/001_twitter_post/
|
||||
# System generates English promotional content
|
||||
```
|
||||
|
||||
**Generated Output**:
|
||||
- Short, engaging summary
|
||||
- Key figure highlights
|
||||
- Hashtag recommendations
|
||||
- Thread-ready format
|
||||
|
||||
---
|
||||
|
||||
### Xiaohongshu (小红书) Content
|
||||
|
||||
For Chinese social media, use alphanumeric folder names:
|
||||
|
||||
```bash
|
||||
# Create Xiaohongshu-optimized content
|
||||
mkdir -p input/xhs_genomics/
|
||||
# System generates Chinese promotional content
|
||||
```
|
||||
|
||||
**Generated Output**:
|
||||
- Chinese language content
|
||||
- Platform-appropriate formatting
|
||||
- Visual-first presentation
|
||||
- Engagement optimizations
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting Common Scenarios
|
||||
|
||||
### Scenario: Large Paper (>50 pages)
|
||||
|
||||
**Challenge**: Processing time and content selection
|
||||
**Solution**:
|
||||
```bash
|
||||
# Option 1: Focus on key sections
|
||||
# Edit LaTeX to comment out less critical sections
|
||||
|
||||
# Option 2: Process in parts
|
||||
# Generate website for overview
|
||||
# Generate separate detailed videos for methods/results
|
||||
|
||||
# Option 3: Use faster model for initial pass
|
||||
# Review and regenerate critical components with better model
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario: Complex Mathematical Content
|
||||
|
||||
**Challenge**: Equations may not render perfectly
|
||||
**Solution**:
|
||||
- Use LaTeX input (not PDF) for best equation handling
|
||||
- Review generated content for equation accuracy
|
||||
- Manually adjust complex equations if needed
|
||||
- Consider using figure screenshots for critical equations
|
||||
|
||||
---
|
||||
|
||||
### Scenario: Non-Standard Paper Structure
|
||||
|
||||
**Challenge**: Paper doesn't follow standard IMRAD format
|
||||
**Solution**:
|
||||
- Provide custom section guidance in paper metadata
|
||||
- Review generated structure and adjust
|
||||
- Use more powerful model (GPT-4.1) for better adaptation
|
||||
- Consider manual section annotation in LaTeX comments
|
||||
|
||||
---
|
||||
|
||||
### Scenario: Limited API Budget
|
||||
|
||||
**Challenge**: Reducing costs while maintaining quality
|
||||
**Solution**:
|
||||
```bash
|
||||
# Use GPT-3.5-turbo for simple papers
|
||||
python pipeline_all.py \
|
||||
--input-dir [paper_dir] \
|
||||
--output-dir [output_dir] \
|
||||
--model-choice 3
|
||||
|
||||
# Generate only needed components
|
||||
# Website-only (cheapest)
|
||||
# Poster-only (moderate)
|
||||
# Video without talking-head (moderate)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Scenario: Tight Deadline
|
||||
|
||||
**Challenge**: Need outputs quickly
|
||||
**Solution**:
|
||||
```bash
|
||||
# Parallel processing if multiple papers
|
||||
# Use faster models (GPT-3.5-turbo)
|
||||
# Generate only essential component first
|
||||
# Skip optional features (logo search, talking-head)
|
||||
|
||||
python pipeline_light.py \
|
||||
--model_name_t gpt-3.5-turbo \
|
||||
--model_name_v gpt-3.5-turbo \
|
||||
--result_dir [output_dir] \
|
||||
--paper_latex_root [latex_dir]
|
||||
```
|
||||
|
||||
**Priority Order**:
|
||||
1. Website (fastest, most versatile)
|
||||
2. Poster (moderate speed, print deadline)
|
||||
3. Video (slowest, can be generated later)
|
||||
|
||||
---
|
||||
|
||||
## Quality Optimization Tips
|
||||
|
||||
### For Best Website Results
|
||||
1. Use LaTeX input with all assets
|
||||
2. Include high-resolution figures
|
||||
3. Ensure paper has clear section structure
|
||||
4. Enable logo search for professional appearance
|
||||
5. Review and test all interactive elements
|
||||
|
||||
### For Best Poster Results
|
||||
1. Provide high-resolution figures (300+ DPI)
|
||||
2. Specify exact poster dimensions needed
|
||||
3. Include institution branding information
|
||||
4. Use professional color scheme
|
||||
5. Test print small preview before full poster
|
||||
|
||||
### For Best Video Results
|
||||
1. Use LaTeX for clearest content extraction
|
||||
2. Specify target duration appropriately
|
||||
3. Review script before video generation
|
||||
4. Choose appropriate presentation style
|
||||
5. Test audio quality and pacing
|
||||
|
||||
### For Best Overall Results
|
||||
1. Start with clean, well-organized LaTeX source
|
||||
2. Use GPT-4 or GPT-4.1 for highest quality
|
||||
3. Review all outputs before finalizing
|
||||
4. Iterate on any component that needs adjustment
|
||||
5. Combine components for cohesive presentation package
|
||||
Reference in New Issue
Block a user