gh-k-dense-ai-claude-scient…/skills/markitdown/INSTALLATION_GUIDE.md

# MarkItDown Installation Guide

## Prerequisites

- Python 3.10 or higher
- pip package manager
- Virtual environment (recommended)

## Basic Installation

### Install All Features (Recommended)

```bash
pip install 'markitdown[all]'
```

This installs support for all file formats and features.

### Install Specific Features

If you only need certain file formats, you can install specific dependencies:

```bash
# PDF support only
pip install 'markitdown[pdf]'

# Office documents
pip install 'markitdown[docx,pptx,xlsx]'

# Multiple formats
pip install 'markitdown[pdf,docx,pptx,xlsx,audio-transcription]'
```

### Install from Source

```bash
git clone https://github.com/microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'
```

## Optional Dependencies

| Feature | Installation | Use Case |
|---------|--------------|----------|
| All formats | `pip install 'markitdown[all]'` | Everything |
| PDF | `pip install 'markitdown[pdf]'` | PDF documents |
| Word | `pip install 'markitdown[docx]'` | DOCX files |
| PowerPoint | `pip install 'markitdown[pptx]'` | PPTX files |
| Excel (new) | `pip install 'markitdown[xlsx]'` | XLSX files |
| Excel (old) | `pip install 'markitdown[xls]'` | XLS files |
| Outlook | `pip install 'markitdown[outlook]'` | MSG files |
| Azure DI | `pip install 'markitdown[az-doc-intel]'` | Enhanced PDF |
| Audio | `pip install 'markitdown[audio-transcription]'` | WAV/MP3 |
| YouTube | `pip install 'markitdown[youtube-transcription]'` | YouTube videos |

## System Dependencies

### OCR Support (for scanned documents and images)

#### macOS
```bash
brew install tesseract
```

#### Ubuntu/Debian
```bash
sudo apt-get update
sudo apt-get install tesseract-ocr
```

#### Windows
Download from: https://github.com/UB-Mannheim/tesseract/wiki

### Poppler Utils (for advanced PDF operations)

#### macOS
```bash
brew install poppler
```

#### Ubuntu/Debian
```bash
sudo apt-get install poppler-utils
```

## Verification

Test your installation:

```bash
# Check version
python -c "import markitdown; print('MarkItDown installed successfully')"

# Test basic conversion
echo "Test" > test.txt
markitdown test.txt
rm test.txt
```

## Virtual Environment Setup

### Using venv

```bash
# Create virtual environment
python -m venv markitdown-env

# Activate (macOS/Linux)
source markitdown-env/bin/activate

# Activate (Windows)
markitdown-env\Scripts\activate

# Install
pip install 'markitdown[all]'
```

### Using conda

```bash
# Create environment
conda create -n markitdown python=3.12

# Activate
conda activate markitdown

# Install
pip install 'markitdown[all]'
```

### Using uv

```bash
# Create virtual environment
uv venv --python=3.12 .venv

# Activate
source .venv/bin/activate

# Install
uv pip install 'markitdown[all]'
```

## AI Enhancement Setup (Optional)

For AI-powered image descriptions using OpenRouter:

### OpenRouter API

OpenRouter provides unified access to multiple AI models (GPT-4, Claude, Gemini, etc.) through a single API.

```bash
# Install OpenAI SDK (required, already included with markitdown)
pip install openai

# Get API key from https://openrouter.ai/keys

# Set API key
export OPENROUTER_API_KEY="sk-or-v1-..."

# Add to shell profile for persistence
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.bashrc  # Linux
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc   # macOS
```

**Why OpenRouter?**
- Access to 100+ AI models through one API
- Choose between GPT-4, Claude, Gemini, and more
- Competitive pricing
- No vendor lock-in
- Simple OpenAI-compatible interface

**Popular Models for Image Description:**
- `anthropic/claude-sonnet-4.5` - **Recommended** - Best for scientific vision
- `anthropic/claude-3.5-sonnet` - Excellent technical analysis
- `openai/gpt-4o` - Good vision understanding
- `google/gemini-pro-vision` - Cost-effective option

See https://openrouter.ai/models for complete model list and pricing.

## Azure Document Intelligence Setup (Optional)

For enhanced PDF conversion:

1. Create Azure Document Intelligence resource in Azure Portal
2. Get endpoint and key
3. Set environment variables:

```bash
export AZURE_DOCUMENT_INTELLIGENCE_KEY="your-key"
export AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT="https://your-endpoint.cognitiveservices.azure.com/"
```

## Docker Installation (Alternative)

```bash
# Clone repository
git clone https://github.com/microsoft/markitdown.git
cd markitdown

# Build image
docker build -t markitdown:latest .

# Run
docker run --rm -i markitdown:latest < input.pdf > output.md
```

## Troubleshooting

### Import Error
```
ModuleNotFoundError: No module named 'markitdown'
```

**Solution**: Ensure you're in the correct virtual environment and markitdown is installed:
```bash
pip install 'markitdown[all]'
```

### Missing Feature
```
Error: PDF conversion not supported
```

**Solution**: Install the specific feature:
```bash
pip install 'markitdown[pdf]'
```

### OCR Not Working

**Solution**: Install Tesseract OCR (see System Dependencies above)

### Permission Errors

**Solution**: Use virtual environment or install with `--user` flag:
```bash
pip install --user 'markitdown[all]'
```

## Upgrading

```bash
# Upgrade to latest version
pip install --upgrade 'markitdown[all]'

# Check version
pip show markitdown
```

## Uninstallation

```bash
pip uninstall markitdown
```

## Next Steps

After installation:
1. Read `QUICK_REFERENCE.md` for basic usage
2. See `SKILL.md` for comprehensive guide
3. Try example scripts in `scripts/` directory
4. Check `assets/example_usage.md` for practical examples

## Skill Scripts Setup

To use the skill scripts:

```bash
# Navigate to scripts directory
cd /Users/vinayak/Documents/claude-scientific-writer/.claude/skills/markitdown/scripts

# Scripts are already executable, just run them
python batch_convert.py --help
python convert_with_ai.py --help
python convert_literature.py --help
```

## Testing Installation

Create a test file to verify everything works:

```python
# test_markitdown.py
from markitdown import MarkItDown

def test_basic():
    md = MarkItDown()
    # Create a simple test file
    with open("test.txt", "w") as f:
        f.write("Hello MarkItDown!")

    # Convert it
    result = md.convert("test.txt")
    print("✓ Basic conversion works")
    print(result.text_content)

    # Cleanup
    import os
    os.remove("test.txt")

if __name__ == "__main__":
    test_basic()
```

Run it:
```bash
python test_markitdown.py
```

## Getting Help

- **Documentation**: See `SKILL.md` and `README.md`
- **GitHub Issues**: https://github.com/microsoft/markitdown/issues
- **Examples**: `assets/example_usage.md`
- **API Reference**: `references/api_reference.md`