Files
gh-k-dense-ai-claude-scient…/skills/markitdown/INSTALLATION_GUIDE.md
2025-11-30 08:30:14 +08:00

6.6 KiB

MarkItDown Installation Guide

Prerequisites

  • Python 3.10 or higher
  • pip package manager
  • Virtual environment (recommended)

Basic Installation

pip install 'markitdown[all]'

This installs support for all file formats and features.

Install Specific Features

If you only need certain file formats, you can install specific dependencies:

# PDF support only
pip install 'markitdown[pdf]'

# Office documents
pip install 'markitdown[docx,pptx,xlsx]'

# Multiple formats
pip install 'markitdown[pdf,docx,pptx,xlsx,audio-transcription]'

Install from Source

git clone https://github.com/microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'

Optional Dependencies

Feature Installation Use Case
All formats pip install 'markitdown[all]' Everything
PDF pip install 'markitdown[pdf]' PDF documents
Word pip install 'markitdown[docx]' DOCX files
PowerPoint pip install 'markitdown[pptx]' PPTX files
Excel (new) pip install 'markitdown[xlsx]' XLSX files
Excel (old) pip install 'markitdown[xls]' XLS files
Outlook pip install 'markitdown[outlook]' MSG files
Azure DI pip install 'markitdown[az-doc-intel]' Enhanced PDF
Audio pip install 'markitdown[audio-transcription]' WAV/MP3
YouTube pip install 'markitdown[youtube-transcription]' YouTube videos

System Dependencies

OCR Support (for scanned documents and images)

macOS

brew install tesseract

Ubuntu/Debian

sudo apt-get update
sudo apt-get install tesseract-ocr

Windows

Download from: https://github.com/UB-Mannheim/tesseract/wiki

Poppler Utils (for advanced PDF operations)

macOS

brew install poppler

Ubuntu/Debian

sudo apt-get install poppler-utils

Verification

Test your installation:

# Check version
python -c "import markitdown; print('MarkItDown installed successfully')"

# Test basic conversion
echo "Test" > test.txt
markitdown test.txt
rm test.txt

Virtual Environment Setup

Using venv

# Create virtual environment
python -m venv markitdown-env

# Activate (macOS/Linux)
source markitdown-env/bin/activate

# Activate (Windows)
markitdown-env\Scripts\activate

# Install
pip install 'markitdown[all]'

Using conda

# Create environment
conda create -n markitdown python=3.12

# Activate
conda activate markitdown

# Install
pip install 'markitdown[all]'

Using uv

# Create virtual environment
uv venv --python=3.12 .venv

# Activate
source .venv/bin/activate

# Install
uv pip install 'markitdown[all]'

AI Enhancement Setup (Optional)

For AI-powered image descriptions using OpenRouter:

OpenRouter API

OpenRouter provides unified access to multiple AI models (GPT-4, Claude, Gemini, etc.) through a single API.

# Install OpenAI SDK (required, already included with markitdown)
pip install openai

# Get API key from https://openrouter.ai/keys

# Set API key
export OPENROUTER_API_KEY="sk-or-v1-..."

# Add to shell profile for persistence
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.bashrc  # Linux
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc   # macOS

Why OpenRouter?

  • Access to 100+ AI models through one API
  • Choose between GPT-4, Claude, Gemini, and more
  • Competitive pricing
  • No vendor lock-in
  • Simple OpenAI-compatible interface

Popular Models for Image Description:

  • anthropic/claude-sonnet-4.5 - Recommended - Best for scientific vision
  • anthropic/claude-3.5-sonnet - Excellent technical analysis
  • openai/gpt-4o - Good vision understanding
  • google/gemini-pro-vision - Cost-effective option

See https://openrouter.ai/models for complete model list and pricing.

Azure Document Intelligence Setup (Optional)

For enhanced PDF conversion:

  1. Create Azure Document Intelligence resource in Azure Portal
  2. Get endpoint and key
  3. Set environment variables:
export AZURE_DOCUMENT_INTELLIGENCE_KEY="your-key"
export AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT="https://your-endpoint.cognitiveservices.azure.com/"

Docker Installation (Alternative)

# Clone repository
git clone https://github.com/microsoft/markitdown.git
cd markitdown

# Build image
docker build -t markitdown:latest .

# Run
docker run --rm -i markitdown:latest < input.pdf > output.md

Troubleshooting

Import Error

ModuleNotFoundError: No module named 'markitdown'

Solution: Ensure you're in the correct virtual environment and markitdown is installed:

pip install 'markitdown[all]'

Missing Feature

Error: PDF conversion not supported

Solution: Install the specific feature:

pip install 'markitdown[pdf]'

OCR Not Working

Solution: Install Tesseract OCR (see System Dependencies above)

Permission Errors

Solution: Use virtual environment or install with --user flag:

pip install --user 'markitdown[all]'

Upgrading

# Upgrade to latest version
pip install --upgrade 'markitdown[all]'

# Check version
pip show markitdown

Uninstallation

pip uninstall markitdown

Next Steps

After installation:

  1. Read QUICK_REFERENCE.md for basic usage
  2. See SKILL.md for comprehensive guide
  3. Try example scripts in scripts/ directory
  4. Check assets/example_usage.md for practical examples

Skill Scripts Setup

To use the skill scripts:

# Navigate to scripts directory
cd /Users/vinayak/Documents/claude-scientific-writer/.claude/skills/markitdown/scripts

# Scripts are already executable, just run them
python batch_convert.py --help
python convert_with_ai.py --help
python convert_literature.py --help

Testing Installation

Create a test file to verify everything works:

# test_markitdown.py
from markitdown import MarkItDown

def test_basic():
    md = MarkItDown()
    # Create a simple test file
    with open("test.txt", "w") as f:
        f.write("Hello MarkItDown!")
    
    # Convert it
    result = md.convert("test.txt")
    print("✓ Basic conversion works")
    print(result.text_content)
    
    # Cleanup
    import os
    os.remove("test.txt")

if __name__ == "__main__":
    test_basic()

Run it:

python test_markitdown.py

Getting Help