zhongwei/gh-k-dense-ai-claude-scientific-writer

Fork 0

Files

Zhongwei Li 1dd5bee3b4 Initial commit

2025-11-30 08:30:14 +08:00

6.6 KiB

Raw Permalink Blame History

MarkItDown Installation Guide

Prerequisites

Python 3.10 or higher
pip package manager
Virtual environment (recommended)

Basic Installation

Install All Features (Recommended)

pip install 'markitdown[all]'

This installs support for all file formats and features.

Install Specific Features

If you only need certain file formats, you can install specific dependencies:

# PDF support only
pip install 'markitdown[pdf]'

# Office documents
pip install 'markitdown[docx,pptx,xlsx]'

# Multiple formats
pip install 'markitdown[pdf,docx,pptx,xlsx,audio-transcription]'

Install from Source

git clone https://github.com/microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'

Optional Dependencies

Feature	Installation	Use Case
All formats	`pip install 'markitdown[all]'`	Everything
PDF	`pip install 'markitdown[pdf]'`	PDF documents
Word	`pip install 'markitdown[docx]'`	DOCX files
PowerPoint	`pip install 'markitdown[pptx]'`	PPTX files
Excel (new)	`pip install 'markitdown[xlsx]'`	XLSX files
Excel (old)	`pip install 'markitdown[xls]'`	XLS files
Outlook	`pip install 'markitdown[outlook]'`	MSG files
Azure DI	`pip install 'markitdown[az-doc-intel]'`	Enhanced PDF
Audio	`pip install 'markitdown[audio-transcription]'`	WAV/MP3
YouTube	`pip install 'markitdown[youtube-transcription]'`	YouTube videos

System Dependencies

OCR Support (for scanned documents and images)

macOS

brew install tesseract

Ubuntu/Debian

sudo apt-get update
sudo apt-get install tesseract-ocr

Windows

Download from: https://github.com/UB-Mannheim/tesseract/wiki

Poppler Utils (for advanced PDF operations)

macOS

brew install poppler

Ubuntu/Debian

sudo apt-get install poppler-utils

Verification

Test your installation:

# Check version
python -c "import markitdown; print('MarkItDown installed successfully')"

# Test basic conversion
echo "Test" > test.txt
markitdown test.txt
rm test.txt

Virtual Environment Setup

Using venv

# Create virtual environment
python -m venv markitdown-env

# Activate (macOS/Linux)
source markitdown-env/bin/activate

# Activate (Windows)
markitdown-env\Scripts\activate

# Install
pip install 'markitdown[all]'

Using conda

# Create environment
conda create -n markitdown python=3.12

# Activate
conda activate markitdown

# Install
pip install 'markitdown[all]'

Using uv

# Create virtual environment
uv venv --python=3.12 .venv

# Activate
source .venv/bin/activate

# Install
uv pip install 'markitdown[all]'

AI Enhancement Setup (Optional)

For AI-powered image descriptions using OpenRouter:

OpenRouter API

OpenRouter provides unified access to multiple AI models (GPT-4, Claude, Gemini, etc.) through a single API.

# Install OpenAI SDK (required, already included with markitdown)
pip install openai

# Get API key from https://openrouter.ai/keys

# Set API key
export OPENROUTER_API_KEY="sk-or-v1-..."

# Add to shell profile for persistence
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.bashrc  # Linux
echo 'export OPENROUTER_API_KEY="sk-or-v1-..."' >> ~/.zshrc   # macOS

Why OpenRouter?

Access to 100+ AI models through one API
Choose between GPT-4, Claude, Gemini, and more
Competitive pricing
No vendor lock-in
Simple OpenAI-compatible interface

Popular Models for Image Description:

anthropic/claude-sonnet-4.5 - Recommended - Best for scientific vision
anthropic/claude-3.5-sonnet - Excellent technical analysis
openai/gpt-4o - Good vision understanding
google/gemini-pro-vision - Cost-effective option

See https://openrouter.ai/models for complete model list and pricing.

Azure Document Intelligence Setup (Optional)

For enhanced PDF conversion:

Create Azure Document Intelligence resource in Azure Portal
Get endpoint and key
Set environment variables:

export AZURE_DOCUMENT_INTELLIGENCE_KEY="your-key"
export AZURE_DOCUMENT_INTELLIGENCE_ENDPOINT="https://your-endpoint.cognitiveservices.azure.com/"

Docker Installation (Alternative)

# Clone repository
git clone https://github.com/microsoft/markitdown.git
cd markitdown

# Build image
docker build -t markitdown:latest .

# Run
docker run --rm -i markitdown:latest < input.pdf > output.md

Troubleshooting

Import Error

ModuleNotFoundError: No module named 'markitdown'

Solution: Ensure you're in the correct virtual environment and markitdown is installed:

pip install 'markitdown[all]'

Missing Feature

Error: PDF conversion not supported

Solution: Install the specific feature:

pip install 'markitdown[pdf]'

OCR Not Working

Solution: Install Tesseract OCR (see System Dependencies above)

Permission Errors

Solution: Use virtual environment or install with --user flag:

pip install --user 'markitdown[all]'

Upgrading

# Upgrade to latest version
pip install --upgrade 'markitdown[all]'

# Check version
pip show markitdown

Uninstallation

pip uninstall markitdown

Next Steps

After installation:

Read QUICK_REFERENCE.md for basic usage
See SKILL.md for comprehensive guide
Try example scripts in scripts/ directory
Check assets/example_usage.md for practical examples

Skill Scripts Setup

To use the skill scripts:

# Navigate to scripts directory
cd /Users/vinayak/Documents/claude-scientific-writer/.claude/skills/markitdown/scripts

# Scripts are already executable, just run them
python batch_convert.py --help
python convert_with_ai.py --help
python convert_literature.py --help

Testing Installation

Create a test file to verify everything works:

# test_markitdown.py
from markitdown import MarkItDown

def test_basic():
    md = MarkItDown()
    # Create a simple test file
    with open("test.txt", "w") as f:
        f.write("Hello MarkItDown!")
    
    # Convert it
    result = md.convert("test.txt")
    print("✓ Basic conversion works")
    print(result.text_content)
    
    # Cleanup
    import os
    os.remove("test.txt")

if __name__ == "__main__":
    test_basic()

Run it:

python test_markitdown.py

Getting Help

Documentation: See SKILL.md and README.md
GitHub Issues: https://github.com/microsoft/markitdown/issues
Examples: assets/example_usage.md
API Reference: references/api_reference.md

6.6 KiB Raw Permalink Blame History

MarkItDown Installation Guide

Prerequisites

Basic Installation

Install All Features (Recommended)

Install Specific Features

Install from Source

Optional Dependencies

System Dependencies

OCR Support (for scanned documents and images)

macOS

Ubuntu/Debian

Windows

Poppler Utils (for advanced PDF operations)

macOS

Ubuntu/Debian

Verification

Virtual Environment Setup

Using venv

Using conda

Using uv

AI Enhancement Setup (Optional)

OpenRouter API

Azure Document Intelligence Setup (Optional)

Docker Installation (Alternative)

Troubleshooting

Import Error

Missing Feature

OCR Not Working

Permission Errors

Upgrading

Uninstallation

Next Steps

Skill Scripts Setup

Testing Installation

Getting Help

6.6 KiB

Raw Permalink Blame History