Files
2025-11-30 08:30:14 +08:00

8.6 KiB

MarkItDown Skill - Creation Summary

Overview

A comprehensive skill for using Microsoft's MarkItDown tool has been created for the Claude Scientific Writer. This skill enables conversion of 15+ file formats to Markdown, optimized for LLM processing and scientific workflows.

What Was Created

Core Documentation

  1. SKILL.md (Main skill file)

    • Complete guide to MarkItDown
    • Quick start examples
    • All supported formats
    • Advanced features (AI, Azure DI)
    • Best practices
    • Use cases and examples
  2. README.md

    • Skill overview
    • Key features
    • Quick reference
    • Integration guide
  3. QUICK_REFERENCE.md

    • Cheat sheet for common tasks
    • Quick syntax reference
    • Common commands
    • Troubleshooting tips
  4. INSTALLATION_GUIDE.md

    • Step-by-step installation
    • System dependencies
    • Virtual environment setup
    • Optional features
    • Troubleshooting

Reference Documentation

Located in references/:

  1. api_reference.md

    • Complete API documentation
    • Class and method references
    • Custom converter development
    • Plugin system
    • Error handling
    • Breaking changes guide
  2. file_formats.md

    • Detailed format-specific guides
    • 15+ supported formats
    • Format capabilities and limitations
    • Best practices per format
    • Example outputs

Utility Scripts

Located in scripts/:

  1. batch_convert.py

    • Parallel batch conversion
    • Multi-format support
    • Recursive directory search
    • Progress tracking
    • Error reporting
    • Command-line interface
  2. convert_with_ai.py

    • AI-enhanced conversions
    • Predefined prompt types (scientific, medical, data viz, etc.)
    • Custom prompt support
    • Multiple model support
    • OpenRouter integration (Claude Sonnet 4.5 default)
  3. convert_literature.py

    • Scientific literature conversion
    • Metadata extraction from filenames
    • Year-based organization
    • Automatic index generation
    • JSON catalog creation
    • Front matter support

Assets

Located in assets/:

  1. example_usage.md
    • 20+ practical examples
    • Basic conversions
    • Scientific workflows
    • AI-enhanced processing
    • Batch operations
    • Error handling patterns
    • Integration examples

License

  • LICENSE.txt - MIT License from Microsoft

Skill Structure

.claude/skills/markitdown/
├── SKILL.md                    # Main skill documentation
├── README.md                   # Skill overview
├── QUICK_REFERENCE.md          # Quick reference guide
├── INSTALLATION_GUIDE.md       # Installation instructions
├── SKILL_SUMMARY.md           # This file
├── LICENSE.txt                 # MIT License
├── references/
│   ├── api_reference.md       # Complete API docs
│   └── file_formats.md        # Format-specific guides
├── scripts/
│   ├── batch_convert.py       # Batch conversion utility
│   ├── convert_with_ai.py     # AI-enhanced conversion
│   └── convert_literature.py  # Literature conversion
└── assets/
    └── example_usage.md       # Practical examples

Capabilities

File Format Support

  • Documents: PDF, DOCX, PPTX, XLSX, XLS, EPUB
  • Images: JPEG, PNG, GIF, WebP (with OCR)
  • Audio: WAV, MP3 (with transcription)
  • Web: HTML, YouTube URLs
  • Data: CSV, JSON, XML
  • Archives: ZIP files
  • Email: Outlook MSG files

Advanced Features

  1. AI Enhancement via OpenRouter

    • Access to 100+ AI models through OpenRouter
    • Multiple preset prompts (scientific, medical, data viz)
    • Custom prompt support
    • Default: Claude Sonnet 4.5 (best for scientific vision)
    • Choose best model for each task
  2. Azure Integration

    • Azure Document Intelligence for complex PDFs
    • Enhanced layout understanding
    • Better table extraction
  3. Batch Processing

    • Parallel conversion with configurable workers
    • Recursive directory processing
    • Progress tracking and error reporting
    • Format-specific organization
  4. Scientific Workflows

    • Literature conversion with metadata
    • Automatic index generation
    • Year-based organization
    • Citation-friendly output

Integration with Scientific Writer

The skill has been added to the Scientific Writer's skill catalog:

  • Location: .claude/skills/markitdown/
  • Skill Number: #5 in Document Manipulation Skills
  • SKILLS.md: Updated with complete skill description

Usage Examples

> Convert all PDFs in the literature folder to Markdown
> Convert this PowerPoint presentation to Markdown with AI-generated descriptions
> Extract tables from this Excel file
> Transcribe this lecture recording

Scripts Usage

Batch Convert

python scripts/batch_convert.py input_dir/ output_dir/ --extensions .pdf .docx --workers 4

AI-Enhanced Convert

export OPENROUTER_API_KEY="sk-or-v1-..."
python scripts/convert_with_ai.py paper.pdf output.md \
  --model anthropic/claude-sonnet-4.5 \
  --prompt-type scientific

Literature Convert

python scripts/convert_literature.py papers/ markdown/ --organize-by-year --create-index

Key Features

  1. Token-Efficient Output: Markdown optimized for LLM processing
  2. Comprehensive Format Support: 15+ file types
  3. AI Enhancement: Detailed image descriptions via OpenAI
  4. OCR Support: Extract text from scanned documents
  5. Audio Transcription: Speech-to-text for audio files
  6. YouTube Support: Video transcript extraction
  7. Plugin System: Extensible architecture
  8. Batch Processing: Efficient parallel conversion
  9. Error Handling: Robust error management
  10. Scientific Focus: Optimized for research workflows

Installation

# Full installation
pip install 'markitdown[all]'

# Selective installation
pip install 'markitdown[pdf,docx,pptx,xlsx]'

Quick Start

from markitdown import MarkItDown

# Basic usage
md = MarkItDown()
result = md.convert("document.pdf")
print(result.text_content)

# With AI via OpenRouter
from openai import OpenAI
client = OpenAI(
    api_key="your-openrouter-api-key",
    base_url="https://openrouter.ai/api/v1"
)
md = MarkItDown(
    llm_client=client,
    llm_model="anthropic/claude-sonnet-4.5"  # or openai/gpt-4o
)
result = md.convert("presentation.pptx")

Documentation Files

File Purpose Lines
SKILL.md Main documentation 400+
api_reference.md API documentation 500+
file_formats.md Format guides 600+
example_usage.md Practical examples 500+
batch_convert.py Batch conversion 200+
convert_with_ai.py AI conversion 200+
convert_literature.py Literature conversion 250+
QUICK_REFERENCE.md Quick reference 300+
INSTALLATION_GUIDE.md Installation guide 300+

Total: ~3,000+ lines of documentation and code

Use Cases

  1. Literature Review: Convert research papers to Markdown for analysis
  2. Data Extraction: Extract tables from Excel/PDF for processing
  3. Presentation Processing: Convert slides with AI descriptions
  4. Document Analysis: Prepare documents for LLM consumption
  5. Lecture Transcription: Convert audio recordings to text
  6. YouTube Analysis: Extract video transcripts
  7. Archive Processing: Batch convert document collections

Next Steps

  1. Install MarkItDown: pip install 'markitdown[all]'
  2. Read QUICK_REFERENCE.md for common tasks
  3. Try example scripts in scripts/ directory
  4. Explore SKILL.md for comprehensive guide
  5. Check example_usage.md for practical examples

Resources

Success Criteria

Comprehensive skill documentation created
Complete API reference provided
Format-specific guides included
Utility scripts implemented
Practical examples documented
Installation guide created
Quick reference guide added
Integration with Scientific Writer complete
SKILLS.md updated
Scripts made executable
MIT License included

Skill Status

Status: Complete and Ready to Use

The MarkItDown skill is fully integrated into the Claude Scientific Writer and ready for use. All documentation, scripts, and examples are in place.