308 lines
8.6 KiB
Markdown
308 lines
8.6 KiB
Markdown
# MarkItDown Skill - Creation Summary
|
|
|
|
## Overview
|
|
|
|
A comprehensive skill for using Microsoft's MarkItDown tool has been created for the Claude Scientific Writer. This skill enables conversion of 15+ file formats to Markdown, optimized for LLM processing and scientific workflows.
|
|
|
|
## What Was Created
|
|
|
|
### Core Documentation
|
|
|
|
1. **SKILL.md** (Main skill file)
|
|
- Complete guide to MarkItDown
|
|
- Quick start examples
|
|
- All supported formats
|
|
- Advanced features (AI, Azure DI)
|
|
- Best practices
|
|
- Use cases and examples
|
|
|
|
2. **README.md**
|
|
- Skill overview
|
|
- Key features
|
|
- Quick reference
|
|
- Integration guide
|
|
|
|
3. **QUICK_REFERENCE.md**
|
|
- Cheat sheet for common tasks
|
|
- Quick syntax reference
|
|
- Common commands
|
|
- Troubleshooting tips
|
|
|
|
4. **INSTALLATION_GUIDE.md**
|
|
- Step-by-step installation
|
|
- System dependencies
|
|
- Virtual environment setup
|
|
- Optional features
|
|
- Troubleshooting
|
|
|
|
### Reference Documentation
|
|
|
|
Located in `references/`:
|
|
|
|
1. **api_reference.md**
|
|
- Complete API documentation
|
|
- Class and method references
|
|
- Custom converter development
|
|
- Plugin system
|
|
- Error handling
|
|
- Breaking changes guide
|
|
|
|
2. **file_formats.md**
|
|
- Detailed format-specific guides
|
|
- 15+ supported formats
|
|
- Format capabilities and limitations
|
|
- Best practices per format
|
|
- Example outputs
|
|
|
|
### Utility Scripts
|
|
|
|
Located in `scripts/`:
|
|
|
|
1. **batch_convert.py**
|
|
- Parallel batch conversion
|
|
- Multi-format support
|
|
- Recursive directory search
|
|
- Progress tracking
|
|
- Error reporting
|
|
- Command-line interface
|
|
|
|
2. **convert_with_ai.py**
|
|
- AI-enhanced conversions
|
|
- Predefined prompt types (scientific, medical, data viz, etc.)
|
|
- Custom prompt support
|
|
- Multiple model support
|
|
- OpenRouter integration (Claude Sonnet 4.5 default)
|
|
|
|
3. **convert_literature.py**
|
|
- Scientific literature conversion
|
|
- Metadata extraction from filenames
|
|
- Year-based organization
|
|
- Automatic index generation
|
|
- JSON catalog creation
|
|
- Front matter support
|
|
|
|
### Assets
|
|
|
|
Located in `assets/`:
|
|
|
|
1. **example_usage.md**
|
|
- 20+ practical examples
|
|
- Basic conversions
|
|
- Scientific workflows
|
|
- AI-enhanced processing
|
|
- Batch operations
|
|
- Error handling patterns
|
|
- Integration examples
|
|
|
|
### License
|
|
|
|
- **LICENSE.txt** - MIT License from Microsoft
|
|
|
|
## Skill Structure
|
|
|
|
```
|
|
.claude/skills/markitdown/
|
|
├── SKILL.md # Main skill documentation
|
|
├── README.md # Skill overview
|
|
├── QUICK_REFERENCE.md # Quick reference guide
|
|
├── INSTALLATION_GUIDE.md # Installation instructions
|
|
├── SKILL_SUMMARY.md # This file
|
|
├── LICENSE.txt # MIT License
|
|
├── references/
|
|
│ ├── api_reference.md # Complete API docs
|
|
│ └── file_formats.md # Format-specific guides
|
|
├── scripts/
|
|
│ ├── batch_convert.py # Batch conversion utility
|
|
│ ├── convert_with_ai.py # AI-enhanced conversion
|
|
│ └── convert_literature.py # Literature conversion
|
|
└── assets/
|
|
└── example_usage.md # Practical examples
|
|
```
|
|
|
|
## Capabilities
|
|
|
|
### File Format Support
|
|
|
|
- **Documents**: PDF, DOCX, PPTX, XLSX, XLS, EPUB
|
|
- **Images**: JPEG, PNG, GIF, WebP (with OCR)
|
|
- **Audio**: WAV, MP3 (with transcription)
|
|
- **Web**: HTML, YouTube URLs
|
|
- **Data**: CSV, JSON, XML
|
|
- **Archives**: ZIP files
|
|
- **Email**: Outlook MSG files
|
|
|
|
### Advanced Features
|
|
|
|
1. **AI Enhancement via OpenRouter**
|
|
- Access to 100+ AI models through OpenRouter
|
|
- Multiple preset prompts (scientific, medical, data viz)
|
|
- Custom prompt support
|
|
- Default: Claude Sonnet 4.5 (best for scientific vision)
|
|
- Choose best model for each task
|
|
|
|
2. **Azure Integration**
|
|
- Azure Document Intelligence for complex PDFs
|
|
- Enhanced layout understanding
|
|
- Better table extraction
|
|
|
|
3. **Batch Processing**
|
|
- Parallel conversion with configurable workers
|
|
- Recursive directory processing
|
|
- Progress tracking and error reporting
|
|
- Format-specific organization
|
|
|
|
4. **Scientific Workflows**
|
|
- Literature conversion with metadata
|
|
- Automatic index generation
|
|
- Year-based organization
|
|
- Citation-friendly output
|
|
|
|
## Integration with Scientific Writer
|
|
|
|
The skill has been added to the Scientific Writer's skill catalog:
|
|
|
|
- **Location**: `.claude/skills/markitdown/`
|
|
- **Skill Number**: #5 in Document Manipulation Skills
|
|
- **SKILLS.md**: Updated with complete skill description
|
|
|
|
### Usage Examples
|
|
|
|
```
|
|
> Convert all PDFs in the literature folder to Markdown
|
|
> Convert this PowerPoint presentation to Markdown with AI-generated descriptions
|
|
> Extract tables from this Excel file
|
|
> Transcribe this lecture recording
|
|
```
|
|
|
|
## Scripts Usage
|
|
|
|
### Batch Convert
|
|
```bash
|
|
python scripts/batch_convert.py input_dir/ output_dir/ --extensions .pdf .docx --workers 4
|
|
```
|
|
|
|
### AI-Enhanced Convert
|
|
```bash
|
|
export OPENROUTER_API_KEY="sk-or-v1-..."
|
|
python scripts/convert_with_ai.py paper.pdf output.md \
|
|
--model anthropic/claude-sonnet-4.5 \
|
|
--prompt-type scientific
|
|
```
|
|
|
|
### Literature Convert
|
|
```bash
|
|
python scripts/convert_literature.py papers/ markdown/ --organize-by-year --create-index
|
|
```
|
|
|
|
## Key Features
|
|
|
|
1. **Token-Efficient Output**: Markdown optimized for LLM processing
|
|
2. **Comprehensive Format Support**: 15+ file types
|
|
3. **AI Enhancement**: Detailed image descriptions via OpenAI
|
|
4. **OCR Support**: Extract text from scanned documents
|
|
5. **Audio Transcription**: Speech-to-text for audio files
|
|
6. **YouTube Support**: Video transcript extraction
|
|
7. **Plugin System**: Extensible architecture
|
|
8. **Batch Processing**: Efficient parallel conversion
|
|
9. **Error Handling**: Robust error management
|
|
10. **Scientific Focus**: Optimized for research workflows
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
# Full installation
|
|
pip install 'markitdown[all]'
|
|
|
|
# Selective installation
|
|
pip install 'markitdown[pdf,docx,pptx,xlsx]'
|
|
```
|
|
|
|
## Quick Start
|
|
|
|
```python
|
|
from markitdown import MarkItDown
|
|
|
|
# Basic usage
|
|
md = MarkItDown()
|
|
result = md.convert("document.pdf")
|
|
print(result.text_content)
|
|
|
|
# With AI via OpenRouter
|
|
from openai import OpenAI
|
|
client = OpenAI(
|
|
api_key="your-openrouter-api-key",
|
|
base_url="https://openrouter.ai/api/v1"
|
|
)
|
|
md = MarkItDown(
|
|
llm_client=client,
|
|
llm_model="anthropic/claude-sonnet-4.5" # or openai/gpt-4o
|
|
)
|
|
result = md.convert("presentation.pptx")
|
|
```
|
|
|
|
## Documentation Files
|
|
|
|
| File | Purpose | Lines |
|
|
|------|---------|-------|
|
|
| SKILL.md | Main documentation | 400+ |
|
|
| api_reference.md | API documentation | 500+ |
|
|
| file_formats.md | Format guides | 600+ |
|
|
| example_usage.md | Practical examples | 500+ |
|
|
| batch_convert.py | Batch conversion | 200+ |
|
|
| convert_with_ai.py | AI conversion | 200+ |
|
|
| convert_literature.py | Literature conversion | 250+ |
|
|
| QUICK_REFERENCE.md | Quick reference | 300+ |
|
|
| INSTALLATION_GUIDE.md | Installation guide | 300+ |
|
|
|
|
**Total**: ~3,000+ lines of documentation and code
|
|
|
|
## Use Cases
|
|
|
|
1. **Literature Review**: Convert research papers to Markdown for analysis
|
|
2. **Data Extraction**: Extract tables from Excel/PDF for processing
|
|
3. **Presentation Processing**: Convert slides with AI descriptions
|
|
4. **Document Analysis**: Prepare documents for LLM consumption
|
|
5. **Lecture Transcription**: Convert audio recordings to text
|
|
6. **YouTube Analysis**: Extract video transcripts
|
|
7. **Archive Processing**: Batch convert document collections
|
|
|
|
## Next Steps
|
|
|
|
1. Install MarkItDown: `pip install 'markitdown[all]'`
|
|
2. Read `QUICK_REFERENCE.md` for common tasks
|
|
3. Try example scripts in `scripts/` directory
|
|
4. Explore `SKILL.md` for comprehensive guide
|
|
5. Check `example_usage.md` for practical examples
|
|
|
|
## Resources
|
|
|
|
- **MarkItDown GitHub**: https://github.com/microsoft/markitdown
|
|
- **PyPI**: https://pypi.org/project/markitdown/
|
|
- **OpenRouter**: https://openrouter.ai (AI model access)
|
|
- **OpenRouter API Keys**: https://openrouter.ai/keys
|
|
- **OpenRouter Models**: https://openrouter.ai/models
|
|
- **License**: MIT (Microsoft Corporation)
|
|
- **Python**: 3.10+ required
|
|
- **Skill Location**: `.claude/skills/markitdown/`
|
|
|
|
## Success Criteria
|
|
|
|
✅ Comprehensive skill documentation created
|
|
✅ Complete API reference provided
|
|
✅ Format-specific guides included
|
|
✅ Utility scripts implemented
|
|
✅ Practical examples documented
|
|
✅ Installation guide created
|
|
✅ Quick reference guide added
|
|
✅ Integration with Scientific Writer complete
|
|
✅ SKILLS.md updated
|
|
✅ Scripts made executable
|
|
✅ MIT License included
|
|
|
|
## Skill Status
|
|
|
|
**Status**: ✅ Complete and Ready to Use
|
|
|
|
The MarkItDown skill is fully integrated into the Claude Scientific Writer and ready for use. All documentation, scripts, and examples are in place.
|
|
|