Files
gh-cskiro-claudex/skills/benchmarking/markdown-to-pdf-converter/SKILL.md
2025-11-29 18:16:40 +08:00

6.2 KiB

name, description, version, author
name description version author
markdown-to-pdf-converter Use PROACTIVELY when converting markdown documents to professional PDFs. Automates the pandoc + weasyprint pipeline with academic-style CSS, proper page breaks, and HTML diagram capture via Playwright. Supports reports, papers, and technical documentation. Not for slides or complex layouts requiring InDesign. 0.1.0 Connor Skiro

Markdown to PDF Converter

Converts markdown documents to professional, print-ready PDFs using pandoc and weasyprint with academic styling.

Overview

This skill provides a complete pipeline for converting markdown to publication-quality PDFs:

  1. Markdown → HTML: pandoc with standalone CSS
  2. HTML → PDF: weasyprint with academic styling
  3. HTML → PNG: Playwright for diagram capture (optional)

Key features: academic table borders, proper page breaks, figure centering, retina-quality diagram export.

When to Use

Trigger Phrases:

  • "convert this markdown to PDF"
  • "generate a PDF from this document"
  • "create a professional PDF report"
  • "export markdown as PDF"

Use Cases:

  • Technical reports and whitepapers
  • Research papers and academic documents
  • Project documentation
  • Experiment analysis reports

NOT for:

  • Presentation slides (use Marp or reveal.js)
  • Complex multi-column layouts
  • Documents requiring precise InDesign-level control

Quick Start

# Prerequisites
brew install pandoc
pip install weasyprint
npm install playwright  # For diagram capture

# Verify installation
which pandoc weasyprint  # Both should return paths

# Basic conversion (two-step)
pandoc document.md -o document.html --standalone --css=pdf-style.css
weasyprint document.html document.pdf

# One-liner (pipe pandoc to weasyprint)
pandoc document.md --standalone --css=pdf-style.css -t html | weasyprint - document.pdf

Workflow Modes

Mode Use Case Process
Quick Convert Simple docs Markdown → HTML → PDF
Academic Report Papers with figures + CSS styling + diagram capture
Iterative Complex layout Review PDF, adjust page breaks, regenerate

Academic PDF Style Standards

Typography

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
  line-height: 1.6;
  max-width: 800px;
  margin: 0 auto;
  padding: 2em;
}

Tables (Academic Style)

  • Top border: 2px solid on header
  • Bottom border: 2px solid on header AND last row
  • Cell padding: 0.5em 0.75em
  • Page break avoidance: page-break-inside: avoid

Page Control

Element Rule
Page margins 2cm
Headings page-break-after: avoid
Figures page-break-inside: avoid
Tables page-break-inside: avoid
Orphans/widows 3 lines minimum

Figure Centering (Critical)

<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
  <img src="diagram.png" alt="Description" style="max-width: 100%; height: auto; display: inline-block;">
  <figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
    Figure 1: Caption text
  </figcaption>
</figure>

Manual Page Breaks

<div style="page-break-before: always;"></div>

Diagram Capture with Playwright

For HTML diagrams that need PNG export:

const { chromium } = require('playwright');

async function captureDiagram(htmlPath, pngPath) {
  const browser = await chromium.launch();
  const context = await browser.newContext({ deviceScaleFactor: 2 }); // Retina quality
  const page = await context.newPage();

  await page.goto(`file://${htmlPath}`);
  const element = await page.locator('.diagram-container');
  await element.screenshot({ path: pngPath, type: 'png' });

  await browser.close();
}

Key settings:

  • deviceScaleFactor: 2 for retina-quality PNGs
  • Target .diagram-container selector for clean capture
  • Use max-width: 100% in CSS, NOT min-width

CSS Template Location

See templates/pdf-style.css for full academic stylesheet.

Markdown Structure for Reports

# Title

## Subtitle (optional)

**Metadata** (date, author, etc.)

---

## Abstract

Summary paragraph...

---

## 1. Section Title

### 1.1 Subsection

Content with tables, figures...

---

## Appendix A: Title

Supporting materials...

Success Criteria

  • PDF renders without weasyprint errors
  • All images display correctly
  • Tables don't split across pages
  • Figures are centered with captions
  • No orphaned headings at page bottoms
  • Manual page breaks work as expected
  • Text is readable (not cut off)

Common Issues

Issue Solution
Image cut off Remove min-width, use max-width: 100%
Image off-center Add margin: auto; text-align: center to figure
Table split across pages Add page-break-inside: avoid
Heading orphaned CSS already handles with page-break-after: avoid
Too much whitespace Remove unnecessary <div style="page-break-before: always;">

Weasyprint CSS Compatibility

Weasyprint does not support all CSS properties. The following will generate warnings (safe to ignore, but can be removed for cleaner output):

Unsupported Property Alternative
gap Use margin on child elements
overflow-x Not needed for print
user-select Not needed for print
flex-gap Use margin instead
backdrop-filter Not supported in print
scroll-behavior Not needed for print

Clean CSS template tip: Remove these properties from your stylesheet to avoid warning messages during conversion.

Reference Files

  • templates/pdf-style.css - Full CSS stylesheet
  • templates/capture-diagrams.js - Playwright capture script
  • examples/report-template.md - Example markdown structure
  • workflow/iterative-refinement.md - Page break tuning process
  • html-diagram-creator: Create publication-quality HTML diagrams
  • html-to-png-converter: Convert HTML diagrams to PNG for embedding

Documentation Pipeline: Create diagrams (html-diagram-creator) → Convert to PNG (html-to-png-converter) → Embed in markdown → Export to PDF (this skill)


Based on: paralleLLM empathy-experiment-v1.0.pdf