Initial commit
This commit is contained in:
18
skills/benchmarking/markdown-to-pdf-converter/CHANGELOG.md
Normal file
18
skills/benchmarking/markdown-to-pdf-converter/CHANGELOG.md
Normal file
@@ -0,0 +1,18 @@
|
||||
# Changelog
|
||||
|
||||
All notable changes to this skill will be documented in this file.
|
||||
|
||||
## [0.1.0] - 2025-11-27
|
||||
|
||||
### Added
|
||||
- Initial release of markdown-to-pdf-converter skill
|
||||
- Academic-style CSS template (pdf-style.css)
|
||||
- Playwright diagram capture script
|
||||
- Figure centering patterns that work with weasyprint
|
||||
- Manual page break documentation
|
||||
- Common issues troubleshooting guide
|
||||
|
||||
### Based On
|
||||
- paralleLLM empathy-experiment-v1.0.pdf formatting standards
|
||||
- pandoc + weasyprint toolchain
|
||||
- Playwright for HTML → PNG capture
|
||||
71
skills/benchmarking/markdown-to-pdf-converter/README.md
Normal file
71
skills/benchmarking/markdown-to-pdf-converter/README.md
Normal file
@@ -0,0 +1,71 @@
|
||||
# Markdown to PDF Converter
|
||||
|
||||
A Claude Code skill for converting markdown documents to professional, print-ready PDFs using pandoc and weasyprint with academic styling.
|
||||
|
||||
## Overview
|
||||
|
||||
This skill automates the markdown-to-PDF pipeline with:
|
||||
- Academic-style CSS (system fonts, proper tables, page breaks)
|
||||
- HTML diagram capture via Playwright at retina quality
|
||||
- Iterative refinement workflow for complex documents
|
||||
|
||||
## Prerequisites
|
||||
|
||||
```bash
|
||||
# Required
|
||||
brew install pandoc
|
||||
pip install weasyprint
|
||||
|
||||
# Optional (for diagram capture)
|
||||
npm install playwright
|
||||
npx playwright install chromium
|
||||
```
|
||||
|
||||
## Usage
|
||||
|
||||
Trigger the skill with phrases like:
|
||||
- "convert this markdown to PDF"
|
||||
- "generate a PDF from this document"
|
||||
- "create a professional PDF report"
|
||||
|
||||
## Key Features
|
||||
|
||||
### Academic Table Styling
|
||||
Tables use traditional academic formatting with top/bottom borders on headers and clean cell spacing.
|
||||
|
||||
### Smart Page Breaks
|
||||
- Headings stay with following content
|
||||
- Tables and figures don't split across pages
|
||||
- Manual page breaks via `<div style="page-break-before: always;"></div>`
|
||||
|
||||
### Figure Centering
|
||||
Proper figure centering that works in weasyprint (not all CSS properties are supported).
|
||||
|
||||
### Retina-Quality Diagrams
|
||||
Playwright captures HTML diagrams at 2x resolution for crisp print output.
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
markdown-to-pdf-converter/
|
||||
├── SKILL.md # Main skill instructions
|
||||
├── README.md # This file
|
||||
├── CHANGELOG.md # Version history
|
||||
├── templates/
|
||||
│ ├── pdf-style.css # Academic CSS stylesheet
|
||||
│ └── capture-diagrams.js # Playwright screenshot script
|
||||
├── examples/
|
||||
│ └── report-template.md # Example markdown structure
|
||||
├── reference/
|
||||
│ └── weasyprint-notes.md # CSS compatibility notes
|
||||
└── workflow/
|
||||
└── iterative-refinement.md # Page break tuning process
|
||||
```
|
||||
|
||||
## Version
|
||||
|
||||
1.0.0 - Initial release based on paralleLLM empathy-experiment-v1.0.pdf
|
||||
|
||||
## Author
|
||||
|
||||
Connor Skiro
|
||||
221
skills/benchmarking/markdown-to-pdf-converter/SKILL.md
Normal file
221
skills/benchmarking/markdown-to-pdf-converter/SKILL.md
Normal file
@@ -0,0 +1,221 @@
|
||||
---
|
||||
name: markdown-to-pdf-converter
|
||||
description: "Use PROACTIVELY when converting markdown documents to professional PDFs. Automates the pandoc + weasyprint pipeline with academic-style CSS, proper page breaks, and HTML diagram capture via Playwright. Supports reports, papers, and technical documentation. Not for slides or complex layouts requiring InDesign."
|
||||
version: "0.1.0"
|
||||
author: "Connor Skiro"
|
||||
---
|
||||
|
||||
# Markdown to PDF Converter
|
||||
|
||||
Converts markdown documents to professional, print-ready PDFs using pandoc and weasyprint with academic styling.
|
||||
|
||||
## Overview
|
||||
|
||||
This skill provides a complete pipeline for converting markdown to publication-quality PDFs:
|
||||
|
||||
1. **Markdown → HTML**: pandoc with standalone CSS
|
||||
2. **HTML → PDF**: weasyprint with academic styling
|
||||
3. **HTML → PNG**: Playwright for diagram capture (optional)
|
||||
|
||||
Key features: academic table borders, proper page breaks, figure centering, retina-quality diagram export.
|
||||
|
||||
## When to Use
|
||||
|
||||
**Trigger Phrases**:
|
||||
- "convert this markdown to PDF"
|
||||
- "generate a PDF from this document"
|
||||
- "create a professional PDF report"
|
||||
- "export markdown as PDF"
|
||||
|
||||
**Use Cases**:
|
||||
- Technical reports and whitepapers
|
||||
- Research papers and academic documents
|
||||
- Project documentation
|
||||
- Experiment analysis reports
|
||||
|
||||
**NOT for**:
|
||||
- Presentation slides (use Marp or reveal.js)
|
||||
- Complex multi-column layouts
|
||||
- Documents requiring precise InDesign-level control
|
||||
|
||||
## Quick Start
|
||||
|
||||
```bash
|
||||
# Prerequisites
|
||||
brew install pandoc
|
||||
pip install weasyprint
|
||||
npm install playwright # For diagram capture
|
||||
|
||||
# Verify installation
|
||||
which pandoc weasyprint # Both should return paths
|
||||
|
||||
# Basic conversion (two-step)
|
||||
pandoc document.md -o document.html --standalone --css=pdf-style.css
|
||||
weasyprint document.html document.pdf
|
||||
|
||||
# One-liner (pipe pandoc to weasyprint)
|
||||
pandoc document.md --standalone --css=pdf-style.css -t html | weasyprint - document.pdf
|
||||
```
|
||||
|
||||
## Workflow Modes
|
||||
|
||||
| Mode | Use Case | Process |
|
||||
|------|----------|---------|
|
||||
| Quick Convert | Simple docs | Markdown → HTML → PDF |
|
||||
| Academic Report | Papers with figures | + CSS styling + diagram capture |
|
||||
| Iterative | Complex layout | Review PDF, adjust page breaks, regenerate |
|
||||
|
||||
## Academic PDF Style Standards
|
||||
|
||||
### Typography
|
||||
```css
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
|
||||
line-height: 1.6;
|
||||
max-width: 800px;
|
||||
margin: 0 auto;
|
||||
padding: 2em;
|
||||
}
|
||||
```
|
||||
|
||||
### Tables (Academic Style)
|
||||
- Top border: 2px solid on header
|
||||
- Bottom border: 2px solid on header AND last row
|
||||
- Cell padding: 0.5em 0.75em
|
||||
- Page break avoidance: `page-break-inside: avoid`
|
||||
|
||||
### Page Control
|
||||
| Element | Rule |
|
||||
|---------|------|
|
||||
| Page margins | 2cm |
|
||||
| Headings | `page-break-after: avoid` |
|
||||
| Figures | `page-break-inside: avoid` |
|
||||
| Tables | `page-break-inside: avoid` |
|
||||
| Orphans/widows | 3 lines minimum |
|
||||
|
||||
### Figure Centering (Critical)
|
||||
```html
|
||||
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
|
||||
<img src="diagram.png" alt="Description" style="max-width: 100%; height: auto; display: inline-block;">
|
||||
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
|
||||
Figure 1: Caption text
|
||||
</figcaption>
|
||||
</figure>
|
||||
```
|
||||
|
||||
### Manual Page Breaks
|
||||
```html
|
||||
<div style="page-break-before: always;"></div>
|
||||
```
|
||||
|
||||
## Diagram Capture with Playwright
|
||||
|
||||
For HTML diagrams that need PNG export:
|
||||
|
||||
```javascript
|
||||
const { chromium } = require('playwright');
|
||||
|
||||
async function captureDiagram(htmlPath, pngPath) {
|
||||
const browser = await chromium.launch();
|
||||
const context = await browser.newContext({ deviceScaleFactor: 2 }); // Retina quality
|
||||
const page = await context.newPage();
|
||||
|
||||
await page.goto(`file://${htmlPath}`);
|
||||
const element = await page.locator('.diagram-container');
|
||||
await element.screenshot({ path: pngPath, type: 'png' });
|
||||
|
||||
await browser.close();
|
||||
}
|
||||
```
|
||||
|
||||
**Key settings**:
|
||||
- `deviceScaleFactor: 2` for retina-quality PNGs
|
||||
- Target `.diagram-container` selector for clean capture
|
||||
- Use `max-width: 100%` in CSS, NOT `min-width`
|
||||
|
||||
## CSS Template Location
|
||||
|
||||
See `templates/pdf-style.css` for full academic stylesheet.
|
||||
|
||||
## Markdown Structure for Reports
|
||||
|
||||
```markdown
|
||||
# Title
|
||||
|
||||
## Subtitle (optional)
|
||||
|
||||
**Metadata** (date, author, etc.)
|
||||
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
Summary paragraph...
|
||||
|
||||
---
|
||||
|
||||
## 1. Section Title
|
||||
|
||||
### 1.1 Subsection
|
||||
|
||||
Content with tables, figures...
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Title
|
||||
|
||||
Supporting materials...
|
||||
```
|
||||
|
||||
## Success Criteria
|
||||
|
||||
- [ ] PDF renders without weasyprint errors
|
||||
- [ ] All images display correctly
|
||||
- [ ] Tables don't split across pages
|
||||
- [ ] Figures are centered with captions
|
||||
- [ ] No orphaned headings at page bottoms
|
||||
- [ ] Manual page breaks work as expected
|
||||
- [ ] Text is readable (not cut off)
|
||||
|
||||
## Common Issues
|
||||
|
||||
| Issue | Solution |
|
||||
|-------|----------|
|
||||
| Image cut off | Remove `min-width`, use `max-width: 100%` |
|
||||
| Image off-center | Add `margin: auto; text-align: center` to figure |
|
||||
| Table split across pages | Add `page-break-inside: avoid` |
|
||||
| Heading orphaned | CSS already handles with `page-break-after: avoid` |
|
||||
| Too much whitespace | Remove unnecessary `<div style="page-break-before: always;">` |
|
||||
|
||||
## Weasyprint CSS Compatibility
|
||||
|
||||
Weasyprint does not support all CSS properties. The following will generate warnings (safe to ignore, but can be removed for cleaner output):
|
||||
|
||||
| Unsupported Property | Alternative |
|
||||
|---------------------|-------------|
|
||||
| `gap` | Use `margin` on child elements |
|
||||
| `overflow-x` | Not needed for print |
|
||||
| `user-select` | Not needed for print |
|
||||
| `flex-gap` | Use `margin` instead |
|
||||
| `backdrop-filter` | Not supported in print |
|
||||
| `scroll-behavior` | Not needed for print |
|
||||
|
||||
**Clean CSS template tip**: Remove these properties from your stylesheet to avoid warning messages during conversion.
|
||||
|
||||
## Reference Files
|
||||
|
||||
- `templates/pdf-style.css` - Full CSS stylesheet
|
||||
- `templates/capture-diagrams.js` - Playwright capture script
|
||||
- `examples/report-template.md` - Example markdown structure
|
||||
- `workflow/iterative-refinement.md` - Page break tuning process
|
||||
|
||||
## Related Skills
|
||||
|
||||
- **html-diagram-creator**: Create publication-quality HTML diagrams
|
||||
- **html-to-png-converter**: Convert HTML diagrams to PNG for embedding
|
||||
|
||||
**Documentation Pipeline**: Create diagrams (html-diagram-creator) → Convert to PNG (html-to-png-converter) → Embed in markdown → Export to PDF (this skill)
|
||||
|
||||
---
|
||||
|
||||
**Based on**: paralleLLM empathy-experiment-v1.0.pdf
|
||||
@@ -0,0 +1,154 @@
|
||||
# Report Title
|
||||
|
||||
## Subtitle or Description
|
||||
|
||||
**Date:** YYYY-MM-DD
|
||||
|
||||
**Author:** Name
|
||||
|
||||
**Version:** 1.0
|
||||
|
||||
---
|
||||
|
||||
## Abstract
|
||||
|
||||
Brief summary of the document (1-2 paragraphs). State the key findings or purpose upfront.
|
||||
|
||||
---
|
||||
|
||||
## Executive Summary
|
||||
|
||||
| Metric | Result |
|
||||
|--------|--------|
|
||||
| Key Finding 1 | Brief description |
|
||||
| Key Finding 2 | Brief description |
|
||||
| Sample Size | n = X |
|
||||
|
||||
---
|
||||
|
||||
## 1. Introduction
|
||||
|
||||
### 1.1 Background
|
||||
|
||||
Context and motivation for this work...
|
||||
|
||||
### 1.2 Objectives
|
||||
|
||||
1. **Objective 1**: Description
|
||||
2. **Objective 2**: Description
|
||||
3. **Objective 3**: Description
|
||||
|
||||
---
|
||||
|
||||
## 2. Methodology
|
||||
|
||||
### 2.1 Approach
|
||||
|
||||
Description of the approach taken...
|
||||
|
||||
### 2.2 Variables
|
||||
|
||||
**Independent Variable**: Description
|
||||
|
||||
| Level | Description | Example |
|
||||
|-------|-------------|---------|
|
||||
| Level 1 | Description | Example |
|
||||
| Level 2 | Description | Example |
|
||||
|
||||
<div style="page-break-before: always;"></div>
|
||||
|
||||
**Dependent Variables**:
|
||||
|
||||
| Variable | Type | Measurement |
|
||||
|----------|------|-------------|
|
||||
| Variable 1 | Type | How measured |
|
||||
| Variable 2 | Type | How measured |
|
||||
|
||||
### 2.3 Infrastructure
|
||||
|
||||
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
|
||||
<img src="architecture_diagram.png" alt="System Architecture" style="max-width: 100%; height: auto; display: inline-block;">
|
||||
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
|
||||
Figure 1: System architecture diagram
|
||||
</figcaption>
|
||||
</figure>
|
||||
|
||||
---
|
||||
|
||||
## 3. Results
|
||||
|
||||
### 3.1 Summary Statistics
|
||||
|
||||
| Category | N | Mean | Std Dev | Key Metric |
|
||||
|----------|---|------|---------|------------|
|
||||
| Category A | 100 | 0.5 | 0.1 | 50% |
|
||||
| Category B | 100 | 0.6 | 0.2 | 60% |
|
||||
|
||||
### 3.2 Key Findings
|
||||
|
||||
**Finding 1: Title**
|
||||
|
||||
Description of the finding with supporting data...
|
||||
|
||||
**Finding 2: Title**
|
||||
|
||||
Description of the finding with supporting data...
|
||||
|
||||
---
|
||||
|
||||
## 4. Discussion
|
||||
|
||||
### 4.1 Interpretation
|
||||
|
||||
Analysis of what the results mean...
|
||||
|
||||
### 4.2 Implications
|
||||
|
||||
| Scenario | Risk Level | Recommendation |
|
||||
|----------|------------|----------------|
|
||||
| Scenario A | **Low** | Safe to proceed |
|
||||
| Scenario B | **High** | Exercise caution |
|
||||
|
||||
---
|
||||
|
||||
## 5. Limitations
|
||||
|
||||
1. **Limitation 1**: Description and impact
|
||||
|
||||
2. **Limitation 2**: Description and impact
|
||||
|
||||
---
|
||||
|
||||
## 6. Future Work
|
||||
|
||||
1. **Direction 1**: Description
|
||||
2. **Direction 2**: Description
|
||||
|
||||
---
|
||||
|
||||
## 7. Conclusion
|
||||
|
||||
Summary of key findings and their significance...
|
||||
|
||||
**Bottom line**: One-sentence takeaway.
|
||||
|
||||
---
|
||||
|
||||
## Appendix A: Supporting Materials
|
||||
|
||||
### A.1 Sample Data
|
||||
|
||||
> Example content or data samples shown in blockquote format
|
||||
|
||||
### A.2 Additional Figures
|
||||
|
||||
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
|
||||
<img src="appendix_figure.png" alt="Additional Figure" style="max-width: 100%; height: auto; display: inline-block;">
|
||||
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
|
||||
Figure A.1: Additional supporting figure
|
||||
</figcaption>
|
||||
</figure>
|
||||
|
||||
---
|
||||
|
||||
*Report generated by [Author Name]*
|
||||
@@ -0,0 +1,82 @@
|
||||
# WeasyPrint CSS Compatibility Notes
|
||||
|
||||
WeasyPrint doesn't support all CSS properties. This reference documents what works and what doesn't.
|
||||
|
||||
## Supported (Works)
|
||||
|
||||
### Layout
|
||||
- `max-width`, `min-width` (but avoid min-width on images)
|
||||
- `margin`, `padding`
|
||||
- `display: block`, `display: inline-block`
|
||||
- `text-align`
|
||||
- `width`, `height` (with units)
|
||||
|
||||
### Typography
|
||||
- `font-family`, `font-size`, `font-weight`, `font-style`
|
||||
- `line-height`
|
||||
- `color`
|
||||
|
||||
### Tables
|
||||
- `border-collapse`
|
||||
- `border` properties
|
||||
- `padding` on cells
|
||||
|
||||
### Print/Page
|
||||
- `@page { margin: ... }`
|
||||
- `page-break-before`, `page-break-after`, `page-break-inside`
|
||||
- `orphans`, `widows`
|
||||
|
||||
### Backgrounds
|
||||
- `background-color`
|
||||
- `background` (simple)
|
||||
|
||||
## NOT Supported (Ignored)
|
||||
|
||||
### Modern CSS
|
||||
- `gap` (use margin instead)
|
||||
- `overflow-x`, `overflow-y`
|
||||
- CSS Grid layout
|
||||
- Flexbox (limited support)
|
||||
- CSS variables (`--custom-property`)
|
||||
- `min()`, `max()`, `clamp()` functions
|
||||
|
||||
### Advanced Selectors
|
||||
- `:has()` (limited)
|
||||
- Complex pseudo-selectors
|
||||
|
||||
## Common Warnings
|
||||
|
||||
```
|
||||
WARNING: Ignored `gap: min(4vw, 1.5em)` at X:Y, invalid value.
|
||||
WARNING: Ignored `overflow-x: auto` at X:Y, unknown property.
|
||||
```
|
||||
|
||||
These warnings are informational and don't affect the output. The CSS fallbacks handle them.
|
||||
|
||||
## Image Centering Pattern
|
||||
|
||||
WeasyPrint-compatible centering:
|
||||
|
||||
```html
|
||||
<!-- This works -->
|
||||
<figure style="margin: 2em auto; text-align: center;">
|
||||
<img style="max-width: 100%; display: inline-block;">
|
||||
</figure>
|
||||
|
||||
<!-- This does NOT work reliably -->
|
||||
<figure style="display: flex; justify-content: center;">
|
||||
<img>
|
||||
</figure>
|
||||
```
|
||||
|
||||
## Page Break Pattern
|
||||
|
||||
```html
|
||||
<!-- Explicit page break -->
|
||||
<div style="page-break-before: always;"></div>
|
||||
|
||||
<!-- Keep together -->
|
||||
<div style="page-break-inside: avoid;">
|
||||
Content that should stay together
|
||||
</div>
|
||||
```
|
||||
@@ -0,0 +1,105 @@
|
||||
/**
|
||||
* Diagram Capture Script
|
||||
* Converts HTML diagrams to high-resolution PNGs using Playwright
|
||||
*
|
||||
* Usage:
|
||||
* node capture-diagrams.js [html-file] [output-png]
|
||||
* node capture-diagrams.js # Captures all diagrams in current directory
|
||||
*
|
||||
* Prerequisites:
|
||||
* npm install playwright
|
||||
* npx playwright install chromium
|
||||
*/
|
||||
|
||||
const { chromium } = require('playwright');
|
||||
const path = require('path');
|
||||
const fs = require('fs');
|
||||
|
||||
// Configuration
|
||||
const CONFIG = {
|
||||
deviceScaleFactor: 2, // 2x for retina quality
|
||||
selector: '.diagram-container', // Default container selector
|
||||
};
|
||||
|
||||
/**
|
||||
* Capture a single HTML file to PNG
|
||||
*/
|
||||
async function captureScreenshot(htmlPath, pngPath, selector = CONFIG.selector) {
|
||||
const browser = await chromium.launch();
|
||||
const context = await browser.newContext({
|
||||
deviceScaleFactor: CONFIG.deviceScaleFactor,
|
||||
});
|
||||
const page = await context.newPage();
|
||||
|
||||
const absoluteHtmlPath = path.resolve(htmlPath);
|
||||
console.log(`Capturing ${htmlPath}...`);
|
||||
|
||||
await page.goto(`file://${absoluteHtmlPath}`);
|
||||
|
||||
const element = await page.locator(selector);
|
||||
await element.screenshot({
|
||||
path: pngPath,
|
||||
type: 'png',
|
||||
});
|
||||
|
||||
console.log(` → Saved to ${pngPath}`);
|
||||
await browser.close();
|
||||
}
|
||||
|
||||
/**
|
||||
* Capture all HTML diagrams in a directory
|
||||
*/
|
||||
async function captureAllDiagrams(directory = '.') {
|
||||
const browser = await chromium.launch();
|
||||
const context = await browser.newContext({
|
||||
deviceScaleFactor: CONFIG.deviceScaleFactor,
|
||||
});
|
||||
const page = await context.newPage();
|
||||
|
||||
// Find all *_diagram*.html files
|
||||
const files = fs.readdirSync(directory)
|
||||
.filter(f => f.endsWith('.html') && f.includes('diagram'));
|
||||
|
||||
if (files.length === 0) {
|
||||
console.log('No diagram HTML files found in directory');
|
||||
await browser.close();
|
||||
return;
|
||||
}
|
||||
|
||||
for (const htmlFile of files) {
|
||||
const htmlPath = path.join(directory, htmlFile);
|
||||
const pngPath = htmlPath.replace('.html', '.png');
|
||||
|
||||
console.log(`Capturing ${htmlFile}...`);
|
||||
await page.goto(`file://${path.resolve(htmlPath)}`);
|
||||
|
||||
try {
|
||||
const element = await page.locator(CONFIG.selector);
|
||||
await element.screenshot({ path: pngPath, type: 'png' });
|
||||
console.log(` → Saved to ${path.basename(pngPath)}`);
|
||||
} catch (error) {
|
||||
console.log(` ✗ Failed: ${error.message}`);
|
||||
}
|
||||
}
|
||||
|
||||
await browser.close();
|
||||
console.log('\nAll diagrams captured successfully!');
|
||||
}
|
||||
|
||||
// Main execution
|
||||
async function main() {
|
||||
const args = process.argv.slice(2);
|
||||
|
||||
if (args.length === 2) {
|
||||
// Single file mode: node capture-diagrams.js input.html output.png
|
||||
await captureScreenshot(args[0], args[1]);
|
||||
} else if (args.length === 1) {
|
||||
// Directory mode: node capture-diagrams.js ./docs
|
||||
await captureAllDiagrams(args[0]);
|
||||
} else {
|
||||
// Default: capture all diagrams in current directory
|
||||
await captureAllDiagrams('.');
|
||||
}
|
||||
}
|
||||
|
||||
main().catch(console.error);
|
||||
@@ -0,0 +1,164 @@
|
||||
/* Academic PDF Style Template
|
||||
* For use with pandoc + weasyprint
|
||||
* Based on: paralleLLM empathy-experiment-v1.0.pdf
|
||||
*/
|
||||
|
||||
/* ==========================================================================
|
||||
Base Typography
|
||||
========================================================================== */
|
||||
|
||||
body {
|
||||
font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Helvetica, Arial, sans-serif;
|
||||
line-height: 1.6;
|
||||
max-width: 800px;
|
||||
margin: 0 auto;
|
||||
padding: 2em;
|
||||
}
|
||||
|
||||
h1, h2, h3, h4 {
|
||||
margin-top: 1.5em;
|
||||
margin-bottom: 0.5em;
|
||||
}
|
||||
|
||||
h2 {
|
||||
margin-top: 2em;
|
||||
}
|
||||
|
||||
h3 {
|
||||
margin-top: 1.5em;
|
||||
}
|
||||
|
||||
/* ==========================================================================
|
||||
Tables (Academic Style)
|
||||
========================================================================== */
|
||||
|
||||
table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
margin: 1em 0;
|
||||
page-break-inside: avoid;
|
||||
}
|
||||
|
||||
table th, table td {
|
||||
padding: 0.5em 0.75em;
|
||||
text-align: left;
|
||||
vertical-align: top;
|
||||
}
|
||||
|
||||
/* Academic-style borders: top/bottom on header, bottom on last row */
|
||||
table thead th {
|
||||
border-top: 2px solid #000;
|
||||
border-bottom: 2px solid #000;
|
||||
font-weight: bold;
|
||||
}
|
||||
|
||||
table tbody td {
|
||||
border-bottom: 1px solid #ddd;
|
||||
}
|
||||
|
||||
table tbody tr:last-child td {
|
||||
border-bottom: 2px solid #000;
|
||||
}
|
||||
|
||||
/* ==========================================================================
|
||||
Block Elements
|
||||
========================================================================== */
|
||||
|
||||
blockquote {
|
||||
border-left: 4px solid #ddd;
|
||||
margin: 1em 0;
|
||||
padding-left: 1em;
|
||||
color: #555;
|
||||
page-break-inside: avoid;
|
||||
}
|
||||
|
||||
code {
|
||||
background: #f5f5f5;
|
||||
padding: 0.2em 0.4em;
|
||||
border-radius: 3px;
|
||||
font-size: 0.9em;
|
||||
}
|
||||
|
||||
pre {
|
||||
background: #f5f5f5;
|
||||
padding: 1em;
|
||||
border-radius: 5px;
|
||||
page-break-inside: avoid;
|
||||
}
|
||||
|
||||
pre code {
|
||||
background: none;
|
||||
padding: 0;
|
||||
}
|
||||
|
||||
hr {
|
||||
border: none;
|
||||
border-top: 1px solid #ddd;
|
||||
margin: 2em 0;
|
||||
}
|
||||
|
||||
/* ==========================================================================
|
||||
Figures and Images
|
||||
========================================================================== */
|
||||
|
||||
figure {
|
||||
page-break-inside: avoid;
|
||||
margin: 1.5em 0;
|
||||
}
|
||||
|
||||
figure img {
|
||||
max-width: 100%;
|
||||
height: auto;
|
||||
display: block;
|
||||
}
|
||||
|
||||
figcaption {
|
||||
text-align: center;
|
||||
font-style: italic;
|
||||
margin-top: 0.5em;
|
||||
font-size: 0.9em;
|
||||
}
|
||||
|
||||
/* ==========================================================================
|
||||
Page Control (Print/PDF)
|
||||
========================================================================== */
|
||||
|
||||
@page {
|
||||
margin: 2cm;
|
||||
}
|
||||
|
||||
/* Keep headings with following content */
|
||||
h2, h3, h4 {
|
||||
page-break-after: avoid;
|
||||
}
|
||||
|
||||
/* Prevent orphan paragraphs */
|
||||
p {
|
||||
orphans: 3;
|
||||
widows: 3;
|
||||
}
|
||||
|
||||
/* Keep lists together when possible */
|
||||
ul, ol {
|
||||
page-break-inside: avoid;
|
||||
}
|
||||
|
||||
/* ==========================================================================
|
||||
Utility Classes
|
||||
========================================================================== */
|
||||
|
||||
/* For centered figures in weasyprint */
|
||||
.figure-centered {
|
||||
margin: 2em auto;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.figure-centered img {
|
||||
display: inline-block;
|
||||
max-width: 100%;
|
||||
}
|
||||
|
||||
/* Small text for appendix tables */
|
||||
.small-text {
|
||||
font-size: 0.85em;
|
||||
}
|
||||
@@ -0,0 +1,83 @@
|
||||
# Iterative PDF Refinement Workflow
|
||||
|
||||
Step-by-step process for tuning page breaks and layout in complex documents.
|
||||
|
||||
## Phase 1: Initial Conversion
|
||||
|
||||
```bash
|
||||
# Generate HTML with CSS
|
||||
pandoc document.md -o document.html --standalone --css=pdf-style.css
|
||||
|
||||
# Generate PDF
|
||||
weasyprint document.html document.pdf
|
||||
```
|
||||
|
||||
## Phase 2: Review PDF
|
||||
|
||||
Open the PDF and check for:
|
||||
|
||||
1. **Orphaned headings** - Heading at bottom of page, content on next
|
||||
2. **Split tables** - Table breaks across pages
|
||||
3. **Cut-off images** - Image doesn't fit, gets cropped
|
||||
4. **Excessive whitespace** - Large gaps from unnecessary page breaks
|
||||
5. **Off-center figures** - Images aligned left instead of center
|
||||
|
||||
## Phase 3: Fix Issues
|
||||
|
||||
### Orphaned Headings
|
||||
CSS already handles this with `page-break-after: avoid`. If still occurring, add manual page break BEFORE the heading:
|
||||
|
||||
```html
|
||||
<div style="page-break-before: always;"></div>
|
||||
|
||||
### Section Title
|
||||
```
|
||||
|
||||
### Split Tables
|
||||
Add to the table's container:
|
||||
```html
|
||||
<div style="page-break-inside: avoid;">
|
||||
|
||||
| Column 1 | Column 2 |
|
||||
|----------|----------|
|
||||
| data | data |
|
||||
|
||||
</div>
|
||||
```
|
||||
|
||||
### Cut-off Images
|
||||
Remove any `min-width` constraints. Use only:
|
||||
```html
|
||||
<img style="max-width: 100%; height: auto;">
|
||||
```
|
||||
|
||||
### Excessive Whitespace
|
||||
Remove unnecessary `<div style="page-break-before: always;">` tags. Let content flow naturally and only add page breaks where truly needed.
|
||||
|
||||
### Off-center Figures
|
||||
Use the full figure pattern:
|
||||
```html
|
||||
<figure style="margin: 2em auto; page-break-inside: avoid; text-align: center;">
|
||||
<img src="image.png" style="max-width: 100%; height: auto; display: inline-block;">
|
||||
<figcaption style="text-align: center; font-style: italic; margin-top: 1em;">
|
||||
Figure N: Caption
|
||||
</figcaption>
|
||||
</figure>
|
||||
```
|
||||
|
||||
## Phase 4: Regenerate and Verify
|
||||
|
||||
```bash
|
||||
# Regenerate after each fix
|
||||
pandoc document.md -o document.html --standalone --css=pdf-style.css
|
||||
weasyprint document.html document.pdf
|
||||
```
|
||||
|
||||
Repeat until all issues resolved.
|
||||
|
||||
## Tips
|
||||
|
||||
1. **Work section by section** - Don't try to fix everything at once
|
||||
2. **Check page count** - Unnecessary page breaks inflate page count
|
||||
3. **Test at actual print size** - View at 100% zoom
|
||||
4. **Version your PDFs** - Keep v1.0, v1.1, etc. during refinement
|
||||
Reference in New Issue
Block a user