Wikipedia Pager for Physical Printing
Evolutionary Development of Print-Optimized Rendering
[All Generations] [Browse Files]
This document describes an evolutionary software development process applied to the problem of rendering Wikipedia articles for print. The process has iterated through four generations, selecting for fitness criteria: pagination quality (absence of orphaned headings), mathematical equation rendering, and two-column layout balance. The current population (Generation 4) contains two candidates that satisfy these criteria on the test corpus.
Wikipedia articles rendered through browser print functions exhibit two defects:
Candidates were developed using an evolutionary model with three stages per generation:
Five candidates spanning three invocation methods: CLI (node.js), web application, and bookmarklet. CLI candidates (002, 007) went extinct due to environment constraints. Survivors: 005 (web), 006 (bookmarklet).
The invocation dimension was eliminated; each candidate became a (web app, bookmarklet) pair. Five candidates tested against the Directed Acyclic Graph article. Finding: CSS-only pagination (104) outperformed Paged.js (103) on orphan prevention, but 104 broke equation rendering.
Objective: combine 104's pagination with 103's equation preservation. Root cause identified: canvas-based image conversion does not handle SVG correctly. Solution: selective image conversion—preserve SVG/equation images as external URLs, convert raster images to data URI.
Web app only (bookmarklets deprecated). Two-column layout with image differentiation: equations flow inline, illustrations float right. PDF page verification via pdftoppm introduced.
Promoted elite candidates from Generation 3. Verified two-column layout with readable equations and text flow around illustrations.
| Candidate | Pagination Method | Image Handling | Orphaned Headings |
|---|---|---|---|
| 401 (current) | CSS + DOM wrapping | Equations inline, illustrations 40% | 0 |
| 402 (current) | CSS + flexbox | Equations inline, illustrations 40% | 0 |
| 204 | CSS + DOM wrapping | Selective (SVG preserved) | 0 |
Test corpus: Pythagorean theorem, Alan Turing, Voigt notation (Wikipedia). Includes equation-heavy content with matrices and tensors.
| Artifact | Location |
|---|---|
| Web application (401) | candidate_401/index.html |
| Web application (402) | candidate_402/index.html |
| Specification | spec.md |
| PDF verification output | gen_4/pdf_output/ |
| All generations | generations.html |