@talers/html-pages@0.5.5
A high-performance utility for splitting HTML content into multiple pages. Perfect for building paginated content viewers, e-readers, or any application that needs to break down large HTML content into manageable page-sized chunks.
Html Pages
A high-performance utility for splitting HTML content into multiple pages. Perfect for building paginated content viewers, e-readers, or any application that needs to break down large HTML content into manageable page-sized chunks.
This library is internally used and developed by Talers, the modern writing application for creative writers, to preview their work before publishing it.
Features
- 📄 Splits HTML content into multiple pages while preserving semantic structure
- 🧩 Handles complex elements:
- Lists (bullet, ordered, nested)
- Tables
- ... and any other kind of HTML element
- ⚡ Extremely fast performance:
- Async page generation that does not block the main thread
- Can handle thousands of pages in few seconds (~500 pages per second)
- Renders the first page as soon as it's available (in ~4ms) for instant user feedback
- Progressive rendering (start displaying while still processing)
Todo
- 🚧 Handle words longer than the page width
- ⏳ Add support for page numbering
- ⏳ Add support for foot notes
- ⏳ Use an array of HTML strings as input with mixed options
Installation
# Install from JSR npx jsr add @talers/html-pages
Requirements
⚠️ This library only works in browser environments as it relies on DOM APIs.
Usage
import { HtmlPageSplitter } from '@talers/html-pages'; // Create a new instance with optional default options const htmlPageSplitter = new HtmlPageSplitter({ // The CSS classes to add to the page container classes: ['page'], // Optional: Maximum number of pages to generate maxNumberOfPages: 100 }); // Example HTML content to split const htmlContent = ` <h1>My Document</h1> <p>This is a paragraph that might span across multiple pages.</p> <ul> <li>List item 1</li> <li>List item 2</li> <li>List item 3</li> </ul> <table> <tr><td>Cell 1</td><td>Cell 2</td></tr> <tr><td>Cell 3</td><td>Cell 4</td></tr> </table> `; // Start processing and rendering pages async function renderPages() { const pages: Array<string> = []; for await (const pageHtml of splitter.split(htmlContent)) { pages.push(pageHtml); // You can buffer pages for better performance instead of updating the UI after each page if (pages.length % 20 === 0) { // TODO: update UI with the new pages (depends on your frontend framework) } } // TODO: update UI with the final pages console.log(`Split ${pageIndex} pages`); } // If you need to cancel the split process (e.g., user navigates away) function cancelSplitting() { splitter.abort(); }
Styling the result page
Page container classes
Each page container in the resulting HTML must have the same classes you passed to the HtmlPageSplitter constructor. This allows you to apply consistent styling to all your pages.
/* Example styling for pages with 'page' class */ .page { /* These are the dimensions of a A4 paper in millimeters */ width: 210mm; height: 297mm; /* Common margins for A4 paper */ margin: 0; padding: 25mm 20mm; box-shadow: 0 0 5px rgba(0, 0, 0, 0.2); background: white; overflow: hidden; } /* When printing, we don't want the page shadow */ @media print { .page { box-shadow: unset; } }
Styling continued list items
When a list spans across multiple pages, the library automatically adds the html-pages__follow-up-list-item class to list items that continue from the previous page. To hide the bullet points or numbers for these continued list items, add the following CSS to your stylesheet:
.html-pages__follow-up-list-item::marker { color: transparent; }
This ensures a clean visual appearance when reading paginated content with lists that break across pages, while maintaining the semantic structure of your HTML.
API
HtmlPageSplitter
The main class for splitting HTML content into pages.
Constructor Options
new HtmlPageSplitter(options?: { // Container element to use for the split (default: creates a hidden div) container?: HTMLElement; // CSS classes to add to each page container classes?: string[]; // Maximum number of pages to generate (default: Infinity) maxNumberOfPages?: number; });
Methods
async *split(html: string, options?: HtmlPageSplitterOptions): Async generator that yields one page at a timeabort(): Cancels an ongoing split operation
How It Works
The library works by creating a container element with the dimensions of a page, then progressively filling it with content from the source HTML. When the container overflows, the library intelligently splits the content at an appropriate breaking point (like spaces or hyphens) and starts a new page.
For complex elements like lists, it preserves semantic structure across page breaks by adding appropriate classes and attributes to maintain continuity.
Performance Tips
- Buffer pages in batches (e.g., every 20 pages) before updating the UI
- Use the
abort()method when the user navigates away from the page to free up resources