Skip to main content
Home

latest

A high-performance utility for splitting HTML content into multiple pages. Perfect for building paginated content viewers, e-readers, or any application that needs to break down large HTML content into manageable page-sized chunks.

This package works with Browsers
This package works with Browsers
JSR Score
94%
Published
7 months ago (0.5.5)

Html Pages

A high-performance utility for splitting HTML content into multiple pages. Perfect for building paginated content viewers, e-readers, or any application that needs to break down large HTML content into manageable page-sized chunks.

This library is internally used and developed by Talers, the modern writing application for creative writers, to preview their work before publishing it.

Features

  • 📄 Splits HTML content into multiple pages while preserving semantic structure
  • 🧩 Handles complex elements:
    • Lists (bullet, ordered, nested)
    • Tables
    • ... and any other kind of HTML element
  • ⚡ Extremely fast performance:
    • Async page generation that does not block the main thread
    • Can handle thousands of pages in few seconds (~500 pages per second)
    • Renders the first page as soon as it's available (in ~4ms) for instant user feedback
    • Progressive rendering (start displaying while still processing)

Todo

  • 🚧 Handle words longer than the page width
  • ⏳ Add support for page numbering
  • ⏳ Add support for foot notes
  • ⏳ Use an array of HTML strings as input with mixed options

Installation

# Install from JSR
npx jsr add @talers/html-pages

Requirements

⚠️ This library only works in browser environments as it relies on DOM APIs.

Usage

import { HtmlPageSplitter } from '@talers/html-pages';

// Create a new instance with optional default options
const htmlPageSplitter = new HtmlPageSplitter({
  // The CSS classes to add to the page container
  classes: ['page'],
  
  // Optional: Maximum number of pages to generate
  maxNumberOfPages: 100
});

// Example HTML content to split
const htmlContent = `
  <h1>My Document</h1>
  <p>This is a paragraph that might span across multiple pages.</p>
  <ul>
    <li>List item 1</li>
    <li>List item 2</li>
    <li>List item 3</li>
  </ul>
  <table>
    <tr><td>Cell 1</td><td>Cell 2</td></tr>
    <tr><td>Cell 3</td><td>Cell 4</td></tr>
  </table>
`;

// Start processing and rendering pages
async function renderPages() {
  const pages: Array<string> = [];
  
  for await (const pageHtml of splitter.split(htmlContent)) {
    pages.push(pageHtml);
    
    // You can buffer pages for better performance instead of updating the UI after each page
    if (pages.length % 20 === 0) {
      // TODO: update UI with the new pages (depends on your frontend framework)
    }
  }

  // TODO: update UI with the final pages
  
  console.log(`Split ${pageIndex} pages`);
}

// If you need to cancel the split process (e.g., user navigates away)
function cancelSplitting() {
  splitter.abort();
}

Styling the result page

Page container classes

Each page container in the resulting HTML must have the same classes you passed to the HtmlPageSplitter constructor. This allows you to apply consistent styling to all your pages.

/* Example styling for pages with 'page' class */
.page {
  /* These are the dimensions of a A4 paper in millimeters */
  width: 210mm;
  height: 297mm;

  /* Common margins for A4 paper */
  margin: 0;
  padding: 25mm 20mm;

  box-shadow: 0 0 5px rgba(0, 0, 0, 0.2);
  background: white;
  overflow: hidden;
}

/* When printing, we don't want the page shadow */
@media print {
  .page {
    box-shadow: unset;
  }
}

Styling continued list items

When a list spans across multiple pages, the library automatically adds the html-pages__follow-up-list-item class to list items that continue from the previous page. To hide the bullet points or numbers for these continued list items, add the following CSS to your stylesheet:

.html-pages__follow-up-list-item::marker {
  color: transparent;
}

This ensures a clean visual appearance when reading paginated content with lists that break across pages, while maintaining the semantic structure of your HTML.

API

HtmlPageSplitter

The main class for splitting HTML content into pages.

Constructor Options

new HtmlPageSplitter(options?: {
  // Container element to use for the split (default: creates a hidden div)
  container?: HTMLElement;
  
  // CSS classes to add to each page container
  classes?: string[];
  
  // Maximum number of pages to generate (default: Infinity)
  maxNumberOfPages?: number;
});

Methods

  • async *split(html: string, options?: HtmlPageSplitterOptions): Async generator that yields one page at a time
  • abort(): Cancels an ongoing split operation

How It Works

The library works by creating a container element with the dimensions of a page, then progressively filling it with content from the source HTML. When the container overflows, the library intelligently splits the content at an appropriate breaking point (like spaces or hyphens) and starts a new page.

For complex elements like lists, it preserves semantic structure across page breaks by adding appropriate classes and attributes to maintain continuity.

Performance Tips

  • Buffer pages in batches (e.g., every 20 pages) before updating the UI
  • Use the abort() method when the user navigates away from the page to free up resources

New Ticket: Report package

Please provide a reason for reporting this package. We will review your report and take appropriate action.

Please review the JSR usage policy before submitting a report.