Skip to main content

Generating Accessible PDFs: How to Create WCAG-Compliant Documents

· 17 min read
Milena Szymanowska
Milena Szymanowska
PDFBolt Co-Founder

Accessible PDF documents that comply with WCAG standards

Creating accessible PDF documents isn't just a best practice – it's increasingly a legal requirement across industries. While PDFs remain the standard format for sharing official documents, they often create significant barriers for users with disabilities. This comprehensive guide will walk you through the process of creating WCAG-compliant PDF documents that work for everyone, regardless of ability.

Why PDF Accessibility Matters

PDFs serve as the backbone of digital documentation – used for everything from financial reports and legal contracts to educational materials and technical manuals. However, without proper accessibility features, these documents can be completely unusable for people who rely on assistive technologies like screen readers.

Consider these important facts:

  • Over 1 billion people worldwide live with some form of disability (WHO).
  • Approximately 1 in 4 adults in the United States has a disability (CDC, 2024).
  • Around 2.2 billion people globally are affected by visual impairments (WHO, 2019).
  • The WebAIM Million Report (2025) found an average of 51 distinct accessibility errors per homepage.

In response to these challenges, governments worldwide have implemented accessibility regulations:

  • Americans with Disabilities Act (ADA) and Section 508 of the Rehabilitation Act in the United States (ADA.gov, Section508.gov).
  • European Accessibility Act in the EU (European Commission).
  • Similar legislation in Canada, Australia, and many other countries.

Creating PDFs that comply with Web Content Accessibility Guidelines (WCAG) not only fulfills legal obligations but also ensures equal access to information, improves user experience for everyone, and demonstrates a commitment to digital inclusivity.

Understanding WCAG Standards for PDFs

The Four Principles of Accessibility (POUR)

The Web Content Accessibility Guidelines (WCAG) developed by the W3C (World Wide Web Consortium) provide the foundation for accessible digital content. These guidelines are organized around four core principles, commonly referred to as POUR:

  1. Perceivable: Information must be presentable in ways users can perceive.
  2. Operable: Interface components must be operable by all users.
  3. Understandable: Information and operation must be understandable.
  4. Robust: Content must work with current and future technologies.

For PDFs specifically, the PDF/UA standard (Universal Accessibility) provides technical specifications that align with WCAG requirements. Understanding these standards is essential for creating truly accessible documents.

Essential Components of an Accessible PDF

An accessible PDF incorporates several critical elements that work together to support assistive technologies:

1. Document Structure and Navigation

A properly structured PDF enables users to navigate efficiently and understand the document's organization:

  • Logical reading order: Content follows a natural sequence that matches visual presentation.
  • Proper headings: Hierarchical heading structure (H1, H2, etc.) for efficient navigation.
  • Bookmarks: Navigation aids that allow quick movement through longer documents.
  • Tagged PDF: All content elements properly tagged with semantic markup.

❌ Example of Poor Structure: A financial report with no heading tags (just visually styled text), no bookmarks for navigation, and images or charts lacking proper tags.

✅ Example of Good Structure: A financial report with clear heading hierarchy (H1 for title, H2 for major sections, H3 for subsections), with each section properly tagged, and bookmarks that mirror this structure.

2. Text Accessibility

Text must be properly formatted and accessible:

  • Real text (not images of text).
  • Searchable content: Text that can be selected and read by screen readers requires OCR – Optical Character Recognition – for scanned documents.
  • Sufficient color contrast: Minimum 4.5:1 ratio for normal text.
  • Font properties: Allows character extraction and proper rendering.

❌ Example of Poor Practice: Using a scanned image of text without OCR, making it impossible for screen readers to access the content.

✅ Example of Good Practice: Using properly formatted text with sufficient contrast (dark text on light background) and standard fonts.

3. Non-Text Elements

Images, charts, and graphics require special attention:

  • Alternative text: Descriptive text for all meaningful images and graphics.
  • Complex graphics: Extended descriptions for charts and diagrams.
  • Decorative images: Should be marked as "artifacts" in PDF structure.
  • Tables: Properly structured with headers and simple layouts.

❌ Example of Poor Alt Text: For a chart: "Sales chart" or "Quarterly performance graph" (too vague, lacks actual data and insights).

✅ Example of Good Alt Text: For a sales chart: "Bar chart showing quarterly sales from Q1-Q4 2025. Q1: $1.2M, Q2: $1.5M, Q3: $1.8M, Q4: $2.1M, demonstrating steady 25% quarterly growth".

4. Interactive Elements

Forms and interactive components need special consideration:

  • Descriptive Links: Clear indication of destination.
  • Accessible Forms: Properly labeled fields with instructions.
  • Keyboard Navigation: All functions accessible without a mouse.
  • Document Metadata: Title, language, and other properties correctly set.

✅ Example of Good Link: "Download the 2025 Tax Guide".

❌ Example of Poor Link: "Click here" or "Download" (no context about destination).

Technical Requirements for PDF Accessibility

Diagram showing key technical requirements for PDF accessibility

Document Structure Tags

A properly tagged PDF uses a hierarchical structure that communicates the relationships between different content elements to assistive technologies. The PDF specification defines the following standard structure types that are crucial for accessibility:

DocumentRoot
├── Document
├── Part
├── H1 (Main heading)
├── P (Paragraph)
├── Figure
│ └── Alt text
├── H2 (Subheading)
├── Table
│ ├── TH (Table header)
│ └── TD (Table data)
├── L (List)
│ ├── LI (List item)
│ ├── Lbl (List label - bullet or number)
│ └── LBody (List body - content)
└── Form (Form field)
└── Link (Hyperlink)

Every element must have appropriate semantic tagging to convey its role and relationship within the document. Without proper tagging, screen readers can't interpret the content correctly.

Language Specification in PDFs

Specifying the correct language is crucial for screen readers to use proper pronunciation:

  • Set the document's default language in the document catalog with the /Lang entry
  • For multilingual documents, assign different languages to specific content using the /Lang entry in structure element dictionaries
  • Use ISO 639 language codes (e.g., "en-US" for U.S. English, "fr-CA" for Canadian French)

This allows assistive technologies to automatically switch pronunciation rules when encountering different languages, improving comprehension for users relying on screen readers.

PDF Document Properties

Accessible PDFs require proper document properties:

  • Document Title: A descriptive title in document properties (not just a filename).
  • Language Setting: Primary language specified in metadata.
  • PDF/UA Compliance: Conformance with accessibility standards.
  • Tagged PDF: Structure enabled for accessibility.

Reading Order Control

The reading order must be explicitly defined to ensure screen readers present content logically:

  1. Visual Order: What the eye sees when scanning the document.
  2. Logical Order: How assistive technology should navigate the content.
  3. Tag Order: Structure defined in the document's tag tree.

These three aspects must align for optimal accessibility. When they don't, screen reader users might hear content in a jumbled, meaningless order.

List Structure in Accessible PDFs

When creating lists in PDFs, proper tagging is essential:

  • Use the List (/L) tag to contain the entire list
  • Each list item should use the List Item (/LI) tag
  • Within each item, use List Label (/Lbl) for bullets/numbers and List Body (/LBody) for content
  • For nested lists, include child List elements inside parent List Body elements
  • Ensure reading order follows the visual order of the list

Incorrectly structured lists will not be recognized by assistive technologies, making navigation difficult for users with disabilities.

Page Numbering and Navigation

Consistent page numbering is crucial for accessibility, especially with varying numbering schemes (Roman numerals for front matter, Arabic for main content):

  • Match logical page numbers to printed page numbers using the /PageLabels entry
  • Provide clear headers and footers to help maintain context
  • Include page numbers in a consistent location
  • Implement bookmarks that match the document's heading structure
  • For complex documents, consider section indicators in running headers

These features particularly help users with cognitive disabilities and those using screen magnifiers who see only portions of the page at once.

Creating Accessible PDFs with HTML to PDF Conversion

HTML to PDF conversion has emerged as the most reliable method for creating accessible PDFs. This approach leverages the inherent accessibility features of HTML and translates them into PDF structure, offering significant advantages over other conversion methods.

Why HTML to PDF Is Superior for Accessibility

  1. Semantic Structure: HTML's native semantic elements (headings, lists, tables) directly translate to PDF tags.
  2. Built-in Accessibility: ARIA attributes and HTML accessibility features can be preserved.
  3. Consistent Results: More reliable tag structure compared to word processors or design software.
  4. Automation Potential: Easier to integrate into automated workflows and CI/CD pipelines.
  5. Better Testing Options: HTML accessibility can be verified before conversion.

Preparing HTML for Accessible PDF Conversion

The accessibility of your final PDF depends heavily on your source HTML. Follow these best practices:

1. Document Structure

A properly structured HTML document provides clear organization that translates well to PDF tags. The header, navigation, main content, and footer should all be semantically defined, which helps assistive technologies understand the document's structure.

Here's how to structure your HTML document:

<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Annual Financial Report 2025</title>
<!-- The title tag becomes the PDF document title -->
</head>
<body>
<header>
<h1>Annual Financial Report</h1>
<p>Fiscal Year 2025</p>
</header>
<nav>
<!-- Navigation elements create PDF bookmarks -->
<ul>
<li><a href="#overview">Executive Overview</a></li>
<li><a href="#financials">Financial Statements</a></li>
</ul>
</nav>
<main>
<section id="overview">
<h2>Executive Overview</h2>
<!-- Content here -->
</section>
<!-- Additional sections -->
</main>
<footer>
<p>© 2025 Company Name</p>
</footer>
</body>
</html>

2. Critical Accessibility Attributes

These attributes and techniques enhance accessibility by providing essential context and functionality for assistive technologies:

  • Document language: The lang attribute ensures screen readers pronounce content correctly.
  • Alternative text: Describes images for people who cannot see them.
  • Skip links: Allow keyboard users to bypass repetitive navigation.
  • ARIA landmarks: Define regions of the page for easier navigation.
  • Form labels: Associate inputs with their descriptions for screen readers.
  • Focus visibility: Makes it clear which element is currently selected.
  • ARIA roles: Provide semantic meaning to custom interactive elements.
  • Heading structure: Creates a logical document outline.
  • Status messages: Announce updates without requiring user focus.
  • Color contrast: Ensures text is readable for users with low vision.

Here are examples of implementing these critical accessibility attributes:

<!-- Document language (required for screen readers) -->
<html lang="en">

<!-- Alternative text for images -->
<img src="chart.png" alt="Q1 2025 Sales increased by 15% compared to Q4 2024">

<!-- Skip navigation for keyboard users -->
<a href="#main-content" class="skip-link">Skip to main content</a>

<!-- ARIA landmarks -->
<header role="banner">
<nav role="navigation">
<main role="main" id="main-content">
<footer role="contentinfo">

<!-- Form labels -->
<label for="email">Email Address:</label>
<input type="email" id="email" name="email" required>

<!-- Focus visibility for interactive elements -->
<style>
:focus {
outline: 3px solid #0066CC;
outline-offset: 2px;
}
</style>

<!-- ARIA for custom UI components -->
<div role="button" tabindex="0" aria-pressed="false" class="custom-button">
Toggle Feature
</div>

<!-- Proper heading structure -->
<h1>Main Document Title</h1>
<h2>Major Section</h2>
<h3>Subsection</h3>
<h3>Another Subsection</h3>
<h2>Another Major Section</h2>

<!-- Status messages -->
<div role="status" aria-live="polite">
Your changes have been saved successfully.
</div>

<!-- Color contrast example -->
<!-- Good contrast: -->
<p style="color: #000000; background-color: #ffffff;">Dark text on light background</p>
<!-- Poor contrast: -->
<p style="color: #aaaaaa; background-color: #eeeeee;">Light text on light background</p>

3. Table Structure for Accessibility

Accessible tables require special attention to ensure screen reader users can understand the data relationships. Key elements include:

  • Table captions: Provide context for the entire table.
  • Header cells: Properly marked with <th> elements.
  • Scope attributes: Help associate headers with data cells.
  • Organized sections: Use <thead>, <tbody>, and <tfoot> to structure content.
  • Simple layouts: Avoid complex merged cells when possible.

Here's how to create an accessible table structure:

<table>
<!-- Caption provides context for screen reader users -->
<caption>Quarterly Revenue by Region (in thousands)</caption>

<!-- Table headers with scope attribute -->
<thead>
<tr>
<th scope="col">Region</th>
<th scope="col">Q1 2025</th>
<th scope="col">Q2 2025</th>
<th scope="col">Q3 2025</th>
<th scope="col">Q4 2025</th>
</tr>
</thead>

<!-- Table body with row headers -->
<tbody>
<tr>
<th scope="row">North America</th>
<td>$120</td>
<td>$145</td>
<td>$160</td>
<td>$190</td>
</tr>
<tr>
<th scope="row">Europe</th>
<td>$95</td>
<td>$110</td>
<td>$125</td>
<td>$140</td>
</tr>
</tbody>

<!-- Table footer for totals -->
<tfoot>
<tr>
<th scope="row">Total</th>
<td>$215</td>
<td>$255</td>
<td>$285</td>
<td>$330</td>
</tr>
</tfoot>
</table>

Advanced HTML to PDF Accessibility Techniques

Document Outline and Bookmarks

PDF bookmarks provide navigation aids that are especially helpful for users with disabilities. They can be automatically generated from HTML headings.

When properly implemented, bookmarks:

  • Allow quick navigation between sections.
  • Provide an overview of document structure.
  • Help users understand document organization.
  • Enable keyboard users to move through content efficiently.
  • Make long documents more manageable for all users.

Accessible Math Content

Mathematical equations require special handling to remain accessible in PDFs. There are two primary approaches:

  1. Using MathML for structured mathematical content that screen readers can interpret correctly.
  2. Using an ARIA approach with descriptive labels for simpler equations.

Here are examples of both techniques:

<!-- Use MathML for accessible math equations -->
<math xmlns="http://www.w3.org/1998/Math/MathML">
<mrow>
<mi>E</mi>
<mo>=</mo>
<mi>m</mi>
<msup>
<mi>c</mi>
<mn>2</mn>
</msup>
</mrow>
</math>

<!-- Alternatively, provide fallback with aria-label -->
<span role="math" aria-label="E equals m c squared">
E = mc<sup>2</sup>
</span>

Form Fields

Accessible form fields are essential for interactive PDFs. Proper field labeling and structure ensure that screen reader users can understand and interact with forms. Key techniques include explicit labels, grouped inputs, help text, and proper error handling.

Here are examples demonstrating accessible form implementation patterns:

<!-- Basic accessible form field -->
<div class="form-field">
<label for="full-name">Full Name:</label>
<input type="text" id="full-name" name="full-name" required aria-required="true">
<div id="name-instructions" class="instructions">Enter your first and last name</div>
</div>

<!-- Radio button group with fieldset -->
<fieldset>
<legend>Preferred Contact Method:</legend>
<div class="radio-option">
<input type="radio" id="contact-email" name="contact" value="email">
<label for="contact-email">Email</label>
</div>
<div class="radio-option">
<input type="radio" id="contact-phone" name="contact" value="phone">
<label for="contact-phone">Phone</label>
</div>
</fieldset>

<!-- Dropdown (select) menu -->
<div class="form-field">
<label for="department">Select Department:</label>
<select id="department" name="department">
<option value="">Please select...</option>
<option value="sales">Sales</option>
<option value="support">Customer Support</option>
<option value="legal">Legal</option>
</select>
</div>

<!-- Error message association -->
<div class="form-field">
<label for="email">Email Address:</label>
<input type="email" id="email" name="email"
aria-describedby="email-help email-error"
aria-invalid="true">
<div id="email-help" class="help-text">Enter your work email address</div>
<div id="email-error" class="error-message" role="alert">
Please enter a valid email address
</div>
</div>

<!-- Required field indication -->
<div class="form-field">
<label for="phone">Phone Number: <span class="required">*</span></label>
<input type="tel" id="phone" name="phone" required aria-required="true">
<div class="sr-only">(required)</div>
</div>

These form examples demonstrate several important accessibility techniques:

  1. Explicit labels: Using <label for="id"> to connect labels with form controls.
  2. Grouped inputs: Using <fieldset> and <legend> for related controls.
  3. Help text: Instructions that explain expected input.
  4. Error messages: Accessible error notifications with aria-describedby and role="alert".
  5. Required fields: Marked both visually and with aria-required="true".
  6. Invalid state: Using aria-invalid="true" to indicate validation errors.
  7. Screen reader text: Using .sr-only class for content visible only to screen readers.

Each of these techniques ensures form fields are accessible to screen reader users and keyboard navigators.

HTML to PDF Conversion Tools and Libraries

Many libraries can convert HTML to accessible PDFs, each with different features and capabilities.

Here are several reliable options:

1. Puppeteer (Node.js)

Puppeteer provides one of the most reliable ways to create accessible PDFs by leveraging Chrome's PDF export capabilities. With Puppeteer, you can programmatically control a headless Chrome browser to generate PDFs with proper tagging and structure.

Learn more

For a detailed implementation guide, check out Generate PDFs in Node.js Using Puppeteer

2. WeasyPrint (Python)

WeasyPrint offers good accessibility support with proper configuration. This Python library excels at CSS-based styling and layout, which can be leveraged to enhance accessibility features.

Learn more

For step-by-step instructions and code examples, see Generate PDF from HTML Using WeasyPrint and PyPDF2

3. HTML to PDF API

For those seeking an easy to implement solution without managing complex infrastructure, cloud-based HTML to PDF API like PDFBolt offer a convenient alternative.

Learn more

To get started with API-based PDF generation, read How to Convert HTML to PDF Using an API

Testing PDF Accessibility

After conversion, always test the accessibility of your PDFs using both automated and manual methods.

Automated Testing Tools

Several specialized tools can help evaluate PDF accessibility:

note

Automated tools typically detect only 20–30% of accessibility issues. Manual testing is critical for evaluating reading order, logical flow, alternative text accuracy, and usability with assistive technology.

Screen Reader Testing

Manual testing with screen readers is essential to validate the real user experience. Commonly used screen readers include:

Manual Testing

Always include manual testing in your workflow.

Key areas to check include:

  • Document structure.
  • Reading order.
  • Form field accessibility.
  • Proper tagging.
  • Keyboard navigation.

Having a real screen reader user test your document provides the most valuable feedback.

Common Accessibility Challenges and Solutions

Addressing accessibility in PDFs often involves overcoming specific technical challenges that require targeted solutions.

ChallengeSolutionPractical Application
Complex Tables• Simplify table structure.
• Use proper header associations.
• Provide table summaries.
• Split into simpler tables.
• Avoid one complex financial table with merged cells.
• Create separate quarterly tables with consistent headers.
Charts and Visualizations• Include comprehensive alt text.
• Provide data tables as alternatives.
• Use proper captions.
• Include text descriptions.
For a sales chart:
• Include alt text summarizing trends.
• Add data table showing exact values.
Scanned Documents• Use high-resolution scanning (300+ DPI).
• Process with OCR software supporting PDF/UA.
• Verify text accuracy and add structure tags.
• Set proper metadata and reading order.
• Process scanned contracts with quality OCR.
• Manually review text recognition accuracy.
• Add tags and properties to make content accessible.
Legacy PDF Collection• Create a complete document catalog (list all existing PDFs).
• Run automated tests on the collection.
• Prioritize by usage and importance.
• Follow structured remediation process.
• Begin with most frequently accessed documents.
• Tag untagged elements.
• Correct reading order.
• Add alt text to images.
Highly Complex Documents• Evaluate whether remediation or recreation is more efficient.
• Consider source availability.
• Assess visual complexity.
• Recreate if source files are available.
• Rebuild heavily visual documents from scratch.
• Replace very old PDFs with modern versions.

Additional Resources

The following resources provide deeper insights into PDF accessibility standards and implementation techniques:

Conclusion

Making your PDFs accessible isn’t just about fulfilling legal obligations – it’s about promoting equal access to information for all users. By applying techniques like semantic tagging, maintaining a logical reading order, and providing alternative text, you ensure your documents are compatible with screen readers and other assistive technologies.

Whether you're converting from HTML to PDF or remediating existing documents, the techniques outlined in this WCAG compliance guide provide a clear roadmap to accessibility. Begin implementing these PDF accessibility standards to make your content available to all users, regardless of ability.

Remember: Accessible PDFs aren’t just better for people with disabilities – they’re better for everyone. 🌍