PDF Accessibility: A Developer's Guide to WCAG-Compliant Documents

Accessibility isn't optional anymore. In the EU, the European Accessibility Act (EAA) takes full effect in June 2025. In the US, Section 508 of the Rehabilitation Act requires federal agencies to provide accessible documents. Courts have consistently ruled that the Americans with Disabilities Act (ADA) applies to digital content, including PDFs.

Beyond legal compliance, there's a compelling business case: roughly 1.3 billion people worldwide — about 16% of the population — live with some form of disability. If your invoices, contracts, reports, or documentation are inaccessible, you're excluding a significant portion of your audience.

This guide covers everything you need to know to create programmatically generated PDFs that are accessible to screen readers, keyboard navigation, and assistive technologies.

What Makes a PDF "Accessible"?

An accessible PDF can be consumed by anyone, regardless of their abilities. This includes people who:

Use screen readers (JAWS, NVDA, VoiceOver) because they are blind or have low vision
Navigate by keyboard because they can't use a mouse
Use magnification software because they have low vision
Have cognitive disabilities that make dense layouts difficult to process
Are colorblind and can't rely on color alone to convey information

The key technical requirements:

1. Document Structure (Tags)

Unlike HTML, where structure is inherent (<h1>, <p>, <table>), a PDF is flat by default — it's just characters drawn at coordinates. To add structure, PDFs use tags, which are similar to HTML elements.

A tagged PDF includes a "tag tree" that describes the logical structure:

Document
├── H1: "Monthly Report"
├── P: "Generated on February 1, 2026"
├── H2: "Revenue Summary"
├── Table
│   ├── TR
│   │   ├── TH: "Quarter"
│   │   ├── TH: "Revenue"
│   │   └── TH: "Growth"
│   ├── TR
│   │   ├── TD: "Q1"
│   │   ├── TD: "$1.2M"
│   │   └── TD: "+15%"
│   └── ...
├── H2: "Key Insights"
├── P: "Revenue grew..."
└── Figure (alt: "Bar chart showing quarterly revenue")

Without tags, a screen reader would read the PDF character by character, left to right, top to bottom — which might produce nonsensical output for multi-column layouts or tables.

2. Reading Order

In a sighted experience, you scan a page visually. But for screen readers, there must be a logical reading order defined in the tag tree. This matters most for:

Multi-column layouts: Should the reader go down column 1, then column 2? Or alternate between them?
Sidebars and callouts: When should these be read relative to the main content?
Headers and footers: Should running headers be read on every page, or skipped?

3. Alternative Text

Every non-decorative image needs alternative text (alt text). This applies to:

Photographs and illustrations
Charts and graphs (describe the data, not just "a chart")
Logos (e.g., "Company Name Logo")
QR codes (describe what they link to)

Decorative images should be marked as artifacts so screen readers skip them entirely.

4. Language Declaration

The document must declare its primary language, and any content in a different language should be tagged with the appropriate language attribute. This allows screen readers to switch voices/pronunciation automatically.

5. Color and Contrast

Color contrast: Text must have at least a 4.5:1 contrast ratio against its background (WCAG AA level)
Color independence: Don't use color as the only way to convey information. For example, don't use red text alone to indicate errors — add an icon or text label too.

PDF/UA: The Accessibility Standard

PDF/UA (Universal Accessibility — ISO 14289) is the standard specifically for accessible PDFs. It builds on PDF tagging and adds strict requirements:

Requirement	What it means
All content must be tagged	No untagged text or images
Tags must use standard types	H1, P, Table, Figure, etc.
Reading order must be logical	Tag tree order = reading order
All images need alt text	Or be marked as artifacts
Tables need headers	TH elements for row/column headers
Lists must be tagged	L, LI, Lbl, LBody
Language must be specified	Document-level and span-level
Fonts must be embedded	Every character must be renderable
Document title must be set	In metadata, not just on the first page

PDF/UA vs WCAG: What's the Difference?

WCAG (Web Content Accessibility Guidelines) was designed for web content. PDF/UA was designed specifically for PDFs. In practice, a PDF that meets PDF/UA will also satisfy most WCAG 2.1 requirements for non-web documents. Many regulations reference one or both standards.

Creating Accessible PDFs Programmatically

Approach 1: Tagged PDF from HTML

If you're generating PDFs from HTML, you get structure "for free" — your HTML headings, paragraphs, tables, and images already have semantic meaning. The challenge is ensuring your PDF converter preserves that structure as tags.

Good news: Prince and WeasyPrint automatically generate tagged PDFs from well-structured HTML.

Bad news: Puppeteer/Chromium does NOT generate tagged PDFs by default. The page.pdf() output is untagged — it's essentially a flat image of the rendered page.

// Puppeteer — NO tagging support
const pdf = await page.pdf({ format: 'A4' });
// This PDF is NOT accessible ❌

If accessibility is a requirement and you're using a headless browser, you need a post-processing step to add tags, or you need to switch to a tool that generates tagged output natively.

Approach 2: Tagged PDF from Low-Level Libraries

Some low-level libraries support PDF tagging:

TCPDF has basic tagging support, but it's manual and tedious:

$pdf->setMarkupContent('<h1>Title</h1><p>Content</p>');

PDFLib has excellent tagging support:

$id = $pdf->begin_item("H1", "Title");
$pdf->fit_textline("Chapter 1: Introduction", 50, 700, "");
$pdf->end_item($id);

$id = $pdf->begin_item("P", "");
$pdf->fit_textflow($tf, 50, 100, 500, 680, "");
$pdf->end_item($id);

ReportLab (Python) has the AcroForm and MarkupParagraph abstractions, but full PDF/UA support requires significant effort.

Approach 3: Using an API Service

Some PDF generation APIs handle accessibility automatically. When you design a template with headings, paragraphs, tables, and images with alt text, the API generates a tagged PDF without any extra work on your part. This is often the easiest path to compliance, especially for teams without PDF accessibility expertise.

Common Accessibility Patterns

Pattern 1: Accessible Tables

Tables are one of the trickiest elements for accessibility. Screen readers need to understand which cells are headers and which are data.

<!-- ✅ Accessible table -->
<table>
    <caption>Q1 2026 Revenue by Region</caption>
    <thead>
        <tr>
            <th scope="col">Region</th>
            <th scope="col">Revenue</th>
            <th scope="col">Growth</th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <th scope="row">North America</th>
            <td>$4.2M</td>
            <td>+12%</td>
        </tr>
        <tr>
            <th scope="row">Europe</th>
            <td>$3.1M</td>
            <td>+8%</td>
        </tr>
    </tbody>
</table>

Key points:

Use <caption> to describe the table's purpose
Use <th> for header cells with scope="col" or scope="row"
Don't use tables for layout — only for tabular data
For complex tables with merged cells, use headers attributes

Pattern 2: Meaningful Alt Text for Charts

Bad: alt="Chart" or alt="chart1.png"

Good: alt="Bar chart showing quarterly revenue growth from $3.2M in Q1 to $4.8M in Q4 2025, with the strongest growth in Q3 at 22%"

For complex charts, consider providing a data table as a supplement:

<figure>
    <img src="revenue-chart.png"
         alt="Quarterly revenue growth chart showing upward trend throughout 2025" />
    <figcaption>
        Figure 1: Quarterly Revenue Growth, 2025.
        See <a href="#table-1">Table 1</a> for exact figures.
    </figcaption>
</figure>

Pattern 3: Accessible Form Fields in PDFs

If your PDFs contain fillable form fields (common in contracts and applications):

Every field must have a label
Required fields must be identified (not just by color)
Error messages must be associated with the relevant field
Tab order must be logical

Pattern 4: Document Metadata

Always set these metadata fields:

// In your PDF generation code
$pdf->setTitle('Invoice #INV-2026-0042');
$pdf->setAuthor('Acme Corp');
$pdf->setLanguage('en-US');
$pdf->setDisplayMode('UseOutlines'); // Show bookmarks panel

Testing PDF Accessibility

Automated Tools

PAC (PDF Accessibility Checker) — Free, thorough, checks PDF/UA compliance. Available for Windows, with a web version at pdfua.foundation.
Adobe Acrobat Pro — Built-in accessibility checker under Tools > Accessibility > Full Check. Good for manual review and fixing issues.
axe-core PDF — If you're already using Deque's axe for web accessibility, they have a PDF testing extension.
VeraPDF — Open-source validator that checks conformance to PDF/A and PDF/UA standards.

Manual Testing

Automated tools catch structural issues but miss context:

Screen reader test: Load the PDF in NVDA (Windows) or VoiceOver (macOS) and listen. Does the reading order make sense? Are tables navigable?
Keyboard navigation: Can you tab through all interactive elements? Can you navigate headings with keyboard shortcuts?
Zoom test: Zoom to 200%. Does the content reflow or become unreadable?
High contrast test: Turn on high contrast mode. Is all content visible?

A Practical Testing Checklist

□ Document title is set in metadata (not just on the first page)
□ Document language is declared
□ All content is tagged (no untagged text)
□ Heading hierarchy is logical (H1 → H2 → H3, no skipping)
□ All images have alt text (or are marked as artifacts)
□ Tables have header cells with scope attributes
□ Reading order matches logical order
□ Links have descriptive text (not "click here")
□ Color contrast meets 4.5:1 ratio
□ Font is embedded (no system font dependencies)
□ Bookmarks are present for documents > 5 pages

Implementation Roadmap

If you're retrofitting accessibility into an existing PDF generation system, prioritize by impact:

Phase 1: Quick Wins (1-2 days)

Set document title, language, and metadata
Add alt text to all images
Ensure fonts are embedded
Check color contrast ratios

Phase 2: Structure (1 week)

Switch to a PDF engine that supports tagging (if yours doesn't)
Ensure HTML templates use semantic elements (headings, lists, tables)
Define logical reading order
Tag all tables with proper headers

Phase 3: Validation (ongoing)

Integrate PAC or VeraPDF into your CI/CD pipeline
Add screen reader testing to your QA process
Monitor for regressions when templates change

Conclusion

Accessible PDFs aren't just about compliance — they're about ensuring everyone can use your documents. The technical requirements are well-defined, the tools are mature, and the effort is reasonable, especially if you start with accessibility in mind rather than retrofitting it later.

The key takeaway: if your HTML is already semantically structured, you're halfway there. Choose a PDF engine that preserves tags, add alt text to images, check your contrast ratios, and test with a screen reader. It's not as hard as it seems.

PDF-API.io generates tagged, accessible PDFs from your templates automatically — no manual tagging required. Start for free.