PDF Security: Encryption, Digital Signatures, and Permissions Explained
PDFs are the de facto standard for documents that matter — contracts, invoices, medical records, legal filings, financial statements. Yet most developers treat PDF security as an afterthought, applying a password and calling it done.
The reality is more nuanced. PDF security is a layered system with multiple mechanisms: encryption prevents unauthorized reading, permissions control what authorized users can do, digital signatures prove authenticity and integrity, and redaction permanently removes sensitive information. Getting any of these wrong can have real consequences.
This guide covers each layer in depth, with practical guidance for implementation.
Understanding PDF Security Layers
Think of PDF security as a multi-layered system, each layer serving a different purpose:
┌─────────────────────────────────────────────┐
│ Layer 4: Redaction │
│ Permanently remove sensitive content │
├─────────────────────────────────────────────┤
│ Layer 3: Digital Signatures │
│ Prove who created it + hasn't changed │
├─────────────────────────────────────────────┤
│ Layer 2: Permissions │
│ Control what users can do (print, copy) │
├─────────────────────────────────────────────┤
│ Layer 1: Encryption │
│ Prevent unauthorized access entirely │
└─────────────────────────────────────────────┘
Layer 1: Encryption
PDF encryption prevents the file's contents from being read without the correct key. The PDF specification supports two types of passwords:
User Password (Open Password)
The user password is required to open the document. If you set a user password, anyone without it will see a blank page or a password prompt.
Owner Password (Permissions Password)
The owner password controls permissions — printing, copying, editing. The document can be opened without the owner password, but certain actions are restricted.
Critical misconception: The owner password does NOT prevent the document from being opened. It only restricts certain actions. Many PDF readers don't even enforce these restrictions. If you need to prevent access entirely, use the user password.
Encryption Algorithms
The PDF spec has supported multiple encryption methods over the years:
| Algorithm | Key Length | PDF Version | Security Level |
|---|---|---|---|
| RC4 40-bit | 40-bit | PDF 1.1+ | ❌ Broken — trivially crackable |
| RC4 128-bit | 128-bit | PDF 1.4+ | ⚠️ Weak — avoid for sensitive data |
| AES-128 | 128-bit | PDF 1.5+ | ✅ Adequate for most uses |
| AES-256 | 256-bit | PDF 2.0+ | ✅ Recommended — current standard |
Always use AES-256. RC4 has known vulnerabilities and should never be used for documents containing sensitive information. Many compliance frameworks (HIPAA, PCI-DSS, GDPR) effectively require AES-256.
Implementing Encryption
With most PDF libraries, encryption is straightforward:
// Using TCPDF
$pdf->SetProtection(
['print', 'copy'], // Allowed permissions
'user_password', // User password (required to open)
'owner_password', // Owner password (required to change permissions)
3 // Encryption mode: 3 = AES-256
);
# Using ReportLab
from reportlab.lib.pdfencrypt import StandardEncryption
enc = StandardEncryption(
userPassword='user_pass',
ownerPassword='owner_pass',
canPrint=1,
canModify=0,
canCopy=0,
canAnnotate=0,
strength=256 # AES-256
)
canvas = Canvas('secure.pdf', encrypt=enc)
// Using pdf-lib (Node.js)
const pdfDoc = await PDFDocument.create();
pdfDoc.encrypt({
userPassword: 'user_pass',
ownerPassword: 'owner_pass',
permissions: {
printing: 'lowResolution',
modifying: false,
copying: false,
},
});
Password Security Best Practices
-
Generate passwords programmatically: Don't hardcode passwords. Generate unique passwords per document and store them securely (encrypted in your database, or in a secrets manager).
-
Use strong passwords: A PDF with AES-256 encryption but the password "1234" is effectively unprotected.
-
Separate user and owner passwords: If you use the same password for both, anyone who can open the document has full control.
-
Consider passwordless encryption: For API-generated documents, you might encrypt with a certificate instead of a password, allowing only the intended recipient to decrypt.
Layer 2: Permissions
PDF permissions control what actions are allowed after the document is opened. These are enforced by the viewer application (not by the encryption itself), which means they're advisory — a determined attacker can bypass them.
Available permission flags:
| Permission | What it controls |
|---|---|
| Print (low-res) | Allows printing at degraded quality |
| Print (high-res) | Allows printing at full quality |
| Modify content | Allows editing the document's content |
| Copy text | Allows selecting and copying text |
| Add/modify annotations | Allows adding comments, highlights |
| Fill forms | Allows filling in form fields |
| Extract content | Allows programmatic extraction of text/images |
| Assemble | Allows inserting, rotating, or deleting pages |
When Permissions Make Sense
Permissions are useful for:
- Preventing casual copying: A user won't be able to select and copy text in their PDF reader
- Discouraging editing: Prevents casual modifications in Acrobat
- Controlled printing: You might want to allow viewing but prevent printing of a confidential draft
They're NOT useful for:
- Preventing determined attackers: Tools exist to remove permissions from any PDF
- DRM: PDF permissions are not a DRM system
- Legal protection: Permissions don't constitute a legal access control
Layer 3: Digital Signatures
Digital signatures are the most powerful security feature in PDFs. They provide three guarantees:
- Authentication: Proves who signed the document
- Integrity: Proves the document hasn't been modified since signing
- Non-repudiation: The signer can't deny having signed it
How PDF Signatures Work
A PDF digital signature uses public-key cryptography (PKI):
- The signer has a private key (secret) and a public key (shared via certificate)
- The PDF content is hashed
- The hash is encrypted with the signer's private key, creating the signature
- The signature and the signer's certificate are embedded in the PDF
- A reader verifies the signature by decrypting the hash with the public key and comparing it to the actual document hash
Signing:
PDF content → SHA-256 hash → Encrypt with private key → Embed signature
Verification:
Signature → Decrypt with public key → Compare hash to current PDF hash
Match? ✅ Document is authentic and unmodified
Mismatch? ❌ Document has been tampered with
Certificate Types
Not all certificates are equal:
| Type | Trust Level | Use Case | Cost |
|---|---|---|---|
| Self-signed | ⚠️ Low | Internal documents, testing | Free |
| Organization-validated (OV) | ✅ Medium | Business documents | $100-500/year |
| Extended validation (EV) | ✅✅ High | Legal, financial, government | $200-1000/year |
| Qualified (eIDAS/EU) | ✅✅✅ Highest | EU legal equivalence to handwritten | $300-2000/year |
For self-signed certificates, the reader will show a warning that the signer's identity can't be verified. For trusted certificates (issued by a recognized Certificate Authority), the reader shows a green checkmark.
Implementing Digital Signatures
// Using TCPDF with a PKCS#12 certificate
$certificate = file_get_contents('/path/to/certificate.pfx');
$pdf->setSignature(
$certificate,
$certificate,
'certificate_password',
'',
3,
[
'Name' => 'John Smith',
'Location' => 'New York',
'Reason' => 'Invoice approval',
'ContactInfo' => 'john@example.com',
]
);
# Using pyHanko (Python) — a dedicated PDF signing library
from pyhanko.sign import signers
from pyhanko.pdf_utils.reader import PdfFileReader
from pyhanko.pdf_utils.incremental_writer import IncrementalPdfFileWriter
signer = signers.SimpleSigner.load_pkcs12(
pfx_file='certificate.pfx',
passphrase=b'password'
)
with open('document.pdf', 'rb') as f:
writer = IncrementalPdfFileWriter(f)
signers.sign_pdf(
writer,
signers.PdfSignatureMetadata(
field_name='Signature1',
reason='Contract approval',
location='Berlin'
),
signer=signer
)
Long-Term Validation (LTV)
A signed PDF is only as trustworthy as the certificate chain at the time of verification. But certificates expire, and CRLs (Certificate Revocation Lists) may become unavailable years later.
Long-Term Validation embeds all the information needed to verify the signature indefinitely:
- The full certificate chain
- OCSP responses (Online Certificate Status Protocol)
- CRL data
- Timestamps from a trusted Time Stamping Authority (TSA)
LTV is essential for any document that needs to be verifiable years from now — contracts, legal agreements, compliance records.
Layer 4: Redaction
Redaction is the permanent, irreversible removal of content from a PDF. This is critical for:
- Removing personal identifiable information (PII) before sharing documents
- Sanitizing classified information for public release
- GDPR right-to-erasure compliance
The Danger of Fake Redaction
The most common redaction mistake is drawing a black rectangle over sensitive text. This does NOT remove the text — it just hides it visually. The underlying text can be extracted by:
- Selecting and copying from the PDF
- Using
pdftotextor similar extraction tools - Opening the PDF's raw content stream
This is not a theoretical risk. High-profile redaction failures include:
- A US Department of Justice filing where redacted text was extractable by copy-paste
- A Facebook legal document where "redacted" contractor names were easily revealed
- Multiple court filings where classified information was hidden behind black boxes but fully extractable
Proper Redaction
Proper redaction:
- Identifies regions to redact
- Removes the underlying content from the PDF's content stream
- Replaces the area with a visual mark (usually a black rectangle)
- Removes the content from any embedded search index
- Removes the content from the document's metadata and XMP data
# Using pymupdf for proper redaction
import fitz # pymupdf
doc = fitz.open("document.pdf")
page = doc[0]
# Mark areas for redaction
page.add_redact_annot(
fitz.Rect(100, 200, 400, 220), # Area containing SSN
text="[REDACTED]", # Replacement text
fill=(0, 0, 0) # Black fill
)
# Apply redaction — this permanently removes the content
page.apply_redactions()
doc.save("redacted.pdf")
After redaction, verify that the content is truly gone by searching the file with a hex editor or text extraction tool.
Security Architecture for Document Systems
When building a system that generates sensitive documents, consider the full security lifecycle:
At Generation
- Encrypt PDFs with AES-256 if they contain sensitive data
- Apply digital signatures to prove authenticity
- Set appropriate permissions (but don't rely on them for security)
- Never log or cache unencrypted content in plaintext
In Transit
- Always serve PDFs over HTTPS
- Use signed URLs with short expiration for downloads
- Consider streaming PDFs instead of storing them temporarily
- Add Content-Disposition headers to prevent browser caching
return response($pdf)
->header('Content-Type', 'application/pdf')
->header('Content-Disposition', 'attachment; filename="invoice.pdf"')
->header('Cache-Control', 'no-store, no-cache, must-revalidate')
->header('X-Content-Type-Options', 'nosniff');
At Rest
- Encrypt PDFs at rest (S3 server-side encryption, database encryption)
- Implement access control — not everyone should be able to download every document
- Retain audit logs of who accessed which document and when
- Implement retention policies — delete documents when they're no longer needed
On Deletion
- Ensure deleted PDFs are actually removed from storage (not just soft-deleted)
- Remove from CDN caches
- Invalidate any outstanding signed URLs
- Update audit logs
Conclusion
PDF security is a spectrum, not a binary. Most documents don't need every layer — a public marketing brochure doesn't need AES-256 encryption. But financial documents, medical records, legal contracts, and any document containing PII absolutely do.
The minimum for sensitive documents:
- AES-256 encryption with strong, unique passwords
- Digital signatures from a trusted certificate authority
- HTTPS delivery with signed URLs
- Proper redaction (never just black rectangles over text)
- Encryption at rest in your storage system
Build security into your PDF pipeline from the start. Retrofitting it later is always harder and more error-prone.
PDF-API.io supports encrypted PDF generation, permission control, and secure delivery via signed URLs out of the box. Try it free.