How to Redact a PDF Without Adobe

Adobe Acrobat Pro costs $19.99/month and requires a desktop app. For developers who need to redact PDFs programmatically -- in a serverless function, an automated pipeline, or a backend API -- that's not a viable solution.

This guide covers how to redact PDFs programmatically without Adobe, including true content removal rather than visual overlays.

The Problem with Most PDF Redaction Tools

Most tools that claim to redact PDFs don't actually remove the text. They draw a black rectangle over it. The text is still in the file -- selectable, copyable, and extractable by anyone with a PDF reader or a script.

This includes many online "free PDF redaction" tools, Microsoft Word's PDF export, and basic PDF editors.

To verify: open a "redacted" PDF in Chrome and try to select text in the blacked-out area. If you can select it, the redaction failed.

True redaction removes the text from the PDF content stream entirely. No selection, no extraction, nothing there.

What Adobe Acrobat's Redaction Actually Does

Adobe Acrobat Pro does support true redaction -- it's under Tools → Redact. It removes text from the content stream and scrubs metadata.

But it has significant limitations for developers:

Requires a paid Acrobat Pro subscription ($19.99/month)
Desktop-only -- no API, no automation
Manual workflow -- you mark regions by hand
No programmatic access to redaction logic
Can't be called from Node.js, Python, or any backend

If you need to redact one PDF occasionally, Acrobat works. If you need to redact hundreds of PDFs automatically, you need an API.

Option 1: Forme API (Recommended)

Forme's redact endpoint performs true content stream removal with a single HTTP call. Works from any language, any environment.

Basic redaction by coordinates

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "redactions": [
      { "page": 0, "x": 100, "y": 150, "width": 200, "height": 20 }
    ]
  }' \
  --output redacted.pdf

Text-search redaction

Find and redact text by string or regex pattern -- no need to know coordinates:

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "patterns": [
      { "pattern": "John Smith", "pattern_type": "Literal" },
      { "pattern": "\\d{3}-\\d{2}-\\d{4}", "pattern_type": "Regex" }
    ]
  }' \
  --output redacted.pdf

Built-in presets for common PII

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "presets": ["ssn", "email", "phone", "date-of-birth"]
  }' \
  --output redacted.pdf

Available presets:

ssn -- Social Security Numbers
email -- Email addresses
phone -- US phone numbers
date-of-birth -- Common date formats
credit-card -- 16-digit card numbers

In Node.js

import { FormeClient } from '@formepdf/sdk';
import { readFileSync, writeFileSync } from 'fs';

const client = new FormeClient({
  apiKey: process.env.FORME_API_KEY
});

const pdf = readFileSync('document.pdf');

const redacted = await client.redact(pdf, {
  presets: ['ssn', 'email'],
  patterns: [
    { pattern: 'John Smith', pattern_type: 'Literal' }
  ],
});

writeFileSync('redacted.pdf', redacted);

In Python

import formepdf
import base64

client = formepdf.FormeClient(api_key="your_api_key")

with open("document.pdf", "rb") as f:
    pdf_bytes = f.read()

redacted = client.redact(
    pdf=pdf_bytes,
    presets=["ssn", "email"],
    patterns=[
        {"pattern": "John Smith", "pattern_type": "Literal"}
    ]
)

with open("redacted.pdf", "wb") as f:
    f.write(redacted)

Option 2: Self-Hosted

If your documents can't leave your infrastructure, Forme ships as a Docker image with the same redaction API running on your own servers:

docker pull formepdf/forme:0.9.0
docker run -p 4000:4000 formepdf/forme:0.9.0

Then call it the same way:

curl -X POST http://localhost:4000/v1/redact \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64>",
    "presets": ["ssn", "email"]
  }' \
  --output redacted.pdf

No API key required for self-hosted. No data leaves your infrastructure. Ideal for healthcare, legal, and government use cases.

Metadata Scrubbing

One thing Forme does automatically on every redaction that many tools miss: metadata scrubbing.

PDFs store metadata in the file header -- author name, creation date, edit history, software used, and sometimes content from earlier versions. This metadata survives a visual overlay redaction.

Forme strips metadata on every redaction automatically:

Author and creator fields cleared
Edit history removed
Producer replaced with "Forme"
Modification date updated

No opt-in required.

Verifying Your Redaction

After redacting:

Open the output PDF in Chrome
Try to select text in the redacted area
Text should not be selectable

You can also verify programmatically using Forme's text search:

import { findTextRegions } from '@formepdf/core';

const regions = await findTextRegions(redactedPdf, [
  { pattern: 'John Smith', pattern_type: 'Literal' }
]);

// Should be 0 if redaction worked
console.log(regions.length);

Limitations

Scanned documents -- If a PDF is a scanned image, there are no text operators to remove. You need OCR first to make the text machine-readable before redacting it.

CJK text -- Chinese, Japanese, and Korean text uses a different encoding (CIDFont) that requires additional glyph mapping. Forme currently redacts WinAnsi (Latin) encoded text.

Encrypted PDFs -- Must be decrypted before redaction.

Getting Started

npm install @formepdf/sdk

import { FormeClient } from '@formepdf/sdk';

const client = new FormeClient({
  apiKey: process.env.FORME_API_KEY
});

const redacted = await client.redact(pdfBytes, {
  presets: ['ssn', 'email', 'phone'],
});

Self-hosting available via formepdf/forme Docker image for teams that need documents to stay on their own infrastructure.