← Back to blog

How to Redact a PDF Without Adobe

Adobe Acrobat Pro costs $19.99/month and requires a desktop app. Here's how to redact PDFs programmatically with true content stream removal — no Adobe required.

Adobe Acrobat Pro costs $19.99/month and requires a desktop app. For developers who need to redact PDFs programmatically -- in a serverless function, an automated pipeline, or a backend API -- that's not a viable solution.

This guide covers how to redact PDFs programmatically without Adobe, including true content removal rather than visual overlays.


The Problem with Most PDF Redaction Tools

Most tools that claim to redact PDFs don't actually remove the text. They draw a black rectangle over it. The text is still in the file -- selectable, copyable, and extractable by anyone with a PDF reader or a script.

This includes many online "free PDF redaction" tools, Microsoft Word's PDF export, and basic PDF editors.

To verify: open a "redacted" PDF in Chrome and try to select text in the blacked-out area. If you can select it, the redaction failed.

True redaction removes the text from the PDF content stream entirely. No selection, no extraction, nothing there.


What Adobe Acrobat's Redaction Actually Does

Adobe Acrobat Pro does support true redaction -- it's under Tools → Redact. It removes text from the content stream and scrubs metadata.

But it has significant limitations for developers:

  • Requires a paid Acrobat Pro subscription ($19.99/month)
  • Desktop-only -- no API, no automation
  • Manual workflow -- you mark regions by hand
  • No programmatic access to redaction logic
  • Can't be called from Node.js, Python, or any backend

If you need to redact one PDF occasionally, Acrobat works. If you need to redact hundreds of PDFs automatically, you need an API.


Option 1: Forme API (Recommended)

Forme's redact endpoint performs true content stream removal with a single HTTP call. Works from any language, any environment.

Basic redaction by coordinates

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "redactions": [
      { "page": 0, "x": 100, "y": 150, "width": 200, "height": 20 }
    ]
  }' \
  --output redacted.pdf

Text-search redaction

Find and redact text by string or regex pattern -- no need to know coordinates:

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "patterns": [
      { "pattern": "John Smith", "pattern_type": "Literal" },
      { "pattern": "\\d{3}-\\d{2}-\\d{4}", "pattern_type": "Regex" }
    ]
  }' \
  --output redacted.pdf

Built-in presets for common PII

curl -X POST https://api.formepdf.com/v1/redact \
  -H "Authorization: Bearer $FORME_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64-encoded-pdf>",
    "presets": ["ssn", "email", "phone", "date-of-birth"]
  }' \
  --output redacted.pdf

Available presets:

  • ssn -- Social Security Numbers
  • email -- Email addresses
  • phone -- US phone numbers
  • date-of-birth -- Common date formats
  • credit-card -- 16-digit card numbers

In Node.js

import { FormeClient } from '@formepdf/sdk';
import { readFileSync, writeFileSync } from 'fs';

const client = new FormeClient({
  apiKey: process.env.FORME_API_KEY
});

const pdf = readFileSync('document.pdf');

const redacted = await client.redact(pdf, {
  presets: ['ssn', 'email'],
  patterns: [
    { pattern: 'John Smith', pattern_type: 'Literal' }
  ],
});

writeFileSync('redacted.pdf', redacted);

In Python

import formepdf
import base64

client = formepdf.FormeClient(api_key="your_api_key")

with open("document.pdf", "rb") as f:
    pdf_bytes = f.read()

redacted = client.redact(
    pdf=pdf_bytes,
    presets=["ssn", "email"],
    patterns=[
        {"pattern": "John Smith", "pattern_type": "Literal"}
    ]
)

with open("redacted.pdf", "wb") as f:
    f.write(redacted)

Option 2: Self-Hosted

If your documents can't leave your infrastructure, Forme ships as a Docker image with the same redaction API running on your own servers:

docker pull formepdf/forme:0.9.0
docker run -p 4000:4000 formepdf/forme:0.9.0

Then call it the same way:

curl -X POST http://localhost:4000/v1/redact \
  -H "Content-Type: application/json" \
  -d '{
    "pdf": "<base64>",
    "presets": ["ssn", "email"]
  }' \
  --output redacted.pdf

No API key required for self-hosted. No data leaves your infrastructure. Ideal for healthcare, legal, and government use cases.


Metadata Scrubbing

One thing Forme does automatically on every redaction that many tools miss: metadata scrubbing.

PDFs store metadata in the file header -- author name, creation date, edit history, software used, and sometimes content from earlier versions. This metadata survives a visual overlay redaction.

Forme strips metadata on every redaction automatically:

  • Author and creator fields cleared
  • Edit history removed
  • Producer replaced with "Forme"
  • Modification date updated

No opt-in required.


Verifying Your Redaction

After redacting:

  1. Open the output PDF in Chrome
  2. Try to select text in the redacted area
  3. Text should not be selectable

You can also verify programmatically using Forme's text search:

import { findTextRegions } from '@formepdf/core';

const regions = await findTextRegions(redactedPdf, [
  { pattern: 'John Smith', pattern_type: 'Literal' }
]);

// Should be 0 if redaction worked
console.log(regions.length);

Limitations

Scanned documents -- If a PDF is a scanned image, there are no text operators to remove. You need OCR first to make the text machine-readable before redacting it.

CJK text -- Chinese, Japanese, and Korean text uses a different encoding (CIDFont) that requires additional glyph mapping. Forme currently redacts WinAnsi (Latin) encoded text.

Encrypted PDFs -- Must be decrypted before redaction.


Getting Started

Sign up at app.formepdf.com. The free plan includes 50 operations per month -- enough to test your redaction workflow.

npm install @formepdf/sdk
import { FormeClient } from '@formepdf/sdk';

const client = new FormeClient({
  apiKey: process.env.FORME_API_KEY
});

const redacted = await client.redact(pdfBytes, {
  presets: ['ssn', 'email', 'phone'],
});

Self-hosting available via formepdf/forme Docker image for teams that need documents to stay on their own infrastructure.