Read time

10 min

Sections

Focus

pii-redaction

Turn this guide into numbers

Need exact pricing after reading? Jump straight to the AI API pricing table, the AI cost estimator, or the AI model cost comparison to price the workflow in this article with your own traffic and token counts.

Live pricing

AI API pricing table

Compare per-token prices across OpenAI, Claude, Gemini, DeepSeek, Mistral, and more.

Budget math

AI cost estimator

Turn token counts and request volume into cost per request, daily spend, and monthly spend.

Head-to-head

AI model cost comparison

See which model is cheaper for the exact workload this article is talking about.

PII redaction is one of the most economically attractive AI workloads in 2026. The task is narrow, repetitive, and heavily input-driven. You are asking a model to find names, emails, phone numbers, addresses, account IDs, medical identifiers, or other sensitive spans inside text, then return either a masked version or structured spans for downstream redaction. That is not frontier-reasoning wizardry. It is disciplined pattern recognition with enough context to avoid stupid mistakes.

That matters because many teams still buy this workflow like they are commissioning a legal brief. They send every support log, contract, HR packet, and export dump to premium models by default. Then they act shocked when the monthly bill looks like a small payroll expense. The blunt answer is simple: most PII redaction should run on cheap or mid-tier models, with premium review reserved for the messy edge cases.

This guide uses current pricing from AI Cost Check, real model prices from the site catalog, and three practical workload shapes: short support logs, medium contracts or policy files, and large DSAR or HR export packets. If you already know the token math basics, great. If not, read the token guide after this. Either way, the recommendation will be the same: keep first-pass redaction cheap, keep output lean, and pay for premium review only when the risk justifies it.

What actually drives PII redaction cost

PII redaction costs are dominated by input tokens. The model has to read the document. The output should be relatively compact if you design the workflow correctly.

The cheapest production setup does not ask the model to rewrite the full document with every sensitive field replaced. That is the lazy version. The smart version asks for:

sensitive spans,
entity types,
character offsets or quoted text,
confidence scores,
optional reasons only for ambiguous cases.

That output format keeps token usage under control and makes the system auditable. Your application can mask the text deterministically after the model returns the spans.

💡 Key Takeaway: PII redaction is cheap when the model returns spans, labels, and confidence. It gets expensive when you make the model rewrite whole documents for no good reason.

There are four cost levers that matter more than everything else:

Document length. A 2,500-token support log and a 120,000-token export packet do not belong on the same pricing spreadsheet.
Output style. Span lists are cheap. Full rewritten documents are not.
Escalation rate. If only 5% of files need premium review, your budget stays sane.
Context requirements. Long HR files, legal packets, and multi-document exports favor large-context models even when the nominal per-token price is higher than the absolute cheapest option.

For a broader intake-and-extraction view, pair this guide with AI OCR and document processing costs in 2026. OCR and redaction are adjacent problems, but they are not the same budget line.

[stat] $124/month The cost to redact 100,000 contract-sized files on GPT-5 nano using structured spans instead of premium-by-default review.

That number is the real headline. Redaction at scale is affordable. Sloppy architecture is what makes it annoying.

Baseline model pricing for redaction work

These are the most useful candidates from the current pricing catalog for a text-first redaction pipeline:

Model	Input $/1M	Output $/1M	Context	Redaction fit
GPT-5 nano	$0.05	$0.40	128K	Best for short, high-volume files
Mistral Small 3.2	$0.10	$0.30	128K	Cheap structured extraction
Llama 4 Scout	$0.08	$0.30	10M	Best cheap option for very long documents
GPT-5 mini	$0.25	$2.00	500K	Safe default when quality matters
DeepSeek V3.2	$0.28	$0.42	128K	Strong price-to-quality middle ground
Gemini 2.5 Flash	$0.30	$2.50	1M	Strong long-context operational default
Claude Sonnet 4.6	$3.00	$15.00	1M	Premium escalation only
Claude Opus 4.7	$5.00	$25.00	1M	Rare, high-stakes exception lane

Two opinions worth stating clearly:

GPT-5 nano is the budget king for short, repetitive redaction jobs.
Llama 4 Scout is the most interesting cheap option for long files because the 10M-token context removes a lot of chunking nonsense.

Premium Anthropic models are not bad choices. They are just bad defaults. If you send routine redaction work to Sonnet or Opus, you are paying review-lawyer prices for clerical labor.

$0.0018

Llama 4 Scout to redact a 20K-token contract

$0.0690

Claude Sonnet 4.6 for the same contract

That is the entire story in one line. Sonnet is useful. Sonnet-by-default is dumb.

Cost per document across three realistic workloads

To keep the math grounded, use these baseline document shapes:

Short support log: 2,500 input tokens, 250 output tokens
Contract or policy file: 20,000 input tokens, 600 output tokens
Large DSAR or HR export: 120,000 input tokens, 2,000 output tokens

Those outputs assume structured spans, entity labels, and confidence metadata. They do not assume a full rewritten redacted document.

Scenario 1: short support logs and ticket transcripts

This is the common “remove customer identifiers before downstream analysis” case. Think support tickets, abuse reports, complaint summaries, CRM notes, or chat transcripts.

Model	Cost per file
GPT-5 nano	$0.0002
Mistral Small 3.2	$0.0003
Llama 4 Scout	$0.0003
DeepSeek V3.2	$0.0008
GPT-5 mini	$0.0011
Gemini 2.5 Flash	$0.0014
Claude Sonnet 4.6	$0.0113
Claude Opus 4.7	$0.0188

At this workload size, the premium tiers are economically absurd unless the redaction also includes delicate legal interpretation. For ordinary support and ops data, use GPT-5 nano or Mistral Small 3.2 and move on with your life.

Scenario 2: contracts, policies, and compliance documents

This is where people start overbuying. A contract is longer, yes. That does not mean it suddenly deserves Opus-level spending.

Model	Cost per file
GPT-5 nano	$0.0012
Llama 4 Scout	$0.0018
Mistral Small 3.2	$0.0022
DeepSeek V3.2	$0.0059
GPT-5 mini	$0.0062
Gemini 2.5 Flash	$0.0075
Claude Sonnet 4.6	$0.0690
Claude Opus 4.7	$0.1150

The cheap models are still cheap. The premium models are still expensive. The main difference is that long-context flexibility starts to matter more, which makes Scout and Gemini Flash more attractive than their raw per-token ranking might suggest.

📊 Quick Math: Redacting 1,000 contract-sized files costs about $1.80 on Llama 4 Scout, $6.20 on GPT-5 mini, and $69.00 on Claude Sonnet 4.6.

Scenario 3: DSAR packets, HR exports, and large mixed files

This is the ugly stuff: long employee records, data exports, combined case files, or giant internal packets that may contain identifiers across many sections.

Model	Cost per file
GPT-5 nano	$0.0068
Llama 4 Scout	$0.0102
Mistral Small 3.2	$0.0126
GPT-5 mini	$0.0340
DeepSeek V3.2	$0.0344
Gemini 2.5 Flash	$0.0410
Claude Sonnet 4.6	$0.3900
Claude Opus 4.7	$0.6500

The cheap lane is still viable, but this is where context-window convenience matters. If you can fit the whole file cleanly into Scout or Gemini Flash without chunking and reconciliation overhead, that is usually the correct engineering decision.

Monthly cost at scale, where the bad habits get expensive

Per-document math is useful for demos. Monthly math is what forces discipline.

100,000 short support logs per month

Model	Monthly cost
GPT-5 nano	$22.50
Mistral Small 3.2	$32.50
Llama 4 Scout	$27.50
DeepSeek V3.2	$80.50
GPT-5 mini	$112.50
Gemini 2.5 Flash	$137.50
Claude Sonnet 4.6	$1,125.00
Claude Opus 4.7	$1,875.00

100,000 contract-sized files per month

Model	Monthly cost
GPT-5 nano	$124.00
Llama 4 Scout	$178.00
Mistral Small 3.2	$218.00
DeepSeek V3.2	$585.20
GPT-5 mini	$620.00
Gemini 2.5 Flash	$750.00
Claude Sonnet 4.6	$6,900.00
Claude Opus 4.7	$11,500.00

This is the moment where “quality-first” hand-waving usually falls apart. If your first-pass redaction workflow really needs Sonnet on all 100,000 contracts, your prompts, chunking strategy, or validation design probably suck.

⚠️ Warning: The fastest way to blow your redaction budget is to ask the model to rewrite the full document. A 20K-token contract that returns a 20K-token redacted rewrite costs about $0.0076 on Llama 4 Scout instead of $0.0018, and $0.3600 on Claude Sonnet 4.6 instead of $0.0690.

That is not a subtle difference. It is a tax on laziness.

The routing policy I would actually ship

A real PII-redaction system should have three lanes.

Lane 1: deterministic pre-filter

Use regexes, validators, and known schema rules first. Email addresses, obvious phone numbers, SSNs, passport patterns, and database field labels should not require a model call every time. Save the model for contextual cases, not the low-hanging fruit.

Lane 2: cheap first-pass model

For most text, use GPT-5 nano, Mistral Small 3.2, or Llama 4 Scout depending on document length. Ask for spans, entity types, and confidence.

Lane 3: premium escalation

Escalate only when confidence is low, the document is regulated, the text is ambiguous, or the file mixes many identities and contexts. This is where GPT-5 mini or Claude Sonnet 4.6 earns its keep.

For 100,000 contract-sized files per month, a smart routed stack looks like this:

Stack	Routing policy	Monthly cost	When to use
Cheapest high-volume stack	95% GPT-5 nano, 5% GPT-5 mini	$148.80	Support ops, standard privacy cleanup, moderate risk
Long-document stack	95% Llama 4 Scout, 5% Claude Sonnet 4.6	$514.10	HR archives, policy packets, long mixed files
Premium-by-default mistake	100% Claude Sonnet 4.6	$6,900.00	Use only if you enjoy lighting money on fire

The second line is the one I would ship for serious long-form redaction. It buys context headroom and a premium escape hatch without letting the entire pipeline inherit premium pricing.

✅ TL;DR: Use deterministic rules first, cheap models for the bulk pass, and premium review only for ambiguous or regulated documents. That is the whole playbook.

If you also need labeling, downstream QA, or human-review economics, the companion read is AI data labeling costs in 2026. Redaction quality is not just model quality. It is also queue design.

Where teams waste money on redaction

1. They treat redaction like generation

Redaction is an extraction-and-masking problem. If your prompt reads like you are commissioning a polished rewrite, you are paying output-token rates for zero extra business value.

2. They use premium models to compensate for missing policy logic

If your rules for “what counts as sensitive” are not clearly defined, teams reach for a bigger model. That is backwards. Fix your taxonomy first. A sharp prompt and a validation layer beat brute-force premium spend.

3. They ignore chunking until long files explode

If the model cannot fit the file cleanly, the system becomes fragile. This is why AI contract review costs in 2026 and redaction economics overlap: long documents punish sloppy context management.

4. They never measure false positives separately from false negatives

Over-redaction harms usability. Under-redaction harms compliance. These are different failures. The right system tracks both and routes the uncertain cases. Throwing everything at Opus is not a measurement strategy.

5. They forget that auditability has token consequences

You do want confidence scores, labels, and reasons for borderline cases. You do not want verbose essay explanations for every field. Keep the default response terse and trigger expanded rationales only when the file is going to human review.

The right mental model is simple: redaction is a cheap bulk workflow with a small expensive exception queue. Build it that way.

Frequently asked questions

What is the cheapest model for AI PII redaction in 2026?

For short, high-volume text files, GPT-5 nano is the cheapest practical option in the current catalog at $0.05 input and $0.40 output per million tokens. For very long files, Llama 4 Scout is often the smarter cheap choice because its 10M-token context reduces chunking overhead.

How much does AI PII redaction cost per contract?

For a contract-sized file using 20,000 input tokens and 600 output tokens, the cost is roughly $0.0012 on GPT-5 nano, $0.0018 on Llama 4 Scout, $0.0062 on GPT-5 mini, and $0.0690 on Claude Sonnet 4.6. Use the calculator if your files are longer or your output format is heavier.

Should I use a premium model for privacy redaction?

No, not by default. Premium models belong in the escalation lane for ambiguous legal language, heavily mixed identity contexts, or regulated edge cases. For the bulk pass, cheap and mid-tier models are the correct economic choice.

Is it better to return spans or a fully redacted document?

Return spans unless you have a very specific reason not to. Span-based output is cheaper, easier to audit, and easier to combine with deterministic masking. Full rewritten documents increase output cost and make validation harder.

What is the best way to cut redaction costs without increasing risk?

Combine deterministic pattern rules, a cheap first-pass model, confidence thresholds, and a premium escalation queue. That architecture cuts cost harder than prompt tweaking and usually improves review quality at the same time.

Calculate your own redaction stack before shipping it

If you are budgeting a privacy workflow, do not guess. Run the numbers in AI Cost Check, then compare the surrounding workflows that usually travel with redaction: AI OCR and document processing costs, AI contract review costs, and what AI tokens are.

The correct strategy is boring and profitable: keep bulk redaction on cheap models, reserve premium reasoning for the ugly exceptions, and make the model emit the smallest useful output. That is how you protect privacy without turning compliance into a ridiculous API bill.

Related Cost Guides

Keep going with the closest pricing and optimization guides in this cluster.

AI Insurance Claims Processing Costs in 2026: Intake, Review, and Exception Handling

Real API cost math for AI insurance claims workflows: FNOL intake, document extraction, review, fraud flags, and exceptions.

insurance-claimsdocument-processing

AI OCR and Document Processing Costs in 2026: Cost Per Page, Per 1,000 PDFs, and the Cheapest Vision Models

See what AI OCR costs in 2026, with real per-page and per-PDF math across Gemini, GPT, Mistral, Llama, and Claude vision models.

ocrdocument-processing

AI Document Summarization Costs in 2026: What It Really Costs to Process PDFs, Reports & Books

How much does it cost to summarize documents with AI in 2026? We break down per-page and per-document costs across GPT-5.4, Claude Opus 4.6, Gemini 3 Pro, DeepSeek V3.2, and budget models — with real token math for contracts, reports, books, and batch workflows.

use-casesummarization

AI PII Redaction Costs in 2026: Cost Per Document, Per 100,000 Files, and the Cheapest Models

What actually drives PII redaction cost

Baseline model pricing for redaction work

Cost per document across three realistic workloads

Scenario 1: short support logs and ticket transcripts

Scenario 2: contracts, policies, and compliance documents

Scenario 3: DSAR packets, HR exports, and large mixed files

Monthly cost at scale, where the bad habits get expensive

100,000 short support logs per month

100,000 contract-sized files per month

The routing policy I would actually ship

Lane 1: deterministic pre-filter

Lane 2: cheap first-pass model

Lane 3: premium escalation

Where teams waste money on redaction

1. They treat redaction like generation

2. They use premium models to compensate for missing policy logic

3. They ignore chunking until long files explode

4. They never measure false positives separately from false negatives

5. They forget that auditability has token consequences

Frequently asked questions

What is the cheapest model for AI PII redaction in 2026?

How much does AI PII redaction cost per contract?

Should I use a premium model for privacy redaction?

Is it better to return spans or a fully redacted document?

What is the best way to cut redaction costs without increasing risk?

Calculate your own redaction stack before shipping it

Related Cost Guides

AI Insurance Claims Processing Costs in 2026: Intake, Review, and Exception Handling

AI OCR and Document Processing Costs in 2026: Cost Per Page, Per 1,000 PDFs, and the Cheapest Vision Models

AI Document Summarization Costs in 2026: What It Really Costs to Process PDFs, Reports & Books