Back to BlogLegal Tech

E-Discovery Sanctions From AI Redaction Failures: How Over-Redaction Became a Legal Liability

In Athletics Investment Group v. Schnitzer Steel (2024), improper redaction triggered discovery sanctions. With AI tools achieving only 22.7% precision rates on legal documents, the risk is systematic.

March 5, 202610 min read
e-discovery sanctionsredaction liabilityAI redaction precisiondocument reviewlegal technology

The Double Liability of Improper Redaction

Legal teams face two distinct redaction failure modes, and both create liability.

Under-redaction exposes privileged content, confidential business information, or personal data that should have been withheld. The producing party has disclosed material it had the right — and in some cases the obligation — to protect.

Over-redaction withholds responsive information that opposing counsel is entitled to receive. The producing party has obstructed the discovery process, potentially hiding evidence behind illegitimate privilege claims. Courts treat over-redaction as a discovery violation subject to sanctions.

AI-assisted redaction tools that prioritize recall over precision — maximally flagging potential sensitive content — systematically produce the second failure mode. When an AI redaction engine redacts 80% of a document's content to ensure it does not miss anything privileged, the resulting production is functionally useless and potentially sanctionable.

Athletics Investment Group v. Schnitzer Steel (2024)

The 2024 case of Athletics Investment Group v. Schnitzer Steel illustrates the judicial response to improper redaction in e-discovery.

The case involved a commercial dispute in which one party's document production included redactions that opposing counsel challenged as unjustified. The court examined the redacted materials and found that the redactions exceeded what privilege law or confidentiality doctrines permitted.

The consequence: discovery sanctions. The court imposed penalties on the producing party for the improper redactions — a remedy available under Federal Rule of Civil Procedure 37 for discovery violations. The producing party bore the burden of having used an inadequate redaction process.

The case is significant not because over-redaction sanctions are novel — courts have awarded them for years — but because it occurred in a litigation landscape where AI-assisted review tools are now common. The question the case raises is whether legal teams have evaluated the precision characteristics of their AI redaction tools before relying on them for production.

The 22.7% Precision Problem

Presidio, the open-source PII detection engine developed by Microsoft and widely used in legal technology applications, achieves a 22.7% precision rate on legal documents in independent benchmarking.

Precision measures how often the tool's positive identifications are correct. A 22.7% precision rate means that approximately 77 out of every 100 items flagged by the tool as sensitive do not actually meet the sensitivity threshold they were flagged for.

For an e-discovery application, this has direct operational consequences. A production set of 10,000 documents processed with a tool achieving 22.7% precision will contain thousands of redactions that have no legitimate privilege or confidentiality basis. The producing party who relies on that output faces the same exposure as the party in Athletics Investment Group: a production that opposing counsel will challenge, a court that will examine the redacted content, and sanctions if the redactions cannot be justified.

The 22.7% figure reflects Presidio's out-of-box configuration on legal content. It does not represent all AI-assisted redaction tools — but it does represent the baseline performance of the most commonly deployed open-source engine in legal technology integrations.

The precision problem is structural: NLP-based entity recognition systems trained on general text corpora perform differently on legal language, which uses terms of art, abbreviations, document formatting conventions, and citation structures that differ from training data. A tool that achieves acceptable precision on medical records or financial statements may perform substantially worse on deposition transcripts, correspondence, and contract exhibits.

What AI Chatbot Content Analysis Reveals

The context for AI tool adoption in legal practice is established by usage data: 27.4% of AI chatbot content is sensitive, according to independent analysis of enterprise AI tool usage patterns.

This figure describes what employees submit to AI tools when using them for work tasks — not data they intentionally shared, but incidentally included sensitive content. For legal professionals using AI tools to draft correspondence, summarize depositions, analyze contracts, or research case law, sensitive content enters AI platforms as a byproduct of normal work.

The 27.4% figure establishes that nearly three in ten interactions with AI tools in a legal environment involve sensitive content — client information, privileged communications, confidential case strategy, or opposing party data. That content reaches the AI provider's infrastructure in usable form unless technical controls intercept it first.

For law firms evaluating their AI security posture, 27.4% is not a marginal risk. It is the baseline assumption: nearly a third of AI tool usage in a legal environment will involve content that warrants protection.

The Cascading Liability Chain

Over-redaction and AI tool data exposure create distinct but related liability chains for legal teams.

Over-redaction liability chain: AI tool flags documents maximally → attorney reviews output without examining each redaction individually → production submitted with unjustified redactions → opposing counsel challenges → court examines → sanctions.

AI exposure liability chain: Attorney uses AI tool to assist with case work → AI tool receives privileged client communications, confidential strategies, or sensitive case data → AI vendor infrastructure is breached → client data is exposed → attorney-client privilege is potentially implicated → malpractice exposure.

Both chains begin at the same point: legal teams deploying AI tools without understanding the technical characteristics of those tools or implementing controls appropriate to legal work.

The judicial standard for redaction is not recall-optimized. Courts evaluating challenged redactions ask whether each specific redaction was justified by privilege, confidentiality doctrine, or applicable protective order — not whether the producing party's tool flagged as much as possible to be safe.

A redaction that cannot be justified is a discovery violation regardless of whether it was produced by a human reviewer or an AI tool. The court's inquiry is document-specific, not system-level.

For legal teams, the operational implication is that redaction tools must be evaluated on precision — the percentage of flagged items that are legitimately privileged or confidential — not just recall. A tool that achieves 90% recall with 22.7% precision may catch more sensitive content, but it imposes a manual review burden for the 77.3% of false positives and creates systematic over-redaction risk when that review does not occur.

The legal environment demands precision at the document level. Each redaction in a production represents an implicit assertion to the court that the redacted content is legitimately withheld. The post-Athletics Investment Group standard is clear: that assertion needs to be accurate.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.