匿名化 vs 假名化：2000 万欧元罚款背后的关键区别

GDPR 对匿名化数据与假名化数据的处理截然不同。真正的匿名化可使数据完全脱离 GDPR 管辖；假名化则不能。混淆两者，可能触发高达 2000 万欧元的最高罚款。

George CurtaMay 8, 20268 分钟阅读

GDPR anonymization pseudonymizationArticle 4 recital 26personal data scope20 million EUR fineanonymization compliance determination

匿名化 vs 假名化：2000 万欧元罚款背后的关键区别

GDPR 第 83 条设定的最高罚款额为 2000 万欧元或全球年营业额的 4%。一个法律问题决定了这一风险是否落到你的数据集上：该法律是否适用于你所持有的数据？

匿名化可使数据脱离 GDPR 管辖范围，假名化则不能。两者之间的差距，意味着天壤之别的法律后果。

两种定义的通俗解读

第 65 条序言（Recital 26）为匿名化设定了门槛：个人必须「不可识别或不再可识别」。这一测试范围很广，涵盖「合理可能被使用的所有手段」，包括数据控制者本身、数据处理者以及任何第三方。

第 4(5) 条对假名化作出定义：当数据可通过密钥还原时，即属于假名化数据。删除密钥，数据依然存在；那把额外的密钥必须单独存放，但这并不使数据转变为匿名化数据。

假名化数据仍属于个人数据，法律对其全面适用，不存在任何管辖范围豁免——没有例外。

错误标注的代价

将假名化数据集错误标注为匿名化数据，会同时引发五个合规问题：

第 30 条下的 ROPA 记录错误
缺失数据主体权利处理程序——访问权、删除权、数据携带权均无从落实
无保留计划——缺乏触发删除的机制
跨境传输缺乏保护措施
无法响应被遗忘权请求

每一项空缺都构成独立的违规事实，五项问题可能同时存在于同一数据处理流程中。

2025 年执法信号

2025 年，EDPB 开展了联合执法专项行动。报告点名了一类反复出现的违规行为：「以低效匿名化手段替代删除义务」。各数据保护机构现在不仅核查是否存在匿名化步骤，还会审查该步骤是否真正有效。

一个带有查找表的令牌化数据集属于假名化，而非匿名化。它有对应的密钥，密钥可以逆转处理过程。将其标注为「匿名化」，恰恰是 2025 年报告所针对的那类错误。

选择正确的处理方法

**真正匿名化——脱离法律管辖范围。**使用「编辑删除」方法，个人数据被彻底移除，无任何回溯路径。也可对高熵值数据进行哈希处理，确保无预映射攻击路径。记录操作依据。输出结果不附带任何法律义务。

**假名化——仍在法律管辖范围内。**使用「替换」「遮蔽」或「加密」方法。法律对处理结果全面适用。假名化可降低数据泄露造成的损害，但不能减少法律义务。

**受控可逆性——适用于研究或审计场景。**使用「客户端持有密钥的加密」方法。密钥托管必须符合 EDPB 05/2022 号指南的密钥分离要求，并在数据保护影响评估（DPIA）中注明适用的假名化域。

一个实际案例

某公司向研究机构销售「已匿名化」的客户记录，采用「编辑删除」方法处理数据：个人数据被彻底移除，无令牌表，无哈希预映射，重新识别不存在任何路径。

数据保护官在 DPIA 中详细记录：所用方法、涵盖的标识符类型、无法逆转的技术原因、剩余风险等级。输出结果脱离法律管辖范围，研究副本不适用数据主体权利和跨境传输规则。

方法与声明相符，流程合规，经得起监管审查。

为何记录至关重要

企业不能仅凭声明主张匿名化已完成，该主张必须有完整的书面记录支撑。DPIA 须呈现四项内容：涵盖哪些标识符、采用何种方法、为何重新识别不存在路径、剩余风险等级为何。

缺少上述记录，监管审查将把该数据集视为在法律管辖范围之内，全套法律义务随之适用——ROPA 条目必须存在，跨境传输保护措施必须存在，被遗忘权响应路径必须存在。没有证明，就没有豁免。

关于被遗忘权与匿名化数据的交叉问题，请参阅 GDPR 被遗忘权与 EDPB 2025 年指南。关于跨境共享数据的传输规则，请参阅数据传输合规与 TikTok 罚款案。

参考来源

GDPR 与合规

Self-Hosted PII Fails Compliance Audits

spaCy 3.4.4 produces different NER results than spaCy 3.5.1. Financial services firm discovers 3% of documents were differently anonymized in staging vs.

GDPR 与合规

Presidio Misses 220+ GDPR Entities

Presidio ships with ~40 default entity recognizers focused on US identifiers. European organizations need IBAN, Codice Fiscale.

GDPR 与合规

Configuration Drift: A Hidden GDPR Risk

Analyst A replaces names with pseudonyms. Analyst B blacks them out. Your GDPR audit finds both in the same dataset. Configuration drift — where team.

准备好保护您的数据了吗？

开始使用 285 种实体类型在 48 种语言中匿名化 PII。

开始免费试用查看功能

About this page

We update this page when our platform or the law changes.

Read our founder note for how we work.

Each change shows up in the timestamp at the top.

We follow these rules

GDPR (EU 2016/679).
ISO/IEC 27001:2022.
NIS2 (EU 2022/2555).
HIPAA safe harbor under 45 CFR § 164.514(b)(2).

Our promise

We do not sell your data.

We do not train models on your text.

We store your files in Germany.

You can delete your account at any time.

You own your work.

Where we run

Our company HQ is in Saarbrücken, Germany. Our servers run in Hetzner's Falkenstein datacenter.

Hetzner holds ISO 27001 certification.

All data stays in the EU.

Backups run every day.

Need help?

Email support@anonym.legal.

We reply within one business day.

How we test

We run a full check suite on every release.

Each surface gets its own sweep script and report.

Human reviewers spot-check the output each week.

We track recall and precision on a labelled set.

Bad runs block the deploy.

What we never do

We never sell your information to third parties.
We never train models on what you upload.
We never keep your work after you delete it.
We never share keys with any outside firm.
We never run ads inside the product.

Plans in plain words

We sell credits, not seats.

One credit covers one short job.

Long jobs use a few credits each.

You can top up at any time.

Unused credits roll over each month.

Read the plans page for current rates.

Who built this

A small team of engineers and lawyers built this.

We ship from Europe and work in the open.

Our founder note spells out why we started.

Where to start

How the parts fit

A browser add-on cleans text inside Chrome.

A Word plug-in handles drafts in Office.

A small desktop tool works on whole folders.

An agent protocol link feeds large models safely.

All four share one core engine and one rule set.

Words from our team

We started this work after a lunch about cookies.

One friend kept getting odd ads on her phone.

We asked why a court file leaked through a draft.

We sketched the first build on a napkin that week.

By month three we had a tiny demo for a friend.

She used it on her first case the next day.

Common questions we hear

Can the tool read scanned PDFs? Yes, with OCR.

Does it work on long files? Yes, in small chunks.

Can I roll my own rule set? Yes, save it as a preset.

Does it run offline? The desktop build runs offline.

Do you keep my files? No, the cloud build wipes after each run.

Will it learn from my work? No, we never train on inputs.

A short tour of the workflow

Upload a file or paste a snippet of prose.

Pick the entities you want gone from the draft.

Choose a method: replace, mask, hash, encrypt, or redact.

Press run and watch the side panel show each hit.

Skim the result and tweak any rule that misfired.

Save the cleaned file or send it to a teammate.

匿名化 vs 假名化：2000 万欧元罚款背后的关键区别

匿名化 vs 假名化：2000 万欧元罚款背后的关键区别

两种定义的通俗解读

错误标注的代价

2025 年执法信号

选择正确的处理方法

一个实际案例

为何记录至关重要

参考来源

相关文章

Self-Hosted PII Fails Compliance Audits

Presidio Misses 220+ GDPR Entities

Configuration Drift: A Hidden GDPR Risk

准备好保护您的数据了吗？

About this page

Related reading

We follow these rules

Our promise

Where we run

Need help?

How we test

What we never do

Plans in plain words

Who built this

Where to start

How the parts fit

Words from our team

Common questions we hear

A short tour of the workflow