By · Last updated 2026-03-25

返回博客GDPR 与合规

一款工具,45个国家,260+种实体类型

巴西CPF含校验位,印度PAN为10位字母数字混合格式,欧盟IBAN因国家而异。全球电商平台无法为每个司法管辖区分别部署工具。

March 25, 20267 分钟阅读
global PII compliance260 entity typesBrazilian CPFIndian PANIBAN formats

一款工具,45个国家,260+种实体类型

全球化平台同时处理来自多个国家的个人数据。每个国家有其独特的证件格式,每种格式有其特定规则。一款检测工具必须处理全部格式,而大多数工具做不到。

标识符碎片化问题

一个覆盖45个国家卖家的电商平台,会收到格式迥异的入驻文件。巴西卖家提交CPF:11位数字,其中两位是校验位,采用特定的加权算法。印度卖家提交PAN:10个字符,字母和数字出现在固定位置。德国卖家提交Steuer-ID:11位数字,含Luhn校验和。荷兰卖家提交BSN:9位数字,采用模11验证。

每种格式长度和结构各异。针对一种格式构建的正则表达式无法匹配其他格式。宽泛的「10至12位数字」模式则捕获过多内容,将价格、日期和参考编号全部标记出来,误报量随业务规模急剧增长。

40种标识符的缺口

大多数企业级PII工具内置约40种标识符类型,常见的包括:

  • 美国社会安全号码
  • 美国护照格式
  • 美国驾照
  • 含Luhn校验的通用信用卡格式
  • 电子邮件地址
  • 北美电话格式
  • IP地址

这些格式对北美合规需求覆盖良好,却无法支撑全球化运营。

各地区的覆盖缺口

南美洲: 巴西CPF和CNPJ使用巴西税务局的校验算法,阿根廷CUIT采用不同的加权求和公式,哥伦比亚NIT有其独特的验证方法。这些均与美国格式无关。

亚洲: 印度PAN、Aadhaar、GSTIN和选民证各有独特格式;日本My Number为12位;韩国居民登记号和中国居民身份证号各需专属识别器。

欧盟成员国: 完整的欧盟覆盖需要全部27个成员国的IBAN格式(每国长度和格式不同),以及各国国家身份证格式,包括德国Steuer-ID、法国NIR、荷兰BSN、波兰PESEL、瑞典Personnummer,以及斯洛文尼亚EMŠO、克罗地亚OIB、保加利亚EGN和罗马尼亚CNP。

260+种实体类型的覆盖范围

260+种实体库覆盖欧盟全部27个成员国的国家身份证件,验证所有欧盟IBAN格式,覆盖南美身份证件(巴西CPF和CNPJ、阿根廷CUIT、哥伦比亚NIT),覆盖亚洲身份证件(印度PAN、Aadhaar、GSTIN、日本My Number、韩国RRN),覆盖英国证件(国家保险号、NHS号码、NINO变体),覆盖医疗证件(美国NPI、DEA号码、医院MRN格式),以及金融证件(SWIFT代码、BIC格式、账号规律)。

检测覆盖为何是合规问题

每个合规框架要求识别并保护其管辖范围内的标识符。GDPR覆盖欧盟卖家数据,LGPD覆盖巴西卖家数据,印度DPDP法案覆盖印度卖家数据。

「适当保护」意味着工具已找到该标识符。遗漏一个Aadhaar号码不是配置失误,而是覆盖缺失。对全球化平台而言,这一差距决定了部分合规与真正保护之间的边界。

一次部署260+种实体覆盖,即可处理所有这些司法管辖区——无需分别部署区域工具,无需分离处理流水线,无需人工补充那40种识别器工具所遗漏的格式。

关于覆盖范围如何对应GDPR义务,请参阅GDPR合规资源。关于审计追踪和更新策略,请参阅安全与合规详情

参考资料

准备好保护您的数据了吗?

开始使用 285 种实体类型在 48 种语言中匿名化 PII。

About this page

We update this page when our platform or the law changes.

Read our founder note for how we work.

Each change shows up in the timestamp at the top.

Related reading

We follow these rules

  • GDPR (EU 2016/679).
  • ISO/IEC 27001:2022.
  • NIS2 (EU 2022/2555).
  • HIPAA safe harbor under 45 CFR § 164.514(b)(2).

Our promise

We do not sell your data.

We do not train models on your text.

We store your files in Germany.

You can delete your account at any time.

You own your work.

Where we run

Our servers live in Falkenstein, Germany.

We use Hetzner. They hold ISO 27001 certification.

All data stays in the EU.

Backups run every day.

Need help?

Email support@anonym.legal.

We reply within one business day.

How we test

We run a full check suite on every release.

Each surface gets its own sweep script and report.

Human reviewers spot-check the output each week.

We track recall and precision on a labelled set.

Bad runs block the deploy.

What we never do

  • We never sell your information to third parties.
  • We never train models on what you upload.
  • We never keep your work after you delete it.
  • We never share keys with any outside firm.
  • We never run ads inside the product.

Plans in plain words

We sell credits, not seats.

One credit covers one short job.

Long jobs use a few credits each.

You can top up at any time.

Unused credits roll over each month.

Read the plans page for current rates.

Who built this

A small team of engineers and lawyers built this.

We ship from Europe and work in the open.

Our founder note spells out why we started.

Where to start

How the parts fit

A browser add-on cleans text inside Chrome.

A Word plug-in handles drafts in Office.

A small desktop tool works on whole folders.

An agent protocol link feeds large models safely.

All four share one core engine and one rule set.

Words from our team

We started this work after a lunch about cookies.

One friend kept getting odd ads on her phone.

We asked why a court file leaked through a draft.

We sketched the first build on a napkin that week.

By month three we had a tiny demo for a friend.

She used it on her first case the next day.

Common questions we hear

Can the tool read scanned PDFs? Yes, with OCR.

Does it work on long files? Yes, in small chunks.

Can I roll my own rule set? Yes, save it as a preset.

Does it run offline? The desktop build runs offline.

Do you keep my files? No, the cloud build wipes after each run.

Will it learn from my work? No, we never train on inputs.

A short tour of the workflow

Upload a file or paste a snippet of prose.

Pick the entities you want gone from the draft.

Choose a method: replace, mask, hash, encrypt, or redact.

Press run and watch the side panel show each hit.

Skim the result and tweak any rule that misfired.

Save the cleaned file or send it to a teammate.