By · Last updated 2026-06-05

Atpakaļ uz BloguGDPR un Atbilstība

LGPD Brazilija: CPF, CNPJ un datu aizsardziba

LGPD aptver 215 miljonus braziliesu un ANPD 2024. gadà sàka nozimigus izpildes pasakumus. CPF tiek atklàts ar tikai 45% precizitàti ar angLu valodas rikiem.

June 5, 20268 min lasīšanai
Brazil LGPDCPF detectionBrazilian Portuguese PIIANPD complianceSouth America data protection

LGPD Brazilija: CPF, CNPJ un datu aizsardziba

Brazilijas Lei Geral de Proteção de Dados (LGPD) aptver 215 miljonus cilvèku. Tà ir pasaules tresais lielàkais datu aizsardzibas likums pèc iedzivotàju skaita. Tà aptver vairày cilvèku neka Vàcija, Francija un Lielbritànija kopà. Autoridade Nacional de Proteção de Dados (ANPD) izdeva pirmàs nozimigàS sankcijas 2024. gadà. Garas periods pèc LGPD pieñemSanas 2020. gadà ir beidzies.

Pastàv ari tehniSkà problème. LGPD dokumenti ir braziliesu portugàLu valodà. Nacionàlie ID Brazìlijà atšKiras no tiem Portugàle. Tie ari atšKiras no jebkuras citas valsts ID.

Kapèc brazilieSu PII ir atšKirigs

Brazilijas federàlo un Statu ID sistèmas attàlinàjàs no EiropaskìS digitàlàs identitàtes sistèmàm. Tà izveidojàS unikàla identifikatoru kopa. Vairums NLP riku ir apmàcìti uz angLu vai Eiropas datiem. Tie nespèj atklàt vietèjos ID.

CPF (Cadastro de Pessoas Físicas): 11 ciparu nodoklu mokàtàja numurs. Formàts: XXX.XXX.XXX-XX. Tam ir divi pàrbaudes cipari. Formula izmanto divus atseviSKus matemàtikas soLus. Abiem ir jàsakrìt, lai CPF bùtu derigs.

AtklaSanas plaisa ir liela. AngLu apmàcìtie NLP riki atklàj CPF ar tikai 45% precizitàti (ANPD, 2024). Divi iemesli izskaidro to. Pirmkàrt, riki, kas saskaña 11 ciparu numurus bez divu soLu pàrbaudes cipara logiku, sajauC derigus CPF numurus ar nejausàm secibàm. Otrkàrt, CPF dazreiz trùkst formàts XXX.XXX.XXX-XX. Tas notiek OCR izejdatos un teksta veidlapàs.

CNPJ (Cadastro Nacional da Pessoa Jurídica): 14 ciparu uznemuma ID numurs. Formàts: XX.XXX.XXX/XXXX-XX. Tam ari ir divi pàrbaudes cipari. Formula ir lìdziga CPF, bet ne identiska.

RG (Registro Geral): Statu pilsoniskàs ID karte. Formàts atšKiras pèc Stata. Sàu Paulo izmanto 2 burtus un 5-9 ciparus. Rio de Žaneiro izmanto 7-8 ciparus ar svìtru. Minas Žeraiss izmanto 7-9 ciparus. Citiem Statiem ir savi formàti. Riks, kas pazìst tikai viena Stata RG, palaidìs garâm vairums RG numuru.

CNH (Carteira Nacional de Habilitação): 11 ciparu autovadìtàja apliecibas numurs. Tam ir viens pàrbaudes cipars. Formàtà ir iekLauts rajona kods.

Título de Eleitor: 12 ciparu vèlètàja ID numurs. Tam ir trìs daLas: 8 ciparu ID kods, 2 ciparu Stata kods un 2 pàrbaudes cipari.

SUS numurs (Cartão SUS): 15 ciparu valsts veselibas ID. Katrs cilvèks valstì to sane. ParàdàS visos slimnicas un klìnikas ierakstos.

PIS/PASEP: 11 ciparu sociàlàs programmas numurs. ParàdàS katrà nodarbinàtibas ierakstà.

LGPD anonimizàcijas standarts

LGPD 12. pants definè anonìmus datus. Standarts: dati "nevar tikt identificèti, ñemot vèrà saprOtigos tehniskos lidzekLus apstrSades laikà". Tas ir tehnoloGijai relatìvs standarts. SodienàS anonìmie dati var nemainìgiem nepalikt, uzlabojoties re-identifikàcijas metodèm.

ANPD sniedz papildu norSolumus. TieSo identifikatoru kà CPF un vàrda nodzèSana nav pietiekama. Kvazi-identifikatoru grupas joprojam var atLaut re-identifikàciju. Vecuma diapazons, pilsèta, dzimums un darbs kopà var identificèt personu. Tie ir jàapstrSadà, grupèjot vai pievienojot trokSnis.

AI apmàcibas datiem ANPD prasa vienu no trim nosacìjumiem. Pirmkàrt: dati atbilst 12. panta standartam. Otrkàrt: katra datu subjekts deva skaidru piekriSanu konkrètai apmàcibas izmantošanai. TreSàm: pastàv derìgs dokumentèts mèrKis.

PortugàLu valodas prasibas

BrazilijESu portugàLu valoda atšKiras no Eiropas portugàLu valodas. Vàrdi, pareizrakstìba un dokumentu formas nav vienàdas. NLP modeLi, kas apmàcìti uz Portugàles tekstiem, sasniedz aptuveni 71% precizitàtes salìdzinàjumà ar modeLiem, kas apmàcìti uz vietèjiem tekstiem. Tas nàk no ANPD tehniSkà novèrtèjuma.

Galvenàs atšKiribas PII atklaSanai:

  • Vàrdi: DivkàrSa uzvàrda izmantošana un vàrda kàrtiba atšKiras no Portugàles.
  • Adreses: CEP kodi izmanto formàtu XXXXX-XXX. Šis formàts ir unikàls valstij. Tam ir vajadziga sava atklaSanas logika.
  • Dokumentu termini: Siet ir "Carteira de Identidade" pret "Bilhete de Identidade" Portugàlè. Agentu nosaukumi ari atšKiras.

Ko ANPD atbilstiba prasa

Cetras tehniSkàs vajadzibas aptver ANPD atbilstibu. CPF un CNPJ atklaSanai ir jàietver divu soLu pàrbaudes cipara validàcija. RG atklaSanai ir jàaptver visi Statu. SUS numura un Título de Eleitor atklaSana ir ari vajadziga. NLP modeLiem ir jàbut apmàcìtiem uz vietèjo portugàLu valodas tekstiem.

Skatiet musu rokasgrâmatu par globàlo PII identifikatoru atklaSanu un LGPD izpildes darbìbàm 2024. gadà.

Avoti

Vai esat gatavi aizsargāt savus datus?

Sāciet PII anonimizāciju ar 285+ entitāšu veidiem 48 valodās.

About this page

We update this page when our platform or the law changes.

Read our founder note for how we work.

Each change shows up in the timestamp at the top.

Related reading

We follow these rules

  • GDPR (EU 2016/679).
  • ISO/IEC 27001:2022.
  • NIS2 (EU 2022/2555).
  • HIPAA safe harbor under 45 CFR § 164.514(b)(2).

Our promise

We do not sell your data.

We do not train models on your text.

We store your files in Germany.

You can delete your account at any time.

You own your work.

Where we run

Our servers live in Falkenstein, Germany.

We use Hetzner. They hold ISO 27001 certification.

All data stays in the EU.

Backups run every day.

Need help?

Email support@anonym.legal.

We reply within one business day.

How we test

We run a full check suite on every release.

Each surface gets its own sweep script and report.

Human reviewers spot-check the output each week.

We track recall and precision on a labelled set.

Bad runs block the deploy.

What we never do

  • We never sell your information to third parties.
  • We never train models on what you upload.
  • We never keep your work after you delete it.
  • We never share keys with any outside firm.
  • We never run ads inside the product.

Plans in plain words

We sell credits, not seats.

One credit covers one short job.

Long jobs use a few credits each.

You can top up at any time.

Unused credits roll over each month.

Read the plans page for current rates.

Who built this

A small team of engineers and lawyers built this.

We ship from Europe and work in the open.

Our founder note spells out why we started.

Where to start

How the parts fit

A browser add-on cleans text inside Chrome.

A Word plug-in handles drafts in Office.

A small desktop tool works on whole folders.

An agent protocol link feeds large models safely.

All four share one core engine and one rule set.

Words from our team

We started this work after a lunch about cookies.

One friend kept getting odd ads on her phone.

We asked why a court file leaked through a draft.

We sketched the first build on a napkin that week.

By month three we had a tiny demo for a friend.

She used it on her first case the next day.

Common questions we hear

Can the tool read scanned PDFs? Yes, with OCR.

Does it work on long files? Yes, in small chunks.

Can I roll my own rule set? Yes, save it as a preset.

Does it run offline? The desktop build runs offline.

Do you keep my files? No, the cloud build wipes after each run.

Will it learn from my work? No, we never train on inputs.

A short tour of the workflow

Upload a file or paste a snippet of prose.

Pick the entities you want gone from the draft.

Choose a method: replace, mask, hash, encrypt, or redact.

Press run and watch the side panel show each hit.

Skim the result and tweak any rule that misfired.

Save the cleaned file or send it to a teammate.