anonym.legal
Back to BlogGDPR & Compliance

Why Your PII Tool Detects SSNs but Misses Brazilian CPF, Indian Aadhaar, and UAE Emirates ID

GDPR applies to German Steuer-IDs, French NIRs, Swedish Personnummers, and 260+ other identifier types most tools have never heard of. Your SSN detector is not GDPR compliant. Here's what complete EU and global coverage actually requires.

March 5, 20268 min read
global PII coverageEU identifier detectionSteuer-ID French NIRBrazilian CPF285+ entity types GDPR

The US-Centric PII Tool Problem

Most PII detection tools were built in the United States for US data formats. The Social Security Number — 9 digits in AAA-BB-CCCC format, with documented area numbers, group numbers, and serial numbers — was the primary design target. Tools built around SSN detection reliably detect SSNs. They may also detect phone numbers, email addresses, and US driver's license formats. They systematically miss the identifier formats used in every other country.

GDPR does not recognize US-centricity as a compliance exemption. A German Steuer-ID (Steuerliche Identifikationsnummer) is an 11-digit tax identification number issued by the Bundeszentralamt für Steuern, with a specific checksum algorithm validated against a checksum digit. It identifies German residents as personally as an SSN identifies Americans. GDPR Article 4 defines personal data as "any information relating to an identified or identifiable natural person" — a Steuer-ID is personal data under GDPR regardless of whether your PII tool knows the format.

GDPR fines have been issued for EU country-specific PII exposure in data systems that processed EU residents' data using tools configured only for US formats. The compliance gap is not theoretical — it has produced enforcement actions.

The European Identifier Landscape

The scale of the European identifier coverage gap:

Germany: Steuer-ID (11-digit, checksum), Sozialversicherungsnummer (12-digit, structural format), Reisepass (10-digit passport with specific issuing authority codes)

France: NIR/Numero de Securite Sociale (15 digits encoding gender [1], birth year [2], birth month [2], department [2], commune [3], registry number [3], check key [2]), Carte Vitale (card of 15-digit NIR), SIRET (14-digit business identifier), SIREN (9-digit)

Sweden: Personnummer (10-digit, format YYMMDD-XXXX with last two digits identifying birth county in older numbers), Samordningsnummer (coordination number for non-residents, similar format with day + 60)

Norway: Fodselsnummer (11-digit, format DDMMYYNNNKK with gender in middle digits), D-nummer (coordination number, day + 40)

Brazil: CPF (Cadastro de Pessoas Fisicas, 11-digit with two check digits), CNPJ (14-digit business identifier)

India: Aadhaar (12-digit biometric identity, with Verhoeff algorithm check digit), PAN (10-character alphanumeric for income tax)

UAE: Emirates ID (15-digit: 784-birth year-sequence-check)

A global HR manager processing payroll data for employees across 12 countries needs a tool that detects all 12 countries' national ID formats in a single pass — without configuring 12 separate country-specific tools or maintaining 12 separate regex libraries.

The 285+ Entity Type Architecture

The 285+ entity type library covers the full EU member state identifier set, major APAC identifiers (Aadhaar, PAN, CPF, CNPJ, Emirates ID, Thai citizen ID), and US identifiers (SSN, EIN, driver's license by state) in a single detection engine. The library is maintained and updated as country-specific formats evolve.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.