Back to BlogGDPR & Compliance

Internal Employee IDs Are PII Too: Detecting Proprietary Identifiers Without Writing Code

Every large organization has proprietary internal identifiers that link anonymized records back to real people. 34% of GDPR fines involve inadequate technical measures. Generic PII tools cannot detect custom formats. GDPR requires detecting and anonymizing all quasi-identifying data.

March 5, 20268 min read
employee ID anonymizationproprietary identifier detectionquasi-PIIGDPR custom entitiesno-code pattern builder

The Quasi-PII Problem

GDPR Article 4 defines personal data as "any information relating to an identified or identifiable natural person." The key word is "identifiable" — not just currently identified, but capable of identification through additional processing. A value that is not directly identifying but can be linked to a real person through internal systems is personal data under GDPR.

Internal employee IDs are the most common example. "EMP-EU-123456" does not directly identify anyone. But the HR database holds a table: EMP-EU-123456 → Maria Schmidt, Senior Engineer, Munich. Any document containing EMP-EU-123456 can be linked to Maria Schmidt by anyone with access to the HR database. Under GDPR, EMP-EU-123456 is personal data — it is information relating to an identifiable natural person.

The same analysis applies to customer account numbers (linking to CRM records), project codes (linking to client identity in contract databases), internal reference numbers for legal matters (linking to case participants in the DMS), and medical record numbers in external systems (linking to patient records in the hospital's EHR).

Organizations that anonymize the obvious PII (names, email addresses, national IDs) but leave internal identifiers untouched have not achieved GDPR-compliant anonymization. They have achieved de-anonymization in two steps rather than one — requiring an attacker (or an overly curious employee) to consult the HR database rather than reading the document directly.

The Coverage Gap in Practice

DLA Piper's 2025 GDPR Annual Report found that 34% of all GDPR fines involve inadequate technical measures under Article 32 — the requirement to implement appropriate technical safeguards. Inadequate anonymization, including the failure to detect and remove quasi-identifying internal identifiers, is a documented category of Article 32 violations.

The EDPB processed 900+ consistency mechanism cases in 2024, reflecting the increasing volume of enforcement coordination across EU member states. Cross-border enforcement (where the lead supervisory authority in one country coordinates with others) means that an Article 32 violation in a data set shared across EU borders can trigger coordinated enforcement.

The No-Code Pattern Solution

For a global logistics company's compliance team anonymizing employee records for an external HR audit:

Employee IDs follow the format EMP-[REGION]-[0-9]{6} — EMP-EU-123456, EMP-APAC-789012, EMP-AMER-345678. The compliance team provides 3 examples to the AI pattern helper. The AI returns: detected pattern EMP-[A-Z]{2,4}-d{6}; matches all provided examples; suggested entity name: EMPLOYEE-ID; test against edge cases including different region codes.

The team tests against 10 additional samples, including EMP-DACH-000001 and EMP-APAC-999999. The pattern validates correctly. The custom entity is saved to the GDPR compliance preset shared with all team members. All 47 documents in the HR audit package are processed in one batch. All employee IDs are replaced with role-based pseudonyms. The audit firm receives documents that cannot be linked to individual employees through any internal database.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.