The EDPB's January 2025 Clarification
The European Data Protection Board's Guidelines 01/2025 on Pseudonymisation, published in January 2025, introduced several clarifications with significant compliance implications for organizations using data anonymization tools.
The most consequential clarification: the guidelines introduce the concept of a "pseudonymisation domain" — the set of parties for whom pseudonymized data remains linkable to real individuals. Pseudonymized data is personal data under GDPR for any party within the pseudonymisation domain (parties who hold the pseudonymization key or who can derive it). The guidelines explicitly state that pseudonymized data does not change its personal data status — it remains subject to all GDPR obligations — even if it appears non-identifying to parties outside the domain.
This clarification affects organizations that believed "tokenization" or "pseudonymization with keys" had removed their data from GDPR's scope. Under the January 2025 guidelines, it has not. The organization holding the pseudonymization key remains a GDPR data controller for the pseudonymized data.
The Tool Marketing Gap
Many tools marketed as "anonymization" tools actually produce pseudonymized data. The distinction:
True anonymization (irreversible): The transformation cannot be reversed by any party, using any means available now or in the future. True anonymization removes data from GDPR scope entirely.
Pseudonymization (reversible): The transformation can be reversed using a key, a lookup table, or additional information held separately. Pseudonymization does not remove data from GDPR scope — it remains personal data for parties who hold or can derive the key.
Token-based systems (replacing PII with consistent tokens and maintaining a mapping table), encryption-based systems (replacing PII with encrypted values and maintaining a decryption key), and format-preserving encryption all produce pseudonymized data. The data remains personal data under the EDPB's January 2025 guidelines.
Hashing (applying a one-way hash function to PII values) is closer to anonymization — if the hash function is cryptographically secure and no preimage lookup is feasible — but the EDPB's guidelines note that hashing of low-entropy data (short strings like names or common identifiers) is vulnerable to rainbow table attacks and may not constitute true anonymization.
Compliance Strategy Under the New Guidelines
DPOs need to reassess their data classification strategy in light of the EDPB's January 2025 guidelines:
For data classified as "anonymized" (outside GDPR scope): re-evaluate whether the transformation is truly irreversible. If any party can reverse it — including the data controller itself — it is pseudonymized and GDPR still applies.
For data that must remain outside GDPR scope (for analytics, archiving, or research): use irreversible anonymization methods — redaction (permanent removal), masking with non-recoverable values, or cryptographic hashing of high-entropy data. Document the method and the basis for the anonymization determination.
For data that benefits from controlled reversibility (research with re-contact requirements, audit trails, discovery obligations): explicitly classify as pseudonymized personal data, maintain all GDPR obligations, document the pseudonymization key custody arrangements per the EDPB's key separation requirements.
The explicit five-method framework — Replace, Redact, Mask, Hash, Encrypt — maps directly onto this classification: Replace, Mask, and Encrypt produce pseudonymized data (GDPR still applies). Redact and Hash (of high-entropy data) approach true anonymization (subject to completeness and entropy analysis).
Sources: