Back to BlogHealthcare

The 18 HIPAA Identifiers Your PII Tool Is Probably Missing

HIPAA lists 18 PHI identifiers. Most anonymization tools detect maybe 6 of them. Medical Record Numbers vary by institution with no standard US format. 45 CFR 164.514 Safe Harbor requires removing all 18. OCR guidance updated 2024 to address AI-assisted re-identification risks.

March 5, 20269 min read
HIPAA 18 identifiersPHI complete detectionMRN detectionNPI DEA numbersHIPAA Safe Harbor compliance

The 18-Identifier Requirement

HIPAA's Privacy Rule (45 CFR Section 164.514) specifies the Safe Harbor de-identification method: to de-identify protected health information, 18 specific identifier categories must be removed. The Safe Harbor method is one of two HIPAA de-identification approaches; it is more commonly used because compliance is deterministic — if all 18 categories are removed, the data is de-identified as a matter of law.

The 18 categories:

  1. Names
  2. Geographic data (smaller than state — including street address, city, county, ZIP code)
  3. Dates (except year) related to the individual — birth, admission, discharge, death
  4. Telephone numbers
  5. Fax numbers
  6. Email addresses
  7. Social Security numbers
  8. Medical record numbers (MRNs)
  9. Health plan beneficiary numbers
  10. Account numbers
  11. Certificate/license numbers
  12. Vehicle identifiers and serial numbers
  13. Device identifiers and serial numbers
  14. Web URLs
  15. IP addresses
  16. Biometric identifiers (fingerprints, voiceprints)
  17. Full-face photographs and comparable images
  18. Any other unique identifying number or code

Most PII detection tools reliably detect categories 1, 4, 6, and 7 — names, phone numbers, email addresses, and SSNs. They systematically fail on categories 8, 9, 10, 11, 13, and 18.

The MRN Detection Gap

Medical Record Numbers are explicitly listed as a PHI identifier (category 8). MRN formats are institution-specific — there is no standardized national format. Hospital A uses a 7-digit integer. Hospital B uses "PT-YYYYNNNN" where YYYY is year and NNNN is a sequence number. Hospital C uses an alphanumeric 8-character string. Hospital D uses "MRN: " followed by a 9-digit number.

A generic PII detection tool that does not know Hospital B's MRN format will not detect "PT-2024-8847" as a PHI identifier. The document containing this MRN will be treated as de-identified after standard processing — when it is not.

This creates a compliance failure mode that is invisible to the organization: the de-identification appears complete because the tool did not flag any violations. The missing detection is the problem.

The Custom Entity Solution

Healthcare organizations that need MRN detection have three options. First, implement the detection in Presidio directly — requiring Python programming expertise and ongoing maintenance as MRN formats evolve. Second, maintain a manual review step specifically for MRNs — creating a systematic weak link in the de-identification pipeline. Third, use a system that provides AI-assisted custom entity creation without requiring code.

The AI pattern helper approach: the clinical informatics team provides 5 sample MRN values (SVHS-0012345, SVHS-0987654, SVHS-1122334, SVHS-4455667, SVHS-8899001) and requests a detection pattern. The AI generates a regex — SVHS-d{7} — and validates it against the provided examples. The pattern is saved to the team's HIPAA compliance preset. All subsequent de-identification sessions detect this MRN format automatically.

The same approach applies to other institution-specific identifiers: health plan beneficiary number formats, equipment serial number formats, and any proprietary identifying codes that are specific to the organization.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.