De-identify PHI for research data sets under HIPAA Privacy Rule exception – CCPA/HIPAA-compliant de-identification per 45 CFR §164.512(i)
Research using PHI without patient authorization is permitted under 45 CFR §164.512(i) when an IRB or Privacy Board waives the authorization requirement, or when the data is de-identified. anonym.legal prepares research data sets by applying Safe Harbor or Expert Determination de-identification, eliminating the need for IRB waiver or individual authorization for the downstream analytical dataset.
When this applies
Apply this workflow when a research team has IRB approval for a study that requires access to PHI but wishes to minimize re-identification risk by providing analysts with a de-identified dataset, or when the proposed data use falls outside an IRB waiver scope and de-identification is the Privacy Rule compliance path.
How anonym.legal handles it
- Upload the PHI dataset (CSV, XLSX, FHIR JSON, or SAS transport) to anonym.legal.
- Confirm the de-identification method: Safe Harbor (§164.514(b)(2)) for straightforward removals, or Expert Determination (§164.514(b)(1)) when the research design requires retention of quasi-identifiers.
- The engine removes or transforms all identifier categories per the selected method, retaining clinical variables — diagnosis codes, lab values, biomarker results, outcome scores — relevant to the research question.
- For longitudinal studies, the engine assigns consistent participant pseudocodes across all time points so temporal analyses remain valid.
- A de-identification certificate and field-transformation log are generated to support IRB documentation requirements.
- The de-identified research dataset is delivered; the mapping table (if pseudonymization was used) is retained under the covered entity's data governance policy.
What you provide
- PHI dataset for the research study (CSV, XLSX, FHIR JSON, or SAS transport format)
- Research protocol summary identifying required analytical variables
- IRB approval documentation (to confirm the research scope and data minimization requirements)
- Preferred de-identification method and statistical risk threshold
Limitations & cautions
- De-identification under §164.512(i) enables disclosure for research without individual authorization, but does not remove obligations under the covered entity's IRB approval — the research team must still operate within the scope of the IRB-approved protocol.
- Longitudinal datasets with fine-grained temporal resolution may retain re-identification risk from rare event combinations even after Safe Harbor de-identification; Expert Determination is recommended for dense longitudinal records.
- De-identified research datasets should not be combined with auxiliary datasets that could re-identify participants without a prior statistical disclosure-control review.
FAQ
Does de-identification eliminate the need for an IRB waiver of HIPAA authorization?
Yes. Once a dataset is de-identified under 45 CFR §164.514, it is no longer PHI and the Privacy Rule's authorization requirement does not apply to its use or disclosure. The research team does not need an IRB or Privacy Board waiver to use de-identified data. However, other IRB obligations for the research study itself remain.
Can de-identified data be combined across covered entities for a multi-site study?
Yes. De-identified data is no longer PHI and may be combined across covered entities and disclosed to a central research coordinating center without BAA requirements. Confirm that each contributing site's de-identification method meets the same standard before combining datasets.
How should participant pseudocodes be managed for longitudinal follow-up?
The engine stores the participant pseudocode-to-identity mapping under the covered entity's access control policy. At each follow-up wave, new data for the same participant is processed using the same pseudocode, preserving longitudinal linkage without exposing the participant's real identity to analysts.