anonym.legal
Back to BlogTechnical

Air-Gapped Privacy: How to Anonymize Sensitive Documents When the Cloud Isn't an Option

FedRAMP and ITAR environments have one thing in common — the cloud is not an option. Reversible pseudonymization under GDPR Art. 4(5) reduces compliance risk. Only 23% of anonymization tools offer true reversibility (IAPP 2024).

March 5, 20269 min read
air-gapped anonymizationSCIF document processingITAR complianceFedRAMP offline toolsoffline PII detection

The Air-Gap Requirement

Defense contractors, government intelligence agencies, and critical infrastructure operators manage networks where external internet connectivity is physically impossible, not merely prohibited by policy. A SCIF (Sensitive Compartmented Information Facility) is a room or facility designed to prevent electronic eavesdropping and signals intelligence collection — it is Faraday-caged, with no wireless signals entering or exiting. A classified government network under ITAR (International Traffic in Arms Regulations) control cannot transmit covered technical data to unapproved parties — a category that includes cloud service providers not cleared under ITAR.

For organizations in these environments, "cloud SaaS" is not a risk to be managed — it is a technical impossibility. Any anonymization tool that requires an active network connection cannot be deployed. Any tool that phones home for licensing verification is a non-starter. Any tool whose detection models require cloud API calls for inference cannot function.

The Ollama community specifically cites air-gapped deployment as the primary justification for local AI tooling: "All data stays on your device with Ollama, with no information sent to external servers — particularly important for sensitive work like doctors handling patient notes or lawyers reviewing case files." The same rationale applies at the organizational level for classified and ITAR-controlled environments.

The ITAR Use Case

A data scientist at a defense contractor processing personnel records under ITAR requirements needs to de-identify files before sharing with a FOIA-requesting journalist. The contractor's network is air-gapped. The processing must occur on the air-gapped machine and must produce outputs suitable for public release.

This use case has no cloud solution. The only path is a tool that runs entirely on the local machine, applies detection models stored locally, and produces anonymized outputs without any external communication. The Tauri 2.0-based Desktop Application runs in exactly this configuration: after download and installation, no network calls are made during document processing. The spaCy NER models, the regex patterns, and the transformer inference run locally. The processing output never leaves the machine unless explicitly exported by the user.

Reversible Pseudonymization for Classified Operations

A related requirement in classified and government contexts: reversible pseudonymization that maintains analytical utility while protecting real identities. GDPR Article 4(5) formally recognizes pseudonymization as a data protection measure that reduces compliance risk — pseudonymized data is subject to reduced obligations compared to fully identifiable data, provided the pseudonymization keys are kept separate from the pseudonymized dataset.

IAPP research (2024) found that only 23% of anonymization tools offer true reversibility — the ability to decrypt pseudonymized data back to original values using a key that is kept separate from the output. The majority of tools implement permanent replacement (the original data is overwritten and cannot be recovered) or masking (partial display of the original value).

For government operations where pseudonymized datasets must be shareable across compartments — one team receives the pseudonymized dataset for analytical work, another team holds the decryption key for re-identification when legally required — reversible encryption with key separation is the only compliant architecture.

The zero-knowledge approach extends this further: the encryption key is generated client-side and never transmitted. Even if the anonymization tool's provider were subpoenaed, they cannot produce the decryption key because they never received it. For classified environments where chain of custody for encryption keys is itself a security requirement, this architecture provides the required assurance.

EDPB Guidance Compliance

EDPB Guidelines 05/2022 on pseudonymization require key separation: the pseudonymization key must be held by a different party than the party receiving the pseudonymized dataset, or stored with technical controls that prevent the receiving party from accessing both the data and the key simultaneously.

The combination of client-side key generation (key never leaves the user's device), local processing (data never leaves the air-gapped environment), and separate export of pseudonymized outputs and decryption keys satisfies the EDPB's key separation requirement while meeting the air-gapped operational constraint.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.