Back to BlogAI Security

Building GDPR-Compliant Customer Support AI: Stripping PII AND Custom Identifiers Before Sending to AI Vendors

Customer support AI receives customer messages with names, emails, AND order IDs. Standard PII tools strip email addresses but leave order IDs intact — partial anonymization that fails GDPR pseudonymization requirements. Here's the complete solution.

March 5, 20267 min read
customer support AIGDPR AI complianceorder ID detectionIntercom GDPRZendesk privacyAI vendor data

Building GDPR-Compliant Customer Support AI: Stripping PII AND Custom Identifiers Before Sending to AI Vendors

Your customer support team uses an AI assistant to draft responses, summarize ticket history, and suggest solutions. The AI is good. Productivity is up. Then your DPO reviews the implementation.

Customer messages pasted into the AI interface contain:

  • Customer name: "Hi, I'm Sarah Johnson and my order..."
  • Email address: "Please email me at sarah.j@gmail.com"
  • Order ID: "ORD-4521893 hasn't arrived yet"

The name and email are personal data. The order ID is also personal data — it's linked to Sarah Johnson in your order management system, which the AI vendor can cross-reference if they process data for multiple clients, or which creates re-identification risk if the AI training data is ever exposed.

You're sending personal data to an external AI vendor without a valid legal basis or appropriate safeguards. This is a GDPR violation.

Why Order IDs Are Personal Data

GDPR's definition of personal data is deliberately broad: "any information relating to an identified or identifiable natural person." A person is identifiable if they can be identified "directly or indirectly, in particular by reference to an identifier."

An order ID (ORD-4521893) is an indirect identifier. Alone, it doesn't identify Sarah Johnson. But combined with your order management database — which the AI vendor may or may not have access to — it identifies her with certainty.

GDPR Article 4(5)'s pseudonymization concept applies here: order IDs are pseudonyms that require additional information (the order database) for re-identification. When the organization controlling the pseudonym key (you, the data controller) sends that pseudonym to an external AI vendor, you're sharing pseudonymous data that may be re-identifiable.

The legal analysis: pseudonymous data sent to a third party who doesn't have the key is protected from re-identification by that third party — but you've still shared personal data requiring a legal basis and DPA agreement.

The Standard Anonymization Gap

Support teams implementing GDPR compliance for their AI tools often deploy standard PII detection:

What gets removed:

  • Customer names (PERSON entity detection) ✓
  • Email addresses (EMAIL_ADDRESS detection) ✓
  • Phone numbers (PHONE_NUMBER detection) ✓
  • Credit card numbers (CREDIT_CARD detection) ✓

What stays:

  • Order IDs (ORD-XXXXXXX format — not in standard entity library) ✗
  • Account numbers (ACC-XXXXXXXX-XX format) ✗
  • Ticket reference numbers (TKT-XXXXX format) ✗
  • Internal user IDs (UUID or custom format) ✗
  • Subscription IDs (SUB-XXXXXXXX format) ✗

The anonymized message looks like: "Hi, I'm [PERSON_1] and my order ORD-4521893 hasn't arrived yet. Please email me at [EMAIL_1]."

The order ID remains. Anyone who knows it's ORD-4521893 (which is literally everyone in your organization with CRM access) can immediately identify the customer this message refers to. The anonymization is incomplete.

Chrome Extension: Real-Time Custom Identifier Detection

For support agents using web-based AI tools (Claude, ChatGPT, Gemini) directly in their browser, the Chrome Extension provides real-time anonymization at the point of input:

  1. Support agent copies customer message to clipboard or types into the AI interface
  2. The Chrome Extension detects that the destination is an AI platform
  3. Standard PII is automatically detected and replaced
  4. Custom entity patterns (order IDs, account numbers in your specific format) are detected using saved team configuration
  5. The agent sees the anonymized message in the AI interface — never the original PII

The custom entity configuration (ORD-XXXXXXX pattern) is set once by the DPO or compliance team and applied to all team members using the extension. Individual agents don't need to know the technical details of what's being anonymized — they paste the message, it's clean.

MCP Server: API-Level Detection for Integrated Tools

For customer support platforms using AI through API integrations (Intercom with AI responses, Zendesk with AI drafting), the MCP Server provides middleware anonymization:

Integration flow:

  1. Customer message received in support platform
  2. Before passing to AI model: message routed through MCP anonymization endpoint
  3. Anonymization applied (standard + custom entities)
  4. Anonymized message sent to AI model
  5. AI response generated (no PII exposure)
  6. Response returned to support platform, agent reviews and edits

This integration is transparent to support agents — the workflow is unchanged. Anonymization happens at the API layer, not requiring any agent action.

Connector configuration: Define custom entities once in the MCP configuration. All API calls through the MCP automatically apply the full entity detection including custom patterns.

DPO Implementation Checklist

For the DPO reviewing AI-assisted customer support implementation:

1. Inventory all data flowing to AI:

  • Direct paste/input (browser-based AI tools)
  • API calls (AI integrated into support platform)
  • File attachments (if agents upload screenshots or documents)

2. Identify all identifier types in customer messages: Standard PII: names, emails, phones (covered by default detection) Custom identifiers: order IDs, account numbers, ticket numbers (require custom configuration)

3. Configure custom entity patterns: For each custom identifier format: define the pattern, test against sample messages, save to team preset

4. Implement anonymization at appropriate layers: Browser-based AI: Chrome Extension with team preset API-integrated AI: MCP Server or API-level preprocessing

5. Document for ROPA: Record that customer support AI processing uses automated PII anonymization, including which custom identifiers are detected. This is the technical safeguard documentation.

6. Validate with test scenarios: Send test messages containing all identifier types through the implemented anonymization. Verify all identifiers are removed before they reach the AI model.

Real-World Example: SaaS Customer Support

A SaaS company's customer support team uses Claude (via their internal AI platform) to draft support responses. Customer messages include:

  • Customer names and emails
  • Order IDs (ORD-XXXXXXX format)
  • Subscription IDs (SUB-XXXXXXXX format)
  • Feature flag names (sometimes contain internal customer identifiers)

Before GDPR review: All message content sent directly to AI model including order and subscription IDs.

After implementing custom entity detection:

  • ORD-XXXXXXX and SUB-XXXXXXXX patterns configured as custom entities
  • Chrome Extension deployed to support team with shared preset
  • DPO verified: test messages through the system show all identifiers removed

Support workflow change: Zero. Agents paste messages as before. Anonymization is invisible to them. The DPO has documentation of the technical safeguard.

Conclusion

GDPR-compliant customer support AI requires more than removing names and emails. Order IDs, account numbers, and ticket references are personal data that standard PII tools miss. The compliance gap between "we anonymize PII before AI" and "we actually anonymize all identifiers" is closed with custom entity configuration.

The fix is not complex: define your organization's identifier formats, test against sample messages, deploy to the team. The DPO can configure this in an afternoon. The ongoing compliance benefit — all customer PII removed before external AI processing — is permanent.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.