Back to BlogAI Security

How Samsung Lost Proprietary Source Code to ChatGPT Three Times in One Month

Three separate Samsung engineering teams pasted proprietary code and confidential data into ChatGPT in April 2023. Each incident revealed a different aspect of the same technical gap — and triggered an industry-wide AI ban wave.

March 5, 20269 min read
Samsung ChatGPT leaksource code protectionenterprise AI controlsinsider data leakageMCP Server anonymization

Three Engineering Teams, Three Incidents, One Month

In April 2023, Samsung Semiconductor disclosed three separate incidents in which employees had transmitted proprietary data to ChatGPT within a single month.

The incidents were not related to each other. They involved different employees in different roles, pursuing different tasks, on different days. They shared only two characteristics: each employee used ChatGPT to accomplish a legitimate work goal, and each inadvertently transmitted data that Samsung had not intended to share with OpenAI's infrastructure.

Incident 1: A software engineer was debugging code related to semiconductor equipment. Debugging complex systems is a common AI tool use case — providing code to an AI model and asking it to identify the source of unexpected behavior. The engineer pasted source code from Samsung's proprietary semiconductor equipment systems into ChatGPT. The code contained intellectual property related to Samsung's manufacturing processes.

Incident 2: An employee was preparing a meeting summary. AI-assisted note-taking and meeting summarization have become standard workflow tools across industries. The employee submitted meeting notes to ChatGPT for summarization. Those meeting notes contained confidential internal discussions — business strategy, technical roadmaps, and other information Samsung considered non-public.

Incident 3: A third employee sought optimization suggestions for a database query. Database optimization is a technically demanding task where AI assistance provides genuine value. The employee provided the database structure and query logic to ChatGPT. The query logic contained references to proprietary data structures and business logic.

Why the Employees Did It

None of the three Samsung employees were acting irresponsibly by their own professional standards. They were using an AI tool for tasks that AI tools are designed to assist with: code debugging, text summarization, technical optimization.

The missing element in each case was technical friction. No system intercepted the submission before it reached OpenAI's servers. No control flagged proprietary code identifiers before they left the corporate network. No architectural layer stood between the employee's legitimate work need and the AI provider's infrastructure.

The employees were rational. The AI tool provided genuine assistance with legitimate work tasks. The policy warning existed but imposed no technical barrier. The consequence of non-compliance — potential disciplinary action for an accidental act — was abstract and remote compared to the immediate productivity benefit of the tool.

The result: three incidents in one month, three disclosures of proprietary information, and a corporate crisis that triggered a global wave of enterprise AI bans.

The Industry Response

Samsung's internal response was swift: ChatGPT access was restricted for corporate devices. The disclosure triggered a broader industry reaction that revealed how widespread the underlying condition was.

The organizations that announced AI tool bans or restrictions following the Samsung disclosure included Bank of America, Citigroup, Goldman Sachs, JPMorgan Chase, Apple, and Verizon. The financial sector response was particularly comprehensive — multiple major institutions simultaneously concluded that the risk profile of AI tools without technical controls was incompatible with their compliance obligations.

Each organization reached the same conclusion: the employees are not the problem, and policy warnings are not sufficient controls. Data was leaving their networks because no technical barrier prevented it, and policy alone cannot create a technical barrier.

The 71.6% Bypass Rate

The ban approach has a documented failure rate. LayerX research from 2025 found that 71.6% of employees subject to enterprise AI bans continued using AI tools through personal accounts or devices.

The bypass rate reflects basic behavior: when a tool provides genuine productivity value, users find workarounds rather than permanently abandon the tool. An employee who discovers that AI assistance substantially accelerates their work output will not stop using those tools because corporate policy prohibits them on corporate devices. They will use personal accounts on personal devices through channels the security team cannot see.

The practical consequence of the 71.6% bypass rate is that the AI ban achieves the worst possible outcome: corporate data reaches AI providers through channels with no security controls at all. At least corporate device access could theoretically be monitored. Personal account usage is invisible to the security team entirely.

Samsung's three incidents happened on corporate devices through corporate access. The employees who bypass the ban are doing the same thing — providing work-related data to AI models — through channels with no enterprise oversight.

The Technical Control That Addresses the Root Cause

The Samsung incidents were not caused by employee carelessness. They were caused by an architecture that provided no interception layer between employee AI use and external AI infrastructure.

Model Context Protocol (MCP) architecture provides a transparent proxy between AI clients and AI model APIs. For developers using Claude Desktop or Cursor IDE — the primary tools for the type of code debugging that caused Samsung's first incident — the MCP Server sits in the protocol path.

Before any text reaches the AI model, the MCP Server processes it through an anonymization engine. Source code is analyzed for proprietary identifiers: function names, variable names, internal API endpoints, database schema details, configuration values. These are replaced with structured tokens before the code reaches the AI model.

A developer asking Claude to debug proprietary Samsung semiconductor code through an MCP Server equipped with anonymization would transmit code in which proprietary identifiers had been replaced with tokens. The AI model assists with the debugging task using the anonymized code — which is sufficient for code analysis. The proprietary specifics never reach the AI provider's servers.

Incident 1 becomes technically impossible. The source code leaves the network in anonymized form. The AI provides the debugging assistance the engineer needed. Samsung's intellectual property stays in Samsung's control.

The same architecture applies to Incident 2 (meeting note summarization through browser-based AI, addressed by the Chrome Extension) and Incident 3 (database query optimization through any AI coding interface, addressed by MCP anonymization).

The Samsung incidents were a preview of a systematic problem. The technical controls that address the root cause now exist. The question is whether enterprises will deploy them or continue relying on bans that 71.6% of their employees are already bypassing.

Sources:

Ready to protect your data?

Start anonymizing PII with 285+ entity types across 48 languages.