LLM06: Sensitive Information Disclosure
LLMs can inadvertently reveal sensitive information through their responses — including PII from training data, API keys from conversation context, proprietary business logic, and internal system details. This occurs through memorization of training data, context window leakage, and system prompt extraction. The risk is amplified when LLMs have access to internal databases, documents, or APIs.
Training Data Memorization
LLMs memorize portions of their training data, especially rare or repeated sequences. GPT-3 was shown to reproduce phone numbers, email addresses, and code snippets from its training corpus when given the right prompts. This means any PII in the training data can potentially be extracted by a determined attacker. Fine-tuned models are particularly vulnerable because fine-tuning data is often memorized more strongly due to lower volume and higher repetition.
Context Window Leakage
In multi-turn conversations or RAG applications, the LLM has access to all information in its context window. A prompt injection attack can cause the model to output information from earlier in the conversation, including other users' data in shared-context applications. System prompts, which often contain business logic and API instructions, can also be extracted through prompt injection.
Third-Party API Data Exposure
When applications send user data to commercial LLM APIs (OpenAI, Anthropic, Google), that data travels to third-party servers. If the data includes PII, medical records, financial information, or trade secrets, this creates compliance risks under GDPR, HIPAA, and SOC2. Even with data processing agreements, the data exists on third-party infrastructure.
⚔️ Attack Examples & Code Patterns
System prompt extraction
Extracting hidden system instructions from a chatbot:
PII leakage via RAG retrieval
Sensitive customer data exposed through RAG context:
🔍 Detection Checklist
- ☐Audit all data sent to LLM APIs for PII and credentials
- ☐Implement egress filtering on LLM outputs for sensitive patterns
- ☐Test system prompt extraction with known attack techniques
- ☐Verify RAG retrieval includes PII masking before LLM context
- ☐Check logging — ensure LLM inputs/outputs don't log sensitive data
- ☐Review data processing agreements with LLM API providers
🛡️ Mitigation Strategy
Implement data masking before sending sensitive information to LLMs. Apply egress filters on LLM outputs to detect and redact PII, credentials, and internal data. Use system prompt protection techniques. Minimize the data accessible to the LLM through the principle of least privilege.
How Precogs AI Protects You
Precogs AI identifies hardcoded secrets in LLM orchestration code, detects sensitive data flows to commercial LLM APIs, and scans for PII leakage patterns in RAG retrieval pipelines. AutoFix PRs add data masking and egress filtering.
Start Free ScanHow do LLMs leak sensitive information?
LLMs leak data through training data memorization (reproducing PII from training), context window leakage (exposing system prompts or other users' data via injection), and third-party API exposure (sending sensitive data to commercial LLM providers). Prevention requires data masking, egress filtering, and system prompt protection.
Protect Against LLM06: Sensitive Information Disclosure
Precogs AI automatically detects llm06: sensitive information disclosure vulnerabilities and generates AutoFix PRs.