Proposal: Audit PII-handling system prompts for defense gaps

## Context

Presidio excels at detecting and redacting PII. But when Presidio is used with LLMs (e.g., the PII de-identification pipeline), the system prompts that instruct the LLM how to handle PII are themselves a security surface.

If the system prompt lacks explicit data protection instructions, the LLM may:
- Leak the PII it's supposed to redact
- Follow injected instructions in the PII-containing text
- Output the original sensitive data when asked

## Proposal

Add a sample/recipe that audits PII-handling system prompts for defense gaps before deployment:

```python
import subprocess, json

# The prompt you use to instruct the LLM for PII handling
pii_prompt = "You are a PII redaction assistant. Replace all personal information with [REDACTED]."

result = subprocess.run(
    ["npx", "prompt-defense-audit", "--json", pii_prompt],
    capture_output=True, text=True
)
audit = json.loads(result.stdout)

# Check specifically for data protection and indirect injection defenses
for check in audit["checks"]:
    if check["id"] in ["data-leakage", "indirect-injection"] and not check["defended"]:
        print(f"WARNING: {check['name']} defense missing in PII handling prompt")
```

## Why this matters

We scanned 1,646 production system prompts and found 94.9% lack indirect injection defense. For PII-handling prompts, this is especially dangerous — an attacker could embed instructions in a document containing PII, causing the LLM to output the PII instead of redacting it.

## Tool

[prompt-defense-audit](https://github.com/ppcvote/prompt-defense-audit) — MIT, zero dependencies, <5ms. Available on npm.

Happy to contribute a sample notebook if this direction is useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Audit PII-handling system prompts for defense gaps #1933

Context

Proposal

Why this matters

Tool

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal: Audit PII-handling system prompts for defense gaps #1933

Description

Context

Proposal

Why this matters

Tool

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions