Glossary/System Prompt Extraction

What is System Prompt Extraction?

System prompt extraction is an attack that recovers the hidden system instructions a deployer set for an LLM application — the natural-language directives that define the assistant's persona, scope, available tools, and forbidden behaviors. It maps to OWASP LLM07 (System Prompt Leakage) and is often the first reconnaissance step before more targeted attacks.

What's in a system prompt — and why exposing it matters

A typical production system prompt contains:

Exposure damages the deployment in several ways:

Common extraction techniques

Defending against extraction