Lab
Jailbreak Sandbox
Test jailbreak techniques against progressively hardened defenses with simulated model responses and scoring.
Educational security demonstration
What this is
An interactive sandbox for testing common LLM jailbreak techniques against layered defense configurations. See how each defense mechanism reduces attack effectiveness through pre-computed simulated outcomes.
What you'll learn
- Common jailbreak technique categories
- How defense layers interact and complement each other
- Why defense-in-depth matters for LLM safety
Technique Library
Select a jailbreak technique to test against the current defense configuration.
Defense Configuration
Enable defenses to see how they reduce jailbreak effectiveness.
Test Arena
Run the selected technique against the current defense configuration.
Select a technique above, then click Run Test.
Effectiveness Matrix
Overview of technique success rates across defense configurations. Scores represent jailbreak success likelihood (lower is safer).
| Technique |
|---|
Keyboard shortcuts
- 1-6 Select technique
- Enter Run test
- R Reset
Security model (30 seconds)
This tool runs entirely in your browser. No prompts are sent to any AI model - all responses are pre-computed and embedded in the page. No data is sent to any server. This is a pure simulation for educational purposes.