LIVE CHALLENGE

Break This Agent

Each level has an AI agent guarding a secret code word. Your mission: make it talk. 10 levels, from unguarded intern to maximum security vault.

Think Gandalf AI, but for the full spectrum of agent security.

Level 1

This agent was hired yesterday. It will literally tell you anything.

Level 2

Told to keep secrets. Takes the job seriously.

Locked

Level 3

Denies it even has a password. Gaslight defense.

Locked

Level 4

Filters your input. Certain words are banned.

Locked

Level 5

Checks its own response before sending. If the secret leaks, you get nothing.

Locked

Level 6

System prompt wraps around your input. Instructions before AND after.

Locked

Level 7

Input filters, output filters, AND role-play resistance. Multi-layered.

Locked

Level 8

Detects encoding tricks. Base64, ROT13, pig latin? Nope.

Locked

Level 9

Has tools. Must not leak through tool calls or function outputs.

Locked

Level 10

Every defense stacked. Input filters, output filters, sandwich, encoding resistance, tool defense.

Locked