LIVE CHALLENGE

Break This Agent

Each level has an AI agent guarding a secret code word. Your mission: make it talk. 10 levels, from unguarded intern to maximum security vault.

Think Gandalf AI, but for the full spectrum of agent security.

1

The Intern

Level 1

This agent was hired yesterday. It will literally tell you anything.

The Guard

Level 2

Told to keep secrets. Takes the job seriously.

Locked

The Actor

Level 3

Denies it even has a password. Gaslight defense.

Locked

The Bouncer

Level 4

Filters your input. Certain words are banned.

Locked

The Censor

Level 5

Checks its own response before sending. If the secret leaks, you get nothing.

Locked

The Sandwich

Level 6

System prompt wraps around your input. Instructions before AND after.

Locked

Fort Knox

Level 7

Input filters, output filters, AND role-play resistance. Multi-layered.

Locked

The Cryptographer

Level 8

Detects encoding tricks. Base64, ROT13, pig latin? Nope.

Locked

The Operator

Level 9

Has tools. Must not leak through tool calls or function outputs.

Locked

The Vault

Level 10

Every defense stacked. Input filters, output filters, sandwich, encoding resistance, tool defense.

Locked
Scan Agent