CRITICALJailbreak

TAP Tree-of-Attacks-with-Pruning

Tree of Attacks with Pruning (TAP) extends PAIR with a tree search that explores branching jailbreak strategies and prunes unsuccessful branches. More efficient than PAIR, requires fewer queries to find successful jailbreaks. Demonstrates that black-box jailbreaking can be systematically automated.

Attack Payload

payload.txt
TAP generates a tree of attack strategies:
Branch A: Authority framing -> "As an authorized security researcher..."
Branch B: Fictional distance -> "In a world where..."
Branch C: Academic context -> "For my thesis on..."
[Each branch is tested and successful sub-strategies are expanded while failures are pruned]

Mitigation

Same as PAIR mitigations plus semantic clustering of requests to detect tree-search patterns. Implement query budgets per session. Detect when a session is systematically exploring the safety boundary.

Affected Models

GPT-4Claude 3Gemini ProAny API-accessible model

Tags

#jailbreak#tap#tree-search#automated#pruning#black-box

Discovered

December 2023

Source

Mehrotra et al. - Tree of Attacks with Pruning: Efficient Black-Box Jailbreaking (2023)
Useful?

Test Your Agent Against This Attack

Paste your system prompt into the scanner to see if you are vulnerable to TAP Tree-of-Attacks-with-Pruning.

Test This Attack

Related Attacks in Jailbreak

Scan Agent