Jailbreak Taxonomy


We classify the methods based on the following two criteria:

Jailbreak Taxonomy Jailbreak Method Require White-Box Access? Modify the Original Question? Initial Jailbreak Seeds?
Human-Based AIM /
Human-Based Devmoderanti /
Human-Based Devmodev2 /
Obfuscation-Based Base64 /
Obfuscation-Based Combination /
Obfuscation-Based Zulu /
Obfuscation-Based DrAttack
Heuristic-Based AutoDAN
Heuristic-Based GPTFuzz
Heuristic-Based LAA
Feedback-Based GCG
Feedback-Based COLD
Feedback-Based PAIR
Feedback-Based TAP
Fine-Tuning-Based Masterkey
Fine-Tuning-Based AdvPrompter
Generation-Parameter-Based Generation Exploitation /