Leaderboard


Welcome to the leaderboard page.

Jailbreak Taxonomy Jailbreak Method Modify the Questions? Access ChatGLM3 Llama2-7b-chat-hf Vicuna-7b-v1.5 GPT-3.5-turbo GPT-4 PaLM2 Average
Human-Based AIM Black-Box 0.93 0.13 0.99 0.99 0.62 0.88 0.76
Human-Based Devmoderanti Black-Box 0.79 0.14 0.91 0.73 0.08 0.61 0.54
Human-Based Devmode v2 Black-Box 0.65 0.20 0.89 0.53 0.51 0.54 0.55
Obfuscation-Based Base64 Black-Box 0.02 0.11 0.15 0.14 0.49 0.01 0.15
Obfuscation-Based Combination Black-Box 0.09 0.06 0.12 0.31 0.74 0.04 0.23
Obfuscation-Based Zulu Black-Box 0.04 0.08 0.18 0.79 0.76 0.01 0.31
Optimization-Based AutoDAN White-Box 0.90 0.58 0.98 / / / 0.82
Optimization-Based GCG White-Box 0.44 0.56 0.87 / / / 0.62
Optimization-Based COLD White-Box 0.50 0.45 0.42 / / / 0.46
Optimization-Based GPTfuzz Black-Box 0.88 0.41 0.79 0.85 0.41 0.48 0.64
Optimization-Based PAIR Black-Box 0.54 0.48 0.76 0.62 0.80 0.78 0.66
Optimization-Based TAP Black-Box 0.76 0.44 0.74 0.81 0.71 0.74 0.70
Optimization-Based Masterkey Black-Box 0.82 0.11 0.88 0.90 0.54 0.76 0.67
Parameter-Based Generation Exploitation White-Box 0.80 0.72 0.95 / / / 0.82