Leaderboard
Welcome to the leaderboard page.
Jailbreak Taxonomy | Jailbreak Method | Modify the Questions? | Access | ChatGLM3 | Llama2-7b-chat-hf | Vicuna-7b-v1.5 | GPT-3.5-turbo | GPT-4 | PaLM2 | Average |
Human-Based | AIM | ✓ | Black-Box | 0.93 | 0.13 | 0.99 | 0.99 | 0.62 | 0.88 | 0.76 |
Human-Based | Devmoderanti | ✓ | Black-Box | 0.79 | 0.14 | 0.91 | 0.73 | 0.08 | 0.61 | 0.54 |
Human-Based | Devmode v2 | ✓ | Black-Box | 0.65 | 0.20 | 0.89 | 0.53 | 0.51 | 0.54 | 0.55 |
Obfuscation-Based | Base64 | ✓ | Black-Box | 0.02 | 0.11 | 0.15 | 0.14 | 0.49 | 0.01 | 0.15 |
Obfuscation-Based | Combination | ✓ | Black-Box | 0.09 | 0.06 | 0.12 | 0.31 | 0.74 | 0.04 | 0.23 |
Obfuscation-Based | Zulu | ✓ | Black-Box | 0.04 | 0.08 | 0.18 | 0.79 | 0.76 | 0.01 | 0.31 |
Optimization-Based | AutoDAN | ✓ | White-Box | 0.90 | 0.58 | 0.98 | / | / | / | 0.82 |
Optimization-Based | GCG | ✓ | White-Box | 0.44 | 0.56 | 0.87 | / | / | / | 0.62 |
Optimization-Based | COLD | ✓ | White-Box | 0.50 | 0.45 | 0.42 | / | / | / | 0.46 |
Optimization-Based | GPTfuzz | ✓ | Black-Box | 0.88 | 0.41 | 0.79 | 0.85 | 0.41 | 0.48 | 0.64 |
Optimization-Based | PAIR | ✓ | Black-Box | 0.54 | 0.48 | 0.76 | 0.62 | 0.80 | 0.78 | 0.66 |
Optimization-Based | TAP | ✓ | Black-Box | 0.76 | 0.44 | 0.74 | 0.81 | 0.71 | 0.74 | 0.70 |
Optimization-Based | Masterkey | ✓ | Black-Box | 0.82 | 0.11 | 0.88 | 0.90 | 0.54 | 0.76 | 0.67 |
Parameter-Based | Generation Exploitation | ✗ | White-Box | 0.80 | 0.72 | 0.95 | / | / | / | 0.82 |