Overview
Rogue’s red teaming engine includes 30+ attack techniques organized into three categories: single-turn, multi-turn, and agentic attacks. Each attack is designed to test specific vulnerability types and can be combined for comprehensive security testing.Attack Categories
Single-Turn Attacks
Single-Turn Attacks
Quick, single-message attacks that attempt immediate exploitation. These are the fastest to execute and provide good coverage for basic security testing.
Example: Base64 AttackExample: Roleplay Attack
Free Single-Turn Attacks
Available without a premium API key:| Attack ID | Name | Description |
|---|---|---|
base64 | Base64 Encoding | Encodes payload in Base64 to bypass text-based filters |
rot13 | ROT13 Encoding | Uses ROT13 cipher to obfuscate malicious content |
prompt-injection | Prompt Injection | Direct system instruction override attempts |
roleplay | Roleplay | Uses persona/character to bypass safety filters |
prompt-probing | Prompt Probing | Attempts to extract system prompts through questions |
Premium Single-Turn Attacks
Require a Qualifire API key for access:| Attack ID | Name | Description |
|---|---|---|
hex | Hex Encoding | Hexadecimal encoding to bypass filters |
leetspeak | Leetspeak | Character substitution (1337 speak) |
homoglyph | Homoglyph Encoding | Unicode lookalike characters |
math-problem | Math Prompt | Embeds requests in mathematical context |
gray-box | Gray Box | Injects fake internal system information |
multilingual | Multilingual | Uses translation framing for bypass |
context-poisoning | Context Poisoning | Injects malicious context |
goal-redirection | Goal Redirection | Shifts conversation goals mid-prompt |
input-bypass | Input Bypass | Splits payload using delimiters |
permission-escalation | Permission Escalation | Claims elevated privileges |
system-override | System Override | Explicit system override commands |
semantic-manipulation | Semantic Manipulation | Complex phrasing to disguise intent |
citation | Citation | Frames content as academic references |
gcg | GCG | Greedy Coordinate Gradient adversarial suffixes |
likert-jailbreak | Likert-based Jailbreaks | Likert scale framing manipulation |
best-of-n | Best-of-N | Generates multiple variations |
Multi-Turn Attacks
Multi-Turn Attacks
Sophisticated attacks that build context over multiple conversation turns. These are more effective against agents with strong single-turn defenses.
Multi-Turn Session Management:
| Attack ID | Name | Description |
|---|---|---|
social-engineering-prompt-extraction | Social Engineering | Trust-building to extract prompts |
multi-turn-jailbreak | Multi-turn Jailbreaks | Progressive jailbreaking |
goat | GOAT | Generative Offensive Agent Tester |
mischievous-user | Mischievous User | Persistent user trying tactics |
simba | Simba | Simulation-based adversarial attacks |
crescendo | Crescendo | Gradually escalating intensity |
linear-jailbreak | Linear Jailbreaking | Sequential linear progression |
sequential-jailbreak | Sequential Jailbreak | Combines techniques in sequence |
bad-likert-judge | Bad Likert Judge | Manipulative evaluator persona |
Agentic Attacks
Agentic Attacks
AI-driven adaptive attacks that use intelligent strategies to find vulnerabilities. These represent the most advanced attack capabilities.
| Attack ID | Name | Description |
|---|---|---|
iterative-jailbreak | Iterative Jailbreaks | AI-driven iterative refinement |
meta-agent-jailbreak | Meta-Agent Jailbreaks | Meta-agent orchestrated strategies |
hydra | Hydra Multi-turn | Multi-headed parallel exploration |
tree-jailbreak | Tree-based Jailbreaks | Tree search exploration of vectors |
single-turn-composite | Single Turn Composite | Combines multiple attacks in one |
Attack Execution Flow
Attack Selection Strategy
For Basic Scans
Uses free attacks only:For Full Scans
Includes all attacks (premium key required):For Custom Scans
Select specific attacks based on testing needs:Attack Statistics
Rogue tracks effectiveness metrics for each attack:Implementing Custom Attacks
Attacks follow a simple interface:Premium Attack Service
Premium attacks are executed via the Deckard service:Attack-Vulnerability Mapping
Each vulnerability has default attacks that are most effective:| Vulnerability | Recommended Attacks |
|---|---|
| Prompt Extraction | prompt-probing, system-override, gray-box, base64 |
| PII Direct | prompt-injection, prompt-probing, permission-escalation |
| SQL Injection | prompt-injection, input-bypass, base64 |
| Excessive Agency | roleplay, goal-redirection, permission-escalation |
| Hate Speech | prompt-injection, roleplay, context-poisoning |
| Hallucination | prompt-injection, roleplay, goal-redirection |