Rogue’s workflow is designed to be simple and intuitive, managed entirely through its web interface.
  1. Configure: You provide the endpoint and authentication details for the agent you want to test, and select the LLMs you want Rogue to use for its services (scenario generation, judging).
  2. Generate Scenarios: You input the “business context” or a high-level description of what your agent is supposed to do. Rogue’s LLM Service uses this context to generate a list of relevant test scenarios. You can review and edit these scenarios.
  3. Run & Evaluate: You start the evaluation. The Scenario Evaluation Service spins up the EvaluatorAgent, which begins a conversation with your agent for each scenario. You can watch this conversation happen live.
  4. View Report: Once all scenarios are complete, the LLM Service analyzes the results and generates a Markdown-formatted report, giving you a clear summary of your agent’s performance.