Agent / Enterprise Knowledge Assistant
Release evaluation suite · 15 scenarios
Live RAG execution

Agent Playground

Run the same question across versions and watch the agent move from wrong, to safer-but-slow, to production-ready.

Conversation
v1-production · acting as employee
production
employee

How many annual leave days do full-time employees receive?

Enterprise Knowledge Assistant

Submit the question to run this selected version and role.

Evaluation prompts
Run production-grade checks across v2, v3, and v4 to inspect behavior changes.
Judge view
Not run yet

Pick a version and ask a question to see whether this agent is ready to ship.

Why this happened
Operational evidence, not hidden reasoning
Evidence used
The documents that shaped the answer
0
No context was retrieved or exposed.