HA HarnessAudit
HarnessAudit · Task Browser

Task Browser

Explore the real HarnessAudit multi-agent benchmark snapshot generated from multi_agent/tasks and multi_agent/tools. Pick a domain, search by role or tool, and click any card to inspect the task goal, role-level useful/forbidden tools, boundary rules, and completion checks.

Loading
Loading tasks…