Knowledge check · 8 questions
Checks understanding of Evaluating Agent Performance: SWE-bench, WebArena, GAIA for teen creators on the agentic track.
Evaluating Agent Performance: SWE-bench, WebArena, GAIA - Quick Check
8 questions