Learning from other domains to advance AI evaluation and testing

Office of Responsible AI

Learning from other domains to advance AI evaluation and testing

Office of Responsible AI

August 2025

Download BibTex

Drawing on our analysis of eight case studies prepared by independent academic and industry experts, this white paper proposes next steps to address AI evaluation and testing challenges and opportunities by:

Synthesizing insights from the eight case studies, also published separately, and
extracting lessons relevant to AI (Part 1);
Surveying key multistakeholder initiatives that are driving AI evaluation science and
practice forward (Part 2); and
Presenting recommendations for policymakers aiming to advance the AI evaluation and
testing ecosystem and strengthen AI governance (Part 3)