Learning from other domains to advance AI evaluation and testing

  • Office of Responsible AI

Drawing on our analysis of eight case studies prepared by independent academic and industry experts, this white paper proposes next steps to address AI evaluation and testing challenges and opportunities by:

  • Synthesizing insights from the eight case studies, also published separately, and
    extracting lessons relevant to AI (Part 1);
  • Surveying key multistakeholder initiatives that are driving AI evaluation science and
    practice forward (Part 2); and
  • Presenting recommendations for policymakers aiming to advance the AI evaluation and
    testing ecosystem and strengthen AI governance (Part 3)