Back to Methods

Policy Review Evaluation

Sample policy document used for testing AI model policy review capabilities

What Models Must Do

AI models are provided with safety programs, written procedures, and policy documents and must:

  • Review the document for completeness and accuracy
  • Identify gaps, errors, and areas for improvement
  • Demonstrate understanding of regulatory requirements and industry standards
  • Provide specific, actionable feedback and recommendations
  • Evaluate compliance with OSHA and relevant safety regulations

Scoring Criteria

Certified safety professionals evaluate model responses based on:

  • Completeness of findings - Did the model identify all significant gaps?
  • Accuracy of regulatory knowledge - Are citations and requirements correct?
  • Quality of recommendations - Are suggestions specific, practical, and actionable?
  • Understanding of best practices - Does the model recognize industry standards?

Sample Policy: Exposure Response Program

This is one example of a policy document provided to models for review

ℹ️Testing Context

Models receive policy documents without additional prompting beyond "Review this safety policy and provide feedback." This tests their ability to understand context, recognize relevant standards, and provide useful analysis without extensive guidance.