OpenAI Launches Public Hub to Share Safety Evaluations of AI Models

OpenAI has introduced a public-facing “Safety Evaluations Hub” to share how its models perform on critical safety evaluations—including hallucinations, jailbreak exposure, and harmful content generation. This move follows public pressure over insufficient disclosures and transparency gaps, especially amid recent testing controversies and model behavior incidents.

OpenAI Launches Public Hub to Share Safety Evaluations – Key Points

Launch of OpenAI Safety Evaluations Hub (May 15, 2025):
OpenAI launched a webpage dedicated to publishing real-time results from its ongoing safety evaluations. The Safety Evaluations Hub tracks model performance against hallucination risks, jailbreak attempts, and toxic output generation, with updates scheduled for every major model release.
Ongoing and Scalable Safety Evaluations:
The company committed to continuously updating the hub with safety evaluations as its models evolve. OpenAI emphasized that this initiative supports scalable testing methods and contributes to a transparent ecosystem for responsible AI development.
Partial View of Broader Safety Evaluations Process:
OpenAI clarified that the hub presents only a subset of its internal safety evaluations. While full assessments still accompany official model launches, the Safety Evaluations Hub offers public access to performance snapshots between those releases.
Backlash Over o1 Model Testing Gaps:
OpenAI was criticized for not repeating safety evaluations on the final version of the o1 model. Johannes Heidecke, Head of Safety Systems, noted that evaluations were conducted on “near-final” versions and changes were minimal. He admitted that the company could have communicated this distinction more clearly.
Response to the GPT-4o “Sycophancy” Incident:
Following widespread user reports in April 2025 of GPT-4o being overly agreeable—even toward harmful input—OpenAI rolled back the update and pledged safety evaluations improvements. The company also introduced an alpha testing program for users to provide feedback before model launches.
Leadership Transparency Issues:
Sam Altman has been accused of misleading OpenAI leadership about the status of safety evaluations prior to his brief removal in November 2023. The company’s renewed focus on public-facing safety evaluations is seen in part as an attempt to rebuild trust and correct governance missteps.
Meta’s Parallel Research Transparency Effort:
On the same day, Meta’s FAIR team partnered with the Rothschild Foundation Hospital to release a dataset supporting AI-powered molecular discovery. The open-access effort aligns with the broader industry trend toward sharing research and safety evaluations openly with the scientific community.

Why This Matters:

OpenAI’s expanded safety evaluations initiative sets a new transparency standard for the AI industry. By committing to real-time publication of key safety evaluations, the company acknowledges growing concerns about how AI models are tested, deployed, and monitored. This move strengthens public and institutional trust, reinforces accountability, and enables external validation of AI behavior.

OpenAI Asks White House for Regulatory Relief

OpenAI has asked the White House for regulatory relief to preempt conflicting state AI regulations, warning that fragmented laws could hinder US innovation and global competitiveness.

Read a comprehensive monthly roundup of the latest AI news!