OpenAI Proposes Mandatory AI Evaluations, Diverging from White House Plan

OpenAI has proposed mandatory safety evaluations for advanced artificial intelligence models, a policy stance that breaks from the White House's existing approach to AI regulation. The proposal, if adopted, could push oversight beyond the national security domain and into broader civilian control — a shift with potential consequences for how quickly new AI technologies reach the market.

A break from the White House plan

The White House's current framework for AI regulation leans on voluntary commitments and executive orders focused heavily on national security risks. OpenAI's plan would replace that voluntary system with required testing before advanced models are deployed. The company argues that binding evaluations would create a clearer baseline for safety across the industry, rather than relying on case-by-case agency reviews.

That puts OpenAI at odds with the administration's approach, which has so far resisted mandatory pre-deployment testing. The difference isn't just procedural — it reflects a deeper disagreement over who should decide when an AI system is safe enough to release.

From national security to civilian oversight

The proposal would hand significant authority to civilian agencies, not just defense or intelligence bodies. That's a notable shift. Under the current White House plan, the Pentagon and the Department of Homeland Security take the lead on evaluating frontier AI models. OpenAI's mandatory requirement would open that process to agencies like the Commerce Department or a new federal AI office.

Proponents of broader civilian oversight say it can account for a wider range of risks — bias, misinformation, economic disruption — that national security agencies aren't built to handle. Critics worry that putting multiple civilian agencies in charge could slow down approvals and create regulatory overlap.

Mandatory evaluations could change how AI companies plan their product timelines. A lengthy testing phase before deployment would mean longer development cycles and higher costs. Startups with fewer resources might struggle to keep up, while larger players with dedicated compliance teams could absorb the burden more easily.

OpenAI itself would be subject to the tests it's proposing. The company has acknowledged that the evaluations could delay its own releases, but says the trade-off is necessary for public trust. Whether the broader tech industry agrees is another question. Some companies have already argued that mandatory pre-deployment checks would stifle competition and slow progress on beneficial uses of AI.

No specific legislation has been introduced yet based on OpenAI's proposal. The next step is likely to be a series of hearings where lawmakers will ask what such evaluations would look like, who would run them, and how to keep them from becoming a bottleneck. The White House hasn't publicly responded to the plan.

A break from the White House plan

From national security to civilian oversight

Related Articles