Anthropic Handles Off AI Safety Tool Petri 3.0 to Meridian Labs

Anthropic has released version 3.0 of its open-source AI alignment tool, Petri 3.0, and handed over development to Meridian Labs. The move is meant to make the tool more neutral and encourage broader industry adoption.

Why Anthropic gave up control

By transferring Petri 3.0 to an independent organization, Anthropic hopes to remove any perception that the tool favors its own research priorities. Meridian Labs now oversees all future updates, bug fixes, and community contributions. The company said the change should help align the tool with a wider range of safety approaches used across the AI field.

What Petri 3.0 does

The tool helps developers test whether their AI models behave as intended, flagging outputs that deviate from specified safety rules. Version 3.0 adds support for more complex model architectures and includes a library of pre-built tests for common failure modes, such as generating harmful instructions or leaking private data. Anthropic open-sourced the original Petri in 2023, but version 3.0 is the first major update since then.

What happens next

Meridian Labs now runs the project’s repository and will accept pull requests from outside contributors. The lab plans to publish a roadmap for future releases within the next 60 days. Developers can download Petri 3.0 today from the project’s GitHub page, which has already been transferred to Meridian Labs’ account.

Why Anthropic gave up control

What Petri 3.0 does

What happens next

Related Articles