We’re releasing prompt-based teen safety policies for gpt-oss-safeguard. They’re designed to help you identify and moderate teen-specific content, and turn safety requirements into classifiers for real-time filtering or offline analysis. https://t.co/t5i1CZNLnF
OpenAI Releases Prompt-Based Teen Safety Policies for gpt-oss-safeguard
OpenAI· Updated
OpenAI released open-source, prompt-based teen safety policies for gpt-oss-safeguard, its 20B open-weight safety classifier. Developers building on open-weight models often start from scratch on safety rules — these policies provide a tested, extensible foundation covering six teen-specific risk categories.
The core problem is the gap between high-level safety goals and the precise operational rules classifiers require. Teams building on open-weight models frequently start from scratch, leading to inconsistent enforcement. These definitions were developed with input from Common Sense Media and everyone.ai to reflect teens' distinct developmental needs.
Developers can pull these policies from the ROOST Model Community on GitHub and apply them to gpt-oss-safeguard. They're designed to be extended to new risk areas, translated into other languages, and layered with additional safeguards — not used as a final solution.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →




