OpenAI Integrates Moderation Scores Directly into Generation APIs

OpenAI

Jun 4, 2026

OpenAI now provides moderation scores directly within its Responses API and Completions API. This allows developers to get safety signals for both input and generated content in a single request, simplifying the integration of content policies into AI applications.

OpenAI has integrated moderation scores into its Responses API (a new API primitive for agents) and Completions API (standard for text generation). Developers can now include a moderation object in their generation requests to receive signals for both the input and the generated output. This functionality uses the omni-moderation-latest model, which is designed to classify harmful content in both text and images.

Moderation Model: `omni-moderation-latest`
Supported Inputs: Text, Images
Image File Size Limit: 20 MB
Moderation Endpoint Cost: Free
Streaming Behavior: Scores arrive after full output
Tool-Calling Coverage: Tool-call arguments, tool outputs in conversation content

This update streamlines AI safety and guardrails implementation by providing immediate feedback alongside generated content. Developers no longer need to make separate calls to a moderation endpoint, enabling quicker decisions on logging, routing for human review, or blocking outputs. This integration simplifies the process of enforcing application policies.

Developers can use these inline scores to enforce their application's content policy, such as filtering outputs or flagging content for review. The moderation endpoint itself is free to use, and the omni-moderation-latest model supports various harm categories for text and images, with image files up to 20 MB. For streaming responses, moderation scores are provided after the full output is available.

View the full update on developers.openai.com

OpenAI Developers

@OpenAIDevsJun 4

Moderation scores are now available in the Responses API and Completions API. Return moderation signals in the same request flow as generation, then decide how your app uses them for logging, routing, review, or blocking. https://t.co/0FMSLek2je

13250

View on X

Still wondering? A few quick answers below.

Moderation scores are signals that indicate the presence of potentially harmful content in text or images. They include a `flagged` status, specific `categories` of harm detected, and `category_scores` representing the model's confidence for each category.

Moderation scores are now available directly within OpenAI's Responses API and Completions API. This allows developers to receive these safety signals as part of their content generation requests, streamlining the moderation workflow.

The `omni-moderation-latest` model accepts both text and image inputs for moderation. It can detect various harm categories, with some categories supporting both text and images (e.g., `violence`, `self-harm`) and others being text-only (e.g., `harassment`, `hate`).

Developers can use moderation scores to enforce their application's content policies. This includes logging flagged content, routing it for human review, or blocking it entirely. Receiving scores inline with generation simplifies the integration of safety checks into AI-powered applications.

Yes, the moderation endpoint itself is free to use. This allows developers to implement content safety measures without incurring additional costs for standalone moderation requests.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from OpenAI →

Keep reading

OpenAI Web Search API Now Returns Images for Visual Context

OpenAI has updated its Responses API to support image results in addition to text when using the web search tool. This allows developers to build applications that can surface visual content like products, places, and visual references directly from the web. The update enhances the multimodal capabilities of AI applications by providing richer, visually-grounded responses.

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity

CloudflareMay 21

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity

Cloudflare added native support for the Claude Compliance API to its security dashboard, allowing teams to monitor AI interactions without installing endpoint agents. The integration scans chat messages, file uploads, and generated artifacts for sensitive data leaks across both Claude Enterprise and Platform accounts.

What are moderation scores in OpenAI's APIs?

Which OpenAI APIs now include moderation scores?

What content types can the `omni-moderation-latest` model moderate?

How do moderation scores help developers?

Is the moderation endpoint free to use?

Keep reading

OpenAI Web Search API Now Returns Images for Visual Context

OpenAI Web Search API Now Returns Images for Visual Context

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity

Keep reading

OpenAI Web Search API Now Returns Images for Visual Context

OpenAI Web Search API Now Returns Images for Visual Context

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity

Cloudflare Integrates Claude Compliance API to Audit Enterprise AI Activity