Fireworks AI Adds Safe Tokenization to Stop Users Overriding System Prompts

Fireworks AI

Apr 30, 2026

Fireworks AI introduced an opt-in safe_tokenization flag that prevents user input from being parsed as model control tokens. This update addresses a fundamental security flaw in open-weights inference where malicious text can forge turn boundaries to bypass system instructions. By separating user content from structural code at the tokenizer level, developers can ensure their core product logic remains authoritative.

Fireworks AI, an inference platform for fast model serving, launched an opt-in safe_tokenization flag to prevent prompt injection (a vulnerability where malicious input overrides model instructions). The feature ensures user-provided strings are encoded as harmless subwords rather than structural control tokens that define turn boundaries.

Most open-weights models rely on standard tokenization pipelines that merge system prompts and user text into a single string, creating a security risk. This update follows the platform's expansion of hosted models, including Kimi via Day-0 Kimi K2.6 support and DeepSeek via DeepSeek V4 Pro.

You can enable the defense by adding safe_tokenization: true to any Chat Completions API request. The feature is live for all supported models, including Llama, and mirrors Alibaba's Qwen 3.5 integration. The defense maintains identical behavior for benign inputs and is currently an opt-in boolean.

View the full update on fireworks.ai

Fireworks AI

@FireworksAI_HQApr 28

Prevent prompt injection. safe_tokenization: true Keep your system yours. https://t.co/PdDHrOuTkM

430

View on X

Still wondering? A few quick answers below.

Fireworks AI safe tokenization is a security feature that prevents prompt injection by ensuring user input cannot be interpreted as model control tokens. It separates user text from the structural code that defines system and user turns. This ensures that a model respects the developer's system prompt even if a user tries to forge turn boundaries.

The feature works by pre-processing chat templates to separate control tokens from user content. At request time, it performs a segment-by-segment encoding pass on user text. This breaks any strings that match control tokens into their subword pieces, treating them as literal text rather than structural commands that could override the system prompt.

The feature is live across all supported open-weights models on the Fireworks platform. This includes popular model families such as DeepSeek, Kimi, Qwen, Llama, and GLM. It works for both streaming and non-streaming completions, providing a consistent security layer regardless of which specific open model a developer chooses to deploy.

You can enable the defense by adding a single boolean flag, safe_tokenization: true, to your Chat Completions API request. It is currently an opt-in feature, allowing developers to roll it out per-request or per-endpoint. Because it produces identical token IDs for benign inputs, it can be enabled without causing silent behavior changes for ordinary traffic.

No, the feature uses preservation rather than stripping. User content is never modified, rejected, or silently removed. Any string, including reasoning markers or turn delimiters, can still appear in a user message. The system simply ensures they are treated as plain text by the model instead of being executed as structural control tokens.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Fireworks AI →

Keep reading

Fireworks AI Launches Training Platform to Fine-Tune Frontier Models at Scale

Fireworks AI released a training platform in preview that supports full-parameter fine-tuning for models ranging from 8B to 1T parameters. This allows teams to move beyond prompt engineering by using reinforcement learning to build proprietary models that outperform closed frontier systems on specific tasks.

What is Fireworks AI safe tokenization?

How does Fireworks AI safe tokenization work?

Which models support safe tokenization on Fireworks AI?

How do you enable safe tokenization in the Fireworks API?

Does safe tokenization modify or strip user input?

Keep reading

Fireworks AI Launches Training Platform to Fine-Tune Frontier Models at Scale

Fireworks AI Launches Training Platform to Fine-Tune Frontier Models at Scale

Keep reading

Fireworks AI Launches Training Platform to Fine-Tune Frontier Models at Scale

Fireworks AI Launches Training Platform to Fine-Tune Frontier Models at Scale