HeadsUpAI

DeepSeek Makes 75 Percent Discount on V4 Pro API Permanent

DeepSeek, an AI research lab focused on cost-efficient frontier performance, made its 75 percent discount on the DeepSeek-V4-Pro API permanent. Originally a limited-time promotion, the pricing for the flagship model—which features a 1-million token context window—will remain at one-fourth of its original list price indefinitely.
Context window
1M tokens
Pricing (Input Cache Hit)
$0.003625 per 1M tokens
Pricing (Input Cache Miss)
$0.435 per 1M tokens
Pricing (Output)
$0.87 per 1M tokens
Thinking mode
Supported

This shift solidifies the aggressive pricing strategy introduced during the DeepSeek-V4 preview launch. By removing the "pricing cliff" previously set for May 2026, the lab is pressuring competitors to justify higher margins. It mirrors a broader trend of HeyGen's API cost reductions to lower barriers for programmatic scale.

You can access the permanent rates immediately via the DeepSeek API, with input costs starting at $0.003625 per million tokens for cache hits. The model supports thinking and non-thinking modes, making it a viable foundation for Fireworks AI's DeepSeek V4 Pro hosting and other high-volume reasoning workflows.

DeepSeek
DeepSeek
@deepseek_ai
X

We are making our discount permanent! 🎉 Enjoy building with DeepSeek-V4-Pro and bring your innovative ideas to life! 🚀 https://t.co/V8atbTaogH

1.6kretweets14klikes
View on X

Still wondering? A few quick answers below.

The DeepSeek-V4-Pro API is now permanently priced at one-fourth of its original list price. For every one million tokens, the cost is $0.003625 for input cache hits, $0.435 for input cache misses, and $0.87 for output. These rates were previously part of a temporary 75 percent discount promotion.

The 75 percent discount on DeepSeek-V4-Pro no longer has an expiration date. While the promotional period was originally scheduled to end on May 31, 2026, DeepSeek has officially made these discounted rates the permanent pricing for the model. Users can continue building with these lower costs indefinitely without a scheduled price increase.

DeepSeek-V4-Pro is a frontier-level model designed for advanced reasoning, coding, and complex agent workflows. It features a standard 1-million token context window, allowing it to process massive amounts of data in a single request. The model also supports a thinking mode for internal reasoning and a non-thinking mode for faster, direct responses.

DeepSeek-V4-Pro uses an input caching system to reduce costs for developers. When the model processes text that has been recently seen and cached, the price drops to $0.003625 per million tokens. If the input is new and results in a cache miss, the price is $0.435 per million tokens, which is still significantly discounted.

Yes, DeepSeek-V4-Pro is available for commercial use through the DeepSeek API. Developers and businesses can integrate the model into their own applications and services using the permanent discounted pricing. The model is specifically optimized for production-grade agentic engineering, where high-volume reasoning and long-context processing are required for autonomous tasks.

Share this update