๐ Better inference efficiency, lower costs, broader access. MiMo-V2.5 Series API pricing is now permanently reduced โ by up to 99% compared to previous pricing. โจ Unified pricing across all context lengths. MiMo Token Plans have also been upgraded: โข 5โ8ร more usable tokens at the same price โข Simpler and more transparent billing rules ๐ As a thank-you to current users, all current Token Plan credits will be fully reset. ๐ง MiMo-V2.5-TTS remains free for a limited time. โฐ Effective May 26 at 6:00 PM PDT. These improvements are powered by continued inference optimization and serving efficiency upgrades across the MiMo stack. ๐ ๏ธ Weโll also publish a detailed technical blog on the inference optimizations later โ stay tuned.
Xiaomi MiMo Slashes V2.5 API Pricing by 99 Percent
Xiaomi MiMo permanently reduced pricing for its MiMo-V2.5 model series by up to 99%. The update removes tiered pricing for long-context inputs, charging a single rate regardless of context usage. This follows the Xiaomi MiMo-V2.5 series launch which introduced native multimodality and reasoning capabilities.
- Price reduction
- Up to 99%
- Token Plan quota
- 5โ8x increase
- Context window
- 1M tokens
- KV Cache data reduction
- Nearly 1/7 of original volume
- Effective date
- May 26, 2026
This move intensifies the industry-wide shift toward low-cost inference. By slashing costs, Xiaomi MiMo answers DeepSeek's V4 Pro API discount to lower the economic barrier for autonomous agents. The efficiency gains stem from Sliding Window Attention, which reduces internal data transfer by 85%.
You can access the new rates immediately via the MiMo API, with Token Plan quotas increased by 5โ8x at no cost. All active credits for current subscribers have been fully reset to reflect the new limits. While the initial creator incentive program has concluded, benefits for Apache Software Foundation members remain active.
Xiaomi MiMo
@XiaomiMiMo
391retweets3.3klikes
View on XStill wondering? A few quick answers below.
Xiaomi has permanently reduced the API pricing for the MiMo-V2.5 series by up to 99 percent compared to previous rates. The new billing system is simplified and no longer differentiates based on input context length, meaning developers pay a unified rate regardless of how many tokens are used within the one-million-token window.
The price cuts are driven by significant inference optimizations in the MiMo technical stack. By implementing Sliding Window Attention based on SGLang HiCache, the team reduced data transfer volume for the KV Cache to nearly one-seventh of previous levels. This increased the number of cacheable tokens by five times, drastically improving throughput and efficiency.
The 100 Trillion Token Creator Incentive Plan officially concluded on May 26, 2026, after all tokens were claimed by developers ahead of schedule. While this specific program has ended, the exclusive benefit program for Apache Software Foundation committers remains available long-term and is not affected by the conclusion of the broader incentive plan.
Existing Token Plan subscribers now receive five to eight times more usable tokens for the same price under the upgraded billing rules. As a one-time benefit, Xiaomi has fully reset the credit quotas for all users with an active plan, allowing them to start fresh with the significantly higher token allowances effective immediately.
The MiMo-V2.5-TTS model, which provides text-to-speech capabilities, remains free for a limited time following this pricing update. This is part of Xiaomi's broader effort to encourage developers to integrate and experience the full MiMo-V2.5 series, which also includes the Pro and Omni flagship models designed for complex reasoning and multimodal tasks.





