Fireworks AI Launches Infrastructure for Training Trillion Parameter MoE Models

Fireworks AIFireworks AI

· Updated

Fireworks AI released a major update to its Training SDK featuring Blackwell-native kernels and 4D parallelism for trillion-parameter Mixture-of-Experts models. By fusing reinforcement learning losses and optimizing for asynchronous data, the platform enables frontier-grade model training that was previously restricted to elite research labs.

Fireworks AI updated its Training SDK with a specialized engine for trillion-parameter Mixture-of-Experts models like Qwen3.5 and Kimi K2.5. The system introduces composable 4D parallelism, which automatically orchestrates data, pipeline, context, and expert sharding. This infrastructure recently powered the training of Cursor's Composer 2 model.

Training frontier models is increasingly an infrastructure bottleneck rather than a modeling one. The new stack utilizes MXFP8 kernels on NVIDIA Blackwell hardware to deliver significant speedups over BF16 without losing numerical accuracy. Fused reinforcement learning losses also provide a 2x performance boost for PPO by eliminating redundant forward passes.

You can now access these training shapes through the Training SDK to fine-tune models at context lengths up to one million tokens. For resource-constrained environments, the platform supports LoRA fine-tuning of trillion-parameter models on a single 8-GPU node using 4x expert quantization. Managed fine-tuning and custom training loops are available via the API.

Fireworks AI
Fireworks AI
@FireworksAI_HQ
X

Training trillion-parameter MoEs is an infra problem disguised as a modeling problem. So we built the infra solution. Cursor used it to train Composer 2. Now it's available for Kimi K2.5, Qwen3.5 397B, MiniMax M2.5, and more: →Fused RL loss (~2x faster PPO) →MXFP8 expert kernels on Blackwell →Composable 4D parallelism →1M+ token context training validated Here's how it all works ↓ https://t.co/PA20I8EFaD

26retweets245likes
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update