MiniMax M3 is live and Together AI is powering its inference 🚀 Tomorrow at 6pm PT we're going live on X Spaces with the teams behind the model and the infrastructure to give you a deep dive. https://t.co/wPayfOWmNg
Together AI powers MiniMax M3 with 1M context and sparse attention
- Context Window
- 1,000,000 tokens
- Architecture
- MiniMax Sparse Attention
- Coding Benchmark
- 59.0% SWE-Bench Pro
- Agent Benchmark
- 74.2% MCP Atlas
- Input Modalities
- Text, Image, Video
MiniMax M3 matches full attention performance across multiple benchmarks while reducing per-token compute to 1/20th of previous generations at a 1-million-token context length. This architecture enables autonomous workflows like CUDA kernel optimization, building on the MiniMax M3 technical highlights. The model's native multimodality allows semantic spaces to merge deeply during training.
Access MiniMax M3 via the MiniMax Code app or the Together AI API, available alongside other providers like SiliconFlow. The model supports "thinking" modes for reasoning and "computer use" for desktop automation. Together AI provides the research-optimized infrastructure required to deploy and scale these models in production.
Still wondering? A few quick answers below.
Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →






