Command A+ sets a new high for Cohere's machine translation capabilities. Opening a clear gap over open source peers Mistral Medium 3.5, DeepSeek, & OpenAI's gpt-oss, as well as Claude Opus 4.6. A+ also outperforms specialist systems like Google Translate. RWS is better... but we built that with them too
Cohere Command A Plus Outperforms Google Translate on Machine Translation Benchmarks
Cohere, an AI company building enterprise models for search and business applications, released new machine translation benchmarks for its Command A+ launch. The 218-billion parameter model now leads open-source and proprietary peers on the WMT24++ benchmark, outperforming specialist systems like Google Translate.
- Supported languages
- 48
- xCOMET-XL score lead
- +2.4 pts
- 20% compression gain
- Arabic
- Hardware (minimum)
- 2x H100 or 1x B200 (W4A4)
- License
- Apache 2.0
This update shifts the landscape for sovereign AI, where organizations must process sensitive multilingual data without external APIs. By outperforming specialist systems and rivaling the Cohere and RWS Language Weaver Pro model, Command A+ allows regulated firms to run high-accuracy translation entirely within private, air-gapped environments.
Beyond accuracy, a new tokenizer reduces tokens for non-Latin languages, cutting costs by 20% for Arabic and 18% for Japanese. You can download the Apache 2.0 weights on Hugging Face, which run on two NVIDIA H100 GPUs using Command A+ single GPU serving.
Cohere
@cohere
12retweets103likes
View on XStill wondering? A few quick answers below.
Command A Plus is a 218-billion parameter large language model designed for enterprise and sovereign AI workloads. It uses a mixture-of-experts architecture that only activates 25 billion parameters per prompt, allowing it to deliver advanced reasoning and multimodal capabilities while remaining efficient enough to run on private, on-premises hardware.
The model sets a new high for Cohere's translation capabilities, outperforming specialist systems like Google Translate and open-source peers like Mistral Medium 3.5 and DeepSeek. It shows significant gains on the WMT24++ benchmark, particularly in European languages like French and Spanish, as well as high-impact non-Latin languages including Arabic, Japanese, and Korean.
Yes, Cohere has released Command A Plus under the Apache 2.0 license, making it a fully open-weight model. This licensing allows developers and organizations to download, run, and adapt the model for their own production environments without vendor lock-in, supporting data sovereignty for governments and regulated industries that require private deployments.
Despite its large 218-billion parameter size, the model is engineered for extreme hardware efficiency. By using 4-bit quantization, which compresses the model weights to reduce memory needs, Command A Plus can run on as little as a single NVIDIA B200 GPU or two NVIDIA H100 GPUs while maintaining high-performance output quality.
The model features an updated tokenizer that processes text more efficiently, especially for non-European languages. It requires fewer tokens to represent the same amount of information, which directly lowers inference costs. For example, tokenization efficiency improved by 20 percent for Arabic and 18 percent for Japanese compared to previous model generations.




