Cohere Releases Open Source Transcribe Model Outperforming Whisper on Accuracy

CohereCohere

· Updated

Cohere launched Transcribe, a 2-billion parameter open-source speech recognition model that currently holds the top spot on the HuggingFace Open ASR Leaderboard. By achieving a 5.42% word error rate, it provides a high-accuracy, high-throughput alternative for enterprise workflows that previously relied on larger or proprietary models.

Cohere released Cohere Transcribe, an open-source speech recognition model under the Apache 2.0 license. This 2-billion parameter model uses a Conformer-based architecture to transcribe 14 languages. It currently ranks first for English accuracy on the HuggingFace Open ASR Leaderboard, outperforming Whisper Large v3 and ElevenLabs Scribe v2.

High-performance transcription usually requires massive models or expensive closed APIs. This model shifts the trade-off between accuracy and cost by delivering a superior accuracy-to-speed ratio in a smaller footprint. It enables state-of-the-art transcription on consumer-grade GPUs or at the edge without the high error rates typical of lightweight models.

Download the weights for cohere-transcribe-03-2026 for local deployment or use the API for experimentation. For production needs, the model is available through Model Vault. Future updates will integrate the model into North, the company’s agent orchestration platform, to power voice-enabled enterprise agents and real-time support workflows.

Cohere
Cohere
@cohere
X

Introducing: Cohere Transcribe – a new state-of-the-art in open source speech recognition. https://t.co/l87Z6oyQdM

264retweets
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update