xAI Launches Grok Text to Speech API with Five Natural Voices

xAIxAI

· Updated

Grok's Text to Speech API is now available for developers, with five voices and expressive controls over pitch, pacing, laughter, and tone. Priced at $4.20 per million characters, the API is built for telephony and web.

xAI launched its Text to Speech API, converting text to natural speech in five voices: Eve, Ara, Leo, Rex, and Sal. The API supports multiple audio formats and expressive controls — tagged speech features for pauses, laughter, whispers, pitch, speed, and emphasis. Pricing is $4.20 per million characters (600 rpm, 10 rps). The TTS API is in beta and joins xAI's Voice Agent API ($0.05/min, WebSocket, MCP support) and Speech to Text on the same platform.

This gives developers a single voice platform — text-to-speech, speech-to-text, and conversational agents — under one xAI API key. The inline tagging system lets you shape how a voice sounds in the text itself.

Add natural speech to any user-facing surface in your app by pointing the TTS endpoint at your content and using tagged controls to tune the voice's tone to context.

xAI
xAI
@xai
X

Grok's Text to Speech API is now available. Start building with natural voices and expressive controls to bring your apps to life. https://t.co/SMxWTB9m6N https://t.co/UtHT0uN148

164retweets
View on X

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update