HeadsUpAI

Vapi adds xAI Grok STT and TTS for enterprise voice agents

Vapi has launched native support for xAI's Grok STT and Grok TTS models. Developers can now select these models directly from the Vapi dashboard to handle real-time transcription (converting speech to text) and vocal synthesis (converting text to speech) for automated phone agents and enterprise voice products.
Integration type
Native dashboard support
Supported models
Grok STT and Grok TTS
Configuration
Selectable via provider dropdown
Platform
Vapi
Availability
Live for all enterprise users

This integration follows the release of xAI's Grok Text to Speech API, bringing xAI's audio stack into Vapi's managed orchestration layer. Rather than wiring the models up manually, teams select xAI as the provider and Vapi handles the transcription and speech pipeline for production voice agents.

Users can deploy these models by selecting xAI as the provider in the Vapi dashboard, combining Grok's speech-to-text and text-to-speech within a single agent built for regulated, customer-facing workflows. These models are available now for all Vapi users.

Vapi
Vapi
@Vapi_AI
X

Grok STT and Grok TTS from @xai are now live on Vapi, the platform for enterprise voice AI. Build on Vapi to create custom voice agents that speak your customers' language, capture the details that matter in regulated workflows, and sound noticeably more human on every call. https://t.co/L2PpEUve5c

7retweets56likes
View on X

Still wondering? A few quick answers below.

You can enable these capabilities by navigating to the agent configuration section of the Vapi dashboard. From there, select xAI as the provider for either the transcriber (STT) or the voice (TTS) settings. The integration is native to the Vapi platform.

Grok STT (Speech-to-Text) is responsible for transcribing live audio from a caller into text that the AI can process. Grok TTS (Text-to-Speech) performs the opposite function, converting the AI's text responses into natural-sounding human speech. Both are now available as selectable components within the Vapi ecosystem.

Vapi positions the Grok STT and Grok TTS integration for enterprise and regulated voice workflows—automated phone agents that capture the details that matter and sound noticeably more human on every call. Developers can pair Grok's speech-to-text and text-to-speech within a single agent, choosing xAI as the provider for whichever components fit their use case.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

Share this update