Google AI Releases Gemini 3.5 Live Translate for Natural Streaming Speech Translation

Google

Jun 10, 2026 · Updated Jun 20, 2026

Google AI released Gemini 3.5 Live Translate, an audio model for live speech-to-speech translation. It supports over 70 languages, streaming translations continuously to maintain natural conversation flow by preserving speaker intonation and pacing. This aims to eliminate awkward pauses, making cross-language interactions feel more fluid across various applications.

Google AI launched Gemini 3.5 Live Translate, an audio model for live speech-to-speech translation across more than 70 languages. This model streams translations as a speaker talks, balancing speed and quality to stay mere seconds behind the conversation. It automatically detects languages, handles multilingual inputs, and filters ambient noise.

Supported Languages: 70+
Latency: Mere seconds behind speaker
Audio Preservation: Pacing, pitch, intonation
Developer Access: Gemini Live API, Google AI Studio (public preview)
Consumer Access: Google Translate app (Android/iOS)
Enterprise Access: Google Meet (private preview)

The model aims to make cross-language communication feel natural by preserving the speaker's intonation, pacing, and pitch, unlike traditional turn-by-turn systems that introduce pauses. This continuous, low-latency approach seeks to foster more fluid and human-like interactions.

Gemini 3.5 Live Translate is available in public preview for developers via the Gemini Live API and Google AI Studio. You can also experience it in the Google Translate app on Android and iOS, with a new Android-exclusive "listening mode" for private earpiece translations. It will also roll out to Google Meet for select business customers.

View the full update on blog.google

Google AI

@GoogleAIJun 9

Today, we released Gemini 3.5 Live Translate, our latest audio model for live speech-to-speech translation. It supports over 70 languages and starts translating as soon as you start talking, streaming translations while listening to what you say next. No awkward pauses or choppy audio, just real connection without language barriers. So, how does it work? 🤔 The model is able to make split-second decisions to juggle speed and translation quality so conversations actually feel fluid, human, and natural. In order to do this, the model must receive and contextualize the input while simultaneously outputting the translated speech. Through this process, Gemini 3.5 Live Translate manages to stay mere seconds behind each speaker and can even maintain pacing, pitch, and intonation across extended sessions. See it in action below, or try it yourself in the Google Translate app on iOS & Android.

4073k

View on X

Still wondering? A few quick answers below.

It is Google AI's latest audio model for live speech-to-speech translation, designed to provide fluid, natural conversations across more than 70 languages by streaming translations in near real-time.

The model processes speech as it streams, making split-second decisions to balance translation speed and quality. It continuously outputs translated speech while listening to the next input, staying just seconds behind the speaker.

It is available for developers via the Gemini Live API and Google AI Studio in public preview. Consumers can use it in the Google Translate app on Android and iOS, and it will soon be in Google Meet for select business customers.

In the Google Translate app, it offers a new "listening mode" for Android users, allowing translations to be heard directly through the phone's earpiece for private listening without headphones.

Yes, the model is designed to maintain the speaker's pacing, pitch, and intonation across extended sessions, aiming for a more natural and human-like translated output.

Every HeadsUpAI update is written based on its original source and reviewed before it's published. Read our editorial standards →

See all AI news & updates from Google →

Keep reading

Google Launches Gemini 3.1 Flash Live for Natural Real Time Voice Agents

Google DeepMind released Gemini 3.1 Flash Live, a low-latency audio model optimized for real-time dialogue and complex task execution. The model improves function calling and tonal recognition, allowing voice agents to handle multi-step workflows and emotional nuances more reliably. This enables more fluid interactions in noisy environments without losing conversational context.

GoogleApr 24

Google Gemini 3.1 Flash TTS Becomes Flagship for Expressive Speech

Google designated Gemini 3.1 Flash TTS as its most expressive speech generation model to date. The model uses natural language audio tags to allow developers to direct emotional delivery and vocal character within generated audio.

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google AI StudioApr 13

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google's Gemini 3.1 Flash Live model reached the #1 position on the Tau Voice Bench leaderboard for real-time voice agents. The update delivers significantly lower latency and higher precision, signaling that multimodal voice AI is now reliable enough for production-grade applications.

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera

GeminiJun 5

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera

Gemini Live now lets users create and edit images directly within the app, using a live camera feed. This brings AI into real-time visual interactions, turning spoken or typed instructions into immediate on-screen changes.

What is Gemini 3.5 Live Translate?

How does Gemini 3.5 Live Translate work?

Where can I use Gemini 3.5 Live Translate?

What new features does it offer in Google Translate?

Does Gemini 3.5 Live Translate preserve speaker's voice characteristics?

Keep reading

Google Launches Gemini 3.1 Flash Live for Natural Real Time Voice Agents

Google Launches Gemini 3.1 Flash Live for Natural Real Time Voice Agents

Google Gemini 3.1 Flash TTS Becomes Flagship for Expressive Speech

Google Gemini 3.1 Flash TTS Becomes Flagship for Expressive Speech

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera

Keep reading

Google Launches Gemini 3.1 Flash Live for Natural Real Time Voice Agents

Google Launches Gemini 3.1 Flash Live for Natural Real Time Voice Agents

Google Gemini 3.1 Flash TTS Becomes Flagship for Expressive Speech

Google Gemini 3.1 Flash TTS Becomes Flagship for Expressive Speech

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google Gemini 3.1 Flash Live Claims Top Spot for Production Voice Agents

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera

Google Gemini Live Now Creates and Edits Images in Real-Time with Camera