Google Gemini 3.5 Live Translate Explained

Key Takeaway

Google has launched Gemini 3.5 Live Translate, a new audio model for near real-time speech-to-speech translation across more than 70 languages. The feature is rolling out through Google Translate, developer tools, and Google Meet previews.

Gemini 3.5 Live Translate – Key Points

The Story

Google is moving live translation from turn-by-turn interpretation toward continuous, AI-generated speech translation.

Gemini 3.5 Live Translate is designed to listen to spoken language, detect it automatically, and generate translated speech while the person is still talking. The model stays only a few seconds behind the speaker, while preserving elements such as intonation, pacing, and pitch.

The release extends Google’s long-running translation work, which now handles more than a trillion translated words each month across Google products.

The Facts

Availability

Google announced Gemini 3.5 Live Translate on June 9, 2026.
The feature is rolling out globally to Google Translate on Android and iOS.
Developers can access it in public preview through the Gemini Live API and Google AI Studio.
Enterprises can test it in private preview through Google Meet for select Google Workspace customers starting this month, with a broader rollout planned later this year.
The developer model name is gemini-3.5-live-translate-preview.
Pricing for production API use has not been disclosed.

Capabilities

The model supports speech-to-speech translation across more than 70 languages.
It can automatically detect multilingual speech without manual language configuration.
It generates translated audio and can also provide a text transcript.
It processes streamed speech continuously instead of waiting for a speaker to finish every sentence.
The model is designed to handle loud and unpredictable environments.
All model-generated audio is watermarked with SynthID.

Technical Details

Gemini 3.5 Live Translate is optimized for low-latency audio-to-audio translation.
It supports audio input and translated audio output.
Google’s developer documentation lists a 131,072-token input limit and a 65,536-token output limit for the preview model.
The model supports the Gemini Live API but does not support features such as file search, image generation, code execution, structured outputs, or search grounding.
Independent benchmark results for latency, accuracy, accents, and lower-resource languages are not yet available.

What Is New

The important change is not only the number of supported languages. It is the translation method.

Traditional live translation tools often work in turns. One person speaks, the system waits, then the translation is produced. Gemini 3.5 Live Translate reduces that delay by generating translated speech continuously, while balancing immediate translation with enough context to improve accuracy.

In Google Meet, the update expands speech translation from five languages to more than 70. It also enables more than 2,000 language combinations in a single meeting, instead of translating only to and from English.

Where Users Will See It

Google Translate

The most immediate consumer use case is the Google Translate app on Android and iOS. Users can tap “Live translate” in the bottom-left corner and connect headphones to hear translated speech that mirrors the speaker’s tone across more than 70 languages.

On Android, a new listening mode streams translated audio through the phone’s earpiece. This lets users hold the phone like a regular call when they do not have headphones or do not want the translation played aloud.

Google Meet

For businesses, Google Meet is the more strategic rollout. Real-time multilingual meetings could reduce friction for global teams, sales calls, support sessions, education, and cross-border collaboration.

On the web version of Google Meet, speech translation will be available through a new control button for faster access inside meetings.

Developer Apps

Through the Gemini Live API and Google AI Studio, developers can build live translation into apps, services, kiosks, learning tools, meeting products, accessibility tools, customer-facing platforms, lessons, broadcasts, and multilingual calls.

Developer platforms including Agora, Fishjam, LiveKit, Pipecat, and Vision Agents are supporting integrations that handle real-time media streaming infrastructure. Grab is also testing the model to support near real-time communication between drivers and travelers at pickups, across more than 10 million voice calls per month.

Why It Matters for End Users

For ordinary users, live translation becomes more useful when it feels less mechanical. Preserving tone, rhythm, and speech flow can make translated conversations easier to understand and less awkward.

It also lowers the barrier for people who need quick communication across languages but do not want to type, wait, or rely on scripted translation.

Limitations

Gemini 3.5 Live Translate is still in preview for developers and private preview for enterprises using Google Meet. That means access, quality, pricing, latency, and enterprise controls may change.

The feature also depends on speech recognition, audio quality, accents, background noise, language pairs, network conditions, and context. Sensitive conversations may still require human review, especially in legal, medical, governmental, or contractual settings.

Developers should test the model with their target language pairs and real audio conditions before assuming consistent production performance.

Why This Matters

Gemini 3.5 Live Translate shows how AI translation is shifting from text conversion to real-time voice mediation. For users, it could make multilingual communication more natural. For businesses, it could reduce language friction across meetings, support, training, and global operations.

This article was drafted with the assistance of generative AI. All facts and details were reviewed and confirmed by an editor prior to publication.