The best real-time voice translation apps for video calls in 2026 are MirrorCaption, Zoom Translated Captions, Google Meet Translated Captions, Microsoft Teams Live Translated Captions, Microsoft Translator, Notta, and Otter.ai. Each one suits a different situation: some are platform-locked, some require a meeting bot, and only two can speak the translation aloud during a live call.

The gap that matters most is not which tools exist. It's whether your translation tool works during the conversation or only after it. When a Korean partner says something ambiguous at minute 12 of a 45-minute call, you need the translation in seconds — not in a polished summary an hour later.

Illustrative scenario

A logistics sales team is on a call with a new distributor in South Korea. At the 14-minute mark, the distributor shifts into Korean to explain a concern about delivery windows. The rep's post-meeting transcript will capture those words accurately — in about 60 minutes. A real-time streaming translator surfaces the same sentence within a second, while the conversation is still alive enough to address it directly.

We evaluated seven tools on four criteria: whether translation is genuinely streaming (word-by-word, not post-processing), whether it requires a bot in the meeting, whether it can speak the translation aloud, and what it actually costs.

Key Takeaways

What "Real-Time Voice Translation" Actually Means for Video Calls

Two things both get called "real-time translation" that work very differently in practice.

Streaming transcription and translation produces words on screen while the speaker is still talking. The text appears word-by-word — often with partial results that self-correct as more context arrives. You're reading what's being said as it happens. MirrorCaption and the platform-native translated captions features work this way.

Near-real-time or post-processing produces a polished transcript or translation after the utterance is complete, sometimes with a short delay, sometimes only after the full meeting ends. Otter.ai and Notta are primarily in this category. Their strengths lie in note quality and action items, not in mid-call comprehension.

There is also a distinction most comparison articles overlook entirely: text output vs. spoken output.

All seven tools on this list can display translated text on screen. Only two can speak the translation aloud during the live exchange. That distinction matters most when one participant cannot read a screen comfortably, when you're presenting to a room, or when the other side needs to hear the translated response rather than read it.

When a Japanese client says    「ちょっと難しいです」 — literally "a little difficult" — a streaming tool surfaces that phrasing mid-call, while there are still 40 minutes left to respond. A post-meeting note gives you the same three words after the conversation has moved on.

The 7 Best Real-Time Voice Translation Apps for Video Calls

Best for Zoom Teams

2. Zoom Translated Captions — Best if Your Whole Team Uses Zoom

Zoom offers Translated Captions as a host-side feature available on select paid plans. When the host enables it, each participant can choose a target language and see captions translated in real time during the call. No third-party tool to open. No extra login.

For teams whose entire meeting stack lives in Zoom, this is the lowest-friction path. The language pairs available and the plan tier required are listed on Zoom's support page and change as Zoom expands coverage — check the current list before assuming your language pair is supported.

Limitations: Platform-locked to Zoom. No translated transcript export on most plans. If any participant in your workflow uses a different meeting platform, this feature does not help.

Google Workspace

3. Google Meet Translated Captions — Best for Google Workspace Teams

Google Meet includes Translated Captions in select Google Workspace plans. Turn them on during a meeting and captions appear in the participant's target language in real time. Like Zoom's version, it's built-in — no extra window needed.

The free personal Google account tier does not include Translated Captions. Availability and supported language pairs vary by Workspace plan and are documented at support.google.com/meet.

Limitations: Platform-locked to Google Meet. Captions are ephemeral at the standard tier — no searchable exported transcript.

Microsoft 365

4. Microsoft Teams Live Translated Captions — Best for Microsoft 365 Organizations

Microsoft Teams offers Live Translated Captions as part of Teams Premium and certain Microsoft 365 plans. Each participant can select a target language and see meeting speech captioned and translated in real time.

For organizations already running Microsoft 365, this is the natural choice for Teams-native calls. As with the Zoom and Google Meet equivalents, its usefulness ends at the Teams boundary.

Limitations: Requires Teams Premium on top of the standard Teams license. Platform-locked to Microsoft Teams.

Free Option

5. Microsoft Translator — Best Free Option (With a Catch)

Microsoft Translator offers a free Conversations feature: multiple participants join a shared translation session, each on their own device, and see others' speech translated into their chosen language in real time. It supports text-to-speech so each device can read the translated speech aloud.

The catch: it is a standalone app experience, not an integration with existing video call platforms. For a video call, all participants need Microsoft Translator open separately alongside their meeting. That friction is manageable for some use cases — particularly in-person conversations — but it's not a transparent drop-in replacement for a browser-tab translation tool.

Limitations: All participants must actively open and join the Translator session. Does not capture meeting audio from another platform automatically.

Meeting Notes

6. Notta — Best for Post-Meeting Translated Notes

Notta is an AI note-taker that transcribes meetings in real time and can produce translated summaries and notes, primarily after the meeting concludes. It works via a meeting bot that joins calls or via a browser extension.

Notta's strength is the polished deliverable after the meeting: clean transcript, translated summary, shareable notes. For teams who need multilingual meeting records rather than in-call comprehension, it's a practical choice. As a real-time voice translator for mid-call use, it's less suited to that role.

Limitations: Meeting bot is visible to other participants and triggers a recording notification in most platforms. Translation experience during the call is secondary to the post-meeting workflow.

English Teams

7. Otter.ai — Best for English-Primary Teams

Otter.ai is one of the most widely used meeting transcription tools. Its real-time English transcription is genuinely strong — clear speaker labels, rolling AI summaries, and action items that appear as the meeting progresses via OtterPilot.

Translation capability exists in higher-tier plans, but Otter is fundamentally English-primary. For meetings where all participants speak English and the goal is notes and summaries, Otter competes well. For multilingual calls where mid-conversation comprehension matters, it falls short.

Limitations: OtterPilot joins the meeting as a visible participant. Translation quality in non-English languages trails dedicated multilingual tools. Not suitable for teams where a bot presence is unwelcome.

Try MirrorCaption on Your Next Call

1 free hour. No credit card. Works alongside browser-based Zoom, Teams, Meet, and Webex in desktop Chrome or Edge.

Open MirrorCaption Free

How to Choose the Right Real-Time Voice Translation App

Four questions narrow the field quickly.

Do you need the translation spoken aloud, or is text enough?

If everyone on the call can read captions, text works fine — and six of the seven tools above produce text. If one participant cannot easily read a screen, or you need the other side to hear the translated response during a live presentation or face-to-face conversation, only MirrorCaption via Speak Translations and Microsoft Translator (within its own app) support spoken output. For cross-border sales calls where the prospect needs to hear the translation rather than read it, this distinction is decisive.

Are all your video calls on one platform?

If yes — and that platform is Zoom, Meet, or Teams — the built-in translated caption features are the lowest-friction path. No extra login, no extra window, no per-seat add-on beyond the existing plan.

If you host or join calls across multiple platforms, or want the same tool for in-person conversations, platform-native features don't travel. MirrorCaption works across browser-based Zoom, Teams, Meet, and Webex calls in desktop Chrome or Edge, and adds Talk mode for face-to-face use on mobile. For a broader look at cross-platform translation tools, see our best meeting translator 2026 roundup.

Does your organization restrict meeting bots or third-party extensions?

Meeting bots (used by Notta and Otter.ai) join calls as a visible participant and trigger a recording notification in most platforms. Many IT policies block or discourage third-party bots. MirrorCaption captures audio from the browser tab directly — no bot joins the meeting.

Note that organizational policies on browser screen-sharing and web-app access still apply. Many teams can set up MirrorCaption without filing an IT ticket, but check your organization's browser and screen-capture policies. For a direct comparison on the bot question, see MirrorCaption vs Zoom AI Companion.

How often do you actually need translation?

For occasional use — a handful of calls per month — MirrorCaption's one-time free hour or Microsoft Translator's free tier may cover it. For regular use, compare the €99 one-time Premium (200h hosted credit included) against recurring per-seat tools such as Otter Pro at approximately $16.99/month. At two hours of translated calls per week, the one-time plan typically pays for itself within the first two months.

Quick Comparison: Real-Time Voice Translation Apps for Video Calls

Tool Streaming Real-Time Spoken Output Bot Required Works On Starting Cost
MirrorCaption Yes Yes (Speak Translations) No Chrome/Edge desktop; Chrome mobile Free 1h; €99 one-time Premium
Zoom Translated Captions Yes No No Zoom only Paid Zoom plans
Google Meet Translated Captions Yes No No Google Meet only Select Workspace plans
Teams Live Translated Captions Yes No No Teams only Teams Premium required
Microsoft Translator Yes Yes (app TTS) No Standalone app only Free
Notta Partial No Yes Zoom, Meet, Teams Subscription — see site
Otter.ai Partial (EN) No Yes Zoom, Meet, Teams $16.99/month Pro

Frequently Asked Questions

Does Zoom have real-time voice translation for video calls?

Yes. Zoom offers Translated Captions as part of select paid plans. When a host enables the feature, participants see captions in their chosen target language in real time during the meeting. It is text-only — there is no spoken output. Available language pairs and the plan tier required are listed on Zoom's support page and are updated as Zoom expands coverage.

Is there a real-time voice translation app that doesn't join my meeting as a bot?

Yes. MirrorCaption runs in your browser tab and captures audio directly from the meeting tab in desktop Chrome or Edge. No bot joins the meeting and no additional participant appears in the attendee list. The platform-native options — Zoom Translated Captions, Google Meet Translated Captions, and Teams Live Translated Captions — also require no bot, but each work only within their own platform.

Can a real-time translator speak the translation aloud during a video call?

Yes. MirrorCaption's Speak Translations feature reads the user's translated speech aloud in the target language with near-real-time timing. Playback options include the laptop speaker, a phone paired via QR code, or a Mac virtual microphone that routes translated audio into Zoom, Meet, or Teams as mic input — so the other side hears the translation through the call. Microsoft Translator also supports text-to-speech playback, but this works within its own standalone app rather than as an integrated layer over an existing video call.

How accurate is AI voice translation on video calls?

Accuracy depends on speaker clarity, microphone quality, the language pair, and accent. Tools that pass earlier conversation segments as context into each translation call generally perform better on multi-turn dialogue than tools that translate each sentence in isolation. For the most demanding use cases — legal, medical, high-stakes negotiation — treat AI translation as a strong real-time aid, not a certified substitute for a professional interpreter. For a closer look at how AI translation quality varies by tool and language, see our analysis of real-time translation accuracy.

What is the best free real-time voice translation app for video calls?

Platform-native options (Zoom Translated Captions, Google Meet Translated Captions, Teams Live Translated Captions) are effectively free if you already pay for the hosting plan, but each is locked to one platform. Microsoft Translator is free with no platform lock-in, but requires all participants to open its standalone app alongside the call. MirrorCaption offers a one-time free hour — no credit card, no monthly reset — which is enough to evaluate the streaming translation experience on a real call before committing to a plan.

Read Every Word — During the Meeting

MirrorCaption works alongside browser-based Zoom, Teams, Meet, and Webex. No bot. No install. 1 free hour to try.

Start for Free

The Bottom Line

Most teams gravitate toward whichever translation feature is built into the platform they already use. That works well when everyone stays on the same tool. The moment a call moves to a different platform, or a conversation happens in person, the platform-native feature disappears entirely.

MirrorCaption is built for that gap: a single browser tab that works across browser-based video calls, captures audio without a bot, and optionally speaks the translated output aloud via Speak Translations — fast enough to keep a real conversation moving. Start with the free 1-hour trial on your next multilingual call.