The best real-time voice translation apps for video calls in 2026 are MirrorCaption, Zoom Translated Captions, Google Meet Translated Captions, Microsoft Teams Live Translated Captions, Microsoft Translator, Notta, and Otter.ai. Each one suits a different situation: some are platform-locked, some require a meeting bot, and only two can speak the translation aloud during a live call.
The gap that matters most is not which tools exist. It's whether your translation tool works during the conversation or only after it. When a Korean partner says something ambiguous at minute 12 of a 45-minute call, you need the translation in seconds — not in a polished summary an hour later.
Illustrative scenario
A logistics sales team is on a call with a new distributor in South Korea. At the 14-minute mark, the distributor shifts into Korean to explain a concern about delivery windows. The rep's post-meeting transcript will capture those words accurately — in about 60 minutes. A real-time streaming translator surfaces the same sentence within a second, while the conversation is still alive enough to address it directly.
We evaluated seven tools on four criteria: whether translation is genuinely streaming (word-by-word, not post-processing), whether it requires a bot in the meeting, whether it can speak the translation aloud, and what it actually costs.
- MirrorCaption is the only browser-based tool on this list that works across browser-based Zoom, Teams, Meet, and Webex calls without a bot, with optional spoken output via Speak Translations.
- Zoom, Google Meet, and Teams each offer built-in real-time translated captions — convenient if your whole team uses that one platform, useless the moment a call moves to another tool.
- Microsoft Translator is free and supports spoken output, but requires all participants to open the Translator app separately alongside the call.
- Notta and Otter.ai are primarily post-meeting tools — strong for note-taking, but not streaming voice translators in the strict sense.
- Only MirrorCaption (via Speak Translations) and Microsoft Translator (within its own app) can read the translated speech aloud during a live exchange.
What "Real-Time Voice Translation" Actually Means for Video Calls
Two things both get called "real-time translation" that work very differently in practice.
Streaming transcription and translation produces words on screen while the speaker is still talking. The text appears word-by-word — often with partial results that self-correct as more context arrives. You're reading what's being said as it happens. MirrorCaption and the platform-native translated captions features work this way.
Near-real-time or post-processing produces a polished transcript or translation after the utterance is complete, sometimes with a short delay, sometimes only after the full meeting ends. Otter.ai and Notta are primarily in this category. Their strengths lie in note quality and action items, not in mid-call comprehension.
There is also a distinction most comparison articles overlook entirely: text output vs. spoken output.
All seven tools on this list can display translated text on screen. Only two can speak the translation aloud during the live exchange. That distinction matters most when one participant cannot read a screen comfortably, when you're presenting to a room, or when the other side needs to hear the translated response rather than read it.
When a Japanese client says 「ちょっと難しいです」 — literally "a little difficult" — a streaming tool surfaces that phrasing mid-call, while there are still 40 minutes left to respond. A post-meeting note gives you the same three words after the conversation has moved on.
The 7 Best Real-Time Voice Translation Apps for Video Calls
1. MirrorCaption — Best for Cross-Platform and Spoken Translation
MirrorCaption is a browser-based real-time transcription and translation tool supporting 50+ selectable languages. Open it in desktop Chrome or Microsoft Edge alongside any browser-based video call — Zoom, Google Meet, Teams, Webex — and it captures meeting audio directly from the browser tab. No bot joins the meeting. No admin install needed for participants.
Transcription appears word-by-word with sub-second latency. Translation follows immediately, side-by-side with the original. Each translated word is tappable to reveal the source word it came from, which matters when a phrase like "ちょっと難しいです" needs more than a literal rendering.
What distinguishes MirrorCaption on this list is Speak Translations — an optional feature that reads the user's translated speech aloud in the target language. Speak in Chinese, translate to English, and MirrorCaption can synthesize the English output while the exchange is still live. Playback options: laptop speaker, a phone paired via QR code, or a Mac virtual microphone that routes translated audio directly into Zoom, Meet, or Teams as microphone input so the other side hears it through the call.
For face-to-face conversations, Talk mode on mobile runs as a continuous session — both sides speak in turns inside the same session without restarting capture for every sentence. It's the difference between a continuous interpreter session and a tap-to-translate phrasebook.
- Pricing (mirrorcaption.com/#pricing): Free — 1 hour one-time, no credit card · Annual — €54.99/year, 100h hosted credit included · Premium (lifetime plan) — €99 one-time, 200h hosted credit included, all future updates and new features with priority access, lowest Voice Pack rate for additional hours; additional hours via Voice Packs sold separately
- Works on: Desktop Chrome or Edge (Meet mode, for meeting-tab audio); Chrome on mobile (Talk mode, for face-to-face)
- Bot required: No
- Spoken output: Yes — Speak Translations (laptop speaker, paired phone, or Mac virtual mic)
- Languages: 50+ selectable
Limitations: Meet mode requires desktop Chrome or Edge — Safari and Firefox do not support meeting-tab audio capture. Mobile Talk mode uses the microphone and is not designed for meeting-tab audio. Workplace screen-capture and web-app policies still apply; most teams can self-serve, but check your organization's browser settings.
2. Zoom Translated Captions — Best if Your Whole Team Uses Zoom
Zoom offers Translated Captions as a host-side feature available on select paid plans. When the host enables it, each participant can choose a target language and see captions translated in real time during the call. No third-party tool to open. No extra login.
For teams whose entire meeting stack lives in Zoom, this is the lowest-friction path. The language pairs available and the plan tier required are listed on Zoom's support page and change as Zoom expands coverage — check the current list before assuming your language pair is supported.
- Pricing: Included with eligible paid Zoom plans — see zoom.us/pricing for current plan requirements
- Works on: Zoom only
- Bot required: No (host-side feature)
- Spoken output: No — text captions only
- Languages: A selection of language pairs; see Zoom's support article for the current list
Limitations: Platform-locked to Zoom. No translated transcript export on most plans. If any participant in your workflow uses a different meeting platform, this feature does not help.
3. Google Meet Translated Captions — Best for Google Workspace Teams
Google Meet includes Translated Captions in select Google Workspace plans. Turn them on during a meeting and captions appear in the participant's target language in real time. Like Zoom's version, it's built-in — no extra window needed.
The free personal Google account tier does not include Translated Captions. Availability and supported language pairs vary by Workspace plan and are documented at support.google.com/meet.
- Pricing: Available in select Google Workspace plans — not available on the free personal tier
- Works on: Google Meet only
- Bot required: No
- Spoken output: No — text captions only
- Languages: A selection of language pairs; current list at support.google.com/meet
Limitations: Platform-locked to Google Meet. Captions are ephemeral at the standard tier — no searchable exported transcript.
4. Microsoft Teams Live Translated Captions — Best for Microsoft 365 Organizations
Microsoft Teams offers Live Translated Captions as part of Teams Premium and certain Microsoft 365 plans. Each participant can select a target language and see meeting speech captioned and translated in real time.
For organizations already running Microsoft 365, this is the natural choice for Teams-native calls. As with the Zoom and Google Meet equivalents, its usefulness ends at the Teams boundary.
- Pricing: Requires Teams Premium or an eligible Microsoft 365 plan — verify current requirements at learn.microsoft.com
- Works on: Microsoft Teams only
- Bot required: No (admin must enable the feature)
- Spoken output: No — text captions only
- Languages: A selection of language pairs; see Microsoft's documentation for the current list
Limitations: Requires Teams Premium on top of the standard Teams license. Platform-locked to Microsoft Teams.
5. Microsoft Translator — Best Free Option (With a Catch)
Microsoft Translator offers a free Conversations feature: multiple participants join a shared translation session, each on their own device, and see others' speech translated into their chosen language in real time. It supports text-to-speech so each device can read the translated speech aloud.
The catch: it is a standalone app experience, not an integration with existing video call platforms. For a video call, all participants need Microsoft Translator open separately alongside their meeting. That friction is manageable for some use cases — particularly in-person conversations — but it's not a transparent drop-in replacement for a browser-tab translation tool.
- Pricing: Free — translator.microsoft.com
- Works on: Standalone web and mobile app — not integrated into Zoom, Teams caption features, Meet, or other call platforms automatically
- Bot required: No
- Spoken output: Yes — device TTS within the Translator app
- Languages: Wide language coverage; see translator.microsoft.com for the current list
Limitations: All participants must actively open and join the Translator session. Does not capture meeting audio from another platform automatically.
6. Notta — Best for Post-Meeting Translated Notes
Notta is an AI note-taker that transcribes meetings in real time and can produce translated summaries and notes, primarily after the meeting concludes. It works via a meeting bot that joins calls or via a browser extension.
Notta's strength is the polished deliverable after the meeting: clean transcript, translated summary, shareable notes. For teams who need multilingual meeting records rather than in-call comprehension, it's a practical choice. As a real-time voice translator for mid-call use, it's less suited to that role.
- Pricing: Subscription plans — see notta.ai/pricing for current tiers
- Works on: Zoom, Google Meet, Microsoft Teams, and others via bot or browser extension
- Bot required: Yes
- Spoken output: No
- Languages: Transcription in many languages; translation features vary by plan
Limitations: Meeting bot is visible to other participants and triggers a recording notification in most platforms. Translation experience during the call is secondary to the post-meeting workflow.
7. Otter.ai — Best for English-Primary Teams
Otter.ai is one of the most widely used meeting transcription tools. Its real-time English transcription is genuinely strong — clear speaker labels, rolling AI summaries, and action items that appear as the meeting progresses via OtterPilot.
Translation capability exists in higher-tier plans, but Otter is fundamentally English-primary. For meetings where all participants speak English and the goal is notes and summaries, Otter competes well. For multilingual calls where mid-conversation comprehension matters, it falls short.
- Pricing (otter.ai/pricing): Free (limited minutes) · Pro $16.99/month · Business $30/month
- Works on: Zoom, Google Meet, Microsoft Teams (via OtterPilot bot)
- Bot required: Yes (OtterPilot joins meetings visibly)
- Spoken output: No
- Languages: Primarily English
Limitations: OtterPilot joins the meeting as a visible participant. Translation quality in non-English languages trails dedicated multilingual tools. Not suitable for teams where a bot presence is unwelcome.
Try MirrorCaption on Your Next Call
1 free hour. No credit card. Works alongside browser-based Zoom, Teams, Meet, and Webex in desktop Chrome or Edge.
Open MirrorCaption FreeHow to Choose the Right Real-Time Voice Translation App
Four questions narrow the field quickly.
Do you need the translation spoken aloud, or is text enough?
If everyone on the call can read captions, text works fine — and six of the seven tools above produce text. If one participant cannot easily read a screen, or you need the other side to hear the translated response during a live presentation or face-to-face conversation, only MirrorCaption via Speak Translations and Microsoft Translator (within its own app) support spoken output. For cross-border sales calls where the prospect needs to hear the translation rather than read it, this distinction is decisive.
Are all your video calls on one platform?
If yes — and that platform is Zoom, Meet, or Teams — the built-in translated caption features are the lowest-friction path. No extra login, no extra window, no per-seat add-on beyond the existing plan.
If you host or join calls across multiple platforms, or want the same tool for in-person conversations, platform-native features don't travel. MirrorCaption works across browser-based Zoom, Teams, Meet, and Webex calls in desktop Chrome or Edge, and adds Talk mode for face-to-face use on mobile. For a broader look at cross-platform translation tools, see our best meeting translator 2026 roundup.
Does your organization restrict meeting bots or third-party extensions?
Meeting bots (used by Notta and Otter.ai) join calls as a visible participant and trigger a recording notification in most platforms. Many IT policies block or discourage third-party bots. MirrorCaption captures audio from the browser tab directly — no bot joins the meeting.
Note that organizational policies on browser screen-sharing and web-app access still apply. Many teams can set up MirrorCaption without filing an IT ticket, but check your organization's browser and screen-capture policies. For a direct comparison on the bot question, see MirrorCaption vs Zoom AI Companion.
How often do you actually need translation?
For occasional use — a handful of calls per month — MirrorCaption's one-time free hour or Microsoft Translator's free tier may cover it. For regular use, compare the €99 one-time Premium (200h hosted credit included) against recurring per-seat tools such as Otter Pro at approximately $16.99/month. At two hours of translated calls per week, the one-time plan typically pays for itself within the first two months.
Quick Comparison: Real-Time Voice Translation Apps for Video Calls
| Tool | Streaming Real-Time | Spoken Output | Bot Required | Works On | Starting Cost |
|---|---|---|---|---|---|
| MirrorCaption | Yes | Yes (Speak Translations) | No | Chrome/Edge desktop; Chrome mobile | Free 1h; €99 one-time Premium |
| Zoom Translated Captions | Yes | No | No | Zoom only | Paid Zoom plans |
| Google Meet Translated Captions | Yes | No | No | Google Meet only | Select Workspace plans |
| Teams Live Translated Captions | Yes | No | No | Teams only | Teams Premium required |
| Microsoft Translator | Yes | Yes (app TTS) | No | Standalone app only | Free |
| Notta | Partial | No | Yes | Zoom, Meet, Teams | Subscription — see site |
| Otter.ai | Partial (EN) | No | Yes | Zoom, Meet, Teams | $16.99/month Pro |
Frequently Asked Questions
Does Zoom have real-time voice translation for video calls?
Yes. Zoom offers Translated Captions as part of select paid plans. When a host enables the feature, participants see captions in their chosen target language in real time during the meeting. It is text-only — there is no spoken output. Available language pairs and the plan tier required are listed on Zoom's support page and are updated as Zoom expands coverage.
Is there a real-time voice translation app that doesn't join my meeting as a bot?
Yes. MirrorCaption runs in your browser tab and captures audio directly from the meeting tab in desktop Chrome or Edge. No bot joins the meeting and no additional participant appears in the attendee list. The platform-native options — Zoom Translated Captions, Google Meet Translated Captions, and Teams Live Translated Captions — also require no bot, but each work only within their own platform.
Can a real-time translator speak the translation aloud during a video call?
Yes. MirrorCaption's Speak Translations feature reads the user's translated speech aloud in the target language with near-real-time timing. Playback options include the laptop speaker, a phone paired via QR code, or a Mac virtual microphone that routes translated audio into Zoom, Meet, or Teams as mic input — so the other side hears the translation through the call. Microsoft Translator also supports text-to-speech playback, but this works within its own standalone app rather than as an integrated layer over an existing video call.
How accurate is AI voice translation on video calls?
Accuracy depends on speaker clarity, microphone quality, the language pair, and accent. Tools that pass earlier conversation segments as context into each translation call generally perform better on multi-turn dialogue than tools that translate each sentence in isolation. For the most demanding use cases — legal, medical, high-stakes negotiation — treat AI translation as a strong real-time aid, not a certified substitute for a professional interpreter. For a closer look at how AI translation quality varies by tool and language, see our analysis of real-time translation accuracy.
What is the best free real-time voice translation app for video calls?
Platform-native options (Zoom Translated Captions, Google Meet Translated Captions, Teams Live Translated Captions) are effectively free if you already pay for the hosting plan, but each is locked to one platform. Microsoft Translator is free with no platform lock-in, but requires all participants to open its standalone app alongside the call. MirrorCaption offers a one-time free hour — no credit card, no monthly reset — which is enough to evaluate the streaming translation experience on a real call before committing to a plan.
Read Every Word — During the Meeting
MirrorCaption works alongside browser-based Zoom, Teams, Meet, and Webex. No bot. No install. 1 free hour to try.
Start for FreeThe Bottom Line
Most teams gravitate toward whichever translation feature is built into the platform they already use. That works well when everyone stays on the same tool. The moment a call moves to a different platform, or a conversation happens in person, the platform-native feature disappears entirely.
MirrorCaption is built for that gap: a single browser tab that works across browser-based video calls, captures audio without a bot, and optionally speaks the translated output aloud via Speak Translations — fast enough to keep a real conversation moving. Start with the free 1-hour trial on your next multilingual call.