Turn Every Call Into
a Language Lesson

Real-time transcription in 60+ languages. Tap any word to see the original. Build vocabulary from real conversations.

Meeting transcription for language learners gives you something textbook apps don't: a searchable, word-level record of every real conversation you have in your target language.

Yuki has studied Japanese for three years. Her weekly session with a Tokyo-based tutor runs 45 minutes on Zoom. Last Tuesday, her tutor used a phrase she'd heard before but never fully understood: 「そのまま使えばいいよ」. She caught the encouraging tone. She missed the nuance. By the time she thought to ask, the topic had moved on.

That's the gap no language drill fills. You need the tutor. You need the real conversation. But you also need the record, the exact words, the original alongside the translation, and a way to save what you didn't catch before the next sentence arrives. That's what MirrorCaption does. Open it beside your Zoom call. Read the transcript in real time. Tap any translated word to see the original. Save 「そのまま」 to your vocabulary deck before the lesson moves on.

Key Takeaways

Why Language Apps Can't Teach You This

Duolingo teaches vocabulary through repetition. Babbel runs you through scripted scenarios. Both are useful for building a foundation. Neither prepares you for a real conversation.

Real conversations are fast, full of idioms, and don't repeat themselves. When your Japanese tutor says 「そのまま使えばいいよ」 in the middle of a 45-minute session, there's no notification that says "new vocabulary item." There's just the sentence, the context, and your ability to catch it.

Language researchers have identified comprehensible input, authentic spoken language at or just above your current level, as a primary driver of acquisition. Scripted exercises are comprehensible, but not authentic. Real tutoring sessions are authentic, but they disappear. You finish the call having learned something. You just can't always say what.

MirrorCaption makes real conversations reviewable. It doesn't replace your tutor, it gives you the tools to get more from every session: a real-time transcript, a tap-to-original feature that links each translated word back to its source, and a vocabulary deck built from the actual language your tutor uses.

How MirrorCaption Turns a Call into a Lesson

Using MirrorCaption for language learning requires no setup and no paid account to start. Here's the workflow from opening the app to reviewing vocabulary:

  1. Open MirrorCaption in a browser tab, works in Chrome, Safari, or Edge on any desktop or laptop. No download, no extension.
  2. Start your tutoring call in another tab, Zoom, Google Meet, Skype, FaceTime, or any browser-based platform. MirrorCaption works alongside all of them.
  3. Select your audio source, "Tab audio" captures your tutor's voice; add "Microphone" to include your own. Both simultaneously is the default for most learners.
  4. Read the real-time transcript, transcription streams word by word as your tutor speaks. Tap any translated word to see the source term it came from. If 「難しい」 appears in English as "difficult," tap it and the original Japanese highlights in the transcript.
  5. Save words to your vocabulary deck, one tap saves the word alongside the sentence it came from. Open your deck after the session to review in context, not in isolation.

The session transcript stays in your browser until you clear it. Export as plain text or Markdown to use as study material or share with a study partner.

Ready to try it in your next session? Start free, 2 hours per month, no credit card needed.

Try Free

The Features Language Learners Actually Need

Most transcription tools are built for business teams: meeting summaries, action items, CRM integrations. Those features don't help a language learner. These do.

Vocabulary Builder

Every call in your target language is a vocabulary source. The problem is retention, you hear an unfamiliar word, and three sentences later it's gone. MirrorCaption's vocabulary builder lets you save any word with a single tap during the call. The word saves alongside the sentence it came from, so you have context, not just a list of isolated terms.

Tap to See the Original Word

Reading a translation is useful. Knowing which source word produced it is how you actually learn. Every translated word in MirrorCaption links back to the term it came from. Tap "agreement" in the English transcript and the corresponding Japanese, French, or German word highlights in the original. This turns passive reading into active comprehension practice, the difference between consuming a translation and learning from it.

Side-by-Side Original and Translation

On desktop, the original transcript and translation run in parallel columns, not one replacing the other. Advanced learners use this to check their own comprehension: read the target language, form your interpretation, then compare it to the translation column. It's the closest thing to a bilingual parallel text, happening live. The side-by-side view is also how you catch the moments where a translation smooths over something the original actually said, which is usually where the most interesting vocabulary lives.

Real-Time Streaming, Under 500ms

The transcript arrives as the speaker talks, under 500ms end-to-end latency, using Soniox's streaming speech-to-text engine. That means you can use the transcript during the call: save a word while the sentence is still on screen, flag something for clarification, or catch what you missed before it scrolls past. Post-meeting transcripts are useful for review. Real-time transcription is useful for the conversation itself.

60+ Languages, Including Non-European Ones

Japanese, Mandarin, Cantonese, Korean, Arabic, Hebrew, Hindi, alongside Spanish, French, German, Portuguese, Russian, and 50 others. Most consumer translation tools focus on high-resource European languages. MirrorCaption covers the languages that are genuinely hard to find good real-time tools for. See the multilingual transcription guide for a full language breakdown.

Every Place Language Learning Happens

🎓

Online Tutoring Sessions

Book a session on iTalki or Preply. Your tutor joins Zoom or Google Meet. MirrorCaption runs in a separate browser tab. Your tutor never sees a bot or a notification, just the lesson.

💻

International Work Meetings

Learning Mandarin? Every call with a Taipei colleague is listening practice. Learning French? The Paris client call is a lesson. The transcript is searchable, review the exact phrase you didn't catch after the meeting.

📚

University Lectures and Classes

Open MirrorCaption in a second browser tab during the lecture. The transcript builds in real time. After class you have a searchable, bilingual record to supplement your notes, plus a vocabulary deck from your professor's actual language.

🤝

In-Person Conversation

MirrorCaption has a Talk mode for face-to-face. Open it on your phone, set the source language to the person you're speaking with. Place the phone between you. Both sides read real-time transcription. No app for the other person to install.

Carlos is a software engineer from Mexico City working for a Berlin startup. Team standups and sprint planning run in English, his second language. For the first year he followed the general flow but missed the specific technical objections, the exact scope agreements, the idioms his German colleagues used to signal hesitation. He started using MirrorCaption during the weekly planning calls, reading the Spanish transcript alongside the live audio. After three months he stopped reading the translation. He still uses MirrorCaption, but now mainly to save new technical terms to his vocabulary deck. "Technical debt," "scope creep," "rubber duck debugging", all from real engineering conversations, not from flashcard apps.

What Language Learners Pay

Most language learning tools charge monthly. That works if you use them daily. If you're learning through real conversations, tutoring calls, work meetings, classes, your usage is irregular and subscriptions become expensive for occasional learning.

Tool Real-Time? Vocabulary Builder Languages Price
MirrorCaption ✓ Under 500ms ✓ Yes 60+ €49 once
Otter.ai ✓ English only — No English primary $16.99/month
Notta — No 58 $13.99/month
Google Translate ✗ Text only — No 130+ Free
Duolingo ✗ Scripted only App-only 40+ $9.99/month

The free tier, 2 hours per month, no credit card, covers a weekly 30-minute tutoring session for the full month. That's enough to evaluate whether it improves your learning before you spend anything.

For heavier use: the Lifetime plan is €49, one payment, 200 hours included. Voice Packs add hours at €2.99 per 5 hours when you need more. The Annual plan (€29/year, 100 hours) makes sense if you have regular weekly sessions. Either way: no per-seat pricing, no auto-renewing trial.

MirrorCaption doesn't replace a tutor, a €49 lifetime plan compared to other real-time translation tools is less than the cost of a single iTalki session. It makes every session more effective.

Start free. 2 hours per month, no credit card. Try it in your next class or tutoring session.

Get Started Free

Frequently Asked Questions

Does MirrorCaption work alongside Zoom or Google Meet?

Yes. MirrorCaption works with any browser-based meeting platform, Zoom, Google Meet, Teams, Skype, or any other. Open it in a second browser tab while your call runs in the first. Select "Tab audio" to capture your tutor's voice. There's no integration to set up and no extension to install.

Will my tutor see a bot join the call?

No. MirrorCaption never joins your meeting. It captures audio locally in your browser, it runs in a separate tab on your side. Your tutor's participant list stays unchanged. Your tutoring platform has no visibility into it.

Which languages does MirrorCaption support?

60+ languages, including Japanese, Mandarin, Cantonese, Korean, Arabic, Hebrew, Hindi, Spanish, French, German, Portuguese, Russian, Italian, Dutch, and many others. You set the source language (what your tutor speaks) and the target language (what you want to read) independently.

How is this different from Google Translate?

Google Translate is designed for text snippets. It doesn't stream speech, has no speaker detection, no vocabulary builder, no session history, and no export. MirrorCaption is purpose-built for spoken conversation: real-time streaming transcription, word-level tap-to-original, and a vocabulary deck that persists across sessions. See how it compares to other tools in our guide for education use cases.

Does MirrorCaption store my tutoring session recordings?

No. Audio is processed in real time and never stored on any server, it streams from your browser to Soniox for transcription, then is discarded. Transcripts are saved locally in your browser's storage and stay on your device. You can export or delete them at any time.

Can I review the full transcript after the class?

Yes. The full session transcript stays in your browser until you clear it. Export it as plain text or Markdown to use as study material, add it to your notes app, or share it with a study partner for joint review.

Your Next Session Is a Lesson Either Way

MirrorCaption makes sure you can review it, and build vocabulary from it. Start free, no credit card needed.

Start Free, 2h/month