Real-Time Medical Interpretation
In Any Browser

No phone call. No interpreter wait. Works for telehealth and in-person appointments in 60+ languages.

MirrorCaption is a browser-based medical interpretation app that streams real-time transcription and translation during clinical appointments — no phone call required, no per-minute billing, no bot joining your telehealth call. Open a tab on your laptop or phone, select the patient's language, and read what they're saying as they say it. It supports 60+ languages and works for both in-person encounters and telehealth visits on any platform.

It's 2:30 on a Tuesday afternoon. Dr. Linh Nguyen's 3 o'clock is a new intake — Gurpreet Singh, 58, possible chest pain, Punjabi speaker. The clinic's staff interpreter works Monday, Wednesday, Friday. Linh has three options: wait 8–12 minutes for a telephone interpreter to connect (assuming one is available for Punjabi at this hour), ask the patient's adult son to describe chest-pain symptoms he has no cardiology vocabulary for, or open a browser tab.

She opens the tab. Selects Punjabi → English. Thirty seconds later, Gurpreet is describing his symptoms while Linh reads the translation in real time — word by word, as he speaks.

Key Takeaways

Language Barriers in Healthcare Are a Clinical Risk

More than 40 million people in the United States have limited English proficiency (LEP) — they speak a language other than English at home and face significant difficulty communicating in English at a healthcare level. According to US Census Bureau data, the largest LEP populations speak Spanish, Chinese, Vietnamese, Tagalog, Korean, Arabic, Punjabi, and Haitian Creole.

The clinical consequences are well documented. Research has found that LEP patients experience significantly higher rates of serious adverse events tied to miscommunication, including wrong medication dosages, misunderstood discharge instructions, and delayed diagnoses. These aren't edge cases — they're patterns that repeat across emergency departments, primary care clinics, and specialist offices wherever language access is inconsistent.

Legally, federally funded healthcare providers are required to provide language access under Title VI of the Civil Rights Act. The HHS Office for Civil Rights enforces this requirement. The National CLAS Standards (Culturally and Linguistically Appropriate Services) go further: Standard 5 explicitly requires offering language assistance to every patient who needs it, free of charge, at all points of contact.

Most clinics understand the obligation. The gap isn't awareness — it's logistics. Telephone interpreter lines involve wait times. Video remote interpretation (VRI) tablets require enterprise contracts. And the informal alternatives are ethically and medically problematic in ways that are hard to defend if an adverse event occurs.

How Clinics Handle Language Access Today — and Where Each Approach Falls Short

Over-the-Phone Interpretation (OPI)

LanguageLine Solutions, CyraCom, and similar services connect callers to a live interpreter in 200+ languages. The model works when an interpreter is available. In practice, call wait times for less common languages — Somali, Hmong, Haitian Creole, Punjabi — can run 8–15 minutes during peak hours. Billing is per-minute: roughly $1.50–$3.00 depending on language, time of day, and volume commitment.

Do the math out loud: a 20-minute appointment with 20 minutes of active interpretation runs $30–$60 each time. A clinic seeing three LEP patients per week pays $90–$180/week — $4,680–$9,360 per year. For a five-clinician practice with similar volume, that scales to over $30,000 annually. Community health centers operating on grant funding are often priced out before they start.

OPI is also functionally incompatible with telehealth. When the appointment is a Zoom or Teams video call, the clinician cannot simultaneously hold a phone to their ear for a telephone interpreter while conducting a video visit. The workflows don't merge.

Video Remote Interpretation (VRI) Tablets

Hospitals deploy VRI systems — iPads or dedicated tablets mounted on stands, connected to video interpreter services like MARTTI or InDemand. The model provides a visual connection, which matters for emotional register and for signed languages. The limitation: VRI hardware is tied to the institution. A community health center serving rural patients via telehealth can't use a VRI tablet in a Zoom window. Solo practitioners can't access enterprise VRI pricing tiers. The tool solves the in-hospital in-person problem well, and nothing else.

Google Translate — The Tool Most Practices Already Use

It's worth naming the elephant. A meaningful number of primary care physicians admit using Google Translate with patients. It's understandable: it's free, instant, and available on any phone. It's also inadequate for clinical use, in several specific ways.

Google Translate processes typed text snippets, not streaming speech. It has no speaker diarization (no way to separate the clinician's voice from the patient's), no medical-context training for terminology accuracy, and no export — the exchange disappears when the window closes. For documentation, it leaves no record. The tool is designed for translating a restaurant menu in a foreign city. It was not designed for a differential diagnosis conversation.

There is also no HIPAA pathway. Google's terms do not include a Business Associate Agreement for the consumer Translate product. Clinicians using it are doing so outside any compliance framework.

Staff and Family as Interpreters

Using untrained staff or family members as interpreters is the most common informal approach — and the one with the highest documented error rate. Family interpreters omit and modify clinical information, often unintentionally, due to limited vocabulary and emotional involvement. A parent asking their child to interpret a cancer diagnosis, or a spouse softening a prognosis they don't want to deliver, are patterns that clinical staff recognize immediately. Untrained staff face similar vocabulary gaps and are not trained in interpreter ethics (confidentiality, accuracy, non-interference).

There is a fourth option. MirrorCaption runs entirely in your browser — open the tab before the patient walks in. Remote teams across languages use it for the same reason clinicians do: the transcript appears while someone is still speaking.

Try Free →

What a Browser-Based Medical Interpreter Looks Like in Practice

MirrorCaption runs as a Progressive Web App — no download, no browser extension, no IT approval required. Open it in Chrome, Edge, or Safari on a laptop, tablet, or phone. Here is the clinical workflow:

  1. Open the app at mirrorcaption.com/app. Sign in or continue with the free tier (2 hours/month, no credit card).
  2. Select audio source. For in-person appointments, choose microphone only. For telehealth calls, choose system audio + microphone to capture both sides of the video call.
  3. Set the languages. Select the patient's spoken language and your reading language. The transcript displays both columns side by side.
  4. Begin the session. Words appear on screen within under 500ms of being spoken. Partial words auto-correct as context accumulates.

In-person mode: Hand the phone or tablet to the patient. They speak into the microphone; you read the translation on your screen. You speak; they read your words translated into their language. One device, one browser tab, no extra hardware.

Telehealth mode: MirrorCaption captures the call audio from the browser tab using the browser's built-in display-audio capture. No bot joins the call. No participant receives a notification. The patient sees and hears only you; MirrorCaption runs in a second window on your side.

🏥

Community Clinic Intake

A Somali-speaking patient arrives for a same-day appointment. Staff interpreter unavailable. Open the app on the intake desk laptop, select Somali → English, hand the patient the screen to read. History-taking begins in under a minute.

📱

Telehealth Visit

Conducting a follow-up via Zoom with a Cantonese-speaking patient in a rural county. MirrorCaption captures the call audio in a second tab — no interpreter phone call, no bot joining the video. Both sides communicate in real time.

🤝

In-Person Handoff

An urgent care patient speaks Haitian Creole. The clinician opens MirrorCaption on their phone, switches to face-to-face mode, and places it on the examination table between them. Both parties read the other's words as they're spoken.

🩺

Rural Clinic, No Staff Interpreter

A solo practitioner in a rural clinic sees Vietnamese-speaking patients three days a week. The nearest certified interpreter is 40 miles away. MirrorCaption runs on the clinic's existing laptop — no hardware, no contract, no per-minute bill.

Languages That Matter Most in Clinical Settings

MirrorCaption supports 60+ languages, including all of the highest-need clinical languages identified in US and EU health equity data:

Translation is bidirectional — the clinician speaks or types in English and the patient reads in their language, and vice versa, in the same session. Speaker detection automatically labels distinct voices in the transcript, so the record shows who said what. That matters for clinical documentation and for reviewing the conversation after the appointment.

One honest note on accuracy: MirrorCaption performs well on conversational medical dialogue — symptom descriptions, medication instructions, follow-up scheduling. It is not trained on specialist clinical jargon at the level a certified medical interpreter is. For general outpatient appointments, that is not usually a limitation. For neurosurgery consultations or psychiatric evaluations, a human interpreter remains the appropriate standard.

HIPAA and Privacy — A Straight Answer

The first question any healthcare provider asks is: "Is this HIPAA compliant?" The honest answer requires two parts.

What MirrorCaption does with audio: none is stored on servers. Audio streams from your browser to the Soniox speech-to-text API for real-time transcription, then is discarded. MirrorCaption's servers never receive or retain audio. Transcripts are saved locally in your browser's IndexedDB storage — they live on your device, not in a cloud database. When you close the session, the audio is gone.

The structural compliance question: MirrorCaption is a software tool, not a healthcare entity. It does not currently offer a Business Associate Agreement (BAA). For practices operating in strictly regulated environments with formal HIPAA compliance programs, that matters and you should review it with your compliance officer before deploying any non-BAA tool in a patient-facing workflow.

For practical comparison: Google Translate — the tool a significant number of clinics currently use informally — also does not offer a BAA, routes text through Google's servers, and logs query data under Google's standard privacy policy. MirrorCaption's local-storage model is meaningfully more privacy-protective than the informal alternatives most practices already rely on, even without a formal BAA.

The Real Cost of Language Access — and a More Affordable Option

The math that telephone interpretation vendors don't put on their websites:

VRI tablet services run $0.80–$2.00/minute plus equipment costs and enterprise licensing. Community health centers and solo practitioners are not the target customer for enterprise VRI — they're priced out before the first conversation.

MirrorCaption's Lifetime plan is €49 one-time. That includes 200 hours of managed transcription and translation. At three 20-minute language-assist appointments per week, that's 52 hours per year — the Lifetime plan covers nearly four years at that volume before a top-up is needed. Voice Packs add 5 hours for €2.99 or 15 hours for €7.99 when you need more.

Who This Is — and Isn't — For

MirrorCaption is best suited for:

MirrorCaption is not a replacement for certified human interpreters in:

If you're uncertain whether a clinical encounter falls into the "human interpreter required" category, default to professional service. MirrorCaption is designed for the routine appointments where an 8-minute OPI wait creates friction without adding safety. Compare to how MirrorCaption differs from general meeting tools that don't address the clinical context at all.

Frequently Asked Questions

Is a real-time AI medical interpreter accurate enough for clinical use?

For routine outpatient appointments — describing symptoms, confirming medication schedules, discussing follow-up care, explaining a common diagnosis — yes, with appropriate clinical judgment. MirrorCaption's accuracy is comparable to other AI translation tools that clinicians already use informally, with the advantage that both parties see the same real-time text. For high-stakes decisions involving complex specialist jargon (surgical risk explanation, detailed psychiatric assessment, complex dosing calculations), a certified human medical interpreter remains the appropriate standard. Use your clinical judgment case by case, as you would with any other tool.

Does it work during a Zoom or Teams telehealth call?

Yes. MirrorCaption captures browser tab audio using the browser's built-in display-audio API (getDisplayMedia), so it works with any browser-based video call — Zoom, Teams, Google Meet, doxy.me, and others. No bot joins the call. No participant receives a notification that translation software is running. The patient sees and hears only you. MirrorCaption runs in a separate browser window on your device. You can check the full overview of meeting translation tools for a broader comparison.

Does the patient need to install anything?

No. In in-person mode, you open the app on your device and hand it to the patient. They see the screen — no account, no app, no download needed on their part. For telehealth, MirrorCaption runs on your side only; the patient uses their normal video call interface unchanged.

Is MirrorCaption HIPAA compliant?

MirrorCaption does not store audio on servers and keeps transcripts in your browser's local storage only. It does not currently offer a Business Associate Agreement (BAA). For strictly regulated clinical environments with formal HIPAA compliance programs, discuss this with your compliance officer. For the majority of outpatient clinic workflows, the local-storage architecture is meaningfully more private than tools most practices already use without a compliance review — including Google Translate, which routes text through Google's servers under its consumer privacy policy.

What languages are available for patient appointments?

60+ languages, including Spanish, Mandarin Chinese, Cantonese, Vietnamese, Tagalog, Punjabi, Hindi, Urdu, Arabic, Somali, Haitian Creole, Russian, Portuguese, Korean, Polish, Romanian, French, German, Italian, and Dutch. The full list is available in the app settings. If you serve a language community not on this list, contact support — language coverage is actively expanding.

Start Before Your Next Appointment

2 free hours every month. No credit card. No install. Open the tab before the patient walks in.

Get Started Free

Language barriers in clinical settings cause preventable harm. Professional interpretation services are the gold standard — and for complex, high-stakes encounters, nothing replaces a qualified human interpreter. For the routine appointment where a 12-minute OPI wait creates friction without adding safety, a browser-based medical interpretation app is the practical middle ground. No phone calls. No enterprise contracts. No IT approval. Open the tab and read what your patient is saying — in real time.