ChatGPT Outperforms Google Translate and UD Talk in Chinese-Japanese Medical Translations

June 18, 2026 By Blab.com AI Team

When a single‑center trial in Taiwan set out to compare three translation tools, the unexpected winner was the large‑language model ChatGPT.

Researchers at Fu Jen Catholic University Hospital recorded 20 cardiology and pulmonology outpatient visits between December 2024 and November 2025. Each encounter was audio‑recorded, transcribed verbatim, anonymised, and fed into ChatGPT, Google Translate, and the subtitle app UD Talk. Eight professional medical interpreters scored the accuracy of selected exchanges on a 6‑point Likert scale, while 14 Japanese‑speaking lay participants rated overall satisfaction.

The results were striking. ChatGPT achieved a median accuracy and satisfaction score of 5.0 (IQR 4.0‑5.0) in both specialties, whereas Google Translate and UD Talk averaged 2.0 (IQR 1.0‑3.0). The difference was statistically significant (P < 0.001). A similarity analysis revealed that Google Translate and UD Talk produced identical translations in 87 % of exchanges, but ChatGPT’s outputs matched only 5 % of the time.

The study attributes ChatGPT’s edge to its context‑aware generation, built on the GPT‑4o model, which can parse colloquial speech, incomplete sentences, and specialised medical terminology. In contrast, Google’s neural‑machine engine and UD Talk’s speech‑recognition pipeline tend toward literal, sentence‑level translation. One illustrative error involved the Chinese term “腎指數”; Google Translate and UD Talk rendered it as “kidney‑index finger,” while ChatGPT correctly translated it as “renal function value.”

Despite its superior performance, the authors caution that ChatGPT should not replace professional interpreters. Human‑trained interpreters deliver not only accurate translation but also cultural mediation and clarification—functions that current AI systems cannot fully replicate.

The study notes several limitations. The language pair is narrow (Chinese‑Japanese), the sample size is small (20 visits), and a single physician per specialty was involved, all of which may affect generalisability. The analysis relied solely on GPT‑4o; newer or alternative LLMs may perform differently.

The findings suggest that AI‑assisted translation could serve as a supplementary aid in outpatient settings where interpreter services are scarce, especially for brief interactions or preliminary triage. However, the authors recommend continued human oversight to safeguard patient safety and communication quality.

Future research should broaden the scope to other language pairs, larger datasets, and diverse clinical contexts to determine whether ChatGPT’s advantages persist across settings.

In sum, the study provides evidence that ChatGPT can outperform conventional machine‑translation tools in translating Chinese‑Japanese medical dialogue, but it underscores the need for complementary use alongside professional interpreters to ensure accurate, culturally appropriate, and safe patient‑provider communication.

ChatGPT Outperforms Google Translate and UD Talk in Chinese-Japanese Medical Translations

Latest AI Stories

McKinsey Calls for Governments to Move Beyond AI Pilots to Transform Public Services

Shanghai Electric Unveils New Humanoid Robots and AI-Native Factory Framework at WAIC 2026

Clark State College Secures $100,000 Grant to Advance AI Literacy Across Ohio Community Colleges

Meta and Snapchat Push AI-Powered Glasses Amid Growing Privacy Backlash

OpenAI Faces Local Opposition as $20B Data Center Plan Unveiled in Effingham County, Georgia

Independent Musicians Join Class Action Against AI Music Platforms Suno and Udio

Financial Firms Ramp Up AI Spending, But ROI Measurement Lags Behind

Open-Source AI Models: What "Open" Really Means for Large Language Models

Cigna Expands AI-Enabled Predictive Tools to Cut Chronic-Condition Costs by $200 Million