How Google Translate Works, and Why It Doesn’t Measure Up Posted by Transparent Language on Sep 2, 2015 in Archived Posts
With over 200 million daily translations, there’s no denying that Google Translate is a wildly popular translation service. Indeed, machine translation has come very far since its infancy in the early 2000s. Instead of translating words at face value, machine translators have developed complex algorithms to deliver more accurate translations, and some even take into account colloquial language and idioms. Still, the very nature of machine translators prevents them from ever doing a human’s job. Let’s take a look at how machine translators (such as Google Translate) work, what their limitations are, and why they can’t replace the quintessential human touch.
How machine translation works
Google Translate, as well as other machine translators, operate on statistics rather than rules. That is, they look for patterns in hundreds of millions of documents that have already been translated by human translators. Google Translate makes special use of UN documents, which are translated in all six official UN languages, and thus provide ample linguistic data. This way, they can weigh a plethora of options for phrases presented by various different (human) translations, and select an educated guess based on the one that occurs most frequently. For example, they detect that, in Spanish, the phrase “darse cuenta” is usually translated as “realize” in English. Therefore, based on statistics, Google Translate will correctly translate the phrase as “realize”, rather than a word-for-word translation, which would appear more like “give account”.
Finding linguistic data large enough to create legitimate statistical analyses is no easy feat. Given that more documents are available in English than in any other language, the data almost always uses English as an intermediary step when translating between two languages that aren’t English. For example, when translating from Russian to Spanish, Google Translate will first translate the text from Russian to English, and then from English into Spanish. As a result, when translating in languages other than English, machine translations actually involve two iterations.
In fact, some language pairs involve even more iterations. If you want to translate some text from, say, Catalan to Japanese, Google will translate it first into Spanish, as most existing Catalan translations are in Spanish. Then, this translated Spanish-language version of the original Catalan text will be translated into English. And finally, the English version of the Spanish version of the Catalan text will finally make it to Japanese — and if you’re lucky, it will still bear some resemblance to the original meaning.
Why it doesn’t make the cut
Google Translate does a good job with very basic translations — especially those whose target language is English — and now even offers alternative interpretations for certain words and phrases. However, the very methodology upon which Google Translate is based prevents it from ever competing with human translators. Here’s why:
Statistics don’t have feelings. Google Translate is based on statistics — it chooses the “best” translation based on how certain words and phrases have been translated in other documents. As a result, machine translators choose the most probable translation, but not the most interesting or poetic one. As a result, even if translations are accurate (which they often aren’t), they adopt a robotic, lifeless tone. It takes a human translator, with feelings and creativity, to reproduce the tone, color, and vibrancy of the original text.
Machine translations struggle with complex grammar. Language is based on rules, and as a result, a statistics-based translator like Google will struggle with complex grammatical concepts, such as the difference between the imperfect and preterite past tenses in Romance languages. This is especially true given that Google almost always uses English — a language that does not grammatically distinguish between preterite and imperfect tenses — as an intermediary step when translating into Romance languages. Therefore, Google Translate often incorrectly translates the imperfect past as the preterite past (and vice versa), making ongoing or habitual acts seem like one-time, completed events.
Google can’t write for an audience. Every translator knows that you need to tailor your work to whom you’re writing for. For example, if this article were written for a casual blog, my use of the word “whom” in the previous sentence may come off as overly formal. However, given that this article appears on a language interest blog, grammarians and language experts may applaud my correct distinction between “who” and “whom” (though they may scoff at my decision to end a sentence in a preposition). Machines cannot make such judgment calls — Google cannot take into account who the intended audience is for the article it translates. Only a human translator can make that kind of decision.
Google Translate vs. a human being
To illustrate the difference between Google Translate and a living, breathing human translator, I will employ both to translate the following text in English, which appears on a website selling Argentine wine. Try to guess which one was written by a human, and which was produced by a machine (spoiler alert: it won’t be hard).
Original text:
Después de una excelente cosecha como la que le precedió, la cosecha 2009 muestra sus virtudes en este vino base Cabernet Sauvignon, mas el ensamble de tres variedades de gran personalidad que encontraron en San Rafael el terruño ideal para la expresión de sus mejores cualidades. Vino aun de color rojo violáceo intenso a pesar de los años en botella, ya en la copa se nos muestra intenso y seductor con aromas especiados que se entremezclan con nítidos y frescos aromas a frutas de ciruelas, cerezas negras y moras, mientras que se van desprendiendo lentamente los aromas tostados que recuerdan a granos de café molidos.
Translation 1:
After an excellent harvest like the one that preceded it, the 2009 vintage shows its virtues in this Cabernet Sauvignon base wine, plus the assembly of three varieties of great personality that found in San Rafael the ideal terroir for the expression of its best qualities. It was still intense violet red despite years in the bottle, already in the glass it is intense and seductive with spicy aromas that intermingle with clear and fresh aromas of plums, black cherries and blackberries, while leaving slowly releasing toasted aromas reminiscent of ground coffee beans.
Translation 2:
After a great harvest like the one that preceded it, the 2009 harvest shows its virtues in this cuvée Cabernet Sauvignon. It’s an expressive mixture, articulated by three varieties of great personality that are found in San Rafael, the perfect region to bring out its best qualities. The wine still preserves a strong purplish-red color, in spite of the amount of years gone by since it was bottled. Once poured into the glass, it remains intense and seductive. Its spiced scents mix together with clear and fresh fruit aromas of plum, black cherry and blackberry, while its toasted scents release slowly, reminiscent of the delicious smell of ground coffee beans.
You probably guessed right: the first translation was done by Google; the second, by a professionally trained bilingual translator. As you can see, the machine translation is comprehensible, but inelegant and sometimes confusing. On the other hand, the human translation flows smoothly, is organized coherently, and matches the elegant tone of the original article.
Machine translation isn’t without its perks. It can be a life-saver in a pinch, when you need to get a rough idea of what a certain phrase means, or when you need to decipher a street sign while traveling to a foreign country. However, when it comes to translating important documents, the limitations of machine translation prevent it from being a viable option. As the above examples demonstrate, a translation that is based on statistical patterns will never match the quality of one created by a professional, who understands the rules and nuances of language. Indeed, machine translation has come a long way, but it’s still far from replacing the human touch.
The following post is from Paul, an English teacher who lives in Argentina. Paul writes on behalf of Language Trainers, a language teaching service which offers foreign language movie reviews as well as other free language-learning resources on their website. Check out their Facebook page or send an email to paul(at)languagetrainers.com for more information.
Build vocabulary, practice pronunciation, and more with Transparent Language Online. Available anytime, anywhere, on any device.
Comments:
Matthew Rothenberg:
“Never”? That’s a bold assertion based on only a few decades of machine translation. I’ve certainly noticed in recent years that Google Translate has become far more comprehensible and handles many more idioms in the languages I can read.
I do think there are choices a human translator makes that a machine would be hard-pressed to match. And certainly the current state of the science is limited in the ways you describe. However, machine translation is helping people get the gist of writing far more complex than street signs. (I often use it to get a general understanding of what’s being discussed in Arabic newspapers, for example. It’s not pretty, but it’s much better than nothing.)
So … All good points, but IMO short-sighted about what “never” really means.
Jay:
@Matthew Rothenberg Like it😃
Chris:
Not exactly on-topic, but “Every translator knows that you need to tailor your work to whom you’re writing for”
even as a native English speaker, I had to read that twice!
–> “…tailor your work to the person for whom you are writing” maybe
“”tailor your work to your audience” would be even clearer, but misses out your deliberate use of “whom”.
Just playing devil’s avocado,
Chris.
paul avermaete:
“…tailor your work to the person for whom you are writing”,,,
“c’est le Ton qui fait la Chançon”,,,(French proverb)
Very interesting explanation, I think the best we-users-can do, is ALWAYS first translate into English, and then in the choosen language; is this an agreement ?
thanks, paul Antwerp Belgium
Jon:
@paul avermaete Le Ton doesn’t like that either, lol.
.tailor your work for the person to whom you are writing..
..tailor your work to whomever you are writing for..
just having fun
JOn
chris:
” c’est le Ton qui fait la Chançon ”
does not work well in Google Translate.
Terry E Ferrell:
Google Translate doesn’t work that well all of the stuff on my phone still comes out in Korean I can’t even get into the music programs that I want from google translate all for nothing to Korean and kpop stuff I cannot read what is the problem
Lily W:
Google Translate is good for single words. Not full sentences or paragraphs. The translations are completely wrong and Im not going to pretend I know why. Dont use it for important conversations. Youll just look like a fool. Shell out some money and purchase Rosetta Stone or something.
Lily W:
P.S. This was a great article!!
Keshia:
@Lily W Yes this is a good article
Elena:
Hi!
It was interesting to read about 2 intermediaries in the translation from Catalan to Japanese. When I explored this Google’s feature in 2011, I found that Ukrainian and Russian were “on equal terms”, Russian was not an intermediary language to translate to and from Ukrainian. It seemed a bit political))
Translations from Russian to Ukrainian and vice versa were 95% correct (which gives hope that translation between very close grammatically languages can be already automated for the most part, e.g. Turkic, Eastern/Southern/Western Slavonic languages).
http://www.translationdirectory.com/articles/article2408.php
Janmejai:
nice article but I needed a technical article in detail.
Stephsnie:
I think google translate is better than not trying at all. Have you ever spoken to someone who spoke very little English but you could understand what they needed even if you wanted to giggle st how they asked? Americans usually don’t even try to express themselves in another language, if they are encouraged to try through the tool of Google-it is priceless!
Toni Molik:
I tried to translate one word from Malayalam (Thirovosthi) to English. I don’t understand the procedure. How does it work? Toni
P De Mario:
I think that google give us an idea what we wanto to translate, but the final conclusions it depends from us.
Sam:
What is missing here is the fact that not all human translators are excellent linguists translating into their native languages. Human translations can be awkward and inaccurate as well. Google Translate makes it possible to read news or other websites in foreign languages. Sure, the output is not perfect, but it’s usually much better than what you could get by trying to look up each word one by one in a dictionary. Much faster, too.
fari:
google can not translate at all ….It does not make any sense ..very disappointed ,,,,
Swift:
Example: https://translate.google.com/translate?hl=en&sl=auto&tl=en&u=http%3A%2F%2Fotorten.ru%2Fdocheri-otortena.html entire text translated word by word, often wrong words that are assumed typos or omonyms, no syntax at all.And it doesn’t accept any corrections, from AI’s point of view this is 100% sure translation
Michael:
This thing about Google Translate using intermediary languages… how do you know that? What’s your source?
j:
@Michael Here is a discussion of the subject. The topmost answer provides translation examples showing that even for French->German translation of very short texts (a simple sentence), Google Translate is going through English.
https://www.quora.com/Does-Google-Translate-use-English-as-an-intermediary-step-language
Adrian Wallwork:
The example given makes no sense. No human would translate a document on the fly and then not spend a few minutes revising it. Google Translate simply provides you with a first draft which you can (must) then work on.
Patrick:
Good info…some thoughts…
As a theologian we have had to deal with biblical translations and versions since forever…for example the popular Jerusalem Bible in English is considered a version since it is a translation from the original French translation which was translated from the original languages. In other words the English JB is a translation of a translation…and in these cases serious biblical students know something is usually lost when that happens. However key meanings usually are not.
AI gets better the more it is used. I use Google Translate when I get stuck writing to my French family and friends in French. I will try a phrase or a sentence in various ways before hitting the send key. I use the one that seems to be the clearest. But the author is right…the human touch along with kindly corrections is the best teacher.
JM Jalaron:
The battle between man and machine concerning language translation is a constant phenomenon. It seems endless, given that, machine translators like Google Translate, has now acquired productive approaches in text translation through the years. But personally, I never really trust any translation machine from the world wide web. Simply because I still feel secured with human translators because I know I can count on them when it comes to portraying the perfect thoughts and emotions in the translated language. So it’s a NO NO to machine translators. They often suck. And when they do, it’s hopeless than world peace!
mike:
Recently I received a e-mail from the Kiev Archives in Ukraine responding to my request for information/documents regarding a genealogy project for family members in Ukraine.I received a response in Ukrainian,I translated it on google translate.The english version said they had the records.When I requested those records they informed me that I misunderstood them and that google translate was incorrect.I also had a Russian/Ukrainian/English/German translator over there in Ukraine inform me also that google translate was wrong too.We are talking about the word “not” in a sentence saying records have been found versus records have not been found.This is not good.Too big of a difference maker in a translation not to get it correctly.Please fix your google translation problem.Do you have another suggestion?Would appreciate your response.Thank you.
Sincerely,
Mike Yesnes
Mota:
Current google translation : “After an excellent harvest like the one that preceded it, the 2009 vintage shows its virtues in this Cabernet Sauvignon base wine, but the ensemble of three varieties of great personality that found in San Rafael the ideal soil for the expression of its best qualities. It even came from deep red violet despite the years in the bottle, and in the glass we are shown intense and seductive with spicy aromas that are intermingled with crisp and fresh aromas of fruits of plums, black cherries and blackberries, while they leave slowly releasing the roasted aromas reminiscent of ground coffee beans.”
Easier to upgrade Machine translations 🙂
Transparent Language:
@Mota Upgraded with the help of people who speak the language, most likely! Comes full circle. 😛
Brent K:
In the last 3 weeks, I’ve found myself in a situation where I’m living with a Chinese person who speaks essentially zero English while I speak zero Chinese. For the first day or so, it was just awkward nods of pretending to understanding one another.
So the next day I installed the Google Translator App for Android. It has a conversation mode, where we can take turns at speaking and it will translate what we say and even read it out in our languages. From then on, we were having lengthy discussions and we able to co-ordinate our actions for the next three weeks. I think over that time in total, we’ve used the app as a live translator for 10 hours. Granted, some of the translations from Chinese to English weren’t smooth, but they were good enough to carry conversations about what needed to be done around the house, what to make for dinner, and even complex topics about differences in our cultures.
I’m not say that at this stage Google Translator is as good as a live human translator (it’s definitely improved since 2015). But it’s amazing to me that I’ve been able to live and communicate in this situation, and all through my cheap phone with an free app. We’ve also learned a little of each other’s language on the way.
Because of this experience of the last 3 weeks, I am extremely impressed with Google Translator App, esp. the live translation.
Transparent Language:
@Brent K Glad to hear this! We’re genuinely impressed by and excited about the improvements Google Translate has made (especially in the last few years since this post was published). For informal situations like surviving with a roommate, ordering a meal, and so on, it can be a life saver. For individuals, organizations, government agencies for whom formal/high-proficiency language skills are not optional, it still doesn’t quite measure up. There’s a time and place for Google Translate and machine translation in general (it can even be a helpful tool when learning a language!), as well as for seriously learning a language.
Guerry’s:
Google translate is not good in khmer!
Alex Potemkin:
Just a couple of days ago, I started tol looked away from Google Translator. I used it frequently, because English is not my native language, and I translating some complicating things for faster ubderstanding sometimes, and even more frequently I using it to check my English writing, with the simple idea “if the Machine can understand me, so a Human will too”.
But something went wrong. Last months(?) Google Translator $$$ (at lest at my Androud phone, when I use it via “tap and translate”) tends to ignore huges parts of translated texts. $$$ Sometimes it’s one or couple words, but frequently it’s whole sentences or big block of text across a couple of sentences. It change the whole sense of translation dramatically.
Which part of machine algorithm is responsible for that? 🙂
P.S. For this particular text, Google Teanslator did the same trick. I marked the part which has been ignored with $$$ at the beginning and the end of the lost block.
P.P.S. Sorry my English 🙂
Ronald:
This brings clarity into mind. Well done!