If you’ve ever tried to translate something online, you know how frustrating that can be. Some words translate flawlessly, others translate verbatim but don’t seem to fit in the sentence, and others don’t translate at all. Why is it all so hit or miss?
The Washington Post says it’s all in the numbers. Researchers used to rely on linguistic models for translating text on a website, but now they’ve realized it’s not so easy to get a website to “think” in a certain language, and even more difficult to get the program to understand the cultural nuances behind that language.
IBM researchers thought there was a better way of going about machine translation. They believed statistics and probabilities could help solve the problems with translation sites. Machines don’t need to understand the translations they produce, they just need to produce accurate translations that people can understand.
The next system took a massive amount of words and sentences in each language, and looked at how sentences are most commonly constructed to try to place words in the appropriate places. Now that researchers have the stats under control, they’re adding native syntax back in.
Machine translations today are a marriage of math and language. This does not mean machines are capable of translating with the same degree of accuracy as people, but the gap is definitely a lot smaller than it used to be.
To get the most out of a machine translation (if you absolutely must use one), there are two things you should know. Machine translations are most accurate between languages in the same family. For example, Spanish and French are both Romance languages and share common roots and grammatical structures. Also, keep your English simple. By doing things like using all present tense you avoid potential hang ups.