Google’s AI can translate languages it never learned

Fariha Khan
November 25, 2016
785

Google’s recent announcements show us how artificial intelligence is really thriving out there. We got to know through a Google Research blog post how the firm’s switch to neural learning for Google Translate implies that the machine can translate between language pairs that it never learned actually. Additionally, a DeepMind project demonstrates that AI can lip-read better than people.

Google says that Google Translate does not have individual systems anymore for each language pair. Now it makes use of only one system with tokens showing input and output languages. The AI learns from tons of examples. That is why team wondered if it could translate between 2 languages without learning how to do it.

Well, can we translate between a language pair even when the system has never seen beforehand? Translations between Korean and Japanese can be a good example where Korean⇄Japanese examples were not exposed to the system. Interestingly this is possible. It can generate sound Korean⇄Japanese translations while it has never been taught in the context. This is called “zero-shot” translation.

Zero Shot Translation

Another astounding realities of AI is that it successfully operates on a ‘black box’ basis, where the programmers don’t know how the system does what it does actually. Yes, that is indeed amazing. According to Google, the company had to look for some way to decide how the AI was doing it all exactly.

There is another question that is couples with the success of the zero-shot translation. Is the system learning a general exemplification in which sentences with the same meaning are signified in analogous ways irrespective of language, that is an “interlingua”? With the help of a 3-dimensional representation of internal network data we can have an idea how the system works as it translates a set of sentences between all potential pairs of English, Japanese and Korean.

We get to know through the analysis that the network must be encoding something about the semantics of the sentence than just simply remembering phrase-to-phrase translations. This is considered as a sign of existence of an interlingua in the network.

The lip-reading task was reported by New Scientist. A professional lip-reader trying to interpret 200 randomly-selected TV clips got a success rate of just 12.4% whereas Google’s DeepMind achieved 46.8%.

The Artificial intelligence system was trained about 5000 hours from 6 different TV programmes (BBC Breakfast, Question Time and more). On the whole, the videos had 118,000 sentences

Just by looking at each speaker’s lips, the system translated whole phrases. It was also able to cope with the audio and video were out of sync.

A computer system was shown the right links between sounds and shapes of mouth. With the help of this information, the system found out how much the feeds were out of sync when they didn’t match, and readjusted them.

A potential use pronounced by the team was silent dictation to Siri in boisterous settings, the iPhone reading your lips through its camera instead of listening to your voice.

With this and much more, let us keep your fingers crossed and see what else Google and AI can do in future.