Universal Translators in the Next Few Years?

“No hablo espanol.”

The universal translator is a sci-fi staple: Star Trek made it infamous. Star Wars had C3PO. Hitchhiker’s Guide had the babel fish. Stargate and Dr. Who both had some variation of a voice-to-voice translating device. In some ways, the future is already here: Google Translate can turn around a workable text translation almost instantly (automatically in Chrome), and it’s letting the multilingual web talk to itself. Word Lens will even translate text you see in real time as augmented reality on your smartphone. Text translation is all well and good, but when will the holy grail arrive? When will voice-to-voice translation become a reality? When can you finally toss your Rosetta Stone software?

Actually, it’s already here – it’s just not as smooth as you might have hoped (yet). All the basic pieces of software necessary to a universal translator have already arrived: speech recognition (voice-to-text), language translation (text-to-text), and speech synthesis (text-to-voice). In fact, it’s already being employed in a number of sectors using current technology. Granted, the process is pretty clunky, but it’s here and it works.

The Army has been using a system developed by DARPA under the Spoken Language Communication and Translation System for Tactical Use (TRANSTAC) to help soldier speak in foreign countries. One such system, IraqComm, was developed in conjunction with SRI International and translates back and forth between English and colloquial Iraqi Arabic. Check out the system in action:

I also found this video of Ray Kurzweil demoing a basic version of a translator a few years ago:

From these videos alone, you can get a good idea of what needs to improve. First, the translation process isn’t nearly fast enough to hold a fluid, natural conversation. This particular hurdle shouldn’t be a difficult one to overcome; faster computers will be able to run the recognition, translation, and synthesis algorithms much more fluidly. However, there is an upper limit to the speed that these systems could acquire: the recognition software needs to hear most of the sentence before it can pass a text copy on to the translator. Languages don’t correspond to one another word-for-word, so a real-time translator isn’t really possible. At best, we should expect news-correspondent delays.

Second, the translation is a bit rough, and doesn’t always catch the finer points of what was said. Again, this is technology that has been improving over time (and nowadays, text translators can almost always capture the general idea in their translation). The newer era of translators – Google Translate included – are a significant leap from the generations that came before them. It wouldn’t surprise me to see the newer algorithms picking up slang, idioms, etc. as they refine their algorithms.

Finally, the automated voice sounds mechanical and awkward. I find this to be true of all the speech software I’ve encountered, and it tends to bother me (however, I have friends who listen to PDFs this way and don’t mind it). Certainly this kind of software is improving as well, but I have yet to hear speech software that sounded completely natural. This might actually be the last hurdle to be overcome. It reminds me of how you can’t lock eyes with someone over a webcam because the cameras aren’t behind the monitor: we’re always looking slightly to the side. The ideal speech translator would reproduce your own voice, as if you spoke that language, but needless to say this is a long way off. There might also be an uncanny valley along the way.

These three pieces are now being integrated more seamlessly, and the hardware is already here to support the improving software. Imagine using your smartphone and a Bluetooth to translate in real time in a foreign country. I doubt it’ll be absolutely perfect in the foreseeable future, but there are already some early versions coming. Earlier this year Google told The Times it was working on such a package, and hopes to have something that will “work reasonably in a few years time.”

We can add one more job to the robot-replacement endangered list: translators.

Drew Halley
Drew Halley
Drew Halley is a graduate student researcher in Anthropology and is part of the Social Science Matrix at UC Berkeley. He is a PhD candidate in biological anthropology at UC Berkeley studying the evolution of primate brain development. His undergraduate research looked at the genetics of neurotransmission, human sexuality, and flotation tank sensory deprivation at Penn State University. He also enjoys brewing beer, photography, public science education, and dungeness crab. Drew was recommended for the Science Envoy program by UC Berkeley anthropologist/neuroscientist Terrence Deacon.
Don't miss a trend
Get Hub delivered to your inbox