Neural Machine Translation: How Artificial Intelligence Works When Translating Language

Last updated: February 18, 2017 1:43AM

As most marketing professionals know, an explosion of big data has revolutionized the way that companies drive operational efficiency and innovation—and it’s only set to keep going. Experts predict that data volumes will continue to increase by 40 percent year over year.

Of course, as companies find effective ways to utilize data, they’ve been presented with challenges in dealing with overloads of information. But big data also presents big opportunities—one of which is taking a business global through localization.

Along with huge increases in data, there are two other factors that are creating a shift in the localization industry. First, the exponential growth of computer power, and second, increased interest in so-called deep learning: a type of machine learning used by Google in its image and voice recognition algorithms.

Bearing these factors in mind, it’s no surprise that deep learning recently kicked up a storm in translation and localization to create what we now know as Neural Machine Translation (NMT). After all, as data volumes and technology advancements increase, so does translatable material. But what exactly is NMT, and how does it increase localization efficiency?

In a recent webinar, Lionbridge’s Director of Machine Translation, Jay Marciano, discussed the application of this new and more accurate translation method and how it’s leading to industry advancements.

How Neural Machine Translation works

Neural Machine Translation is a relatively new paradigm, first explored toward the end of 2014. Before this, machine translation operated on a statistical model whereby machine learning depends on a database of previous translations, called translation memories.

While NMT still trains on translation memories as Statistical Machine Translation does, it uses deep learning—and possibly a higher volume of training data—to build an artificial neural network.

Marciano uses a game of chess to illustrate how Statistical Machine Translation works. In a chess program, there is a limited universe in which a limited number of moves can be made. The program simply calculates all possible moves to find the best one. Similarly, the machine learning that takes place in an SMT system works by comparing n-grams—or 6-word groupings of words in a sentence—from a source sentence to those that occur in the target language to find correlations.

On the other hand, Neural Machine Translation could be described as “raising” a neural system, as Marciano explains. It’s like playing the piano: When you make a mistake, you back up, try again, and repeat until you have it down. Neural MT systems try to find their way through neural networks in the same way.

In this sense, Neural MT is much more effective than the limited, and often inaccurate, n-gram-based model. For one thing, NMT systems run on powerful GPUs (graphical processing units), rather than CPUs (central processing units) as SMT systems do. And although Neural MT takes longer to translate a sentence due to the wealth of data involved—as SMT systems took much longer than older rule-based systems—Statistical MT presents big problems with languages where rules occur outside of the six-word unit.

Of course, NMT does still run into a few issues: for example, when translating highly technical content. But source material containing unknown technical abbreviations would not be translated well by any machine translation system, Neural MT included. For language directions that don’t have much training data—for instance, German to Korean—deep learning opens up the possibility of using indirect, or “pivoted,” training data from the source material of another language.

The major difference between NMT and SMT? When you present training material to the deep learning algorithms, you don’t necessarily tell them what to look for. You let the system find patterns themselves, such as contextual clues around the source sentence. The specifics of the process, however, remain mysterious in many ways.

Neural MT and big data: casting away limited abilities

Neural networks were first used in image and speech recognition programs, by training systems with supervised data—such as an image of a dog with metadata attached. In reading its metadata, the system would know to identify the content of the image as a dog.

Then, the system would try to find the best way through the neural network to make that link, backing up and finding better pathways if it finds the wrong answer, and eventually developing a neural pathway that results in the correct answer. This is the pathway that would be emphasized going forward.

In speech recognition, for a given recorded sentence in a given language, there is generally only one correct transcription for deep learning to find—making the training pretty straightforward. Translation involves “noisier” training material and is a more complex task.

Yet, deep learning and big data, Marciano describes, allows us to cast away our limited abilities to perceive and analyze the world. As big data yields so much information, we’re able to identify complicated patterns, and associations among these patterns, in ways that are beyond human ability to recognize.

But it’s difficult to build a mental picture of the NMT process. Much of the processing is done in “hidden layers” of complicated data, meaning it’s hard to see how the neural network makes its decisions.

This why we can only present the training material, let the algorithms do their thing, and tweak the training material if the translations aren’t accurate. Lionbridge uses GeoFluent to clean up errors in the Neural MT output, too.

Using quality evaluation methods, such as BLEU, becomes a gray area. If a Neural MT system chooses a translation that’s different to the reference translation for an obscure reason, then it may be penalized for its vocabulary choice—even if it’s perfectly correct.

The future of neural networks and communication

Although it’s tricky to debug a neural network and understand its decision-making, the improvement in fluency we’re seeing from Neural MT is encouraging enough for it to be a strong consideration. So, are any other machine translation vendors providing Neural MT now?

The short answer is no. There are three Neural MT systems that you can try right now on the internet: Google Translate (which can be integrated into any given computer-aided translation [CAT] tool), Microsoft Translator, and Systran Pure Neural Machine Translation. However, we are still a little bit ahead of the curve in terms of production-ready systems that have complete training tool sets. Look out for announcements about upcoming NMT systems this year from Microsoft, Google, Systran, Baidu, Facebook, Amazon, and others.

The Neural MT roll out will happen first on those language directions that show the biggest improvement over the SMT systems. At Lionbridge, we plan to evaluate available neural translation systems to see these tools fit into our localization processes and meet our customers’ needs before rolling out ourselves. Visit our Machine Translation thought leadership page for the latest trends on MT.

But one thing is for certain: Neural MT is a game changer. Considering how young this model is, improvements in translation have been enormous compared to the last 10 years. The difference between traditional translation and machine translation will continue to narrow—and we’re intent on finding out just how far this can go.

To learn more about the benefits of Neural MT and our expectations for the future of machine learning, watch the full webinar: Neural MT: What It Is, and How it Impacts Translation Efficiency

#blog_posts
#translation_localization

AUTHOR

Lionbridge

WHAT WE DO

INDUSTRIES

RESOURCES

WHO WE ARE

Neural Machine Translation: How Artificial Intelligence Works When Translating Language

How Neural Machine Translation works

Neural MT and big data: casting away limited abilities

The future of neural networks and communication

INDUSTRIES