Harness the Lionbridge Lainguage Cloud to support your end-to-end localization and content lifecycle

Lainguage Cloud™ Platform
Connectivity
Translation Community
Workflow Technology
Smairt Content™
Smairt MT™
Smairt Data™
Language Quality
Analytics

Our people are our pride, helping companies resonate with their customers for 20+ years. 

About Us
Key Facts
Leadership
Insights
News
Trust Center

 

SELECT LANGUAGE:

Lionbridge Expert Commentary: Automated Translation Analysis

Lionbridge technology experts examine the Machine Translation and generative AI paradigms and share insights into the latest automated translation trends.

Machine Translation Technology Maintains Its Relevance Despite the Disruptive Nature of Generative AI


Changes abound: Understanding developments in automated translation

We’ve been saying for some time that the Machine Translation (MT) paradigm was ripe for disruption. Read our expert commentaries, and you’ll learn why.

Our automated translation experts offer insight into numerous topics, including:

  • The translation performance of MT engines and generative AI (GenAI) models at given points in time and what the results mean in a larger context
  • The limitations of automated translation tools
  • Ways to bolster the effectiveness of Machine Translation

The more you understand MT and GenAI, the more you can deploy the tools selectively to meet your needs. Capitalize on the strengths offered by each paradigm to ultimately achieve enhanced translation efficiency, increased content output, and cost savings.

Featured Lionbridge Expert Commentary

Noteworthy GPT-4 Peculiarities, October 2023

We’ve enhanced the Lionbridge Machine Translation (MT) Tracker, given the prevalence and promise of GenAI / Large Language Models (LLMs). From here on in, the tracker will include GPT-4 translation results in addition to GPT-3.5 and Davinci results and, of course, Neural MT (NMT) engine performance.

What are some of our latest findings? Some noteworthy GPT-4 peculiarities.

We faced several issues associated with GPT-4, including slow performance, its inability to provide translations for various reasons, and inconsistent behavior, such as missing translations in some runs but not in others.

Finding #1 — GPT-4’s failure to translate some text.

GPT-4 failed to translate a particular sentence in our MT test set.

After some research, we determined that a term with a sexual connotation in particular contexts caused the issue. To be clear, the sentence in our test set was entirely standard and acceptable. Nonetheless, the term triggered GPT-4’s sexual content filter anyway, and the AI subsequently censored the translation of that sentence and outputted nothing. We were surprised by this result for two reasons:

The typical use of that term in isolation had no issues.

The context of that particular sentence had no problematic interpretation.

This observation led us to conclude that perhaps a part of the GPT-4 filtering mechanism was based on a simple forbidden word list that also includes ambiguous terms. This approach is problematic as it is prone to overfire and provoke false positives, which is a serious issue for professional translation.

Because earlier Machine Translation technologies, such as Neural MT engines, do not have this type of content filtering issue, we can conclude it is a limitation of LLM technology.

The limitation has implications for real-world scenarios. For instance, imagine you need to translate medical content associated with gynecology or sexual education. You may be surprised that the LLM will not translate some of your text.

Interestingly, this issue happened to us only when translating that sentence into a particular language, Chinese, but not when translating it into other languages. This result indicates that the filter was on the GPT-4 output. The solution is to turn off the content filters for translation tasks.

Finding #2 — GPT-4’s output variability.

We found LLM Machine Translation output highly variable after five weeks of tracking, particularly with GPT-4.

While we expected this outcome for generative AI, the variability was more significant than anticipated — even when we used Temperature and Top Probability (Top_p) parameter settings to reduce creativity and make the output more deterministic. The translation output was different in every single GPT run we conducted, even when we ran translations one right after the other.

Both translations may be acceptable even though they differ. Nonetheless, this is another aspect to control and another difference from the previous Neural MT paradigm.

We are starting to intuit that this potential change of paradigm — from NMT to LLM MT — may not only be a technological change but also require us to have a change in mindset: We may need to be prepared to live with less deterministic outputs, even when using the very same input and the very same parameters, and expect to see more variability than what we are used to with current automation.

While we may have to live with more uncertainty to some extent, it may be possible to use some mechanisms and best practices to make that variability somewhat controllable.

Final note: There was a decrease in the Edit Distance for GPT-4 at the time of publication; this finding does not indicate decreasing quality. It is merely a reflection of the variability of GPT outputs.

 

    —Rafa Moral, Lionbridge Vice President, Innovation

Index of Expert Commentary Topics

Browse the executive summaries below to explore the topics of our past expert commentaries.

March 2023 — A Large Language Model (LLM) outperforms a Neural Machine Translation (MT) engine: Now what?

February 2023 — Enhancing Machine Translation (MT): MT customization vs. MT training

January 2023 — Translation quality comparison between ChatGPT and the major MT engines

November 2022 — Microsoft MT improvement

October 2022 — MT and language formality

September 2022 — Using terminology for enhanced MT quality

August 2022 — Overcoming catastrophic errors during MT

July 2022 — Language ranking for MT

June 2022 — Accurately analyzing MT quality

May 2022 — Amazon and Yandex performance in May

April 2022 — Yandex performance in April

March 2022 — Custom MT comparative evaluations

February 2022 — The future of Neural Machine Translation (NMT) 

January 2022 — MT engine performance in January

December 2021 — Lionbridge adds Yandex MT to the MT Quality Tracker competitive check

November 2021 — Bing Translator makes improvements

October 2021 — How Amazon’s MT engine is progressing

September 2021 — Amazon makes improvements to MT quality

August 2021 — Top tech companies and their MT engine development

The Lionbridge Machine Translation Tracker

The Lionbridge Machine Translation tracker is the longest-standing measure of MT in the industry.

The tracker measures the overall performance of the five major neural MT engines and several GenAI models. It also evaluates translation quality based on language pairs and domains. GenAI does not outperform the major neural MT engines, with some exceptions. However, these models produce decent results, especially considering they haven’t been trained explicitly for translations.

What’s the takeaway? Amidst the strong interest in deploying GenAI/LLMs, Machine Translation continues to prove itself to be a worthy automated translation tool.

Translation results are constantly changing, and the tracker captures these fluctuations.

Lionbridge Expert Commentaries

Gain insight from our automated translation experts.

March 2023

Generative Artificial Intelligence (AI) has achieved a significant milestone: It outperformed a Neural Machine Translation (MT) engine in one of our comparative evaluations. Specifically, Large Language Model (LLM) GPT-4 provided slightly better quality than Yandex for the English-to-Chinese language pair, as shown in Figure 1.

This development is noteworthy because it’s the first time a different type of MT approach has beaten a Neural MT engine since the advent of Neural MT. Moreover, a non-MT approach — a multi-purpose language automation not specifically prepared for Machine Translation — has beaten the Neural MT engine.

Why should you care about this occurrence? If you are an MT provider, you must be at the forefront of technological advancements and consider how they will impact your current MT offering to stay competitive. If you are an MT buyer, you must be privy to these developments to make sound MT investments, which will likely include some LLM-based technology instead of pure Neural MT offerings.

It's worth noting that generative AI is still in its early stages. As such, it falls short in some key areas. For instance, it produces variable outputs during multiple runs, has Application Programming Interface (API) instability, and makes more errors than Neural MT engines. These issues must be resolved for the technology to mature, and we are already seeing improvements being made at breathtaking speed.

The incredible speed at which LLMs can improve supports the notion that LLMs will become the next paradigm for Machine Translation. We expect a hybrid period whereby Neural MT providers integrate some aspects of LLMs into the Neural MT architecture as the paradigm evolves.

Read our blog for a translation quality comparison between Neural MT and LLM for two more language pairs and additional thoughts on whether it is the beginning of the end of the Neural Machine Translation paradigm.

 

    —Rafa Moral, Lionbridge Vice President, Innovation

Top


February 2023

Generic Machine Translation (MT) engines will frequently provide an adequate output for companies seeking to automate their translations. However, these engines may produce subpar suggestions — especially when dealing with technological or highly specialized content.

Companies seeking to improve Machine Translation (MT) results to meet specific goals can consider two options: MT customization and/or MT training. Either method — or a combination of both — can produce better results during the automated translation process.

However, the approaches differ from one another, and they are not interchangeable. Table 1 provides an overview of MT customization and MT training and offers some considerations when evaluating each method.

Machine Translation Customization vs. Machine Translation Training

  MT Customization
What it is and how it works An adaptation of a pre-existing Machine Translation engine with a glossary and Do Not Translate (DNT) list to improve the accuracy of machine-generated translations
What it does Improves MT’s suggestions for more accurate output and reduces the need for post-editing
Specific benefits Enables companies to adhere to their brand name and terminology and achieve regional variations
The risks of using it The MT could make poor suggestions and negatively impact overall quality when executed improperly
When to use it Ideal for technological and detail-oriented content and any content that requires:
*Accurate translations of terminology
*Regional variation, but you lack sufficient data for MT training
Success factors An experienced MT expert who can successfully manage input and output normalization rules, glossaries, and DNT
Cost considerations There is a one-time cost to update the profile that goes into the MT engine and some ongoing costs to maintain a glossary over time; costs are relatively inexpensive when factoring in the potential benefits and are typically lower than MT training costs

Table 1. A comparison between MT customization and MT training

  MT Training
What it is and how it works The building and training of an MT engine by using extensive bilingual data from corpora and Translation Memories (TMs) to improve the accuracy of machine-generated translations
What it does Improves MT’s suggestions for more accurate output and reduces the need for post-editing
Specific benefits Enables companies to attain a specific brand voice, tone, and style and achieve regional variations
The risks of using it MT training may fail to impact output if there is not enough quality data to train the engine; the MT could generate poor suggestions and negatively impact overall quality if inexperienced authors overuse terminology
When to use it Ideal for highly specialized content, marketing and creative content, and any content that requires:
*A specific brand voice, tone, or style
*Regional variation, and you have enough data for MT training
Success factors A minimum of 15K unique segments to adequately train the engine
Cost considerations There are costs associated with the first training and potential costs for additional training, which may be considered over time if the MT performance monitoring indicates room for improvement; MT training can be worth the investment in certain cases when factoring in the potential benefits

Table 1. A comparison between MT customization and MT training

  MT Customization MT Training
What it is and how it works An adaptation of a pre-existing Machine Translation engine with a glossary and Do Not Translate (DNT) list to improve the accuracy of machine-generated translations The building and training of an MT engine by using extensive bilingual data from corpora and Translation Memories (TMs) to improve the accuracy of machine-generated translations
What it does Improves MT’s suggestions for more accurate output and reduces the need for post-editing Improves MT’s suggestions for more accurate output and reduces the need for post-editing
Specific benefits Enables companies to adhere to their brand name and terminology and achieve regional variations Enables companies to attain a specific brand voice, tone, and style and achieve regional variations
The risks of using it The MT could make poor suggestions and negatively impact overall quality when executed improperly MT training may fail to impact output if there is not enough quality data to train the engine; the MT could generate poor suggestions and negatively impact overall quality if inexperienced authors overuse terminology
When to use it Ideal for technological and detail-oriented content and any content that requires:
*Accurate translations of terminology
*Regional variation, but you lack sufficient data for MT training
Ideal for highly specialized content, marketing and creative content, and any content that requires:
*A specific brand voice, tone, or style
*Regional variation, and you have enough data for MT training
Success factors An experienced MT expert who can successfully manage input and output normalization rules, glossaries, and DNT A minimum of 15K unique segments to adequately train the engine
Cost considerations There is a one-time cost to update the profile that goes into the MT engine and some ongoing costs to maintain a glossary over time; costs are relatively inexpensive when factoring in the potential benefits and are typically lower than MT training costs There are costs associated with the first training and potential costs for additional training, which may be considered over time if the MT performance monitoring indicates room for improvement; MT training can be worth the investment in certain cases when factoring in the potential benefits

Table 1. A comparison between MT customization and MT training

 

    —Thomas McCarthy, Lionbridge MT Business Analyst

Top


January 2023

Would Large Language Models (LLMs) be a good alternative to a Neural Machine Translation (NMT) paradigm for Machine Translation (MT)? To find out, we compared the translation performance of ChatGPT, OpenAI’s latest version of its GPT-3 family of LLMs, to the five major MT engines we use in our MT Quality Tracking.

As expected, specialized NMT engines translate better than ChatGPT. But surprisingly, ChatGPT does a respectable job. As shown in Figure 1, ChatGPT performed almost as well as the specialized engines.

We calculated the quality level based on the inverse edit distance using multiple references for the English-to-Spanish language pair. The edit distance measures the number of edits a human must make to the MT output for the resulting translation to be as good as a human translation. For our calculation, we compared the raw MT output against 10 different human translations — multiple references — instead of just one human translation. The inverse edit distance means the higher the resulting number, the better the quality.

Figure 1. Comparison of automated translation quality between ChatGPT and the major Machine Translation engines based on the inverse edit distance using multiple references for the English-to-Spanish language pair.

These results are remarkable because the generic model has been trained to do Natural Language Processing (NLP) tasks and has not been specifically trained to execute translations. ChatGPT’s performance is similar to the quality level MT engines produced two or three years ago.

Given the evolution of LLMs — based on the public’s attention and the significant investments tech companies are making in this technology — we may soon see whether ChatGPT overtakes MT engines or whether MT will start adopting a new LLM paradigm. MT may use LLMs as a base but then fine-tune the technology specifically for Machine Translation. It would be like what OpenAI and other LLM companies are doing to improve their generic models for specific use cases, such as making it possible for the machines to communicate with humans in a conversational manner. Specialization adds accuracy to the performed tasks.

One great thing about these Large Language "Generic” Models is that they can do many different things and offer outstanding quality in most of their tasks. For example, DeepMind’s GATO, another general intelligence model, has been tested in more than 600 tasks, with State-of-the-Art (SOTA) results in 400 of them.

Two development lines will continue to exist — generic models, such as GPT, Megatron, and GATO, and specialized models for specific purposes based on those generic models. The generic models are important for advancing Artificial Generic Intelligence (AGI) and possibly advancing even more impressive developments in the longer term. Specialized models will have practical uses in the short run for specific areas. One of the remarkable things about LLMs is that both lines can progress and work in parallel.

We are intrigued by what the future holds. We will continue to evaluate LLMs and publish the results so you can stay up to date on this exciting evolution. Read our blogs to delve deeper into ChatGPT’s translation performance and to learn more about ChatGTP and localization and why it’s probably a game-changer.

 

    —Rafa Moral, Lionbridge Vice President, Innovation

Top


November 2022

We’ve seen a nice overall improvement in Microsoft’s Machine Translation (MT) results during October 11-November 1. With this recent quality increase by Bing Translator, the main MT engines are producing very similar results. As such, they face a tight battle for the top leadership position.

The major MT engines have not shown interesting improvements for months. Let’s hope this development from Microsoft breaks that trend and is the start of forthcoming progress by these engines. 

We went beyond our usual measure of single-reference translations and confirmed the Microsoft improvement results with a second tracking that encompassed multiple references. In this MT evaluation, we used 10 reference translations completed by humans — the gold standard — instead of just one translation to get a more precise Edit Distance metric that considers multiple possible correct translations in the final results.

As we reach the end of the year, we note that 2022 has had very flat MT results. We observed little change; this Microsoft Bing MT development may be the most notable advancement of the whole year. As commented on earlier in the year, the current MT paradigm may be reaching a plateau. We look forward to seeing what 2023 holds for Machine Translation.

 

    —Rafa Moral, Lionbridge Vice President, Innovation

Top


October 2022

This month, we want to bring your attention to language formality and how difficult — but not impossible — it is to get it right when using Machine Translation (MT).

Machine Translation (MT) engines can produce incorrect and inconsistent formality. Why? MT models typically return a single translation for each input segment. When the input segment is ambiguous, the model must choose a translation among several valid options, regardless of the target audience. Letting the model choose between different valid options may result in inconsistent translations or translations that have an incorrect level of formality.

It is especially challenging to get the correct output when the source language has fewer formality levels than the target language. For instance, languages like French have well-defined formal modes — tu vs. vous — while English does not.

While most MT systems do not support language formality or gender parameters, we are seeing progress. At present, DeepL (API) and Amazon (console and SDK) offer features that control formality. Lionbridge’s Smairt MT™, an enterprise-grade Machine Translation solution, allows linguistic rules to be applied to the target text to produce Machine Translations with the desired style or formality.

It’s critical to effectively translate your source to meet the needs of your target audiences, which includes addressing formal and informal language in your MT output. Translations that come across as “off” or — even worse — as rude can put you at risk of alienating your audiences.

Read our blog to learn more about Machine Translation and formal vs. informal language.

 

    —Yolanda Martin, Lionbridge MT Specialist 

Top


September 2022

It can be advantageous to use Machine Translation (MT), but you must proceed with caution. Generic MT engines can put out erroneous translations and can especially cause undesired results for specific domains from a terminological point of view. The impact can be particularly harmful to the medical and legal fields. But there are things you can do to enhance MT output.

Using terminology can enable you to improve the quality of MT and achieve accurate, consistent translations.

It’s imperative to train customized MT systems with domain-specific bilingual texts that include specialized terminology. Still, accurate translations cannot be guaranteed when engines are trained with specialized texts if the terminology is not used consistently. Research in this area proposes to inject linguistic information into Neural Machine Translation (NMT) systems. The implementation of manual or semi-automatic annotation depends on available resources, such as glossaries, and constraints, such as time, cost, and availability of human annotators.

Lionbridge’s Smairt MT™ allows the application of linguistic rules to the source and target text, as well as the enforcement of terminology based on Do Not Translate (DNT) and glossary lists added to a specific profile. We help our customers create and maintain glossaries, which are regularly refined to include new, relevant terms and retire obsolete terminology. When glossaries are created once in Smairt MT, they can then be used for all the MT engines, saving time and money.

Using glossaries for MT projects is not as simple as it may seem. Glossaries, if used inappropriately, can negatively affect the overall quality of Machine Translation. The best way to follow terminology in MT is through MT training. The combination of trained MT engines, glossary customization, and the identification of preprocessing and post-processing rules ensure MT output contains proper terminology and is similar in style to the customer's documentation.

Read our blog for more insight into using terminology to enhance MT output.

 

    —Yolanda Martin, Lionbridge MT Specialist 

Top


August 2022

As companies increasingly rely on Machine Translation (MT) as a standard practice, employees will need to prevent the dissemination of catastrophic errors.

Catastrophic errors are more problematic than standard MT errors, which pertain to error typology related to linguistic features, such as spelling, grammar, or punctuation. Catastrophic errors transcend linguistics and occur when engine output dangerously deviates from the intended message. Resulting misinformation or misunderstandings have the potential to cause companies reputational, financial, or legal problems and may lead to adverse public safety or health consequences. It is essential to find ways to identify these errors and stop them from compromising your communications.

Lionbridge administers specific automated quality checks in translated texts to detect critical errors while preserving MT speed and reducing the need for human intervention.

These automated methods detect:

  • Opposite meanings between original and translated texts
  • Offensive, profane, or highly sensitive words
  • Incorrect translations of proper names of individuals and organizations that are also common words

Companies will be better protected from catastrophic errors when computer scientists improve existing MT technology to prevent these translation errors. Until such time, we can use automated technology to identify potential issues, revise problematic sentences, and promote accuracy during the translation process.

Read our blog for a more in-depth examination of catastrophic errors during Machine Translation.

 

    —Luis Javier Santiago, MT Group Leader,

 

    and Rafa Moral, Lionbridge Vice President, Innovation 

Top


July 2022

Google NMT, Bing NMT, Amazon, DeepL, Yandex — which engine is best? Last month’s data — and the current general trend — show major engines perform similarly. That’s why it’s worthwhile to consider additional factors when developing your MT strategy, such as the ease with which MT engines translate specific language pairs.

Identifying how challenging it is for engines to handle specific language pairs will help you allocate your budget when planning translation costs across languages. For instance, you’ll need to allocate more effort to achieve high-quality translations when dealing with complex language pairs. Having insight into language complexity can help support your business decisions.

Ranking languages by translatability is not a straightforward process; however, we can use different metrics for evaluation. Edit Distance, which is the number of changes a post-editor makes to ensure the final text has a human quality, can provide a sense of MT complexity and translatability (machine-translatability or m-translatability) for each language pair.

Most Romance languages, such as Portuguese, Spanish, French and Italian, require fewer changes to reach high-quality levels when translated from English. We identified these target languages as the easiest for machines to handle, and they took the first four spots in our m-translatability ranking. Hungarian and Finnish — two Uralic languages — are more complex languages; they placed last in our ranking, taking the 27th and 28th spots. Estonian, another language in the same family, is also among the more complex languages. These results — based on millions of sentences processed by Lionbridge — underscore the importance of language families in MT results.

While intra language comparison has limitations, the ranking can provide some interesting insights to better manage multilingual projects. Read our blog to see the Lionbridge language ranking table in its entirety.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


June 2022

In June, we observed a tiny improvement in Russian translations by Yandex’s MT engine and a tiny dip in translation results by Microsoft Bing’s MT engine. Are these noteworthy changes or insignificant, spurious outcomes? To find out, we analyzed the results differently.

Instead of using a single gold standard that measures the distance from the MT translation to one “perfect” human translation, we used multiple reference translations. We compared each translation made by machines to 10 translations by professional translators. When we took this approach, the small fluctuations in translation quality by Yandex and Microsoft Bing in June disappeared. As such, we can conclude that there were no changes to the MT translation quality. June results are flat.

Sometimes data and its graphical representations may be misleading. This often happens when there are small deltas among different measurements. It’s good practice to use more than one approach to evaluate data to interpret results accurately.

We project little movement in MT engine quality in the coming months. We will use this section to provide analysis and general MT observations. Next month, look for comparisons among MT language pairs. We’ll explore whether it is possible to use data to classify languages and language families by MT complexity and determine whether machines can translate some language pairs easier than others.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


May 2022

It has primarily been another static month for the MT engines.

We’ve noticed Amazon has made an incremental improvement in the way its engine handles the English-Spanish pair. It is now the leading engine in this language pair. Amazon also made minor strides in the other languages, but smaller than its improvements to the English-Spanish pair. We speculate these advancements are due to some generic setting changes and as a result of work on the English-Spanish pair. Enhancements appear to affect the treatment of some special characters and strings with measurement expressions.

For the second month in a row, Yandex made minor improvements. Interestingly, these improvements also affect Spanish.

As we’ve previously noted, there have been no significant changes. All the engines perform similarly. In the coming months, we will analyze some specific MT areas and provide general observations. Of course, we will also track major developments.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


April 2022

After several months of flat MT engine performance, Yandex has made some progress, particularly with its German engine.

In one detailed analysis, we saw advancements in Yandex engines' handling of sentences with punctuation characters — such as question marks, exclamation points, parentheses, and slashes — and units of measurement. These developments may result from some fine-tuning to the MT settings rather than improvements in the models. However, we also saw improvement in our tracking of rare terms, so Yandex’s progress may also be due to some refinements of the models or more data training.

Around this time last year, several MT engines showed some improvements that we found interesting. Is there a time pattern involved with these advancements? Will we see something like what we observed in 2021 this year? We’re tracking the MT performance of these engines, and we'll report our findings in the next month or so.

Generally, there is an increased interest in MT engine evaluation. Today, most everyone will agree that MT is a mature technology. People recognize the technology’s usefulness for almost any translation case — with or without human intervention and hybrid approaches. But MT users are still struggling to find the right way to evaluate, measure, and improve MT results.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


March 2022

If you’ve been following these pages, you’re familiar with our generic MT comparative evaluations. Each month, we identify which MT engines are performing best for given language pairs and track engine improvements. In March, the performance of the different MT engines was flat. It’s a trend we’ve been noticing for some time already. As we commented last month, it may indicate that a new MT paradigm is needed.

While we share generic results, companies are increasingly pursuing custom MT comparative evaluations. Unlike the generic version, these evaluations take a company’s specific needs into consideration when determining the most advantageous MT engines.

When a company wants to start using MT or improve the way it currently uses MT, it is critical to identify which MT engines will work best. When we execute custom evaluations, we take a similar approach to the one demonstrated on this page, but we make recommendations based on a company’s content type and language pair requirements.

While custom MT comparative evaluations have been available for years, there’s greater demand for them. We attribute this trend to the important role MT plays in helping companies succeed in a digital marketplace.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


February 2022

Google’s MT engine showed a tiny improvement during January and February of 2022, while the other engines we track remained stagnant. These observations may lead us to start asking some pointed questions. Is the Neural Machine Translation (NMT) paradigm reaching a plateau? Is a new paradigm shift needed given the engines’ inability to make significant strides? We observed similar trends when NMT replaced Statistical MT.

At the end of the Statistical MT era, there was virtually no change to MT quality output. In addition, the quality output of different MT engines converged. We see similar trends. While NMT may not be imminently replaced, if we believe in exponential growth and accelerating returns theories — and consider Rule-based MT’s 30-year run and Statistical MT’s decade-long prominence and note NMT is now in its sixth year — a new paradigm shift may not be so far away.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


January 2022

During January, the main Machine Translation (MT) engines did not show significant changes in their performance. 

Google demonstrated small, incremental improvements across some languages and domains. The performance of most of the other engines has been flat. Microsoft had improvements over the last few months, but performance plateaued in January. Overall, the quality of Google Translate continues to lead in general-purpose MT technology.  

In December, we added a fifth MT engine to our tracker. By monitoring Yandex, we can better analyze the MT quality of the Russian language.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


December 2021

In December, we added Yandex MT to our MT Quality Tracker comparative check.

According to our test sets, so far, Yandex:

  • Performs better than MS Bing, similarly to Google, and not as well as Amazon and DeepL for Russian.
  • Performs similarly to Amazon and MS Bing for German.
  • Does not perform as well as the main MT engines for the other language pairs we track.
  • Works well when addressing sentences that are longer than 50 words.

In other observations, MS Bing has improved its output in a nice way during the last months of 2021. In particular, translations into Chinese improved. Amazon has also made some strides. As we start the new year, Google is taking the baton and improving its output. Specifically, translations into Spanish, Russian, and German have improved. Yandex’s line has been flat during the five weeks we have been tracking it.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


November 2021

After a few weeks of experimentation and fluctuation in overall performance, it’s clear that Microsoft NLP Engineers are on to something. Bing Translator has shown overall improvements during the past few weeks and improvements for Chinese in particular, making this MT engine last month’s big winner. Bing Translator has closed some gaps in most areas, even surpassing the performance of some of its competitors. Bing Translator remains one of the most trainable engines, and its enhancements position it to be a good choice when building customized models that are specific to your content.

 

    —Jordi Macias, Lionbridge Vice President, Language Excellence

Top


October 2021

Amazon’s Machine Translation (MT) engines continued to evolve positively during the month of October, building upon what they started doing about a month ago. These continued enhancements are the second set of incremental improvements we’ve seen in the last few months.  

As a reminder, here are some of the areas where Amazon’s MT engines have continue to evolve over the past couple of months:

  • They are putting out a more informal style than before
  • They treat units of measurement differently
  • Both imperial and metric measurements are now consistently put out
  • Imperial measurements now appear before metric measurements
  • Numbers that correspond to measurements are now translated and correct
  • "Euro" is now spelled out and replaces the currency symbol €

 

    —Jordi Macias, Lionbridge Vice President, Language Excellence

Top


September 2021

September has proven to be a good month for Amazon’s Machine Translation (MT) engines. First, the company improved its MT quality output for the German and Russian languages. Then, we saw spikes for the Spanish and Chinese language pairs. These enhancements are the second set of incremental improvements we’ve seen in the last few months.

Here are some more changes to the Amazon MT engines:

  • They are putting out a more informal style than before 
  • They treat units of measurement differently 
  • Both imperial and metric measurements are now consistently put out 
  • Imperial measurements now appear before metric measurements 
  • Numbers that correspond to measurements are now translated and correct 
  • "Euro" is now spelled out and replaces the currency symbol € 

 

    —Yolanda Martin, Lionbridge MT Specialist 

Top


August 2021

All the big technology companies have developed their own MT engines, including Microsoft, Google, Amazon, Facebook and now Apple. Many other big players in markets outside of the U.S. are also competing in the space. Clearly, big tech companies believe that MT and Natural Language Processing (NLP) are must-have tools for today’s interconnected, global world.

Watch this space as Lionbridge follows the competition. We’ll identify the best MT engine options based on a company’s specific needs, taking its desired language pair and content type into account.

We expect the MT/NLP race to accelerate with so many top tech companies investing in this space. There’s no doubt that Apple—with its attention to detail and quality—will drive other companies to step up their game.

 

    —Rafa Moral, Lionbridge Vice President, Innovation 

Top


Meet the Experts

Rafa Moral

Vice President, Innovation 

Rafa oversees R&D activities related to language and translation, including Machine Translation initiatives, Content Profiling and Analysis, Terminology Mining, and Linguistic Quality Assurance and Control.

Share on LinkedIn

Yolanda Martin

MT Specialist

Yolanda is responsible for the creation of customized translation models, as well as quality analysis and the development of strategies to fine-tune them. In parallel, she collaborates with the R&D department to develop new linguistic tools and resources.

Share on LinkedIn

Thomas McCarthy

MT Business Analyst

Thomas ensures Lionbridge customers and stakeholders obtain maximum benefits from MT-related technologies, services, and consultancy.

Share on LinkedIn

Contact Us