Our people are our pride, helping companies resonate with their customers for 20+ years. 

About Us
Key Facts
Leadership
Insights
News
Trust Center

 

Harness the Lionbridge Lainguage Cloud to support your end-to-end localization and content lifecycle

Lainguage Cloud™ Platform
Connectivity
Translation Community
Workflow Technology
Smairt Content™
Smairt MT™
Smairt Data™
Language Quality
Analytics

SELECT LANGUAGE:

Lionbridge Machine Translation Tracker

Introducing our assessment tool to help you choose the best Machine Translation engine for your needs.

 

Comparing Four Machine Translation Deployment Strategies

Machine Translation (MT) has been around for decades. In recent years, it has evolved at an exponential rate. As companies produce ever-growing amounts of content in different languages, they have come to see MT as an opportunity to extend their reach in an increasingly globalized world. 

Companies seeking to deploy MT can explore the following four basic strategies. 

Public MT

This strategy includes services like Google Translator or Bing Translate. Such services are readily available at no cost. However, the engines are completely unsecure and not trained for specific domains or particular use cases. 

On-Premise MT

This strategy requires a company to deploy an MT server in its own IT environment. While this is the most secure option, it comes at a significant cost, is complex to deploy and manage, and requires ongoing maintenance. Importantly, this strategy often produces suboptimal MT output across multiple language pairs or content types.

Cloud MT

This strategy works like Public MT, is also hosted in the cloud, but creates a company-dedicated instance. Any data shared with the service is tightly secured and not shared with third parties. It provides additional capabilities around terminology customization and has other benefits. However, it can result in vendor lock-in and less-than-optimal MT quality across multiple language pairs.

Best of Breed MT

This is a single platform that allows companies to leverage multiple MT engines. It provides a single layer of terminology customization, an easy-to-manage interface and the ability to choose the best engines for different language pairs, industries or domains, and types of content. 

No matter which strategy you are contemplating, selecting the right engine may be challenging without the proper data and MT experience. Lionbridge is an expert in MT. In addition to having more than two decades of MT experience, we have gathered a large volume of linguistic and quality data about MT technology that will help you make the right choice. This webpage provides basic information about the performance of popular MT engines for the most common language pairs to help you select the best option based on your content. 

 

Which Machine Translation (MT) engine is best? There’s no simple answer.


When choosing among the many available MT systems, it’s important to note some engines address a specific function or domain. If your needs don’t align with that purpose, the engine may perform sub-optimally no matter how advanced it is. To determine the best option, first, identify why you are using MT.

If you want an MT engine for general use, it may be appropriate to use Google Translate or Bing Translator. If you seek MT services for a specific language or domain, you may achieve better results by turning to Amazon Translate or DeepL Translator.

Lionbridge’s Machine Translation Tracker analyzes engine performance monthly to help you figure out the best MT engine depending on the language pairs you use. The next time you ask which MT engine is best, reframe the question to, “Which MT engine is best for me?” And count on Lionbridge for guidance.

Want to learn more about the different types of MT technologies? Check out our blog Machine Translation in Translation.

Lionbridge Expert's Commentary

October 2023

We’ve enhanced the Lionbridge Machine Translation (MT) Quality Tracker report, given the prevalence and promise of generative AI (GenAI) / Large Language Models (LLMs). From here on in, the report will include GPT-4 translation results in addition to GPT-3.5 and Davinci results and, of course, Neural MT (NMT) engine performance.

What are some of our latest findings? Some noteworthy GPT-4 peculiarities.

We faced several issues associated with GPT-4, including slow performance, its inability to provide translations for various reasons, and inconsistent behavior, such as missing translations in some runs but not in others.

Finding #1 — GPT-4’s failure to translate some text.

GPT-4 failed to translate a particular sentence in our MT test set.

After some research, we determined that a term with a sexual connotation in particular contexts caused the issue. To be clear, the sentence in our test set was entirely standard and acceptable. Nonetheless, the term triggered GPT-4’s sexual content filter anyway, and the AI subsequently censored the translation of that sentence and outputted nothing. We were surprised by this result for two reasons:

  • The typical use of that term in isolation had no issues.
  • The context of that particular sentence had no problematic interpretation.

This observation led us to conclude that perhaps a part of the GPT-4 filtering mechanism was based on a simple forbidden word list that also includes ambiguous terms. This approach is problematic as it is prone to overfire and provoke false positives, which is a serious issue for professional translation.

Because earlier Machine Translation technologies, such as Neural MT engines, do not have this type of content filtering issue, we can conclude it is a limitation of LLM technology.

The limitation has implications for real-world scenarios. For instance, imagine you need to translate medical content associated with gynecology or sexual education. You may be surprised that the LLM will not translate some of your text.

Interestingly, this issue happened to us only when translating that sentence into a particular language, Chinese, but not when translating it into other languages. This result indicates that the filter was on the GPT-4 output. The solution is to turn off the content filters for translation tasks.

Finding #2 — GPT-4’s output variability.

We found LLM Machine Translation output highly variable after five weeks of tracking, particularly with GPT-4.

While we expected this outcome for generative AI, the variability was more significant than anticipated — even when we used Temperature and Top Probability (Top_p) parameter settings to reduce creativity and make the output more deterministic. The translation output was different in every single GPT run we conducted, even when we ran translations one right after the other.

Both translations may be acceptable even though they differ. Nonetheless, this is another aspect to control and another difference from the previous Neural MT paradigm.

We are starting to intuit that this potential change of paradigm — from NMT to LLM MT — may not only be a technological change but also require us to have a change in mindset: We may need to be prepared to live with less deterministic outputs, even when using the very same input and the very same parameters, and expect to see more variability than what we are used to with current automation.

While we may have to live with more uncertainty to some extent, it may be possible to use some mechanisms and best practices to make that variability somewhat controllable.

Finally, as you review our chart, note that the Edit Distance decrease for GPT-4 does not indicate decreasing quality. It is merely a reflection of the variability of GPT outputs. Next month, we may see the line going up. Watch this space for developments and more insight.

 

    —Rafa Moral, Lionbridge Vice President, Innovation

Click here to read previous expert commentaries.

Evaluating Overall MT Performance
Time
Per Language Pair Quality
Choose between German, Spanish, Russian, or Chinese in the drop-down menu
Time
Per Domain Performance
Select Domain/Subject Matter in the drop-down menu
Time

For more insights and future trends about Machine Translation, read our Future of Language Tech blog post – Future of Machine Translation.

Lionbridge Machine Translation Tracker Methodology

Lionbridge uses inverse edit distance as a scoring method. The edit distance measures the number of characters (for Asian languages) or words (for Western languages) that need to be changed by a human post-editor before it achieves the quality level produced by a human translator. The higher the metric, the better the quality.

Out of the four MT engines we assessed, Google and Bing NMT performed best across different language pairs and for general content. However, specialized engines performed best for certain language combinations. For instance, DeepL had the strongest performance in German, and Amazon translated Chinese best.

Disclaimer

  1. Machine Translation engines in this report are assessed monthly by Lionbridge.
  2. The data provided is for illustration purposes and each case should be treated and assessed individually.
  3. This report is generated based on source data preselected by Lionbridge Machine Translation teams. The same source data is submitted to every Machine Translation engine and language pair each time, making comparisons between translation engines possible.
  4. No customer data has been used in the generation of the report.

Get Smaⁱrter™ About MT

Smaⁱrt MT™: Machine Translation for the Digital Age

Find out how to leverage MT to offer digital experiences in local languages, achieve better customer satisfaction ratings, and excel in global markets.

The Future of Language Technology: The Future of Machine Translation

Machine translation will continue to evolve and become increasingly important for translation productivity if it is deployed properly.

Machine Translation vs. Machine Translation Plus Post-Editing

When is it best to rely on Machine Translation? When should you consider a hybrid model that incorporates traditional, human translation? We go through the scenarios. 

Neural Machine Translation: How Artificial Intelligence Works When Translating Language

Delve into what Neural Machine Translation is and why it is considered a game changer for the language industry. 

Machine Translation in Translation

This handy cheat sheet will bring you up to speed on the most important terms associated with Machine Translation. 

Click the image below to view key definitions for understanding Machine Translation 

Contact Us