LANGUAGE:
LANGUAGE:
Our people are our pride, helping companies resonate with their customers for 20+ years.
We create, transform, test, and train more content than anyone in the world – from text, voice, audio, video, to structured & unstructured data.
Solutions
Lionbridge Knowledge Hubs
Our experts know the in-and-outs of your industry & its challenges.
Harness the Lionbridge Lainguage Cloud to support your end-to-end localization and content lifecycle
Our people are our pride, helping companies resonate with their customers for 20+ years.
About Us
Key Facts
Leadership
Insights
News
Trust Center
We create, transform, test, and train more content than anyone in the world – from text, voice, audio, video, to structured & unstructured data.
Content Services
- Technical Writing
- Training & eLearning
- Financial Reports
- Multicultural Marketing
- Digital Experience Assessments
Translation Services
- Video Localization
- Software Localization
- Website Localization
- Translation for Regulated Companies
- Interpretation
- Live Events
- Multilingual SEO & Digital Marketing
- Content Optimization
Testing Services
- Functional QA & Testing
- Compatibility Testing
- Interoperability Testing
- Performance Testing
- Accessibility Testing
- UX/CX Testing
Solutions
- Translation Service Models
- Digital Marketing
- Machine Translation
- STAⁱRT Onboarding™
Our Knowledge Hubs
- Generative AI
- Positive Patient Outcomes
- Future of Localization
- Innovation to Immunity
- COVID-19 Resource Center
- Disruption Series
- Patient Engagement
- Lionbridge Insights
Our experts know the in-and-outs of your industry & its challenges.
Life Sciences
- Pharmaceutical
- Clinical
- Regulatory
- Post-Approval
- Corporate
- Medical Devices
- Validation and Clinical
- Regulatory
- Post-Authorization
- Corporate
Banking & Finance
Retail
Luxury
E-Commerce
Games
Automotive
Consumer Packaged Goods
Technology
Industrial Manufacturing
Legal Services
Travel & Hospitality
Harness the Lionbridge Lainguage Cloud to support your end-to-end localization and content lifecycle
Lainguage Cloud™ Platform
Connectivity
Translation Community
Workflow Technology
Smairt Content™
Smairt MT™
Smairt Data™
Language Quality
Analytics
SELECT LANGUAGE:
Machine Translation (MT) has been around for decades. In recent years, it has evolved at an exponential rate. As companies produce ever-growing amounts of content in different languages, they have come to see MT as an opportunity to extend their reach in an increasingly globalized world.
Companies seeking to deploy MT can explore the following four basic strategies.
No matter which strategy you are contemplating, selecting the right engine may be challenging without the proper data and MT experience. Lionbridge is an expert in MT. In addition to having more than two decades of MT experience, we have gathered a large volume of linguistic and quality data about MT technology that will help you make the right choice. This webpage provides basic information about the performance of popular MT engines for the most common language pairs to help you select the best option based on your content.
When choosing among the many available MT systems, it’s important to note some engines address a specific function or domain. If your needs don’t align with that purpose, the engine may perform sub-optimally no matter how advanced it is. To determine the best option, first, identify why you are using MT.
If you want an MT engine for general use, it may be appropriate to use Google Translate or Bing Translator. If you seek MT services for a specific language or domain, you may achieve better results by turning to Amazon Translate or DeepL Translator.
Lionbridge’s Machine Translation Tracker analyzes engine performance monthly to help you figure out the best MT engine depending on the language pairs you use. The next time you ask which MT engine is best, reframe the question to, “Which MT engine is best for me?” And count on Lionbridge for guidance.
Want to learn more about the different types of MT technologies? Check out our blog Machine Translation in Translation.
We’ve enhanced the Lionbridge Machine Translation (MT) Quality Tracker report, given the prevalence and promise of generative AI (GenAI) / Large Language Models (LLMs). From here on in, the report will include GPT-4 translation results in addition to GPT-3.5 and Davinci results and, of course, Neural MT (NMT) engine performance.
We faced several issues associated with GPT-4, including slow performance, its inability to provide translations for various reasons, and inconsistent behavior, such as missing translations in some runs but not in others.
GPT-4 failed to translate a particular sentence in our MT test set.
After some research, we determined that a term with a sexual connotation in particular contexts caused the issue. To be clear, the sentence in our test set was entirely standard and acceptable. Nonetheless, the term triggered GPT-4’s sexual content filter anyway, and the AI subsequently censored the translation of that sentence and outputted nothing. We were surprised by this result for two reasons:
This observation led us to conclude that perhaps a part of the GPT-4 filtering mechanism was based on a simple forbidden word list that also includes ambiguous terms. This approach is problematic as it is prone to overfire and provoke false positives, which is a serious issue for professional translation.
Because earlier Machine Translation technologies, such as Neural MT engines, do not have this type of content filtering issue, we can conclude it is a limitation of LLM technology.
The limitation has implications for real-world scenarios. For instance, imagine you need to translate medical content associated with gynecology or sexual education. You may be surprised that the LLM will not translate some of your text.
Interestingly, this issue happened to us only when translating that sentence into a particular language, Chinese, but not when translating it into other languages. This result indicates that the filter was on the GPT-4 output. The solution is to turn off the content filters for translation tasks.
We found LLM Machine Translation output highly variable after five weeks of tracking, particularly with GPT-4.
While we expected this outcome for generative AI, the variability was more significant than anticipated — even when we used Temperature and Top Probability (Top_p) parameter settings to reduce creativity and make the output more deterministic. The translation output was different in every single GPT run we conducted, even when we ran translations one right after the other.
Both translations may be acceptable even though they differ. Nonetheless, this is another aspect to control and another difference from the previous Neural MT paradigm.
We are starting to intuit that this potential change of paradigm — from NMT to LLM MT — may not only be a technological change but also require us to have a change in mindset: We may need to be prepared to live with less deterministic outputs, even when using the very same input and the very same parameters, and expect to see more variability than what we are used to with current automation.
While we may have to live with more uncertainty to some extent, it may be possible to use some mechanisms and best practices to make that variability somewhat controllable.
Finally, as you review our chart, note that the Edit Distance decrease for GPT-4 does not indicate decreasing quality. It is merely a reflection of the variability of GPT outputs. Next month, we may see the line going up. Watch this space for developments and more insight.
—Rafa Moral, Lionbridge Vice President, Innovation
For more insights and future trends about Machine Translation, read our Future of Language Tech blog post – Future of Machine Translation.
Disclaimer