SELECT LANGUAGE:

Smaⁱrt Data™

High-quality linguistic data for exceptional translations.

Curated Data Collected From Professional Translation Work. All Domains. Any Language.


Data that delivers intelligence to the Lainguage Cloud™.

AI is only as effective as the data that trains it. Lionbridge has the best. We leverage our vast repositories of high quality, curated language data to train our language AI and fuel our Lainguage Cloud. Smairt Data helps route each piece of content to the right workflow, find the right translator for every piece of content, detect issues and errors, and power our Computer-Assisted Translation (CAT) tools. Smairt Data works for you at every stage of the content journey and enhances the accuracy of Lainguage Cloud automations. 

Smaⁱrt Data: The Lionbridge Difference

Smairt Data is the result of more than two decades of work on language data acquisition. We’ve sorted and tagged this data to create a uniform, centralized multilingual data repository. 

Lionbridge has sourced internal data and curated it from high-quality translations. We don’t use poor-quality, user-generated, or scraped content. This practice eliminates segmenting issues and other problems. 

We solely rely on large volumes of high-quality, accurate language data that work for you during every step of the content journey. 

The Best Multi-language Dataset in the Industry

We’ve been building our dataset since our inception 25 years ago. In this time, we’ve amassed one of the largest repositories of bilingual corpora. It is comprised of high-quality translations and language datasets in 2,000 language pairs. The repository contains important elements for translations and localizations.

Translation Memories

500,000 bilingual translation memories in every language, domain, and industry

Words

40+ billion words and 4 billion sentences of high-quality translations in dozens of domains and specialties

Languages

350 languages and more than 2,000 language pairs, making Smairt Data one of the most comprehensive data repositories in the industry

Metadata

4 types of metadata that are organized and tagged to increase the efficiency and speed of the Lionbridge Lainguage Cloud, which applies relevant data to tasks

Data Privacy and Your Language Service Provider

How does your LSP handle your data? What happens to your data when your LSP completes the job? Does your LSP have a privacy program that is continuously assessed? It’s time to find out. We take data privacy and security seriously and go to great lengths to protect your data.

The Metadata That Fuels Smairt Data

Smairt Data is powered by two things: a vast repository of high-quality language data and relevant metadata. Lionbridge gathers four types of metadata to extend the usefulness of Smairt Data. 

Content Data

Data comprised of overall content characteristics like domain, language type, content complexity, source quality, and frequently encountered issues enable us to profile inbound content and understand it better in the early stages of localization. 

Traⁱnslation Community™ Data

Data involving translator performance is imperative for the Lainguage Cloud to match the right content to the right translator. We use extensive translator profile data, quality assessments, and information about translators’ experience, skills, specializations, speed, and delivery success rates to make assignments. We strive to set translators up for success. We provide them with diverse, engaging work to continuously develop their skills.

Order Data

Data about project operations enable us to better understand the nature of the work and improve the way the Lainguage Cloud operates. 

General Data

Data relating to language rules, generic language information, glossaries, and terms help us to understand languages better. It enables translators to deliver precise translations as language evolves. 

A New Way To Think About AI and Localization

Why does Lionbridge embed “ai” in select words? It’s to underscore the important role Artificial Intelligence (AI) plays throughout the translation and localization process.

What is Smairt?

The combination of smart and AI points to the evolution of a frictionless localization process as AI learns and enables that process to improve over time.

What is Lainguage?

The combination of language and AI emphasizes the importance of language AI to succeed in a digital-first marketplace by boosting content output and enhancing the quality and performance of that output across more languages. 

What is Trainslation?

The combination of translation and AI refers to the synergy created when AI complements the work of talented translators, which enables companies to handle more content and grow their businesses.

Contact Us