Last Updated: December 27, 2019 2:57AM
The black-box magic of AI algorithms is not possible without a little human intervention. Data collection, annotation, and validation still need human input to build the best systems. As Rana el Kaliouby told Forbes, some companies can synthesize previously collected data to help develop their AI, but “data synthesis doesn’t eliminate the need for collecting real-world data—this will always be critical to the development of accurate AI algorithms.” That’s one reason why Lionbridge values its vast global network of experts and laypeople; they are irreplaceable resources when supporting our clients as they design, train, and test machine learning systems.
Whether a machine learning algorithm handles text, sound, or video, this early phase is key. Collecting high quality data is a challenge described so often in conversations about AI that it nearly goes without saying. If an algorithm uses transfer learning techniques, that can reduce the need for new discrete data. Having a consistent, diverse source of data points is especially important for algorithms that need to be changing constantly, like for example, voice recognition tools that need to recognize new slang or industry terms. Once the data is collected, it needs to transform into something a computer can parse.
A pile of unorganized data is not in and of itself useful for training an AI. Accurate annotation of any kind ensures that the algorithm is learning from correct information. The Lionbridge crowd can annotate semantically, categorize text and content, extract entity information and bound images and video. Regardless of language, file format, or specialized knowledge required, we can find the perfect match to feed your algorithm crisply identified data.
Of course, without testing, no algorithm is complete. Test users for AI need to be as similar as possible to the targeted users of a “complete” algorithm to make sure the system will understand their inputs. That’s why in addition to the highly skilled linguists and industry professional who can help annotate data, Lionbridge has a community of laypeople ready to test drive AI outputs.
Rinse and Repeat
Iteration is the name of the game in technology, and development of an AI is no different. Our customers can return to us again and again to improve on their systems or create something new. Take a look at why and how we can support your AI development in our whitepaper.