Turbocharge Translation Quality: The Power of Custom Engine Training

Discover insights from Smartling AI leaders on how training a custom machine translation engine can significantly improve translation quality.July 8th, 2024

Yext

As the global utilization of AI-powered translation continues to increase exponentially, there's a growing interest in ensuring that its outputs align with brand standards and are tailored to customer’s specific needs. One effective approach to achieving this is through the use of custom machine translation (MT) engines. By leveraging a company's linguistic assets, such as translation memory, style guides, do-not-translate lists, and glossaries, these engines can be customized to optimize outputs for brand compliance and quality.

However, the pivotal question remains: what is the actual impact of customized engines on aspects like translation quality and brand alignment? Are they truly worth the investment, and how can they be utilized most efficiently? Smartling AI leaders Olga Beregovaya and Alex Yanishevsky explored these questions and more during a recent Smartling webinar. Read on for a few key takeaways from the conversation.

1. Why Machine Translation (MT) in the age of GenerativeAI (GenAI)? The proof is in the pudding.

While the buzz about GenAI could suggest the industry has moved on from MT in favor of GenAI, the data reveals otherwise. MT continues to outperform Large Language Models like GPT4 in Smartling benchmark assessments, achieving consistently higher BLEU scores and lower edit distances. With that said, GenAI can still play a significant role in translation workflows (see item X of this list).

2. Custom engines deliver higher quality output. Period.

Trained engines not only achieve an average of 18% higher BLEU scores compared to their untrained counterparts, but their outputs also require 22% fewer edits than generic engines. These higher quality outputs require less review by human translators, empowering translation teams to work more efficiently. Even more effectively, when combined with self-learning AI features like dynamic fuzzy match repair or formality switching, custom engines deliver consistently higher quality outputs compared to generic engines or GenAI on its own.

3. The golden rule of custom engine training: Garbage in, garbage out.

Training an MT engine requires uploading linguistic assets like style guides and glossaries, but an “upload and pray” approach can only take an engine so far in terms of quality. Hence, the common saying in the AI community is “garbage in, garbage out.” MT engines prefer clear, unambiguous entries formatted in a language that they understand (TMX files). Effective custom training requires a highly granular approach to uploading linguistic content, where your team exercises lots of control over what your engine is trained on and how it’s trained.

Ensure your engine evolves with you.

Training a custom engine is like maintaining proper dental hygiene. Without regular check-ups, cavities can form and cause damage. Similarly, as your business grows, your custom engine needs regular updates with new terminology, messaging, and style guidelines to continue producing accurate outputs.

4. LLMs + custom engines: “either/or” or “both/and?”

Our answer: both/and!

While LLMs alone may not consistently deliver higher quality outputs compared to traditional MT, they excel in refining and enhancing MT outputs for specific tasks such as formality switching, adherence to style guidelines, and audience persona alignment.

Smartling’s latest innovation, the AI Translation Toolkit, harnesses the power of LLMs to optimize MT output. Dynamic features like AI Fuzzy Match Repair and Glossary Term Insertion use LLMs to correct grammatical errors and ensure brand consistency, leading to higher quality MT translations.

5. It all comes down to ROI.

To determine whether custom engine training is right for your translation workflow, it all comes down to ROI. If you already have a robust translation memory that encompasses most of the content you plan to translate in the future, a generic MT engine could likely fulfill your translation needs without needing to invest in a custom engine. Generic engines, however, might not have the vocabulary for particular domains or languages. If that’s the case, custom engine training is a worthy investment to streamline your translation workflows.

At the end of the day, custom engine training is here to stay.

Keep up with more upcoming events from Smartling here.