Connect with us

Huawei

Huawei can use LLMs to revamp Document-Level AI translation

Published

on

Huawei document AI translation

Huawei plans to improve its document-level AI translation capability by using LLMs (Large Language Models) in a better way. The company’s researchers recently applied for a way to make AI translation for documents easier and more convenient.

It’s not the first time that Huawei will use LLMs to improve document-level AI translation. Earlier, LLMs used to break documents into segments and translate one at a time. In this method, one has to prompt the language model with a document summary and certain translated words like names, places, and events.

But this time, Huawei is taking a new approach to fine-tune LLMs for AI translation. The company is planning two different versions of the same translation: sentence by sentence (Sent2Sent) and full document (Doc2Doc).

Sent2Sent will be more fluent and accurate at the sentence level, though lacking consistency across the document. Meanwhile, Doc2Doc can be more consistent and context-responsive but might exclude certain details or even an entire phrase.

Although here’s a solution. The researchers may combine both outputs and ask the LLM to refine them into a single and improved translation. Researchers said:

“We propose fine-tuning LLMs for translation refinement using two intermediate translations, combining the strengths of both Sent2Sent and Doc2Doc.”

Huawei can use LLMs to revamp Document-Level AI translation

Huawei can use LLMs to revamp Document-Level AI translation (Image Credits: Huawei)

Huawei researchers further optimized two open-source models – LLaMA-3-8B Instruct and Mistral-Nemo-Instruct.

These models use a dataset of source documents, two intermediate translations, and a human reference translation. They get quality-aware training that lets them focus on difficult or error-prone segments, and then compare inputs to give a better result.

Researchers added:

“Our refinement approach, based on the two intermediate translations… significantly improves translation performance across all language pairs like English, German, French, Chinese, and Russian.”

[via]

I like to listen to music, sing, dance, and play outdoor games. I have a huge interest in reading novels and cooking. I'm good enough as a speaker. Besides, I have the willingness to learn new things and increase my knowledge in different aspects with full dedication and determination.