Connect with us

Huawei

Huawei has better Ascend chip-based AI training tech than DeepSeek

Published

on

Huawei Ascend

Huawei Pangu team has reportedly said that the company has a better and improved Ascend chip-based AI training technology than DeepSeek. The company has recently published a paper that hints at a new and advanced training method.

AI training method refers to the process of teaching a computer (here, AI LLM) to make decisions or predictions by learning from data. It involves feeding the system with data, enabling it to identify patterns to make inferences.

In the latest edition, Huawei Pangu team members and researchers have released a new paper that says the company has developed a new concept called MoGE.

Mixture of Grouped Experts (MoGE) is said to be an upgraded version of the MoE – Mixture of Experts technique used in DeepSeek’s money-saving AI models.

Before diving into the details, let’s understand the difference between MoE and MoGE.

MoE vs MoGE

MoE and MoGE are machine learning techniques. The former uses many specialized models (experts or components within a larger model, each of which handles specific tasks) to tackle a complex task. It often provides low execution costs and enhanced learning capacity, but leads to some inefficiencies.

According to Huawei’s new paper, the uneven activation of experts can interrupt the performance of the AI models while running on multiple devices in parallel.

Huawei Ascend

Huawei Ascend (Source – Huawei)

But that’s not the case with MoGE. It is capable of grouping the experts during selection and can also balance the expert workload. As a result, the technique offers more efficient execution for both model training and inference.

Huawei Pangu team researchers tested the new MoGE architecture of the Ascend NPU, which can boost AI tasks, and found better results than the MoE technique.

Huawei’s MoGE-powered new Pangu model has obtained state-of-the-art performance on many benchmarks and resulted in higher efficiency in long-context training over DeepSeek-V3, Alibaba Qwen 2.5-72B, and Meta Llama-405B.

Pangu Ultra model also achieved top scores in language comprehension and reasoning operations. It has 135 billion parameters optimized for NPUs. The new model and Huawei Ascend chip-based AI training system may go official for commercial consumers in the future.

|| source ||

I like to listen to music, sing, dance, and play outdoor games. I have a huge interest in reading novels and cooking. I'm good enough as a speaker. Besides, I have the willingness to learn new things and increase my knowledge in different aspects with full dedication and determination.