NEWS  /  Analysis

China’s AI Chip Ecosystem Needs Improvement, Says Member of Chinese Academy of Engineering

By  Annabelle  Aug 04, 2024, 11:06 p.m. ET

He explained that the computing power required by large models spans four main stages: model development, model training, model fine-tuning, and model inference. Consequently, computing power is essential throughout the lifecycle of a large model.

TMTPOST--The AI large models are evolving from single-modality to multi-modality, with increasing applications leading to an explosive growth in computing power, said Zheng Weimin, an academician of the Chinese Academy of Engineering and a professor in the Department of Computer Science and Technology at Tsinghua University.

Compared to Nvidia, the domestic AI chip ecosystem is not as developed, he pointed out.

Zheng made the remarks at the 2024 annual seminar of the China Information Technology Association (ChinaInfo100) on Sunday.

He explained that the computing power required by large models spans four main stages: model development, model training, model fine-tuning, and model inference. Consequently, computing power is essential throughout the lifecycle of a large model.

He highlighted the high costs associated with computing power. For instance, GPT-4 used 800 Nvidia A100 GPUs, with a monthly development cost of $2 million. The cost of training with 10,000 A100 GPUs reaches $200 million, and the daily inference cost for ChatGPT is $700,000. In large model enterprises, computing power accounts for 70% of model training costs and 95% of inference costs.

At the model training level, Zheng identified three support systems. The first is the GPU systems based on Nvidia chips. They have excellent hardware performance and a robust programming ecosystem but are not sold to China, making them hard to obtain and significantly more expensive. 

The second is the systems based on domestic AI chips. Although domestic chips have made significant progress in both software and hardware, users are reluctant to adopt them due to the underdeveloped ecosystem.

Please sign in and then enter your comment
  • At the model training level

    Aug. 30, 2024 Reply
  • The second is the systems based on domestic AI chips.

    Aug. 30, 2024 Reply