Every reporter: Zhu Chengxiang Every editor: Yang Xia
From July 4 to 6, the 2024 World Artificial Intelligence Conference and High-level Conference on Global Governance of Artificial Intelligence was held in Shanghai. During the period, experts in many fields such as large models and computing power expressed forward-looking thoughts.
Xu Li, CEO of SenseTime, believes that the essence of the large model is to do a memory thing and remember the knowledge of the world. And what little intelligence it has comes from the memory of the higher-order thinking logic behind the knowledge. Therefore, in vertical industries, how to construct synthetic data with high-order thinking logic is often the key to success and differentiation. This is also the key to Chinese's road to artificial intelligence.
Qiu Xiaoxin, founder and chairman of Aixin Yuanzhi, believes that the real large-scale implementation of large models requires the close integration of the three levels of cloud-edge-end, and the key to the combination of edge and device lies in AI computing and perception.
Qiu Xiaoxin, founder and chairman of Aixin Yuanzhi. Image source: Courtesy of the interviewee
The application of large models has been implemented
Regarding the application of large models, Xu Li said that if the industry wants to change, the interactive mode must be the first. Real-time interactivity leads to a fluid experience and is at the heart of driving super-moments and app changes. For the release of GPT-4o, the most profound feeling from the outside world is that it can interact with people in real time, thus redefining the interface of human-computer interaction.
In addition, a major factor hindering the implementation of large models is the "large model illusion".
Yan Junjie, founder and CEO of MiniMax, emphasized reducing the error rate. It said that since ChatGPT came out last year, and then there are various versions of GPT-4, many domestic companies are also catching up and launching many models. The core problem is that the error rate of the current model is still relatively high. For example, GPT-4 may have a 60% or 70% accuracy rate on many test indicators, that is, an error rate of 30% to 40%. The overall error rate of domestic models is still 60% to 70%.
Yan Junjie added, why is the product of the large model in the form of dialogue? Because the fault tolerance rate of the dialogue is relatively high. Why can't I become an independent agent? It's because each step has a 30%, 40% error rate. Therefore, the core problem is how to reduce the error rate of large models from 30% and 40% to 3%, 4% and 2%, which is an order of magnitude. This is the core sign that AI has evolved from a tool to assist humans to the ability to complete work independently.
Zhang Peng, CEO of Zhipu AI, believes that accuracy is an aspect, and generally speaking, when talking about accuracy, most of them are limited to some evaluation levels or tasks, just look at numerical quantitative evaluation, but some things are actually difficult to quantify, such as logic and the ability to think abstractly. Zhang Peng emphasized that these are precisely the places where the current model is stronger than people or traditional methods.
Zhang Peng, CEO of Zhipu AI Image source: Courtesy of the interviewee
Zhang Peng believes that it is important to break through the multimodality of large models. It is because when a real person solves a problem in the real world, the information he needs to input is multimodal, in addition to natural language, there are also vision, hearing, touch, and common sense, all of which need to be integrated to solve many common problems in the real world, not even complex problems, but common problems.
For example, tasks such as sweeping, cooking, and washing clothes require multimodal input, and breakthroughs in these capabilities will bring about the benefits of AI.
Regarding the application of industrial large models, Li Shaobin, president of the Hong Kong Industrial Robot and Robotics Research and Development Center (FLAIR), told every reporter: "When we have more data, we can train large industrial models. At that time, we can directly ask the equipment, how is your status? Is there anything wrong? The device can reply with an answer, such as 'I found that there may be a little bit of a problem with a certain device, I should be able to hold out for a week, you think about how to arrange a replacement, do a pre-maintenance'. ”
He added: "We want to combine the large-scale model technology with some of our solutions, so that the communication between the equipment and people in the workshop will be more like the communication between people in the future." ”
Cloud-side + device-side computing power coordination
At present, in terms of large-scale model applications, more emphasis has been placed on the coordination of cloud-side and device-side computing power.
Xu Li believes that if all resources are focused on the cloud, it will bring a significant increase in inference costs and a decrease in inference efficiency, because the congestion of the network will inevitably bring about unsmooth services. SenseTime has increased the optimization of the model on the device side, improving the performance accuracy by 10%, in addition, the speed has been greatly improved, and the cost has been greatly reduced. The latency of the first packet is reduced by 40%, and the inference efficiency is increased by 15%.
Xu Li, CEO of SenseTime Image source: Courtesy of the organizer
Qiu Xiaoxin, founder and chairman of Aixin Yuanzhi, believes that intelligent chips and multi-modal large models have become the "golden combination" in the era of artificial intelligence, and when large models are increasingly widely used, more economical, more efficient, and more environmentally friendly will become the key words of smart chips, and efficient inference chips equipped with AI processors will be a more reasonable choice for large models to land, which is also the key to promoting inclusive AI.
Jia Chao, vice president of Facewall Intelligence, believes that with its advantages in cost, privacy, latency, reliability, etc., the development of device-side AI will become a global trend, which also means that large models have officially entered the lightweight era. In this context, "model knowledge density will increase by 1 time every 8 months on average" will become the new Moore's law in the era of large models. Jia Chao emphasized that enterprises need to do two-way work from the algorithm side and the chip side to develop device-side large models, and use device-side chips to efficiently implement the device-side models in user scenarios, so as to bring users the most extreme experience.
National Business Daily