Unlike last year's 100-model war and relying on computing power to build a general large model, this year's large model industry has to rely on business to run out.
Text: Zhao Yanqiu, Xu Xin
Edited by Niu Hui
This year is the first year of the large-scale model industry. From the beginning of the year to the present, customers have higher expectations for large models and require more business scenarios to be solved.
In this process, agents are exploding - more and more customers want their applications to evolve in the direction of the next generation of agents (agents).
To put it simply, if you compare a large model to a brain, an agent is like the hands, feet, and limbs of the brain. It can disassemble the complex requirements of customers, call workflows and tools, and become a real business assistant. Since it has a low enough barrier to entry, it allows more people to get started. Today, most AI-native applications can be built with agents.
All large-scale model companies and ecological enterprises in the industry are working on agents. This also means that the landing of the large model has entered the stage of "must-roll agent".
At the 2024 JD Cloud Summit held in Shanghai on July 30, JD officially released the Yanxi Intelligent Twin Platform, which is a one-stop intelligent twin development platform. JD Cloud judges that intelligent twins, digital humans, and embodied intelligence are the core interaction media between large models and end users in the future. Among them, intelligent twins are more cloud-oriented, which is an important driver of enterprise AI native applications.
In fact, JD Cloud has released full-stack products that support the landing of large models, including the Yanxi Intelligent Twins platform, to accelerate the implementation of large models in all scenarios.
"The general model is stacked up by computing power, and the enterprise model is run out by business." Cao Peng, Chairman of JD Technical Committee and President of JD Cloud Business Division, said at the Cloud Summit. Through the increasingly perfect product and tool platform system, the large model can be combined with the industry and give full play to its potential.
Digital Intelligence Frontline learned that the full-stack products of JD Cloud's large model were incubated in the super incubator of JD Supply Chain. At present, there are more than 100 large-scale model applications within JD.com, supporting the applications of more than 600,000 employees and 200,000 merchants.
01
"I was amazed by the enthusiasm of the front line for agents"
The relevant person in charge of Jingdong said that in practice, it is felt that the agent is one of the tool platforms with the best application effect in the landing of the large model this year.
The Yanxi intelligent body platform released by JD.com is actually an "endogenous and externalized" product. It began to be developed in October last year, and was opened to use on JD.com and some ecological chain enterprises in the spring of this year, and in just a few months, employees have built more than 3,300 intelligent twins. "Surprise us." The person in charge said. At the same time, thousands of workflows and knowledge bases have also been formed on the platform, and their popularity has exceeded expectations.
"This may be related to the fact that JD.com has many business personnel and the chain is long enough." An interesting phenomenon is that in the past few years, JD.com has invested a large number of AI algorithm teams in the core retail supply chain. However, this large-scale model change has brought subversive changes to those groups who have not been affected by AI in the past. Front-line business, function, and product managers all combine their own work to create agents.
For example, someone has set up a long video editing agent. Because there are a large number of training videos to be edited within JD.com, in the past, everyone had to manually find the corresponding part, and then use editing software to cut it, and only a few can be cut a week. Now this video editing assistant, as long as you upload videos and requirements, you can use the multi-modal capabilities of large models to find the corresponding frames, and employees can edit them, and hundreds of them can be completed in a week.
Another intelligent brother assistant does path planning and intelligent prompts for the courier brother to free their hands; In JD.com's agent market, there are a large number of active agents. There is a telemarketing quality inspection agent, which has generated millions of visits, replacing manual review and verification of marketing calls...... In addition, JD.com has more than 600,000 employees and a large number of general scenario agents, such as reimbursement and learning.
Due to the use of a large number of front-line employees, the Yanxi agent platform supports zero-code development, so that employees without algorithm background can also build agents through visualization and drag-and-drop. "The intelligent twins platform is to enable every AI inspiration to be implemented quickly." The relevant person in charge of Jingdong has a deep understanding of this.
Agents have also brought about changes in organizational collaboration. In the past, when some business departments collaborated, they had to find each other to have a meeting, and they needed to develop and formulate schedules. Now, everyone spontaneously registers their own tools and APIs on the agent platform, realizing the link of underlying capabilities, and employees of other departments can directly call them, and the collaboration has become different.
After internal tempering, the Yanxi agent platform is open to the public. In terms of business model, it provides public cloud and private deployment.
Many of the intelligent twin platforms that have been launched in the market focus on individual developers, but the Yanxi intelligent twin platform is more industry-specific. When it was incubated internally, it was widely used in JD's retail, health, and logistics sectors, and precipitated corresponding industry solutions. In this release, the Yanxi agent platform is preset with relevant configuration templates and plug-ins, as well as more than 100 industry solutions, so that customers can build exclusive agents in 1 minute.
In view of how to make good use of intelligent twins, the relevant person in charge of Jingdong suggested that two aspects should be paid special attention to: first, we should continue to dig out explosive models and create benchmarks. For example, JD.com will select agents to be listed on the official market every week based on data to attract more people to use them. "The hackathon pushes it up a notch, wave after wave."
The other is related to the operation of the platform. How do you support applications for thousands of employees? On the one hand, it is necessary to establish a category system with enterprise characteristics, and classify thousands of agents into categories to facilitate everyone to find them; The other is to precipitate mature solutions in time, which users can use directly.
The AI-native applications made by agents have also brought changes to the enterprise software market. Some agents directly replace the original SaaS software of the enterprise, and some are embedded in the SaaS system. The platform also provides a simple application release link, and enterprises can publish the built agents to internal IM through Web, API, etc., such as WeCom and collaborative office channels.
The industry has seen that the agent is still in a very early stage, and more unexpected agent capabilities will be generated in the future, which depends on thousands of enterprise application providers. The process they use will be the process of the evolution of the AI agent.
02
Behind the agent, the forging assembly line of the large model
At present, the Yanxi agent platform has been connected to dozens of large models. These models are delivered by the Yanxi AI development and computing platform. During the Shanghai Cloud Summit, the Yanxi AI development and computing platform was fully upgraded to 2.0, and the key capabilities were highlighted according to the key points of the large model.
The first ability is model compression + model pulling. Since the end of last year, the main energy of various enterprises has been to cut and pull up various small models on the basic large models according to the customer's use scenarios. Cao Peng introduced that this is because even after experiencing the investment of the arms race, the general model is still missing a thin layer of paper in the real scene, and it needs to be specially tuned. Moreover, many scenarios require the model to respond quickly and the inference cost to be low, and the market for small models is larger.
At present, enterprises generally use a model group to realize the application of different scenarios. These model groups need to be extracted and compressed from the general large model, and the enterprise knowledge is infused to amplify it. On the Yanxi AI development and computing platform, users can quickly obtain the professional model of an enterprise through such push and pull through zero-code method. At the same time, the cost of inference is reduced by 100% and the speed is increased by 1.5 times.
"At present, the industry usually does a two-step process – compression and vertical model fine-tuning." The relevant person in charge of Jingdong's artificial intelligence business department said that the Yanxi AI development and computing platform can infuse vertical domain knowledge into the process of compression. This is also a scheme widely adopted by JD.com.
The second capability is data preparation. Every industry has a lot of data, including multimodal data. This requires a very high level of processing power in the tool chain.
At the same time, the lack of process data is the biggest obstacle encountered in the implementation of large models in the industry. "For example, we see a symptom and the expert's advice on how to deal with it, but we don't know what the expert's reasoning logic is." The relevant person in charge of JD Health told the front line of digital intelligence that if there is no reasoning logic, the problem of hallucinations cannot be solved.
"We put a lot of effort into it today, through experts and big models to complement it." One is through RAG, which is an indispensable technology in the industry, to give the model literature and let it automatically capture the inference link; The other is supplemented by specialists. Whether the tool platform can help the expert team improve efficiency is also the key to the technology that the industry is fighting.
In addition, large model synthetic data is very popular. "How to synthesize data in the vertical domain, which is closer to the seed data, the team has also done in-depth work." The relevant person in charge of JD's artificial intelligence business department said.
The third capability is model evaluation. It includes a general competency assessment as well as a vertical competency assessment. For the evaluation of general large models, there are some good lists on the market, and the code and evaluation datasets are disclosed, which can be automatically evaluated and scored directly. "We need to protect the general ability first. If you don't have universal capabilities, you don't have a vertical domain. ”
As for vertical domain evaluation, JD has evaluation datasets in health, retail, etc., and users can also evaluate vertical domains after adjusting the model. This is also automated. Of course, there are also some manual evaluations, such as people in the health scene who understand the business better, and the platform also provides a similar crowdsourcing method for everyone to participate in the evaluation.
In the industry, building an evaluation system is very important. "It's not so much about how to train a vertical model, it's important to have an evaluation system that can tell you which direction the model should go." In addition to the tool platform, JD.com has also set up an evaluation team, "This is our behind-the-scenes hero".
03
The AI base accelerates the implementation of large models in all scenarios
With the application of large models and agents to the front line of the industry, the industry has found that the underlying infrastructure such as computing, networking, and storage also needs to adapt to the new situation and solve new challenges.
The most typical point is that the current large-scale model application practice on the industrial side needs to create a more open infrastructure platform, such as supporting multi-cloud, multi-core, and multi-active, and can undertake a variety of models to meet complex application scenarios and business needs.
At present, the creation of multi-modal large models requires 10 times or even 100 times more computing power than before. Global enterprises have a characteristic of completing model training, invocation and inference based on heterogeneous computing power, solving the general shortage of computing power and improving cost performance.
In addition to computing, in terms of storage, terabytes of data storage may be processed within tens of seconds during peak model training hours, while in traditional applications, the processing of these massive small files can be dispersed over several months. As a result, storage products must evolve towards higher throughput, higher IOPS, higher bandwidth, and lower latency. "Under the same GPU computing scale, the storage performance may bring about a difference of 3 times the model training cycle." Cao Peng mentioned in his speech.
Under the scaling law, hyperscale clusters pose challenges to the efficiency of the network between hardware. Not long ago, Musk disclosed in a high-profile manner that a 100,000-calorie H100 supercomputing center has been built, and the industry is also extremely concerned about the network architecture behind it-NVIDIA Spectrum-X. After all, a robust, low-latency network is about getting the most out of the computing resources in the cluster.
These trends show that infrastructure is related to the implementation process of large models. In the face of this system war, manufacturers need to build a more powerful AI foundation to accelerate the implementation of large models in all scenarios.
At the JD Cloud Summit, the recent evolution of JD Cloud's AI base was disclosed. For example, in terms of high-performance storage power, as a new generation of storage products independently developed by JD Cloud, Yunhai has gone through more than 10 years of JD's own complex scene experience, and can provide the ultimate performance required for large model ultra-large data transmission.
At the summit, JD Cloud Yunhai, which was fully upgraded to version 3.0, further improved its performance in terms of throughput, bandwidth, and latency, providing more solid support for the full implementation of the large model. Yunhai is also compatible with all mainstream localization platforms.
It is understood that at present, JD Cloud Yunhai provides the support of underlying data storage in the training of medium and large models of heavyweight financial institutions such as China Construction Bank. At the same time, in terms of upper-layer applications, it also helps these enterprises to do online digital transformation through the cooperation of data elements.
In terms of computing power, JD.com has built a cloud computing power cloud platform for Yunship. It has the ability to manage and schedule multiple heterogeneous computing resources, including various CPUs, GPUs, and localized AI acceleration chips, supports the unified scheduling of distributed computing power in multiple regions, and provides cost-effective computing power supply.
In addition, the vGPU pooling solution can increase the utilization rate of AI computing power by 70% through the pooling of heterogeneous GPU resources, which can effectively reduce the cost of computing power. There is widespread anxiety about computing power in the industry, which is also a solution provided by JD Cloud.
In addition, JD Cloud's large model security and trustworthiness platform covers more than 200 unique red and blue adversarial attack methods, covering all 31 types of risks required by regulatory compliance, and the risk analysis accuracy is as high as more than 95%.
In view of the price war of large model enterprises this year, Cao Peng mentioned that last year's JD Cloud Summit proposed to reduce costs through technological upgrades, compare prices across the network, and benchmark the lowest transaction price of mainstream cloud vendors by 10%. At present, JD Cloud has more than 100 product specifications sold through price comparison, and has set up a price comparison of 1 billion yuan to give back the dividends brought by technology to the developer group.
Overall, in the 2024 JD Cloud Summit, JD is accelerating the integration with the industry by releasing full-stack products from infrastructure, model services, to intelligent twins applications, so that more enterprises can implement large models and run out through business.