Author丨Gu Jian
Producer丨 Yan Xi
In the past few days, OpenAI has been pushed to the forefront, and some United States media have calculated that if there is no further profit improvement for this AI star company, cash flow may be exhausted within a year. OpenAI is still like this, and the followers of other volume technologies are naturally not much better.
The optimism of the techno-idealist is a double-edged sword, which, like a magnifying move, can suddenly catalyze a world-changing product. It's just that people never know exactly how long they have to wait for a skill cooldown. Just like right now, technology geeks want to grow their products into the strongest pedestal models, and then dominate the entire industry with high-level general-purpose capabilities - when the time comes, whichever scenario makes money, they can reduce the dimensionality and generate corresponding solutions.
This kind of thinking overestimates the technical precision required for the application layer, but also underestimates the scenario complexity of the application layer. To put it simply, the "academic school" focuses too much on performance improvement and ignores the actual needs of industrial applications.
It's like a courier brother, who needs to overcome the interference of sweaty hands when picking up the parcel and complete the order entry with voice; Brand merchants hope to get more sales performance, and use digital humans to undertake more live broadcast time and conversion rate; Small and medium-sized businesses with limited budgets can save some marketing expenses and improve efficiency at the same time, using AI to create beautiful and practical product promotional images. These scenarios do not require any high-precision AI, but require a deep understanding of professional capabilities and scenario characteristics.
The 2024 JD Cloud Summit has brought a "practical" approach to solving problems. The voice conveyed by this summit is no longer a cliché like "how strong is my big model", but to promote the landing of more industries and scenarios, and create large model applications and use large models with a low threshold, so that large models can run first in the industry and get real economic benefits.
First, let go of the "strongest" complex and let the large model run first in the industry
"Our technology is not simply doing technical research in the laboratory, but also expects to feed back the evolution of technology from the application of real business scenarios and industrial scenarios, and polish the ability to productize large models with real industrial demand scenarios."
This is a voice of a project leader of the company on the eve of the 2024 JD Cloud Summit, which also reflects the concept of JD Cloud's large-scale model industry to a large extent. In the era of large models, JD Cloud always follows the principle of "born in the industry, serving the industry". To test whether this company is a combination of knowledge and action, the retail scene is an ideal touchstone in the supply chain industry that JD.com is cultivating.
Warren Buffett, who has almost a god's investment win rate, has repeatedly failed in the retail industry. He has concluded that the changing sales channels and consumption habits make it difficult for the retail industry to maintain a competitive edge and form a "moat" at the investment level. Munger, a good partner, was also hit hard after investing in a top e-commerce company, admitting that "no matter how high its market position is, it is still a damn retailer." ”
Everyone is more or less involved in the operation of the retail chain, but top investors are not able to draw a methodology from it.
The main reason is that under the trend of digitalization and integration, the retail platform is no longer simply selling goods, it needs to form a linkage with the warehousing and logistics of merchants, provide sales forecasting, marketing guidance, and even cover some social media functions. The vast majority of platforms digest these complex scenarios through the division of labor in the market.
In contrast, JD.com, which takes into account both self-operated business and open ecology, is more suitable as an excellent large-scale model incubator.
Taking the "human" level as an example, as the retail industry becomes more and more specialized, businesses need a large number of professional talents to assist in operations. Let this already unprofitable industry increase the pressure to survive. Fortunately, JD Cloud has brought several large model capabilities with strong orientation this time to create industry professional assistants for merchants.
△ Application of large model in the whole process of retail - AIGC content generation platform
For example, the shopping guide assistant Jingjingyan, which has insight into the needs of users, can realize emotional, professional and personalized product shopping guides, which effectively alleviates the customer service pressure of merchants;
JD's AIGC intelligent creative platform can achieve end-to-end advertising without manual intervention, which is equivalent to allowing merchants to have an extra professional advertising engineer out of thin air;
JD Wanshang AI Marketing Assistant is an all-round marketing copywriter who can control the whole platform and switch the style at will, and can create high-quality content that conforms to the tonality of platforms such as Douyin and Xiaohongshu in as fast as 30 seconds, and the average acceptance rate of the generated video content is as high as 90%;
The AIGC content generation platform Jingdiandian can provide services such as AI-produced product main maps, detailed pictures of commodity merchants, product marketing maps, matching purchase scene maps, product white background maps, and through-bottom maps, which gives the store an AI designer.
At the level of "goods", it depends on whether the merchant can use the value of the above-mentioned "AI employees" in place.
Some businesses that are good at exploring have found operational ideas.
For example, for a sashimi knife business, the biggest marketing pain point was real-life shooting, which took 2 weeks to complete the production of a full set of graphic materials for a single SKU. Until he unlocked the ability of Jingdiandian to generate pictures, he could directly generate product scene maps comparable to real-life effects, and output corresponding copywriting in combination with the actual click-through rate and conversion rate of the industry, which greatly reduced marketing investment and time costs.
At the level of "field", some people use JD Wanshang's AI marketing assistant to improve the expressiveness of copywriting and short videos; Some people use the AIGC intelligent creative platform to enrich the diversity of advertiser images; There are also people who have optimized in-store marketing materials at high frequency without increasing the number of operation personnel, and have become an active store in the eyes of consumers.
A condiment merchant has realized "1 art operation of 6 stores", of which 80% of the main image, detail map, and selling point advertising map of the product are generated by Jingdiandian AI, which has generated considerable economic benefits.
It can be seen that JD Cloud's industrial model does not have the so-called "strongest" complex, and does not hype the superiority of a single parameter, but effectively solves the demand pain points of the retail industry. This fully shows that the large model should be closely related to specific scenarios and create more "landing and application" products.
2. Take the lead in realizing the large-scale commercialization of digital humans and be a "practical school" one step ahead of others
Before the JD Cloud Summit, I learned about a large model industry application case that was very touching.
An online chess and card platform regularly plans competitive events, during which a large number of editing needs are generated. The technical team considered training a large model to recognize "great videos" to achieve intelligent editing and segmentation.
If we use academic thinking, this is a rather complex topic. First of all, let the large model learn a large number of hand videos; Then, combined with behavior analysis, target detection, and machine vision, we pay attention to the changes in the players' expressions and movements, understand "what is a wonderful game" step by step, and finally put it into the application side to repeatedly test and fine-tune.
But the team soon realized that this type of training was impractical. So he quickly changed his thinking, and through his own understanding of the industry, he directly selected several indicators to define the "wonderful game". For example, the moment when the number of comments suddenly surged, the host's speech speed changed, and the AI was asked to pay attention to the number of "fried" in the game, so as to complete the rapid landing of the tool.
In the same way, the pragmatists represented by JD Cloud tend to let the large model run first in the supply chain scenario, and quickly iterate in practice based on its rich industrial experience.
JD Yunyanxi Digital Human is an excellent product of this concept.
In the past, the industry's perception of digital humans has stayed at "art + AI", and the main focus has been on general capabilities. But not long ago, a college entrance examination test for large models showed that top large models, including GPT, did not perform well in science exams, and their total scores could only reach the level of three or four hundred points.
The application of retail scenarios for digital humans also requires strong professional capabilities as support. If you rely too much on generic capabilities, uncontrollable "hallucinations" are likely to cause a surge in customer complaints.
The pragmatism of the JD Yunyanxi digital human team is that it admits at the beginning that "if the digital human wants to be used independently in serious scenes, it must achieve zero illusions." Prior to this, the team advocated the combination of digital humans and real people, and used various AI assistance tools to serve the corresponding business scenarios. Dismantle the problem from the perspective of universalization, and reduce the application cost as much as possible while ensuring quality.
The newly released JD Cloud Yanxi Digital Human 3.0 platform at this conference mainly amplifies the professional ability, sensory affinity, customization ability and ease of use of JD Cloud Yanxi Digital Human. And on this basis, brand merchants, as well as course training, self-media bloggers and other small and micro B-end customers can also afford and use well.
The professional ability mainly relies on the infusion of JD's e-commerce knowledge graph to form a corresponding algorithm model. This mechanism can quickly generate professional-level introductory copy that matches the digital human, and the operation staff can be directly put into use with little to no adjustment. With the product matrix of Yanxi intelligent customer service, Yanxi AI outbound call, Yanxi Jing Xiaozhi, Yanxi AIGC commodity map generation, etc., it can cover a large number of application scenarios.
△ JD Yunyanxi Digital Human 3.0
In terms of sensory affinity, with the blessing of 200,000 hours of training data, JD Yunyanxi Digital Human 3.0 has been greatly optimized in terms of timbre naturalness and expressiveness. On small sample synthesis, the shortest 3-5 seconds can reproduce a person's timbre and the entire way of speaking. With the upgrade of video generation, art, body coordination, and lip movement, the efficiency of use has been effectively improved.
According to the relevant person in charge, the new digital human can already take charge of the retail and financial scenes. In some businesses, user satisfaction and user problem resolution rate have been significantly improved. The 24-hour problem resolution rate is as high as 85%, and the satisfaction rate is over 90%. The team also found that an infectious digital human image increased users' willingness to listen and communicate.
Customization capabilities are linked to the specific application scenarios of the enterprise. For example, the Mulan digital human customized for Datong Cultural Tourism has been applied in cultural tourism, customer service, and e-commerce live broadcast scenarios; Meet the large-scale and personalized digital human output of some enterprises; Or to meet the high-precision image, dialect, for the president, professional procurement and sales managers of the customized needs.
In April this year, the "Procurement and Sales of Dongge AI Digital Human", which resembles the reality show Suqian dialect, set a good result of more than 20 million views in 1 hour and a GMV of over 50 million in the live broadcast debut. During the 618 period, the president digital people of more than 18 brands also came to the live broadcast room, including Gree Dong Mingzhu, Hisense Hu Jianyong, LG Li Dongshan and other top 500 companies, which shows that JD Yunyanxi digital people have been fully recognized by these "bosses".
In terms of ease of use, it only takes about 15 minutes to adapt to a JD Cloud Yanxi digital human live broadcast room, and the whole process of image and timbre selection, import of product links, generation of intelligent copywriting, atmosphere setting, and generation of warm-up video can be completed, so as to achieve rapid broadcasting. At present, JD Cloud Yanxi Digital Human 3.0 platform has launched 100+ personalized characters and 50+ characteristic industry attribute scenarios, and in the e-commerce live broadcast scenario, the performance exceeds 80% of the industry's real anchors. As a result, the large-scale commercialization of JD Cloud Yanxi Digital Human has been successfully realized.
What exceeds expectations is that the decision-making influence of JD Yunyanxi digital people has been gratifying.
Originally, the industry generally believed that digital people were only suitable for the sale of low-priced goods, but data from JD.com showed that during the 618 period, JD Yunyanxi digital people were improved in automobiles, jewelry, 3C digital, home appliances, and even medical devices with "double high" professional and customer unit prices, driving the conversion rate of the above categories to increase by more than 30%, with a cumulative live broadcast duration of more than 400,000 hours, a cumulative number of viewers of more than 100 million, and an interactive frequency of 5 million+ times.
In the communication with the relevant teams of JD Cloud, we found that the positioning of JD Cloud Yanxi Digital Human is the core interaction medium between the large model and the end user in the future. It will surely move towards a multi-modal and quantifiable development path, using digital human live broadcast as the anchor to define the next scene of interactive shopping, and complete the innovation of e-commerce experience and model.
AlphaGo consumes about 20,000W of power to play a game of chess, and ChatGPT will consume nearly $10 billion in training and personnel expenses this year. If such specifications are required for large models and digital human applications, then the vast majority of people will not be able to participate in it.
Compared with Jarvis-style super artificial intelligence, the industry needs AI products that everyone can afford and create value. The development model of JD Cloud Yanxi Digital Human has transformed the large model from a top-down "game for a few people" to a new technological dividend that benefits thousands of industries and allows each link of the industrial chain to be gained.
3. Large-scale model application innovation: from the ivory tower to thousands of industries
It must be admitted that as long as the sentence is long enough, each large model can become "the strongest" under the specialized target. But at the application level, it makes no sense to rely on a certain percentage of data to "roll you to death", and customers will not pay for ethereal paper efficiency. Compared with the rankings on the authoritative list, whether practitioners are willing to use them is actually more convincing.
This is also the biggest difference of JD Cloud, which is determined by the genes of JD Cloud.
In the early stage of the e-commerce platform's "light is beautiful", JD.com took the lead in investing heavily in warehousing and logistics. At that time, the group's R&D actions closely followed the self-operated business, focusing on scene efficiency and personnel use. With the continuous growth of the industrial chain, a supply chain ecology covering all formats and scenarios has been formed.
Tens of millions of self-operated commodity SKUs, more than 8 million active enterprise customers and the real needs of more than 2,000 industrial belts across the country have precipitated a large number of high-quality supply chain native data, so that JD Cloud's Yanxi model has both general and professional capabilities. At the same time, the incubated large-scale model applications can be put into the market as soon as possible, and the paper efficiency can be transformed into real value in practice.
△ Yanxi intelligent body platform
In order to incubate more enterprise-friendly applications to empower the market, JD.com has also created a one-stop AI agent development platform "Yanxi Intelligent Twin Platform", which opens up the accumulated professional knowledge and dozens of large models to users in the form of low-code, and now JD has more than 3,300 intelligent twin applications. Users can integrate agents into business systems through APIs, making the application system an AI-native application.
This model is summarized based on the powerful JD supply chain, and incubation and testing are completed in real industrial scenarios, which not only accelerates the landing speed of large model applications, but also effectively avoids the waste of computing resources, and ensures that the product is "incubated and applied".
Compared with the academic school of "making products first, and then going to the world to find demand scenarios", JD Cloud's model is undoubtedly more economical and feasible. And Jingdong's keen sense of smell in "industrial science and technology" has been repeatedly verified. Therefore, we have more reason to firmly believe that the industrial application road of JD Cloud's large model while running and training will be a more reliable path.