
The first large models were filed, and more radical inputs began

author:Late LatePost
The first large models were filed, and more radical inputs began
When the "speed limit" is released, it's time to see who's fast and who's slow.

Text | He Qianming

Editor丨Cheng Manqi

On the evening of August 30, Baidu's public relations team worked overtime to prepare promotional materials, with the goal of releasing the news that the large model application Wen Xin was fully open to the public as soon as the clock crossed 0 o'clock, and finally released it successfully at 0:02.

At the same time, Baidu also deployed a large amount of computing power in advance to support large model applications such as Wen Xin Yiyan to cope with the possible surge in usage after this full liberalization.

Immediately afterwards, at 1:44 a.m. on the 31st, the big model startup Zhipu AI announced the official launch of the large model application "Zhipu Qingyan"; At 3:09, Baichuan Intelligent said that it opened public services through "generative artificial intelligence filing".

More news spread early this morning: companies and institutions such as ByteDance, SenseTime, MiniMax, the Chinese Academy of Sciences, and the Shanghai Artificial Intelligence Laboratory also announced that their large models have passed the record and will begin to provide official services.

"LatePost" learned that the large models developed by iFLYTEK, Huawei, Tencent and Alibaba are also among the first batch of filings. Ali's Tongyi Qianwen is also about to open its services to the outside world.

"For us, this node today is more important than the release of the big model on March 16." A Baidu source said. He and many of his colleagues remember that 167 days have passed since Baidu released its big model.

With the first batch of large-model applications launched through filing, the Chinese intelligent large-model market has entered a new stage of competition, and products developed by technology companies and institutions based on large models can provide services to all users, while previously only limited numbers of tests could be carried out.

"The evolution of large models is highly dependent on user feedback, and more people use them, and more data feedback will be used to improve large models." Yu Huan, director of Baidu's Center for Technology and Society, said Baidu is trying to increase the iteration speed of large models. "We were originally planning to release a new version of the model at the end of the year, but now we are accelerating and releasing it as early as possible."

Wang Xiaochuan, founder of Baichuan Intelligence, told LatePost that Baichuan Intelligent will release a 100 billion parameter model in the fourth quarter of this year and a "super application" in the first quarter of next year. It is understood that iFLYTEK, which released a new version of the large model half a month ago, will also accelerate the promotion of the application of the large model after passing the filing.

The new environment has turned the big model competition into a comprehensive capability test: the winning factor will no longer be just a company's technical ability to train a large model, but also its ability to understand market needs, develop matching applications, and operate well.

More aggressive investment around new users and customers, growth and products is about to begin. When the "speed limit" is released, it's time to see who's fast and who's slow.

The policy is implemented, and a number of large models will be publicly launched

The "Interim Measures for the Management of Generative Artificial Intelligence Services" was officially implemented on August 15, which was a key node for China's large model companies to pass the filing. A large model practitioner said that after that, the relevant departments began to convene some large model companies to hold meetings to conduct filing training and issue filing material templates.

It is understood that during the filing process, the regulatory authorities pay attention to data security and data sources, such as whether the data infringes intellectual property rights or invades privacy; At the same time, the regulator suggested that the "rejection rate should not be too high" when completing chat tasks.

Before and after the start of the large-model filing work, the large technology companies that passed the filing in this round have more or less released the progress of the large model:

  • At the end of July, Tencent began testing the hybrid model across multiple lines of business, and is expected to announce new developments next month. Two months ago, Tencent CEO Ma Huateng said he was in no hurry to get the semi-finished product out early.
  • In early August, ByteDance publicly tested the large model application "Bean Bao", and the underlying model is the "Skylark" model that passed the filing this time.
  • On August 4, Huawei announced that it will integrate the Pangu model into the Hongmeng system, which will provide functions such as generating emails and automatically controlling mobile phone software through the voice assistant in the mobile phone.
  • On August 15, iFLYTEK released version 2.0 of iFLYTEK's Spark Model, which increases the ability to generate and understand images and codes, and jointly launched the Spark All-in-One Machine with Huawei to provide solutions for government and enterprise customers to deploy large models locally.
  • A few days ago, Baidu sent a group text message to remind Wen Xin to obtain the qualification of "Baidu Search AI Partner" and use New Bing-like functions through Baidu App and Baidu search engine.

Among the first batch of startups that passed the large model filing, Zhipu AI, Baichuan Intelligent and MiniMax are also rapidly iterating their large models recently. In June, Zhipu AI upgraded and launched the ChatGLM2 series, adding 3 models with different parameter specifications, which can handle up to 32,000 tokens (tokens are proportional to the amount of word processing).

Baichuan Intelligence, established in April this year, has rapidly launched 3 models in the past 4 months, two open source and one closed source, with the highest parameters reaching 53 billion. Founded at the end of 2021, MiniMax completed a major upgrade of its own model, ABAB, in July and improved performance on a weekly basis.

Startups also have big companies behind them. Meituan participated in Zhipu AI's B-2 round of financing this year. Tencent also invested in MiniMax in June. Wisdom AI and MiniMax have both become unicorns valued at more than $1 billion.

At present, most of the large model companies that have passed the filing have announced that they are open to the public. However, filing itself may not be a long-term advantage for large model competition.

The view of many industry insiders who participated in the filing is that more large-model companies will pass the filing in the future, "not only the first batch, but also the second and third batches."

Promotion is no longer limited, and the commercialization of large models is accelerated

After the application of large models "through generative artificial intelligence filing", the most immediate change is that the product can directly provide services to the public.

Before this, most companies were relatively restrained when promoting large model applications, and their products for individual users were internal testing and invitation testing forms, and ordinary users could not directly register and use, and companies would not actively launch advertisements to promote large model products, which inhibited the proliferation of products.

The implementation of the policy will promote the company to invest resources to promote the large model, and ultimately accelerate the commercialization of the large model. At present, there are four main monetization models in the large model industry:

  • Develop a large-model dialog app and charge users monthly/annually. For example, OpenAI's ChatGPT Plus service.
  • Sell large model API interfaces and charge companies or developers according to the number of calls, such as the cooperation between MiniMax and Kingsoft Office WPS.
  • Directly sell large-model development services and export large-model industry solutions to traditional enterprises to earn money, such as the large-scale industry model solutions promoted by Baidu, Tencent, iFLYTEK, and Huawei.
  • Companies with large models can also use large models to transform existing businesses and improve the competitiveness of their products to obtain more commercial returns. Companies such as Google and Baidu are using large models to optimize search products; DingTalk integrates large models into product functions; Ali has said that it will use large models to transform e-commerce business.

One of the most obvious market changes after the filing of large models is that products directed to individual consumers will become more and more active.

It is reported that MiniMax will launch a public-facing product next, but no details have been disclosed.

Wang Xiaochuan said that in the first quarter of next year, Baichuan plans to launch the first "super app" for individuals. He said at a media conference in early August that Baichuan Intelligent "will not only have a super application in the future, (more products) are on the way to development."

Large-scale product promotion, while driving traffic to business for enterprise customers. A Baidu source said that Baidu will not charge for public-facing Wen Xin Yiyan products for the time being, but it is a good way to demonstrate technical capabilities and "help attract enterprise users."

The enterprise-level market is the direction that the entire industry has been focusing on before the filing of this large model. Tencent, iFLYTEK, and Huawei have mentioned on different occasions that they have released dozens or even hundreds of large-model solutions for more than ten industries. MiniMax also announced that its open platform for enterprise customers has access to more than 100 paying customers.

The technical race for the big model itself continues. A Baidu source said that the company is fully accelerating the development of a new version of the large model, hoping to release it in advance. Baichuan Intelligent said that according to the previous research and development plan, Baichuan2's 7 billion parameters and 13 billion parameter versions will be released successively, and it is planned to launch a large model of hundreds of billions of parameters by the end of the year. iFLYTEK plans to launch a large model with Chinese surpass and comparable English ability to ChatGPT on October 24, and benchmark GPT-4 in the first half of next year.

The development of large models has entered a new stage

Up to now, China has hundreds of large models with more than 1 billion parameters. The flip side of policy implementation and accelerated commercialization of large models is that participants will face more fierce and comprehensive competition. When the "speed limit" is released, the limit of the leader can be tested, while the person who runs slower may face elimination.

From the perspective of the overseas market where large models develop faster and supervision is post-facing, the competitiveness of large models is mainly reflected in three aspects:

  • Computing infrastructure. When a large model application acquires a large number of users, it consumes a lot of computing power. OpenAI once suspended paid user registration and strictly limited the number of times users called GPT-4, the core reason is that computing power cannot keep up with the growth rate of users.
  • Proprietary Data. Most of the pre-trained large models on the market are trained with the same architecture, public datasets, and similar methods. The key to a large model's ability to differentiate itself is what kind of data to fine-tune it. The quantity and quality of these data will directly determine the capabilities of a large model.
  • Business applications. It is not difficult to make applications based on large models, but if you want to apply based on tens of billions or even hundreds of billions of parameter models, you need a large number of GPUs to do inference calculations. An industry insider judged that the cost of training and inference for a large model with hundreds of billions of parameters is about 1:9. This means that scenarios with sufficient commercial value and enough money must be found to make large model applications cost-effective. In larger application scenarios, large model suppliers can also get feedback from more users and continuously improve their models.

The competition of large models will benefit large companies with deep pockets and large users in some ways, such as Baidu, Tencent, Huawei, Alibaba, iFLYTEK and ByteDance.

However, an entrepreneur who develops generative writing applications based on other companies' large models told LatePost that he is not very worried that after filing, large companies will increase investment at the application level and squeeze small and medium-sized companies. "The boom has receded a lot before, and many applications have entered the process of deep integration of AI, that is, AI itself is not a selling point, the key is to grasp the needs and scenarios of users." In this competitive point, he believes that there are opportunities for both large and small companies, and representative products include Notion and DingTalk.

There are also many startups that are also establishing partnerships with companies with large user bases to improve their capabilities. For example, MiniMax and Zhipu AI are connected to Kingsoft Office's WPS.

It is understood that before Meituan invested in Zhipu AI, it has spent tens of millions of yuan to purchase its large model license, and plans to explore related applications on this basis.

The next big test for all big-model companies is how to find a truly profitable and sustainable big-model business model.

"We can't just push AI without a business model that underpins it." Frank Slootman, CEO of cloud database company Snowflake, said on an August earnings call. "Many company executives describe their attempts to get into big models as experimental, exploratory, and they're still trying to figure out how challenging that is," he said. ”

So far, almost all of the companies that have made money from the wave of big models have been companies that "buy shovels". For example, NVIDIA. In the second fiscal quarter, NVIDIA's GPU-related business revenue increased 171% year-over-year to $10.3 billion, and the company's net profit increased eightfold year-over-year to $6.2 billion.

The landing of this round of policies may also allow Internet advertising platforms to make a sum of money first. A Beijing model practitioner said that they are waiting for the filing to be completed, and then they will restart the product launch on the short video and search platforms. In the previous period, when the products were mainly in the form of tests, the company decided that large-scale investment was not economically economical. Prior to this, they once spent millions of yuan per month on product advertising.

"It's not yet to the point where there are super apps." One large model practitioner believes that it may take another two to three years, and there are only some signs that "when the technical ability is stronger, the application effect is good enough, and the cost is low enough, the real super application will appear." ”

Zhu Likun also contributed to this article.

Image source: Chariots of Fire