Today, it was revealed that the large-scale model start-up "Baichuan Intelligence" has recently completed a round of financing, with a total financing amount of 5 billion yuan.
iDark Horse asked Baichuan Intelligence for verification of the above information, and Baichuan Intelligence gave the following response:
Baichuan Intelligent has indeed recently completed a round of financing, with a total financing amount of 5 billion yuan, and will start the B round of financing at a valuation of 20 billion yuan.
Series A investors include leading manufacturers and market-oriented investment institutions such as Alibaba, Xiaomi, Tencent, Asia Investment Capital, and CICC, as well as state-owned industrial investment funds such as Beijing Artificial Intelligence Industry Investment Fund, Shanghai Artificial Intelligence Industry Investment Fund, and Shenzhen Capital Group.
Baichuan Intelligent has always adhered to the two-wheel drive development strategy of "super model + super application". Up to now, 12 large models have been released, and the first AI assistant - Baixiaoying was launched in May this year.
Baichuan Intelligence has always believed that medical care is the "crown jewel of large models" and the best scenario for building super applications. As the only leading large-scale model company focusing on medical care in China, Baichuan Intelligent has made phased progress in AI medical technology and application.
The self-developed general medical enhancement model has surpassed GPT-4 in many authoritative evaluations, and it also demonstrated the AI medical application AI health consultant for the first time at the World Artificial Intelligence Conference (WAIC) held in early July.
At the 2024 Global Digital Economy Conference Artificial Intelligence Forum held recently, Wang Xiaochuan shared. The following is the content shared by Wang Xiaochuan, edited by i Dark Horse:
Today, the whole of Chinese society, especially the tech community, is in a state of excitement and anxiety.
On the one hand, we have to follow the United States volume model, and we want to invest billion, tens of billions of dollars to make super models; On the other hand, it is calling for a scenario for the landing and application of large models. Everyone is worried that only doing landing applications will miss the future, but only thinking about the future has no present.
Baichuan was established in April last year, and in January this year, my friends sent me WeChat congratulations, saying "Happy New Year's Day 2024". I replied, "Happy 2nd Year of the Intelligent Era".
What is the meaning of this phrase? We have entered a new era, the age of intelligence. Although AGI has not been achieved yet.
We are fortunate that we have come from the Internet era, enjoyed the rapid development of the Internet for 20 years, and now witnessed and participated in the development of the intelligent era.
01
Through language, we are approaching intelligence
I would like to use a key word to express my understanding of the central axis of the intelligent age, which is to transform language into mathematical models.
When we talk about large model LLMs today, the first key word is Scale, which is scale. From tens of billions of parameters to hundreds of billions of parameters to trillions of parameters, and then to the quadrillion parameters known as GPT5, the scale of data can bring intelligence. In the end, everyone thinks that how much money is invested in it, and intelligence can be realized. At the same time, we will also cheer, for example, the launch of Sora in the field of video generation this year, whether it represents another kind of intelligence in the future.
None of these are complete answers. Because the path of artificial intelligence that we are taking today is not in Sora, but in language. By turning language into a mathematical model, it is possible to obtain artificial intelligence like today.
Many people say that the difference between humans and animals may be that people know language.
For example, image recognition is very powerful, unmanned driving is very powerful, I joke that dogs can also be unmanned and image recognition. The difference is that language is the carrier of our thinking, communication, culture and knowledge. So through language, we are approaching intelligence. Today, this paradigm system is not divorced from language, so we emphasize language as the central axis. This is a key logic in general intelligence.
Another key word is energy. Data, algorithms, and computing power are already clear to everyone, but since the first half of this year, more energy issues have been mentioned. Especially abroad, there has been a controlled nuclear fusion that is preparing for the large-scale deployment of new models. If AGI or artificial intelligence explodes further, the energy behind computing power will constitute a big shortcoming. China has not yet begun to explore this issue, and there is no such crazy idea of United States when it comes to electricity. This is also one of the differences between the development of large models in China and the United States.
02
The big model brings an increase in productivity
There is a knowledge management framework called the DIKW model, which stands for Data Information Knowledge Wisdom. In the past, machines generated information, and the core of intelligence in this era is to produce knowledge. In other words, the large model not only brings the transmission of information, but also can generate knowledge, representing the cognition of the world. Further on, wisdom can be generated.
We know that in the age of the Internet, it is more about changing the relations of production. Whether it is Taobao, Didi or Meituan, they all use the Internet to connect more merchants with more users and improve the efficiency of matching supply and demand.
However, in the era of intelligence, large models bring a direct increase in productivity. GPT is the first application product of this era, and I believe that more new applications will be produced in the future.
In the information age, we use computers with a tool-like mindset. Now, we should think of the big model as a partner. If you still treat the large model as if it were the original calculator, then it can play too limited a role. We should think of AI as a companion and service partner. I predict that in the next two to three years, we can move out of this tooling mentality and make AI as capable of serving and accompanying as humans.
In this case, the current direction of exploration is to empower thousands of industries, that is, to improve productivity. There are two other areas: intelligent assistants and virtual worlds.
In the personal realm, this kind of intelligent assistant actually didn't do very well before. At the moment, Perplexity should be considered good in the field of intelligent assistants, and can provide a better presentation of knowledge by summarizing the search. GPT4O represents more of an innovation in the mode of interaction, which is to communicate with users in a human way. I think that in the future, there will be new To C product forms for intelligent assistants, rather than just redoing them in the original e-commerce, games, and other fields.
In the virtual world, as technology advances, in the next 3 to 5 years, we foresee AI playing a more important role in many aspects of the virtual world, especially in the entertainment and gaming industry. With the development of the virtual world, it will bring new impetus and change to the global economy. We believe that this will be a process of transformation from production relations to productivity, and will eventually realize the empowerment of the virtual world.
At present, we have discussed many concepts, including search enhancement, multimodality, agents, etc., which are the main technical routes to improve the capabilities of large models. The development of these technologies could lead to a transformative force in the future.
03
Reinforcement learning enables large models to think
In addition to language and scale, another development idea for large models is reinforcement learning. OpenAI has mentioned that their model can be surpassed in some places, including the ability of a large model to reach the level of a PhD student after 18 months. This is not something that can be solved in the current paradigm, and in terms of solving problems in the current paradigm, it would be good if the large model could reach the level of high school students, so it can only do auxiliary work.
But there will be a new paradigm in the future, which is worth everyone's expectation and attention. At the heart of this paradigm is the use of reinforcement learning to change today's model settings. Broadly speaking, this is like the familiar AlphaGo, an artificial intelligence that specializes in playing Go. AlphaGo has taken two opposite paths to the big model. The big model is that the more data, the better, and how much data can bring as much intelligence as possible. The first version of AlphaGo was a game that combined millions of human Go experts. But after AlphaGoZero, there is no need for human chess games, and it becomes more intelligent without data.
Today's large models can only solve the problem of fast thinking, but they do not have the ability to think slowly. Therefore, when it comes to AI surpassing humans, it will take a long time for slow thinking through reinforcement learning. In terms of the learning method advocated by Confucius, the large model is "learning without thinking", learning a lot of things, but not thinking, not particularly intelligent; AlphaGo is "thinking without learning", and only delves deeply into one field. Therefore, the combination of learning and thinking systems can achieve the final explosion of the intelligent era. The two technologies that we have now have not yet been fused together, and these are some of my basic judgments about the future. Some universities in China are exploring this aspect, and we even predict that the United States has achieved a breakthrough in this regard.
With our own practice, we emphasize how to balance the relationship between ideals and reality. The pursuit of "poetry and distance" is to catch up with OpenAI, but there is no landing scene. If you go to the landing scene, you may not be able to go far away from the large model, lose the "slow thinking", and lose the possibility of general intelligence represented by the large model.
Therefore, in view of the current situation in China, including chip capabilities, talent reserves and market environment, we have put forward a logic, that is, "one step slower in the ideal and three steps faster in the landing". In other words, ideally we don't follow them, but we make sure we go our way through them. Our priority was to open up the landing scene and make sure we didn't lose sight of the ultimate ideal of a large model. Therefore, our goal is to develop both supermodels and superapps. Therefore, when we empower thousands of industries, we choose these scenarios with a particularly high knowledge density and really need 100 billion models and trillion models, and then we reduce it to 10 billion, which is the positioning of Baichuan.
Baichuan developed rapidly last year, becoming one of the first eight large-scale model companies to be registered by the Ministry of Industry and Information Technology, and the only company established that year. In May this year, Baichuan launched the latest generation of pedestal model Baichuan 4, and launched the first AI assistant "Baixiaoying" at the same time.
04
Medical care is the crown jewel of the big model
Baichuan chose the medical field in the landing scene. Medical care is the crown jewel of the big model. Geoffrey Hinton, the father of backpropagation, emphasized that healthcare will be one of the most important AI application areas and will unleash the maximum potential of AI. The mastery of knowledge and experience by the large model, the multimodal ability, memory ability, and thinking ability of the large model, and the ability of the large model to reduce hallucinations and communicate empathy can all be used in the medical field. Even the better the model, the more advanced the medical treatment can be.
At present, Internet hospitals are more successful in the medical field, including good doctors, etc., which need to rely on existing doctors, in fact, it is still a change in production relations. Because the supply of doctors is insufficient, and increasing the supply of doctors is a major opportunity, the medical industry is called "those who get doctors win the world".
At the same time, we have also divided AI medical care into levels, from L0 to L5, which is similar to unmanned driving.
L0 is traditional medical care, without AI intervention.
L1 is a single-function machine assist that can be used to make recommendations at a single point.
L2 is a multimodal assistance, and AI can integrate multiple data such as medical records and images to provide information.
L3 is conditional automated diagnosis and treatment, similar to the current autonomous driving, which can completely replace the driver to drive, but the driver's confirmation is still needed at critical moments. Our goal is to achieve the L3 level from this year to next year, so that AI can be used as an assistant to doctors, and consumers can also provide follow-up health consultants, but key decisions need to be confirmed by doctors.
L4 requires the level of AGI. AGI's ability to create doctors and work exactly like doctors is a general-purpose artificial intelligence.
L5 is fully automated health management that can go beyond AGI and manage the entire health journey of patients, from prevention and diagnosis to treatment. At this time, we began to move from a language model to a life model, decoding life, and surpassing the level of doctors.
In the process, we landed two things. The first is to manage the whole course of the disease in the early stage, and the second is to do precision medicine on this basis.
At present, the resources of doctors are far from sufficient, and it is impossible to have doctors to accompany patients all the time. The large model can be used as a physician assistant in the hospital, a health consultant outside the hospital, and can manage the whole course of the disease. In this process, the large model allows the patient to collect as much data as possible, which is also like a clinical experiment for the doctor. Because doctors are not only clinicians, but also scientific researchers, they need to summarize their observations and experiences, write papers, and turn them into knowledge. Each doctor's energy is limited, and a doctor in China may only be able to diagnose 30,000 patients in his lifetime. Large models can quickly iteratively learn, can be modeled for each patient individually, and finally move towards precision medicine, so that prevention, diagnosis, and intervention can reach new heights.
For the future, on the one hand, Baichuan will continue to tackle tough problems, break through the scale of models with language as the center, and make super models. At the same time, we are making a lot of enhancements in medical care, and we look forward to achieving AGI through AIGC and improving medical related services in the process.
[The author of this article is i Dark Horse, i Dark Horse original.] If you need to reprint, please contact the WeChat public account (ID: iheima) for authorization. ]