laitimes

The WAIC Venture Partners · Entrepreneurship and Investment Forum was successfully held

On July 6, the 2024 World Artificial Intelligence Conference (WAIC) hosted by Qiming Venture Partners, "Qiming Venture Partners· Entrepreneurship and Investment Forum - Super Model, Super Application, Super Opportunity" was successfully held in the Red Hall of Shanghai World Expo Center. Famous experts and scholars in the fields of large language models, multimodal models, embodied intelligence, and generative AI applications, as well as top investors and leading entrepreneurs, gathered together to share and exchange ideas on topics such as the progress of basic generative AI technologies, business application prospects, and venture capital ecology.

As the earliest and most abundant investment institution in the field of AI in China, this is the second consecutive year that Qiming Venture Partners has hosted the forum, and it is also the only sub-forum initiated by venture capital institutions at this World Artificial Intelligence Conference to showcase and discuss generative AI from an innovative perspective.

Since 2013, Qiming Venture Partners has systematically deployed in the field of artificial intelligence, from AI 1.0 to AI 2.0, after more than ten years of deep cultivation and cutting-edge insights, Qiming Venture Partners has invested in many projects in the field of AI, and many of them have been listed or grown into unicorn enterprises.

In his opening speech on the theme of "Technological Breakthrough to Application Transformation - A New Chapter in AI Development", Zhou Zhifeng, Managing Partner of Qiming Venture Partners, said that Qiming Venture Partners' investment strategy in the field of artificial intelligence has evolved, from investing in artificial intelligence as a technology or a vertical field to treating it as a fundamental capability and looking for its huge potential in thousands of industries.

The WAIC Venture Partners · Entrepreneurship and Investment Forum was successfully held

Zhifeng Zhou, Managing Partner of Qiming Venture Partners

Compared with the landing time of applications in the Internet wave, Zhou Zhifeng predicts that in the current AI wave, the explosion of applications will be significantly earlier. At present, generative AI has gained a large number of users in the three "C-fields" - Copilot, Creativity, and Companionship, showing a similar trajectory to the development of Internet applications, and is undergoing a transformation from applications used to improve efficiency (Save Time) to applications aimed at pleasure (Kill Time). He pointed out that the Internet has reduced the marginal cost of information distribution to almost zero, and the core of generative AI is to reduce the marginal cost of creating digital content to almost zero, so it seems that AI technology will definitely release huge value.

Zhou Zhifeng pointed out that China's huge market, excellent technical capabilities and talent pool, and excellent experience and ability to create applications cultivated and accumulated in the past 20 years have laid a good foundation for China to lead the next generation of artificial intelligence native applications. Based on the statistics of more than 400 AI startups that have been deeply communicated by the investment team of Qiming Venture Partners, compared with last year, the proportion of multimodal applications is on the rise, and many new application categories based on AI large model technology have emerged. In addition, he shared the portraits of three typical founders of generative AI startups, including AI scientists and AI research leaders of tech giants, industry experts and senior product or operation executives of large enterprises, and emerging entrepreneurs and technology geniuses.

In view of the problems faced by the application of generative AI, Zhou Zhifeng pointed out: first, reduce the cost of model use required for the popularization of generative AI; Second, improve the effect of large models; Third, enhance the user retention rate of generative AI applications. Because generative AI application companies have a longer growth time from 0 to 1 than other fields, and need to overcome the two challenges of TPF (technology-product fit) and PMF (product-market fit) at the same time, the founding team needs to be more patient and determined to understand the technology (the edge of the technology), understand the product (new features and new distribution mechanisms of native AI products), and understand the world (the opportunity for global development).

Zhou Zhifeng also made the top ten prospects for generative AI in 2024 around large language models, multimodal models, business opportunities, etc.:

1. The two core technologies of generative AI, GPT and diffusion models, will be gradually integrated to stimulate new model capabilities.

2. The acquisition and organization of high-quality data will significantly affect the new generation of models, and the proportion of synthetic data in pre-training will be greatly increased.

3. Multi-Agent technology will take a leap forward, significantly improving the efficiency and effectiveness of generative AI by optimizing collaboration and division of labor;

4. A unified continuous representation of images and text will appear, and the image-text joint diffusion model based on this will reach the GPT-4o level capability;

5. The compression rate of image and video hidden space representation is increased by more than five times, so that the generation speed is increased by more than five times.

6. Within 3 years, video generation will explode in an all-round way, combined with 3D capabilities, controllable video generation will bring changes to the production mode of film and television, animation, and short films;

7. We will witness super multimodal large models that compress more modal information, such as text, images, speech, music, 3D, sensor data (control signals, eye movement signals, gesture information, radar signals, etc.);

8. Generative AI opens up the conversion channel between human language and machine language, and the cost of commanding machines to complete complex tasks will be significantly reduced, bringing about huge productivity changes.

9. There will be a huge growth in end-side inference, which comes from the superposition of three factors: inference optimization algorithm + end-side inference chip + end-side large model;

10. AI will dominate several highly digitally advanced industries and will reshape the vast majority of enterprise software.

During the World Artificial Intelligence Conference, Stepleap Star debuted three new Step series general large model products, comprehensively upgrading the base capability of general large model. In this forum, Jiang Daxin, founder and CEO of Step Leap Star, pointed out in his speech on the theme of "The Path and Practice of Climbing AGI: Trillion Parameters + Multi-modal Fusion" that to explore the path of AGI, "Scaling Law" and "Multi-modality" are two directions that complement each other and are indispensable, and the two go hand in hand to finally reach AGI.

Jiang Daxin, founder and CEO of Step Leap Star

In Jiang Daxin's view, Scaling Law is still effective, and the model performance is still increasing to the power with the increase of the number of parameters, data and calculations. Step Star actively explored the system and algorithm, and finally walked through the road of Step-2 trillion parameter MoE large model training. At the same time, multimodality is the basic ability to build a world model, and in the face of the challenge of unifying understanding and generation in one model, Step Star has made some progress, with its newly upgraded Step-1.5V 100 billion parameter multi-modal large model with greatly improved performance and better video understanding capabilities, and the newly released Step-1X image generation model is the first time that Step Star has launched a multi-modal generation large model.

On the first day of the conference, Infinite Lightyear, a trusted large model company, also released the Lightlanguage large model, which is gray box credible and the model with tens of billions of parameters is better than the ultra-large-scale model GPT-4 Turbo. Qi Yuan, Haoqing Distinguished Professor of Fudan University, Dean of Shanghai Institute of Scientific Intelligence, and Founder of Infinite Lightyear, pointed out from a technical perspective that Scaling Law has changed artificial intelligence, but it will not directly lead to AGI, and the goal of AGI is to discover the unknown laws of the complex world. However, the current large models are highly dependent on data, and the unknown laws may lack the support of massive data. At this forum, Qi Yuan introduced the standard of the most advanced artificial intelligence - a smart brain that combines the discovery of unknown laws in a complex world and the saving of energy: AI Einstein

The WAIC Venture Partners · Entrepreneurship and Investment Forum was successfully held

Qi Yuan, Haoqing Distinguished Professor of Fudan University, Dean of Shanghai Institute of Scientific Intelligence, and Founder of Infinite Light Year

Qi Yuan analyzed that at present, the large model is mainly the "black box" probability prediction of the connection school, if the symbol calculation is combined with the large model, it can have the "white box" logic ability of slow thinking at the same time, and the integration of the two methods is an important direction for the development of AGI, so as to realize the "gray box" credibility; Deep learning can achieve data fitting, and can be extended to places where data is not available, and when knowledge rules and key data contradict each other, knowledge rules can be adjusted to get rid of data dependence. He further introduced that the "gray box" can solve the illusion problem of large models and professional problems in vertical fields through the combination of symbolic computing and neural networks. Looking forward to the future, he hopes that the company can deeply cultivate scenarios, be credible in gray boxes, and release large-scale model productivity to empower thousands of industries.

Training and inference are two indispensable stages in the life cycle of large models, both of which require powerful computing resources. During the 2024 World Artificial Intelligence Conference, Wuwen Xinqiong released the world's first hybrid training platform for heterogeneous chips with a single task kilocalorie scale, providing strong computing infrastructure support for the large model industry. Xia Lixue, co-founder and CEO of Wuwen Xinqiong, said in his keynote speech on "Building AI Native Infrastructure" that computing power has become the cornerstone of AI development and continued development, and the four key Infra problems faced by AI Native applications include: activating "sleeping chips" and promoting the integration of heterogeneous computing power, improving the computing performance of large models of multiple computing cards, stabilizing the training/pushing of large-scale training clusters, and making more efficient use of limited end-side computing resources.

The WAIC Venture Partners · Entrepreneurship and Investment Forum was successfully held

Xia Lixue, co-founder and CEO of Wuwen Core Dome

For multiple chips, Wuwen Core Dome is committed to providing a high-quality computing platform that efficiently integrates heterogeneous computing resources, middleware that supports joint optimization and acceleration of software and hardware, and easy-to-use large-scale model application development and service tools, so as to realize the full utilization of heterogeneous computing power. Xia Lixue pointed out that Wuwen Xinqiong hopes to continue to reduce the landing cost of large model applications through algorithm innovation, model computing, computing platform and hardware inference optimization, so that more people can embrace new technologies.

In his keynote speech on "U-ViT: The Transformation and Future of Multimodal Large Models", Bao Fan, co-founder and CTO of Biodigital Technology, shared that the company has full-stack independent R&D capabilities in the field of multi-modal large models, and lays out multi-modal capabilities such as images, 3D, and video generation. Previously, Biodigital Technology and Tsinghua University officially released China's first long-duration, high-consistency, and high-dynamics video model - Vidu, which is the world's first video model to make a major breakthrough since the release of Sora, and its performance is fully benchmarked against the international top level. The model adopts the team's original Diffusion and Transformer fusion architecture, U-ViT.

The WAIC Venture Partners · Entrepreneurship and Investment Forum was successfully held

Bao Fan, co-founder and CTO of Shengshu Technology

At the conference, Bao Fan also introduced the principle of the U-ViT architecture, and pointed out that the architecture ensures the optimal generation quality, controllable computing overhead, scalability of parameter scale, and emergence. As the first company to successfully apply the ViT architecture to large-scale model training, the multi-modal diffusion model UniDiffuser launched by Biodata Technology can support diversified styles, have "artistic" aesthetic standards, and have outstanding semantic understanding capabilities in the image generation process. The company has also made progress on Vidu, a large video generative model, which supports audio and video synthesis and 4D animation generation, and realizes the continuous improvement of generation effects.

With the rapid development of artificial intelligence and robotics, and the continuous advancement of sensors, actuators, computing power, and AI algorithms, embodied intelligence has become a hot topic of common concern in academia and industry. From technological breakthroughs to industrial landing, how is the current development of embodied intelligence going? In the special session of "Embodied Intelligence: From Technological Breakthrough to Industrial Landing", Zhou Xiaofei, an investor from Qiming Venture Partners' technology team, served as the moderator and discussed with Chen Jianyu, assistant professor of Tsinghua University and founder of Xingdong Era, Lu Cewu, professor of Shanghai Jiao Tong University and co-founder of Dome Intelligence, and Wang He, assistant professor of Peking University and director of Peking University-Galaxy General Joint Laboratory.

Xingdong Era is a leading humanoid robot company in China, and the product launched by Xingdong No. 1 is also the world's first humanoid robot to climb the Great Wall. Chen Jianyu believes that humanoid robots will be the ultimate form of general-purpose robots, not only because the pure humanoid form of feet and hands is more compatible with the existing environment, but also easier to migrate from the human world in terms of training data acquisition. In terms of technical paradigm, the end-to-end cerebellar fusion scheme will be an important research direction in the future, and the effect of using human language as the transmission interface between the cerebellum and cerebellum is limited.

Chen Jianyu believes that the future robot is expected to achieve extreme performance in various tasks. In the near future, it may be possible to design a kind of Turing test of robots, with a robot and a human to interact, behind which may be intelligent autonomous control or human teleoperation, when the technology develops to the point where it is difficult to distinguish whether the robot is behind artificial intelligence or human teleoperation, it may be the day when the robot truly realizes intelligence and generality. Finally, Chen Jianyu remains optimistic about the prospect of developing embodied intelligence in China, and believes that every startup should think about how to take advantage of the advantages of the Chinese market, maximize the advantages of the domestic supply chain, and create hardware products with global competitiveness.

Lu Cewu is the first human in the world to be shaved by a robot, demonstrating the advanced technology behind the precision force control robotic arm behind Dome Intelligence. Lu Cewu believes that the endgame of embodied intelligence needs to comprehensively consider the iteration of technology and business needs, and embodied intelligence, as a software algorithm that carries hardware, welcomes various types of robot forms. For specific technical paths, embodied intelligence algorithms need two core elements, namely a world model that can perceive and understand the world, and a skill operation model with strong robustness. Among them, the force feedback mechanism in the operating model is very important, not only to add an interactive dimension in addition to the image dimension, but also to reduce the dependence on the world model millisecond-level decision-making, the skills such as peeling cucumbers and folding clothes displayed by Dome Intelligence at this exhibition show that the operation model can greatly expand the potential application space after being robust.

Speaking of the future of embodied intelligence, Lu Cewu believes that in the near future, we can see batches of ChatGPT moments of operation skills, constantly enriching the operation capabilities of robots, and gradually making the commercial flywheel of robots keep turning. At the same time, young scholars in China are also constantly entering the embodied intelligence industry, doctoral applications in the field of embodied intelligence have been very popular in recent years, China's talent density and potential are very large, and in the future, China's top universities and companies will compete with their peers on the international stage.

Galaxy General released the first generation of embodied large-scale model robots with generalization some time ago, showing the infinite possibilities of general robots entering thousands of households in the future. Wang He believes that humanoid robots are the greatest common divisor of the entire general robot market in the future, but in the process of moving towards this ultimate goal, every step needs to have a healthy business model so that the robot can really enter the scene, and the upper body anthropomorphic lower body chassis will be the most likely practical solution to land within three years. From a technical point of view, Galaxy General is very concerned about how to achieve enough generalization and generalization of the skill control model at the cerebellar level, for cerebellar skills, Galaxy General has developed and synthesized tens of millions of scene data and billions of grasping data, under the training of synthetic data, Galaxy General Robot has achieved a success rate of more than 95% in grabbing randomly placed transparent, high-light and other objects, and demonstrated the strong generalization of any object provided by the audience at the WAIC booth. On this basis, Galaxy General is gradually exploring commercialization.

Wang He believes that robots that can be landed need to be low enough and durable enough, which require technology companies to continuously polish their hardware and supply chain capabilities, and domestic startups have natural advantages. At the end of the discussion, Wang He called for confidence in the development of embodied intelligence in China, once China can mass-produce humanoid robots and achieve the versatility of embodied intelligence, we will bring humanoid universal robots to the market on a large scale with the most reliable supply chain and the most comprehensive manufacturing industry. The entire industry needs the continuous support of capital and the long-term investment of talents, and the future of embodied intelligent general robots must also belong to China.

The breakthrough of large models has injected a strong impetus into the development of super applications. As the generative AI industry shifts from super models to super applications, what super applications will be born in the future and what changes will it bring to human life? In the AI application discussion session of "New Opportunities for Super Applications: Mutual Benefit and Win-Win with Model Breakthroughs", Hu Qi, an investor from Qiming Venture Partners Technology Team, served as the moderator, and discussed with Zhang Fan, COO of Zhipu AI, Ding Li, founder and CEO of Mikue AI, Zhu Jianxiong, COO of Infinite Light Year, Sun Yiqiao, founder and CEO of Xizhi Intelligence, and Ding Ning, chief algorithm scientist of Zhiyuan Technology.

Zhang Fan introduced that as a large model company, Zhipu AI has core algorithms and complete model matrices with independent intellectual property rights, covering large language models, code models and multimodal models. Zhang Fan believes that disruptive super applications may emerge in the next few years, but these applications are often difficult to design in advance, but emerge gradually through continuous iteration. He emphasized that the core of the large model is to improve the bandwidth of human-computer interaction, from the early keyboard to today's natural language, which has greatly improved the interaction ability, and each increase in interaction bandwidth will reconstruct user needs and application methods.

Regarding the unique advantages of Zhipu AI, Zhang Fan pointed out that large models have lowered the cost and threshold of AI applications, so that AI has become a basic production factor that can be obtained by everyone from an advanced capability exclusive to a few large manufacturers. The spread of this capability has stimulated more people's creativity and driven change in industries and industries. Zhang Fan also mentioned that Zhipu AI took the lead in proposing the concept of "Model as a Service", which enables enterprises and developers to reduce the cost of using and training models through the MaaS platform, and more easily explore and build super applications. Zhipu AI has also promoted the popularization and application depth of AI technology through open source and price reduction.

When talking about the future of AI-driven super apps, Zhang Fan expressed optimism, saying that although it is not easy to build super apps, many unimaginable applications will emerge in the AI era. This process requires improvements in computing power, networking, hardware levels, and user habits, as well as following the principle of starting small with small applications and growing up incrementally. Zhang Fan emphasized that by embracing and utilizing existing AI technologies and gradually changing existing applications and products, the future will usher in super applications in the AI era.

Mikueh AI is committed to combining AI technology and content production to help creators create better works with less effort, with the goal of becoming a leader in AI comics and animation platforms. The team is composed of three founders who combine production, education and research, Ding Li has worked in NetEase, Huya Live, Bilibili and other companies; Technical partner Niu Li is an associate professor at Shanghai Jiao Tong University, and he is an international pioneer and pathfinder in the field of image composition in image editing. Operating partner Chen Dazhi has 12 years of investment experience, specializing in animation and two-dimensional project investment.

Ding Li believes that in the next few years, there will be disruptive super applications in fields such as drawing, comics and 2D animation. He pointed out that Webtoon in Korea has successfully achieved high-frequency updates through industrialization and streamlining, which has improved the user experience. Mi Kueh AI is expected to achieve efficient production of comics through AI technology, greatly increase the frequency of updates, change the consumption mode of users from paid to free, promote the high-frequency update of the comic industry like short dramas, and improve user experience and industry efficiency.

Regarding the business model, Ding Li said that AI technology has lowered the threshold for content creation, allowing creators to focus on scripts, outlines and creativity, and AI to complete the tedious painting process, improve creative efficiency, and enable more creative people to join the cultural and creative industry. Mi Kueh AI's technology has increased the drawing speed by more than 10 times, making comic creation more efficient and low-cost.

Speaking about the challenges, Ding stressed the importance of forming a friendly ecosystem with industry practitioners, and that AI should be used as a tool to increase productivity and efficiency, rather than completely replace humans. At present, AI entrepreneurship requires high capital and high technology thresholds, and teams need to work together to solve challenges in order to stand out in the fierce market competition.

In the roundtable dialogue, Zhu Jianxiong shared his views on the future of super apps and the company's strategy. He believes that in the next few years, super apps will emerge in a number of areas. At present, the wide acceptance of AI large-scale model technology and the reduction of the cost of use have prompted many enterprises and startups to actively explore this field. Zhu Jianxiong pointed out that from the PC Internet era to the mobile Internet era, the transformation of traffic entrances provides a reference, and a similar trend will occur in the era of large models, and companies with deep scene service capabilities may grow into super applications.

When talking about the relationship between trusted large models and super applications, Zhu Jianxiong mentioned that there is an "impossible iron triangle" in large model technology, that is, versatility, professionalism and economy. He emphasized that Infinite Lightyear chooses to deepen its professionalism, build a vertical large model of industry knowledge content, ensure the reliability of model output through neural symbol computing technology, and has launched specific products in the financial and medical fields, such as investment research writing assistant and physical examination report writing assistant. These products have significantly improved work efficiency and have been highly recognized by users.

Discussing the challenges of AI-driven innovative applications, Zhu noted that the role and requirements of product managers have changed. Today's product managers not only need to define scenarios and requirements, but also pass this information to the big model for evaluation and validation. He believes that product managers who understand both models and customers are currently scarce in the market, but this problem will gradually improve over time.

Xizhi Intelligence focuses on AI education, especially AI teaching and problem solving. Sun Yiqiao, who founded the company as an undergraduate at Tsinghua University, believes that purely statistical models are lacking in logic and robustness. Sun Yiqiao said that GPT-4 is unreliable in driving a spaceship, but by building a white-box system that contains human knowledge, the ability of large models can be significantly improved.

Sun Yiqiao introduced that Xizhi Intelligence has greatly improved the model reasoning ability by building a complete knowledge system of mathematics and other disciplines, and its mathematical problem-solving ability is significantly higher than that of GPT-4o, and the existing products have nearly 2 million users in the United States, with an annual income of nearly one million US dollars, and cooperate with domestic giants such as New Oriental to develop large models.

Sun Yiqiao believes that in the future, super applications should start from demand and solve problems vertically, and education is a field with great potential. He emphasized that the education field is prone to AI super applications due to its high frequency and rigid demand, which can create great value by improving teaching efficiency and students' willingness to learn.

In terms of improving the mathematical ability of large models, Sun Yiqiao mentioned OpenAI's Qstar project and its reinforcement learning method, and believed that the logical reasoning ability of large models can be significantly improved by gradually optimizing the mathematical problem-solving steps. Xizhi Intelligent uses a similar method, combined with a complete mathematical knowledge system, to gradually teach the large model to solve problems to improve its ability.

In the field of AI problem solving, Sun Yiqiao pointed out that the key to the combination of professional AI capabilities and large models lies in the improvement of the ecosystem. Xizhi Intelligence is committed to improving the ecology through win-win cooperation. Vertical applications require a lot of fine-tuning and reinforcement learning, and we hope that in the future, the ecosystem will work together more efficiently to improve the capabilities of the base model.

Ding Ning demonstrated his unique science and engineering perspective and profound technical background in the roundtable dialogue. He introduced the strategy of Lianyuan Technology, that is, the integration of mold and application, the combination of general and specialized, and emphasized the concept of improving professionalism on the basis of technology generalization.

When discussing disruptive super applications, Ding Ning mentioned the potential of large models to process various information sequences such as text, video, DNA, etc. He proposed two key dimensions: gains when succeeding and losses when failing, and pointed out that there are scenarios where there are opportunities where the benefits of success are large and the losses of failure are small, such as scientific discovery and advertising. He emphasized the specialization of the generic model to create value by achieving target tasks at the lowest cost.

In view of the improvement of large model technology, Ding Ning pointed out the challenges of large models in processing input and output sequences, especially the difficulty of learning in scenarios with a high proportion of negative signals. He emphasized the ability to quickly and efficiently specialize general-purpose models, discussed the application of reward models in improving model performance, and emphasized the importance of low cost and high efficiency.

When talking about the challenges of generative AI-driven super apps, Ding Ning shared two inertia pitfalls: resource inertia and technology inertia, and emphasized the importance of keeping an open mind. He also mentioned missing data, especially the lack of high-quality data for scenarios with a high proportion of negative signals, as a key challenge for the future.

Read on