Huang: The key to machine learning is the flywheel, and Nvidia doesn't talk about market share

How much computing power does the world need at present? Is Nvidia looking to expand its market share?

Recently, in an interview that lasted about one and a half hours, Huang talked about his views on the future of AGI (Artificial General Intelligence), machine learning, and AI (artificial intelligence), and commented on Musk, xAI, OpenAI and his working life.

Huang said AGI will soon become a personal "pocket assistant" in some form, a tool that will be useful at first, but it won't be perfect, and over time, it will become more and more perfect, like all technology, and that's the beauty of technology.

Huang: The key to machine learning is the flywheel, and Nvidia doesn't talk about market share

Jensen Huang's Visual China data map

"We reinvented computing", the key to machine learning is the flywheel effect

Huang said Nvidia has "reinvented computing," reducing the cost of computing by a factor of 100,000 over the past decade, far more than Moore's Law can bring.

He believes that innovations such as accelerated computing, new numerical precision, new architectures, and extremely fast memory have all contributed to the rapid development of computing power and allowed us to move from human programming to machine learning. At the same time, machine learning is also fast, and as it redefines how computing is distributed, NVIDIA has introduced various forms of parallel computing and has become adept at inventing new algorithms and training methods on top of these. These technologies and innovations stacked on top of each other and ultimately led to incredible progress, "The whole stack is growing, we're innovating at all levels, so we're seeing an unprecedented rate of scaling...... Previously we were talking about the scaling of pre-trained models, doubling the size and amount of data every year, and therefore quadrupling the computing power requirements every year, but now we're also seeing expansion in the post-training and inference phases, where pre-training is no longer seen as difficult, reasoning has become complex, it's ridiculous to treat all human thinking as a one-off, and the concepts of fast thinking and deep reasoning, reflection, iteration, and simulation are now starting to emerge. ”

Huang emphasized that many people in the past, including many people now, believe that designing a better chip means having more computing power and more floating-point computing power. Computing power is really important, but this way of thinking is outdated, because in the past, software was running on the system, which was static, which means that the best way to improve the performance of the system was to make faster chips, but we have entered machine learning, not human programming, not just software, but the whole data processing process. The key to machine learning is the "flywheel effect", and we need to think about how to make the "flywheel" work efficiently. Many people don't even realize that AI is required for data collation and training alone, and the process itself is very complex. And it's because of smarter AI to organize data, there is now synthetic data generation and a variety of different ways to organize data. As a result, there is already a lot of data processing work before training. So when you think about this flywheel, you should look at it holistically, not just focus on training, and you should design a computing system and architecture that can make every step of this flywheel as efficient as possible, not just training for specific application scenarios.

"Training is just one step, and every step is hard, and there's no easy part to machine learning," Huang said. Whether it's OpenAI, or DeepMind's Gemini team, what they're doing is not something simple. So, you should pay attention to the whole process, speed up every step, and respect Amdahl's law. If a step takes up 30% of the time, even if it is accelerated by a factor of three, the overall process improvement is limited. So, the key is to create a system that can speed up each step so that you can really improve the cycle time and the efficiency of the entire flywheel. ”

He believes that the acceleration of flywheels and learning will eventually lead to exponential improvements, and the entire process of NVIDIA is also accelerated through CUDA.

The growth of reasoning will reach a scale of 100 million times, and Nvidia "never talks about market share"

Regarding Nvidia's "moat", Huang emphasized that Nvidia's strength lies in algorithms, as well as the deep integration of upper science and underlying architecture, and he believes that the company's "moat" in reasoning will be as deep as in training.

He argues that training is really large-scale inference, and that if it is well trained on a particular architecture, then the inference process will perform well, and if it is built on that architecture, it will run on that architecture even if there is no special consideration. Therefore, the compatibility of the architecture is essential for inference tasks, just like iPhones and other devices.

At the same time, more than 40% of Nvidia's current revenue comes from inference, and inference is about to grow significantly due to the emergence of the inference chain, Huang said that it is a revolution in intelligent production, and the growth of inference will reach a scale of 100 million times, "It's like going to school to contribute to society in the future, training models is important, but the ultimate goal is inference."

Huang said Nvidia's goal is to create a ubiquitous computing platform, "and we're trying to create a new computer every year that delivers two to three times faster performance, two to three times lower cost and two to three times more energy efficient." The progress is incredible. Therefore, we recommend that customers purchase new equipment in batches year by year to maintain an average cost, which has the benefit of maintaining architectural compatibility.

Huang said it's very difficult to build separate systems at the company's pace of improvement, and the difficulty is that Nvidia isn't just selling these innovations as infrastructure or services, but breaking them down and integrating them into multiple platforms. Because every customer's integration needs are different, we must integrate all the architecture libraries, algorithms, and frameworks into their systems, including our security systems and networks. We basically do about 10 integrations a year. It's a miracle, but it also drives me crazy, and I'm going crazy just thinking about it."

As for the market, he said that Nvidia does not want to steal market share from anyone, "If you look at our PPT, you will find that we never talk about market share. We're talking internally about how to create the next thing, what the next problem can be solved in the flywheel, how to better serve people, how to shorten the flywheel that used to take a year to a month...... While considering these things, we are convinced that our mission is very unique. The only question is whether this mission is necessary...... All great companies should have a mission at their core, and it's all about what you're doing and whether it's necessary, valuable, influential, and helping others. ”

What do you think of OpenAI and Musk, do they need million-scale clusters?

For OpenAI, which Huang considers to be one of the most influential companies of our time, a company focused on AI and committed to the vision of AGI, ChatGPT marks the awakening of artificial intelligence, "I really appreciate their speed and their unique goals to advance this space."

When asked about Musk and xAI, Huang was also full of praise, saying that in 19 days he built a cluster of 100,000 GPUs and a huge liquid-cooled, powered-up and licensed factory, "as far as I know, there is only one person in the world who can do that, and that's Elon." At the same time, he also said that it has now entered the era of having 200,000-300,000 GPU clusters.

So does the cluster need to scale to 500,000 or even 1 million sheets? "If you look at the scalability, the simple math of the calculations, the quadruple growth in model size and computing power each year, and the increased usage demand, you can see that we need millions of GPUs, and there's no doubt about that," Huang answered. But the question is, how do we approach architectural design from a data center perspective? This is closely related to the size of the data center, such as in gigawatts or 250 megawatts? I think it's going to be both...... All the ongoing breakthroughs in model parallel and distributed training, all the batching, etc., are because we worked hard in the early days, and now we're doing early work for the future. ”

On the issue of open source and closed source, Huang said it's security-related, but not all-security-related. "None of the problems are caused by closed-source models, which can be the engines of business models, and they are necessary to drive innovation, and I fully support that. It is important that they should not be opposites, but coexist". He agrees that open source is essential for many industries, bringing great potential to financial services, healthcare, transportation, and more.

For open source, Huang made an analogy: "My imagination is that if you lock a super-smart person in a buffered room for a month, it probably won't come out as a smarter person." But if two or three people sit together, through communication, discussion, questioning each other, all of them can become smarter. So, the concepts of interaction between AI models, arguments and reinforcement learning, and the generation of synthetic data, are reasonable. ”

I didn't expect my work to be fun forever, and I used AI every day

At the end of the interview, Huang also expressed his thoughts on himself and the industry. "I don't think everything we do is fun," he said. My job wasn't always fun, and I didn't expect it to be fun forever. You ask me if this is my expectation, and I will say that this work is important. I don't think much of myself, but I take my work, my responsibilities and my contribution in this day and age very seriously...... Like family, friends, children, they are not always fun, but we always love them deeply. ”

The real question, he argues, is how long he can remain relevant. He said that he uses AI every day, and even if he knows the answer, he will use AI to re-check it to find new content, "AI as a mentor, an assistant, as a brainstorming partner, checks my work, and it completely turns everything upside down." This is a revolution for information workers. I hope to maintain this relevance and continue to contribute, because this work is very important to me and I want to continue to pursue it. I am in disbelief with the current quality of life and can't imagine missing such a moment. ”

Huang: The key to machine learning is the flywheel, and Nvidia doesn't talk about market share

Read on