The current state of large language models: evolving along an S-curve

The AI community is embracing the S-curve – after an initial period of rapid growth, progress is starting to flatten out as we encounter natural constraints.

译自 The Current State of LLMs: Riding the Sigmoid Curve，作者 Patrick McFadin。

If you've been focusing on the field of artificial intelligence lately, you've probably noticed a shift. The unbridled optimism of a year ago has given way to a more calm, realistic outlook. As someone who spends most of his weekends immersed in AI code and contributing to projects like LangChain and LlamaIndex, I've seen this shift firsthand.

Recently, I attended two AI conferences – the AI Quality Conference and the AI Engineers World Expo – and the change in people's mood was palpable. It feels like a big milestone in the AI journey, and I wanted to share my thoughts on where we are and where we're going.

Remember when we thought that AI growth was exponential and ready to leave us all behind? Well, the reality is otherwise. The AI community is now adopting a different model: the S-curve. This S-shaped curve suggests that after the initial phase of rapid growth, progress begins to flatten out as we encounter natural limitations.

Why is there a shift in perspective? This comes down to the limitations we face in the development of large language models.

Triple Threat: Data, Energy, and Economy

The first is data availability. The Internet is huge, but the amount of high-quality data is still limited. Sure, companies like OpenAI are scrambling to reach an agreement to get more data to train GPT-5, but what happens when we need 10x the data of GPT-6? Synthetic data will help bridge some of the gaps, but this is hard to solve.

Then there are energy and infrastructure costs. Training these massive models requires amazing computing power. We're talking about rows and rows of GPUs running constantly, generating enough heat to warm a small town. Not only is this expensive, but it is reaching a tipping point of diminishing returns. In some cases, resource availability limits what is even possible. The new xAI data center in Memphis, Tennessee, requires a staggering 1 million gallons of water and 150 megawatts of electricity per day. Researchers and startups are looking to eliminate the need for GPUs, but it's still early days.

Finally, there is the question of economic viability. Currently, large, cutting-edge models are being subsidized by deep-pocketed cloud providers. But as the true cost of LLMs becomes clearer, we may see a shift in the way these models are developed and deployed. Training a cutting-edge model is a multi-billion dollar club that requires the private mobile number of Nvidia CEO Jensen Huang.

Crisis of trust in artificial intelligence

And if these restrictions weren't enough, we are also facing what is known as a "crisis of confidence in AI". This was a hot topic at the AI Engineers conference. What is the problem? By design, LLMs tend to be ...... Be creative. That's great for writing the next great United States novel, but not for automating critical business processes. This disconnect is about fantasy thinking about AI, and a lack of understanding of implementation. An LLM is a probabilistic model; In some cases, they get lost.

I've seen firsthand what some customers do: trying to replace the analytics process by feeding large amounts of data into the LLM, or worse, trying to replace an entire category of work by letting the LLM work unsupervised. Of course, none of these ideas worked out, leaving the initiators frustrated and negatively viewed the capabilities of the new AI. Even insiders realize that the Transformer architecture is not enough, and we are achieving GPT4-level performance in all models. We're only one or two breakthroughs away from credible automation or everyone's favorite buzzword, AGI (Artificial General Intelligence).

进入低谷：来自 Gartner hype cycle 的证据

If you're wondering where we are on the AI roller coaster, look no further than the Gartner hype cycle. This reliable tool gives us a visual representation of technology maturity and adoption. I can't embed a diagram of Gartner's AI hype cycle for licensing reasons, but I can link to the demo, where the folks at Gartner do a lot of graphics. It's worth taking a few minutes to see how your favorite latest technology works in line.

According to Gartner in the AI hype cycle, foundational models and generative AI are entering a "trough of disillusionment." Don't be fooled by the name – it's not a bad thing. This is a necessary step for any technology to mature. Early adopters get everyone excited, and power users discover many of the early benefits. Later adopters are starting to compare more mature technologies and find sharp edges, declaring it "all hype" (and I mean you, the enterprise). Eventually, there will be things like support contracts, architecture diagrams, and a lot of products that make this all more reliable and secure. Ah, the dawn of enlightenment.

The hope of the S-curve

Now, before you start thinking that everything is bleak, let me assure you that there are some significant benefits to this S-curve and the "trough of disillusionment." If you're ready to trust the process, here are some things to make you happy.

Adaptation time – As the rate of change slows, organizations have a chance to catch their breath and figure out how to use these tools effectively. No more constantly scrambling to keep up with the latest models that make last week's work obsolete. That's what keeps you from getting bogged down in endless POCs and ending up delivering something.
Improved risk management – With a clear understanding of AI's capabilities and limitations, companies can make more informed decisions about where and how to implement these technologies. Even a little bit of AI can have an amazing impact on the productivity of your products and end users.
Strategic planning opportunities – As the fog of hype lifts, the path forward becomes easier to see. Companies can begin to plan their AI strategy with a more realistic view of future capabilities. Not so long ago, there was some crazy speculation about firing an entire software engineering team or all the marketers. AI does everything, right? Now, it's clear that AI is a new skill in these professions that increases productivity and adds new features. Plan accordingly.

Current State of the Game: From "Wow" to "How"

So, where does this put us? If we look at the Gartner hype cycle, we see that while foundational models and GenAI are entering a trough, other AI technologies are at different stages. For example, knowledge graphs are finally coming out of the trough, which may be driven by their usefulness in AI applications.

What are the key takeaways? AI isn't going away, but it's moving into a more measured, realistic phase of advancement. We're transitioning from a "wow" phase to a "how" phase: how do we actually implement these technologies to add real value? After I've absorbed our current state, my advice is: relax and get used to what we have today. If you're building a chatbot, you should increase user productivity in some way. Otherwise, you're just doing more AI research.

Looking to the future

What can we expect as we move along this S-curve? I believe we are about to usher in a period of consolidation and improvement. The gap between the models is narrowing, and many are GPT-4 in quality. This is great news for advanced users, who can now build on a more stable foundation.

We may also see a shift towards more focused and efficient models. The era of "bigger is always better" is coming to an end, replaced by a more nuanced approach that balances power and efficiency. While we may not be moving towards AGI at an breakneck pace, we are entering a phase that could be even more exciting. This is an era of practical innovation, and the real-world impact of AI will begin to become clear. So, AI enthusiasts, fasten your seatbelts. The journey may have been smoother than expected, but it's far from over.

The current state of large language models: evolving along an S-curve

Triple Threat: Data, Energy, and Economy

Crisis of trust in artificial intelligence

进入低谷：来自 Gartner hype cycle 的证据

The hope of the S-curve

Current State of the Game: From "Wow" to "How"

Looking to the future

Read on

Small tricks make a big difference, "only read twice prompts" makes the loop language model surpass Transformer++

PubMed GPT: A domain-specific large language model for biomedical texts

Carnegie Mellon University launches online graduate certificates in generative AI and large language models

How do I build a large language model from scratch and further train and fine-tune it?

MICROSOFT, NVIDIA AND OPENAI ARE ALL FULLY SUPPORTING, AND THIS IS THE HUMANOID ROBOT CLOSEST TO TRUSS'S "OPTIMUS PRIME" AT PRESENT! On August 6, Figure was officially released

Interpretation of the paper | ACL 2024: Self-distillation bridges distribution differences in language model fine-tuning

Report: Large Language Model Natural Language Processing Job Recruitment Increases by 111% Year-on-Year

Top 10 Global Company News of the Week | Alibaba's large language model is open to the global open source community; The Boeing union strikes 737 to suspend production

大语言模型如何助力药物开发? 哈佛 George Church Lab 最新综述

Li Shen, Hu Renfen, Wang Lijun丨Construction and application of ancient Chinese large language model

20,000 words: The intersection of large language models, prompt learning, and future technology research and development

Apple issued a question: large language models are simply unable to perform logical reasoning

Institutions are optimistic about the decline of experts and criticize the project for being difficult, will the large language model become an AI bubble that is about to burst?

Millions of robust data training, new SOTA for 3D scene large language models! IIT and others released Robin3D

CNCC | Explore the potential and limitations of large language models: where are the boundaries of the capabilities of large language models

【AASLD2024 Express】Prediction of HBsAg clearance by peginterferon α-2b treatment: a simple model based on baseline HBsAg levels

Large models lead the 6G revolution! The latest review explores the future of communication methods, covering multimodality, RAG, etc

The top CP of the large model turned from sweet to abusive: they were dissatisfied with each other, and they all looked for a spare tire, because the money was unpleasant

Archetype AI released a large model of Newtonian physics to learn physics principles from sensor data

CNCC | The future of multimodal affective computing under large models

The "Fuxi Eye" large model was released! It has the world's largest ophthalmic image database

New car | The AI large model is on the car, 13 new/27 optimizations, and the ZEEKR 009 glorious OTA upgrade

AI Daily: Fudan and Baidu's new models can generate 1-hour long videos; The new version of ChatGPT for Windows is launched; Two new features have been added to NotebookLM

Surveying and Mapping Bulletin | Ren Ping: Noise data visualization based on LOD1 city model

The terminal AI grading standard has been implemented, and the "fire" of the mobile phone model has burned to the agent

J Clin Invest丨Yang Weili/Li Shihua/Li Xiaojiang's team used monkey models to reveal new pathological mechanisms of Parkinson's disease

Tens of millions of dollars lost by poisoning for large model training? Anthropic found a hidden bug in the LLM codebase

Nearly 1,000 teenagers in the city gathered at Zhonghai Expo to show their skills in the three major model competitions of navigation, aviation and architecture

DeepMind and MIT developed Fluid, which enables autoregressive models to achieve large-scale expansion of Wensheng graphs

AI Weekly | ByteDance's large model training was "poisoned"; Microsoft will terminate the Azure OpenAI service for individuals in China

ByteDance responded to the attack on the intern for the training of the large model: it has been dismissed and does not affect the online business