The growth of NoSQL databases has recently declined compared to vector databases. However, NoSQL vendors believe their products are best suited for AI.
译自 NoSQL Database Growth Has Slowed, but AI Is Driving Demand,作者 Richard MacManus。
Four years ago, I wrote about the rapid growth of NoSQL databases – in large part because of their compatibility with artificial intelligence (AI) and machine learning (ML). But that was before the generative AI craze began, with OpenAI releasing ChatGPT in November 2022.
So, what has happened to NoSQL databases since the advent of ChatGPT? In the new era of vector databases, are NoSQL database systems – such as document stores (MongoDB), key-value stores (Redis), and wide column stores (Cassandra) – still growing?
Back in 2020, to illustrate the growth of NoSQL database systems, I used the following chart from DB-Engines:
The chart shows a steep upward trajectory from 2013 to 2020 for systems such as MongoDB, Redis, and Cassandra (although all three declined slightly at the end of this period). Compared to the flat—and eventually declining—curves of traditional relational databases like Oracle and MySQL, NoSQL's growth curve is significant.
Here's a chart of the latest popularity of DB-Engines over the past 36 months (3 years):
It's important to note that the chart measures popularity growth (not actual users), and we can see that the vector database has naturally experienced a growth explosion since 2021 – although it seems to have peaked at the end of last year. At the same time, there was a slight decline in document storage and key-value storage.
However, if we look at the 2013 chart, we can see that the growth of vector databases has not yet reached the peak of document storage and key-value storage (let's ignore the wide column store charts, as its dataset seems to have changed on DB-Engines since my post in 2020).
In addition, despite a slight decline in growth rates, NoSQL database systems remain one of the most popular choices for developers. The chart below shows that the top 10 database systems have changed very little over the past two years, with the top six (including MongoDB at 5th and Redis at 6th) remaining the same. We also see that the top four database systems are all relational databases; And the number of users is significantly higher than MongoDB and Redis.
NoSQL (AI)
When Redis announced a controversial license change earlier this year, the Linux Foundation almost immediately announced support for an open-source fork of Redis called Valkey. Redis' position is that large cloud providers have an unfair market advantage, and the new licenses are their way of trying to get them to pay. MongoDB took a similar step in 2018, tightening the limits on its licenses.
I'll leave that to others about the debate about the new license for Redis, but I do want to highlight a blog post that Redis published the day after it was announced. The article, titled "The Future of Redis," focuses on the AI uses of Redis. "We have always been at the forefront of the GenAI wave," wrote CEO Rowan Trollope and CTO Yiftach Shoolman, adding, "We were one of the first companies to recognize the need for vector search capabilities in databases, even before ChatGPT and LLMs became household names." ”
This post details plans for an AI-powered assistant called Redis CoPilot (now available) that "allows developers to interact directly with their data using language and convert that data into code." It also intends to improve vector processing performance by leveraging product quantization and further leveraging the latest hardware and GPU advancements, making Redis "more cost-effective for RAG use cases."
As for MongoDB, it also targets generative AI use cases. In a recent article published in The New Stack, Rick Houlihan, Developer Relations Team Lead, explicitly compares its solution to PostgreSQL, a popular open-source relational database system. According to Houlihan, systems like PostgreSQL are not designed for the types of workloads that AI requires:
"Given the well-known performance limitations of RDBMS when it comes to processing wide rows and big data attributes, it's no surprise that these tests show that platforms like PostgreSQL struggle to handle the rich, complex document data required for generative AI workloads."
Not surprisingly, he concludes that using a document database, such as MongoDB, "provides better performance than using tools that are not designed for these workloads."
To maintain PostgreSQL's reputation, there's no shortage of hosting providers that provide AI-related features for Postgres. Earlier this year, I interviewed a "Postgres as a Platform" company called Tembo that saw a huge demand for AI scaling. "Postgres has an extension called pgvector," Samay Sharma, CTO at Tembo, told me. "So, it allows you to add a simple data type called vector to your existing table. So, even if you have existing rows of data, you can add a vector data type – it's a converted embedding. ”
AI data is in abundant supply
Of course, every database company now claims that it works well with AI. Just last month, Oracle released an AI-driven update to its Oracle APEX low-code development platform, which the company says enables non-developers to execute vector queries in less than two minutes without having to know SQL.
When it comes to AI, there's no shortage of demand right now – all database companies and projects, whether SQL or NoSQL, benefit from it.