laitimes

DuckDB 1.0 Released: 300,000 rows of C++ engine, millions of downloads!

author:Not bald programmer
Codenamed "Snow Duck" for DuckDB 1.0.0, signifies that DuckDB has reached a new level in the field of data analytics.

On June 3, 2024, DuckDB, the highly-anticipated data analysis engine, celebrated an important milestone - version 1.0.0 was released.

DuckDB is a high-performance, analytic relational database designed for efficient data analysis. It's easy to install, very fast, and can be run in-process.

The core idea behind DuckDB is to retain the simplicity and ease of use of SQLite while enhancing it with fast analytical processing and fast data transfer between R/Python and RDBMS to handle OLAP workloads.

DuckDB 1.0 Released: 300,000 rows of C++ engine, millions of downloads!

The project was named "DuckDB" because the creators believed that ducks were resilient and could live off anything, similar to how they envisioned a database system to operate.

Codenamed "Snow Duck", DuckDB 1.0.0 symbolizes that DuckDB has reached a new level in the field of data analysis, and it also symbolizes the DuckDB team's unremitting pursuit of system stability.

The DuckDB project started in 2018, and after nearly 6 years of continuous polishing, it has now developed into a mature open source project with more than 300,000 lines of C++ engine code, 42,000 code commits, and 4,000 issues.

DuckDB has earned a lot of praise in the industry for its superior query performance and ease of use, with tens of thousands of followers on GitHub and social media platforms, millions of downloads per month, and more than 4TB of download traffic per day for the extension module alone. Now, even Wikipedia is starting to recognize DuckDB.

DuckDB 1.0 Released: 300,000 rows of C++ engine, millions of downloads!

DuckDB chose to release version 1.0.0 at this time because the team realized that the data management system is a core component of the application, and there is a trust contract between the developer and the user.

Users rely on databases to provide correct query results and keep data secure, and system developers need to be aware of their responsibility not to arbitrarily disrupt user applications.

The release of version 1.0.0 marks a major breakthrough in the stability of storage formats and query semantic consistency, providing users with a strong stability guarantee.

According to the announcement, one of the main hurdles to the release of v1.0.0 is the storage format. DuckDB has its own custom data storage format. This format allows users to manage many (and possibly very large) tables in a single file, with full transactional semantics and state-of-the-art compression.

But designing a new file format wasn't without its challenges, and over time, DuckDB had to make significant changes to the format. This leads to DuckDB becoming a sub-optimal scenario, i.e., whenever a new DuckDB version is released, files created with the old version do not work with the new DuckDB version and must be manually upgraded.

This issue was addressed in the February release of v0.10.0 - introducing backward compatibility and limited forward compatibility for the DuckDB storage format. The team says that this feature has been in use for some time now with no serious issues - they are confident that DuckDB files created with DuckDB v1.0.0 will be compatible with future versions of DuckDB.

DuckDB 1.0 Released: 300,000 rows of C++ engine, millions of downloads!

In addition to system stability, another core aspect of DuckDB 1.0.0 is cross-version stability. While it may not be possible to never disrupt anyone's workflow, they will be more careful about handling user-facing changes in the future. In particular, the DuckDB plan is focused on providing stability for the SQL dialect as well as the C API.

Looking ahead, DuckDB will continue to focus on system stability and plan to enrich the expansion environment around DuckDB. DuckDB is expected to be the foundation of the next data revolution through the expansion of community contributions, providing users with a high-performance, unified SQL interface.

In addition, DuckDB has strong financial backing and a long-term development strategy behind it.

DuckDB Labs currently has a core team of nearly 20 people focused on the long-term strategic development of DuckDB; The non-profit DuckDB Foundation ensures the long-term development of DuckDB under the MIT open source license.

P.S. DuckDB Labs CTO Mark is said to be the most important programmer, with 50% of the code being completed by himself, along with 13 programmers, 1 test intern, 1 developer ecosystem, and 1 training and documentation.

Reference

https://github.com/duckdb/duckdb/releases/tag/v1.0.0

https://duckdb.org/2024/06/03/announcing-duckdb-100.html