By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
World of SoftwareWorld of SoftwareWorld of Software
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Search
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
Reading: DuckDB co-creator: “It was clear that a new architecture was necessary”
Share
Sign In
Notification Show More
Font ResizerAa
World of SoftwareWorld of Software
Font ResizerAa
  • Software
  • Mobile
  • Computing
  • Gadget
  • Gaming
  • Videos
Search
  • News
  • Software
  • Mobile
  • Computing
  • Gaming
  • Videos
  • More
    • Gadget
    • Web Stories
    • Trending
    • Press Release
Have an existing account? Sign In
Follow US
  • Privacy
  • Terms
  • Advertise
  • Contact
Copyright © All Rights Reserved. World of Software.
World of Software > Software > DuckDB co-creator: “It was clear that a new architecture was necessary”
Software

DuckDB co-creator: “It was clear that a new architecture was necessary”

News Room
Last updated: 2026/05/20 at 3:13 AM
News Room Published 20 May 2026
Share
DuckDB co-creator: “It was clear that a new architecture was necessary”
SHARE

Hannes Mühleisen

Hannes Mühleisen

(Image: Hannes Mühleisen)

Hannes Mühleisen is co-creator of DuckDB and CEO of DuckDB Labs. Together with Mark Raasveldt, he originally launched DuckDB as a research project at the Centrum Wiskunde & Informatica (CWI) Amsterdam.

Read more after the ad


the next big thing – Golo Roden

the next big thing – Golo Roden

Golo Roden is the founder and CTO of the native web GmbH. He is engaged in the conception and development of web and cloud applications and APIs, with a focus on event-driven and service-based distributed architectures. His guiding principle is that software development is not an end in itself, but must always follow an underlying professionalism.

Golo: Hannes, you are one of the co-creators of DuckDB and co-founder of DuckDB Labs. When DuckDB version 1.0 was released in the summer of 2024, I reported on it for heise – and a lot has happened since then. Before we go into the details, I would like to start at the beginning: DuckDB has its roots in your research at the CWI in Amsterdam, where you and Mark Raasveldt worked on database internals for years. What was the moment (or gap) when you both decided that the world actually needed another database, and what did you originally want it to be?

Hannes: Back then, we worked quite closely with statisticians who had to analyze large survey data. It was clear to us that they needed database technology! But when we suggested this, they said that they didn’t really want a database in the classic sense. For example, before Docker, it wasn’t easy to install a database locally without being an expert. In addition, you couldn’t easily share the state of the database with someone else.

It was clear that a new architecture was needed, an embedded analytical database system. That didn’t even exist back then. It became clear quite quickly that we needed a completely new development – a clean design tailored to the embedded deployment model, with a modern system architecture.

In the summer of 2018, we decided to make this a reality and started implementing DuckDB.

The term “SQLite for Analytics” has been attached to DuckDB for years. He gets to the heart of a lot in just three words, but can also seem reductive. How accurate do you find this framing from your current perspective, and where does it fall short?

Read more after the ad

Hannes: “SQLite for Analytics” was an apt description of the project for the first five years. Over time, we have added a powerful extension mechanism that allows working with almost any file format such as Parquet, JSON or Iceberg and many popular storage options, for example S3 API. That’s why we started calling DuckDB a general purpose data tool.

This may be less memorable than the original description, but it captures that the system is now much more versatile. And if you need SQLite for analytics, you can still use DuckDB for that.

Beyond Big Data

You’ve been taking the position for some time that distributed systems are simply oversized for the vast majority of analytical workloads – and that a single modern machine can do significantly more than the industry usually assumes. This is an argument that I also took up in a detailed iX test, in which I positioned DuckDB as a slim alternative to Apache Spark. Would you like to make this thesis in your own words? And how do you react to people who immediately criticize you for underestimating their problem?

Hannes: My argument rests on three pillars. First, hardware development has made great strides, and modern computers are amazingly powerful. Today, a powerful laptop ships with a dozen fast CPU cores, tens of gigabytes of memory, and a fast SSD with terabytes of storage. A server can easily offer ten times that amount or more.

Second, the field of database architecture has evolved significantly since 2010, when big data emerged. We were able to build on results in column-based storage, vectorized query processing, concurrency, and concurrency control. We have also conducted our own research on topics such as compression and operators for data volumes that exceed RAM.

Third, what most people don’t consider is that even if an organization is sitting on petabytes of data, you never need to process all of the data in a single query. There is now robust evidence of this: In recent years, both Snowflake and Redshift have published samples and statistics of their user queries – veritable treasure troves for understanding real workloads. George Fraser at Fivetran has an excellent analysis of this, showing that even among queries on Snowflake and Redshift, the 99.9th percentile scans about 300GB, so could easily run on a single node.

Performance is one of the most striking aspects of DuckDB – many early adopters describe their first experience with the words “that can’t be right, let me check the result again”. Which architectural decisions do you think are most important, and which of them are not obvious to outsiders?

Hannes: We have already talked about opting for a single node architecture, which eliminates various types of overhead in implementation, operation and performance. But there are also some non-trivial architectural decisions.

We chose vectorized execution over JIT compilation because it’s perfect for analytical workloads and much easier to maintain in the long term. We didn’t use GPUs or exotic hardware like AI accelerators, but rather put all our energy into writing the most efficient algorithms for the CPU. And finally, we deliberately avoided using SIMD intrinsics (manually formulated vector commands) when implementing these algorithms. Instead, we wrote scalar code and let the compiler do the auto-vectorization. The result is highly portable yet powerful code.

Additionally – as discussed in the previous question – a lot of current research has been incorporated into DuckDB. Processing data volumes that exceed RAM by offloading them to disk is a key contributor to DuckDB’s performance. Most modern database systems can swap to disk, but when they do, they experience a performance crash. DuckDB uses modern flash-based storage to handle this much more elegantly – users often barely notice that their queries have been offloaded to disk.

The ecosystem

DuckDB’s reach into the Python and R communities, into Node.js, into all sorts of tools and notebooks is remarkable. Was this ecosystem strategy a conscious choice from the start, or did it come about because people pulled DuckDB into their workflows?

Hannes: Of course you have to meet the users where they are. Initially, we envisioned that DuckDB would be used for data science workloads, and that determined the initial selection of clients. We obviously needed a command line client. On the language side, Python was already very strong, and we had strong connections with the R community, so we decided to implement these clients first.

Node.js followed soon after. As DuckDB grew, the community began developing clients independently. This allowed us to monitor their adoption before investing the core team’s work into fifteen different drivers. For example, the DuckDB Go driver was initially implemented by Marc Boeker, who later gave the code to the DuckDB Foundation.

The extension mechanism seems like a rather quiet but very consequential design decision. It allows DuckDB to read formats it wasn’t built for, work with object stores, and even talk to other databases. How do you think about the line between what belongs in the core and what is better off in an extension?

Hannes: We see DuckDB being used in resource-constrained environments – single-board computers, browser tabs, memory-limited containers. To enable this use, we want to keep the core of DuckDB small and only include the essentials: the SQL parser, the database engine, the storage engine, the CSV reader – and the extension mechanism. Most other features such as the Parquet reader or even HTTPS support are available as extensions.

A nice side effect of this powerful extension mechanism is that our community can build its own extensions. There are currently more than 180 community extensions for DuckDB, each of which brings new features to the system and can be installed with a single line.



Unfortunately, this link is no longer valid.

Links to gifted items will be invalid if they are older than 7 days or have been accessed too often.


You need a heise+ package to read this article. Try it now for a week without obligation – without obligation!

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.
By signing up, you agree to our Terms of Use and acknowledge the data practices in our Privacy Policy. You may unsubscribe at any time.
Share This Article
Facebook Twitter Email Print
Share
What do you think?
Love0
Sad0
Happy0
Sleepy0
Angry0
Dead0
Wink0
Previous Article Pricing error or not, this Panasonic 4K LED TV is less than 300 euros at JoyBuy Pricing error or not, this Panasonic 4K LED TV is less than 300 euros at JoyBuy
Leave a comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Stay Connected

248.1k Like
69.1k Follow
134k Pin
54.3k Follow

Latest News

Pricing error or not, this Panasonic 4K LED TV is less than 300 euros at JoyBuy
Pricing error or not, this Panasonic 4K LED TV is less than 300 euros at JoyBuy
Mobile
Avoid distractions: How to make your working time more efficient
Avoid distractions: How to make your working time more efficient
News
From the screen into everyday life: How wearables are changing brand communication
From the screen into everyday life: How wearables are changing brand communication
Gadget
Dispute over fiber optics in buildings: market failure or overregulation
Dispute over fiber optics in buildings: market failure or overregulation
Software

You Might also Like

Dispute over fiber optics in buildings: market failure or overregulation
Software

Dispute over fiber optics in buildings: market failure or overregulation

6 Min Read
Digital administration: Federal data centers are clearly missing green energy targets
Software

Digital administration: Federal data centers are clearly missing green energy targets

6 Min Read
Luxury headphones: Sony 1000X The Collection in the test
Software

Luxury headphones: Sony 1000X The Collection in the test

2 Min Read
Enterprise users: This is important in macOS 26.5
Software

Enterprise users: This is important in macOS 26.5

3 Min Read
//

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

  • Privacy Policy
  • Terms of use
  • Advertise
  • Contact

Topics

  • Computing
  • Software
  • Press Release
  • Trending

Sign Up for Our Newsletter

Subscribe to our newsletter to get our newest articles instantly!

World of SoftwareWorld of Software
Follow US
Copyright © All Rights Reserved. World of Software.
Welcome Back!

Sign in to your account

Lost your password?