DuckDB, an In-Process Analytical DBMS

Feb 24, 2022 · New York, United States of America

Continuing the series of talks about using R with external computing, we have Hannes Mühleisen discussing DuckDB.

About the Talk:
Using databases to wrangle and retrieve data from R and Python can be challenging, traditional systems like SQLite or MySQL are not built for analytical workloads and moving data into the analysis environment suffered from low bandwidth. DuckDB is a new in-process database management system that runs directly in-process, greatly streamlining setup and data transfer. DuckDB uses a column-vectorized query processing architecture to run analytical SQL queries very quickly indeed. DuckDB can also directly query data that lives in R or Pandas data frames and from external files without a dedicated data importing step. DuckDB is available as Free and Open Source software. In my talk, I will describe the rationale behind building DuckDB as well as give some usage examples for statistical programming.

About Hannes:
Hannes Mühleisen is a Senior Researcher in the Database Architectures Group at CWI Amsterdam, the Dutch National Research Center for Computer Science and Mathematics. Hannes is one of the creators of DuckDB, the first in-process analytical database management system. He is also co-founder and CEO of DuckDB Labs, a CWI spin-off to provide commercial services around the DuckDB project.

The talk will begin at 7 PM America/New_York and we will start admitting people to the event shortly before. Since this is completely remote there will be no pizza but everyone is encouraged to have pizza individually.

