In this discussion with @mehdio and Pedro Holanda from @duckdb goes over all the things around CSVs! They also dive into a pragmatic example of how the CSV parser is working in DuckDB. Enjoy!
Don't miss the next livestream : motherduck.com/events/
📓 Resources
* DuckDB CSV docs : duckdb.org/docs/data/csv/over...
* Pedro's Linkedin : / pdet
➡️ Follow Us
LinkedIn: / motherduck
Twitter : / motherduck
Blog: motherduck.com/blog/
0:00 Intro
1:11 About Pedro
3:29 When did CSV support come into DuckDB ?
7:24 Where is CSVs being used ?
9:52 What is the TPC-H benchmark? How is that linked with CSVs ?
12:35 Downside of benchmark
13:35 History of the CSV parser of DuckDB
18:32 CSVs in banking and Excel freedom
26:36 Will there be a file format to replace CSV ?
28:45 Why DuckDB built their custom CSV parser
32:35 hands-on : What horrible CSV can we throw at DuckDB?
53:33 Wrapping up and future work
#duckdb #dataengineering #sql #python
Негізгі бет Ойын-сауық Why CSVs Still Matter: The Indispensable File Format
Пікірлер