Негізгі бет Ғылым және технология Eliminating Shuffles in Delete Update, and Merge

Күн бұрын

Eliminating Shuffles in Delete Update, and Merge

Рет қаралды 4,057

If you’ve ever had to delete a set of records for regulatory compliance, update a set of records to fix an issue in the ingestion pipeline, or apply changes in a transaction log to a fact table, you know that row-level operations are becoming critical for modern data lake workflows. Even though the industry has seen a tremendous amount of innovation in this area, row-level operations can be still fairly expensive if the underlying data has to be shuffled.
This session will explain how Apache Spark™ can completely avoid shuffles during row-level operations by leveraging storage-partitioned joins, a key to efficiently modify data at PB scale.
Talk by: Anton Okolnychyi and Chao Sun
Here’s more to explore:
Rise of the Data Lakehouse: dbricks.co/3NHT7CD Lakehouse Fundamentals Training: dbricks.co/44ancQs
Connect with us: Website: databricks.com
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc
Facebook: / databricksinc

Жүктеу

Пікірлер: 2

@user-sw9kd9pv4n
11 ай бұрын
Excellent work Anton and Chao. SPJ works like magic !
@theanigos
11 ай бұрын
Amazing as always Anton and Chao

Building Minimalistic Open Lakehouse w/ Open Source Projects Apache Spark™: Project Nessie & Iceberg

UNIX FileSystems

버블티로 체감되는 요즘 물가

Дибала против вратаря Легенды

СНЕЖКИ ЛЕТОМ?? #shorts

孩子多的烦恼？#火影忍者 #家庭 #佐助

Data pipeline vs Dataflow vs Shortcut vs Notebook in Microsoft Fabric

The Parquet Format and Performance Optimization Opportunities Boudewijn Braams (Databricks)

How to handle Data skewness in Apache Spark using Key Salting Technique

Apache Iceberg Tutorial for Beginners: Understanding Copy-on-write and Merge-on-read

StarRocks 3.3 is Here: Key Features and Improvements

AzDatabricks # 25:-How to Upsert data in delta table using merge statements

Evaluating LLM-based Applications

Advancing Spark - Understanding Low Shuffle Merge

Accelerating Data Ingestion with Databricks Autoloader

Liquid Clustering in Databricks,What It is and How to Use, #liquidclustering #clusterby #databricks

Simple maintenance. #leddisplay #ledscreen #ledwall #ledmodule #ledinstallation

КУПИЛ ПОДДЕЛКУ iMac С WILDBERRIES ЗА 20К - ИГРОВОЙ АЙМАК С WB ЗА 20.000р, ОБЗОР

Что это самый маленький айфон в мире 🤯 оцени ролик в коментарии

How To Unlock Your iphone With Your Voice

478 СОКЕТ НА СТЕРОИДАХ / ЧТО СМОЖЕТ В 2024 ГОДУ?

Игровой Комп с Авито за 4500р

ОЖИВЛЕНИЕ МЕРТВОГО IRBIS NB254 после ПОПАДАНИЯ В ЛАПЫ ПРЕДЫДУЩЕГО МАСТЕРА / РЕАЛЬНО ПОЧИНИТЬ ТАКОЕ?

Eliminating Shuffles in Delete Update, and Merge

Пікірлер: 2