Care and Feeding of Catalyst Optimizer

You’ve seen the technical deep dives on Spark’s Catalyst query optimizer. You understand how to fix joins, how to find common traps in a logical query plan. But what happens when you’re alone with Spark UI and the cluster goes idle for 40 minutes? How can you diagnose what’s gone wrong with your query and fix it? Spark SQL’s ease of use can have a deceptively steep operational curve. Queries can look innocent but cause issues that require a sophisticated understanding of Spark internals to diagnose and solve. A tour through puzzles and edge cases, this talk challenges us to a better practical understanding of Spark’s Catalyst Optimizer:
-Everything about how you - and the optimizer - reason about UDFs is based on the idea they’re cheap to run. What if they’re not? Betrayed by salt, a surprising source of skew!
-What do you do when Spark’s codegen stage generates a method that exceeds 64k? What’s really going on, and is it possible to fix it other than just disabling whole stage codegen?
-How can tuning the JVM code cache improve your Spark application’s performance?
About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: databricks.com/product/unifie...
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com/databricks-nam...

Жүктеу

AI Disruption of Quantitative Finance: From Forecasting, to Generative Models to Optimization

A Deep Dive into Spark SQL's Catalyst Optimizer with Yin Huai

Clowns abuse children#Short #Officer Rabbit #angel

🤔Какой Орган самый длинный ? #shorts

No empty

DEFINITELY NOT HAPPENING ON MY WATCH! 😒

Generative AI in a Nutshell - how to survive and thrive in the age of AI

Delta Live Tables A to Z: Best Practices for Modern Data Pipelines

Top AWS Services A Data Engineer Should Know

Deep Dive Into Catalyst: Apache Spark 2 0'S Optimizer

25 Nooby Pandas Coding Mistakes You Should NEVER make.

Intro To Databricks - What Is Databricks

Which Database Model to Choose?

MongoDB Schema Design Best Practices

Practical Deep Learning for Coders: Lesson 1

Это - iPhone 16 и вот что надо знать...

ОБСЛУЖИЛИ САМЫЙ ГРЯЗНЫЙ ПК

Он придумал гениальную идею, как исправить разбитый экран! 🤯 | Credit : gertieinar (TT)

КРУТОЙ ТЕЛЕФОН

ЧТО ЭТО За Флешки Замурованные в СТЕНЕ? #shorts

POCO X6 PRO😈 Vs iPHONE 15 PRO💀Vs POCO F6 PRO😱 VsiQOO 12Vs 8GBvs4GBVs-PUBG TEST #pocox6pro #iPhone

Что если робот Cozmo увидит огромную африканскую Саранчу?

Care and Feeding of Catalyst Optimizer

Пікірлер: 3