Presto On Spark: A Unified SQL Experience

Presto was originally designed to run interactive queries against data warehouses, but now it has evolved into a unified SQL engine on top of open data lake analytics for both interactive and batch workloads. However, Presto doesn't scale to very large and complex batch pipelines. Presto Unlimited was designed to address such scalability challenges but it didn’t fully solve fault tolerance, isolation, and resource management.
Spark is the tool of choice across the industry for running large scale complex batch ETL pipelines. This motivated the development of Presto On Spark.
Presto on Spark runs Presto as a library that is submitted with spark-submit to a Spark cluster. It leverages Spark for scaling shuffle, worker execution, and resource management. It thereby eliminates any query conversion between interactive and batch use cases. This solution helps enable a performant and scalable platform with seamless end-to-end experience to explore and process data.
Many analysts at Intuit use Presto to explore data in the Data Lake/S3 and use Spark for batch processing. These analysts would earlier spend several hours converting these exploration SQLs written for Presto to Spark SQL to operationalize/schedule them as data pipelines.
Presto On Spark is now used by analysts at Intuit to run thousands of critical jobs. No query conversion is required here, improved analysts' productivity and empowered them to deliver insights at high speed.
Benefits from session:
Attendees will learn about Presto On Spark architecture
Attendees will learn when To Use Spark's Execution Engine With Presto
Attendees will learn how Intuit runs thousands of presto jobs daily leveraging databricks platform which they can apply to their own work
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / data. .
Instagram: / databricksinc

Жүктеу

Пікірлер: 2

@andrescmarin
2 ай бұрын
Is there something like this for Trino (formerly Presto SQL)?
@hasanmougharbel8030
2 жыл бұрын
Hey there, god bless your efforts in this channel. As a new sql learner i have only few enquires. I made my mind to work on database systems that are designed to perform data analytics and not merely transactional functions. Is it a good start to start by learning on sql server or i should consider other softwares. Also, is there any ETL tools that i should leverage right from the beginig to ease my learing process. I aim to start with any open source softwares or inexpensive solutions throughout my learning process. Thanks for taking care of this. Looking forward to learn from you.

Scaling Your Workloads with Databricks Serverless

Presto: Fast SQL-on-Anything | Starburst

ОСКАР vs БАДАБУМЧИК БОЙ! УВЕЗЛИ на СКОРОЙ!

HAPPY BIRTHDAY @mozabrick 🎉 #cat #funny

Did you believe it was real? #tiktok

WHO LAUGHS LAST LAUGHS BEST 😎 #comedy

015. Apache Spark - Егор Пахомов

F8 2019: Getting Started with Presto Run SQL at Any Scale

Snowflake Vs Databricks - 🏃‍♂️ A Race To Build THE Cloud Data Platform 🏃‍♂️

Presto On Spark: Scaling not Failing with Spark - Ariel Weisberg, Meta & Shradha Ambekar, Intuit

What Table Format Should I Choose For My Data Lake? Hudi | Iceberg | Delta Lake

Presto: a Powerful SQL Query Engine for Big Data! | Hadoop Big Data Tutorial | Lecture 39

Presto on Apache Spark: A Tale of Two Computation Engines

Presto 101: An Introduction to Open Source Presto

Apache Spark - Computerphile

МЫШКА КОТОРАЯ НУЖНАЯ КАЖДОМУ КИБЕРСПОРТСМЕНУ? ЗАЧЕМ НУЖНА ЭТА МЫШКА? #cs2 #игры

ВОЗМОЖНО ЛИ ПОЧИСТИТЬ КЛАВИАТУРУ КЛЕЕМ?🤔 #shorts

The first two iPads are imitations, just for demonstration purposes, don't worry#ipadkeyboard #ipad

А СУЩЕСТВУЕТ ТРЕХКАНАЛЬНЫЙ РЕЖИМ РАБОТЫ ОПЕРАТИВНОЙ ПАМЯТИ ? #ddr4 #ddr5 #оперативнаяпамять

Игровой Комп с Авито за 4500р

Красиво, но телефон жаль

СТРАННАЯ КОМПЬЮТЕРНАЯ МЫШЬ КОЛЬЦО, ТАКОЙ ТЫ ТОЧНО НЕ ВИДЕЛ

Presto On Spark: A Unified SQL Experience

Пікірлер: 2