☕ www.buymeacoffee.com/johnnych...
ℹ️ johnnychivers.co.uk/
ℹ️ github.com/johnny-chivers/glu...
ℹ️ aws.amazon.com/glue/
00:00 - Intro
00:19 - What is Glue ETL?
00:58 - Glue vs EMR
02:21 - How to use Glue ETL
03:39 - What we will cover in the tutorial
04:00 - Hands on tutorial
04:06 - Architecture overview
05:29 - S3 configuration
05:47 - Creating an ETL job
08:36 - Running an ETL job
09:37 - Creating the 'output' database
09:49 - Creating a new crawler
11:24 - Athena
11:57 - Recap of what we coded
In this series of videos we take a look at AWS Glue. We mix the theory with the practical as we build a functioning ETL application using the Glue Data Catalog, Crawlers, Glue ETL, Triggers, Workflows and Dev Endpoints
In this video we take a look at Glue ETL. Using the data we ingested and registered in the AWS Glue Data Catalog in lesson 1 we create an AWS Glue ETL Job to transform our CSV data into Parquet. We then register this new dataset with the AWS Glue Data Catalog before querying it using Athena.
In this series of videos we take a look at AWS Glue. We mix the theory with the practical as we build a functioning ETL application using the Glue Data Catalog, Crawlers, Glue ETL, Triggers, Workflows and Dev Endpoints
AWS Glue is a serverless data integration service that makes it easy to discover, prepare, and combine data for analytics, machine learning, and application development. AWS Glue provides all of the capabilities needed for data integration so that you can start analyzing your data and putting it to use in minutes instead of months.
Data integration is the process of preparing and combining data for analytics, machine learning, and application development. It involves multiple tasks, such as discovering and extracting data from various sources; enriching, cleaning, normalizing, and combining data; and loading and organizing data in databases, data warehouses, and data lakes. These tasks are often handled by different types of users that each use different products.
AWS Glue provides both visual and code-based interfaces to make data integration easier. Users can easily find and access data using the AWS Glue Data Catalog. Data engineers and ETL (extract, transform, and load) developers can visually create, run, and monitor ETL workflows with a few clicks in AWS Glue Studio. Data analysts and data scientists can use AWS Glue DataBrew to visually enrich, clean, and normalize data without writing code. With AWS Glue Elastic Views, application developers can use familiar Structured Query Language (SQL) to combine and replicate data across different data stores.
😎 About me
I have spent the last decade being immersed in the world of big data working as a consultant for some the globe's biggest companies. My journey into the world of data was not the most conventional. I started my career working as performance analyst in professional sport at the top level's of both rugby and football. I then transitioned into a career in data and computing. This journey culminated in the study of a Masters degree in Software development. Alongside many a professional certification in AWS and MS SQL Server.
Негізгі бет Ғылым және технология AWS Glue 101 | Lesson 2: Glue ETL
Пікірлер: 11