To be honest, I didn't find this to be very helpful. I'm a project manager tasked with redesigning the whole data environment in a small enterprise, technically minded but never formally studied. It seemed like the presenter didn't make the case for the presentation's title "Build frameworks, not pipelines." I didn't observe a part where he discounted pipelines. The beginning 10 minutes about many units being used across Britain as an analogy for different technologies and systems in data didn't reveal any insights and can be safely skipped IMO. After that, the diagramming of a framework from the data source all the way to a data warehouse seems more like an explanation for beginner's, but without the clarity that such an explanation should possess. Overall, seemed like an inadequately organized way to present a basic idea. Though, some individual points from this presentation that I took away: - Keep HTML files from web scraping, not just fields, for access to the data at any time without going back to the original source - Maintain a layer for failed data extractions: this has been my idea for a long time but good to see it articulated by an actual data engineer - Maintain a layer as a staging data warehouse, prior to the production data warehouse Instead, I found this recommended video better, even though it was more complex: kzitem.info/news/bejne/pGx3yKpucHZml4o It goes more in-depth about one company's challenges in designing a new data pipeline and offers insights that are generalizable to anyone setting up or upgrading such a pipeline.
@ooker777
Жыл бұрын
Thanks for your time and effort to write a detailed review
@efeorikpete8774
2 жыл бұрын
Fast-forward to 3 years later: AIRFLOW now has robust documentation for authoring, scheduling and monitoring your data pipeline
@MrKane101111
2 жыл бұрын
Great presentation, really nice analogy and very clear.
@RedShipsofSpainAgain
Жыл бұрын
First 10 minutes he talks about different measuring units in Britain as a bad analogy for the importance of standards in modern daya engineering: it has zero relevance to data engineering platforms. Really poor analogy. Just skip to 10:20.
@AshokTak
2 жыл бұрын
00:00 Welcome 00:34 Merchant John Story 08:17 Need for standardization 22:26 Q&A Will update it later.
@TheSolbiatii
2 жыл бұрын
00:00 Welcome 00:34 Merchant John Story 08:17 Need for standardization 10:25 Traditional Pipeline vs Ideal Framework with Validations 18:02 Principles 22:26 Q&A
@vansf3433
Жыл бұрын
It's too simple, and anyone can learn the process of sorting out, transforming and transmitting data without any need of good knowledge of CS
@firefoxmetzger9063
2 жыл бұрын
Somehow this makes me think of XKCD's Standards comic.
@julianatlas5172
2 жыл бұрын
I likes the xkdc about date format. There is only one good date format according to the ISO 8601 which is YYYY-MM-DD e.g 2021-12-15
@boudehoucherahma8083
2 жыл бұрын
Verry interesting présentation. Tanks🙏
@severtone263
2 жыл бұрын
This was very helpful. That analogy is simply the best.
Пікірлер: 16