Mock Interview for Data Engineers | Spark Optimizations | Real-time Project Challenges and Scenarios

To enhance your career as a Cloud Data Engineer, Check trendytech.in/?src=youtube&sub=mockdec for curated courses developed by me.
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
30 INTERVIEWS IN 30 DAYS- BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
A highly experienced guest interviewer, Himanshu Mishra, www.linkedin.com/in/himanshu-mishra-4796014b/ conducting a well engaging interview covering all the important topics that a Data Engineer should be aware of.
Our talented guest interviewee, Hamida Bano, www.linkedin.com/in/hamida-bano-793804208/ answering the interview questions in a very simplistic way with good examples.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - kzitem.info/door/PLtgiThe4j67rAoPmnCQmcgLS4iIc5ungg
Python Playlist - kzitem.info/door/PLtgiThe4j67pQSwkaEF9uHXzr8Td9IEpV
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - www.linkedin.com/in/bigdatabysumit/
Twitter - twitter.com/bigdatasumit
Instagram - instagram.com/bigdatabysumit/
Student Testimonials - trendytech.in/#testimonials
Discussed Questions : Timestamp
1: 40 Introduction
2:21 Challenges you faced in your project
4:40 What’s the contribution towards your project ?
6:20 File formats you have worked on in your project ?
7:53 What is wide and narrow transformations ?
9:38 Lazy evaluation in spark ?
11:25 What is fault tolerance in spark and mapreduce and how does it work ?
13:32 Client mode and Cluster mode in spark ?
14:15 Broadcast joins we have in spark ?
15:18 Memory management in spark ?
18:12 In live production, if you are facing an out of memory error. So what’s the approach you follow to debug that?
19:51 What is Data skewness ?
20:16 What is Caching ?
21:38 How do you test your spark code ?
22:17 What are the performance tuning techniques that you use to tune your spark job ?
23:18 What is coalesce and when should we use it ?
24:54 Managed and external tables with a use case
26:28 How do you deploy your spark code ?
27:29 How did you schedule your workflow ?
28:14 What are the version control tools you have used ?
28:49 What is shuffling and why do we need to think of minimising it ?
29:50 One of the Spark jobs you've developed is experiencing slow performance. How would you go about resolving this issue?
31:00 What are the transformations and actions you have performed in the current project ?
32:03 How does spark work ? Explain Spark Architecture ?
33:05 What is lineage in spark ?
33:50 Different types of joins in spark ? Use case on any one of those joins ?
35:25 What is a spark session and how do we initialise it ?
36:33 How to read a parquet file into a dataframe ?
37:37 How can you perform filters on a dataframe?
39:20 How to remove duplicates in a dataframe ?
39:56 Consider a scenario where in dataframe we want to update a column name, So how will you do this ?
40:40 Usage of withColumn ?
41:27 How to remove any column from a dataframe ?
41:50 Have you handled any null values in your dataframe ?
42:37 SQL Coding Question
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs

Жүктеу

Пікірлер: 25

@chetankakkireni8870
3 ай бұрын
she spoke about user memory, executor memory, cache memory which uses off heap memory which does not use garbage collector, which I felt very useful.
@PraveenSingh-no8ol
3 ай бұрын
Sumit Sir kindly make a video on a person who has transition from non-It to Data Engineering profile it will be really helpful
@poojabarawkar1808
3 ай бұрын
Thanks
@akshaythengane4302
Ай бұрын
This series is too good! Keep em coming!
@prannay19
3 ай бұрын
Thanks again. I am following these closely and feel that these would be immensely helpful in cracking the interviews. Appreciate it. 👍
@sumitmittal07
3 ай бұрын
definitely
@_-_Abhinav_-_33
3 ай бұрын
This interview is really very helpful. Thank you so much Sir for this entire series.
@sumitmittal07
3 ай бұрын
Pleasure to share more such content for all my supportive followers!
@swapnildande4706
3 ай бұрын
Really thanks sir for mock interview playlist 🙏🏻
@sumitmittal07
3 ай бұрын
Most welcome
@zaffer2024
3 ай бұрын
🙏
@sadiqueahmad6781
3 ай бұрын
Insightful interview 👍
@sumitmittal07
3 ай бұрын
thank you
@karthikeyanudayakumar9553
3 ай бұрын
Excellent mock interview 👍
@sumitmittal07
3 ай бұрын
Glad you enjoyed it!
@user-rx3vl2en5i
3 ай бұрын
Hi sir good morning it was helpful to us please do make some AWS data engineering interview also instead of azure..
@sumitmittal07
3 ай бұрын
Noted
@user-rx3vl2en5i
3 ай бұрын
Yeah please we facing the end to end data pipeline AWS side explanation where use etl used nd which transfer that used and so on.
@pritamkabiraj7691
Ай бұрын
Hi Sumit Sir I also want to appear for Mock Interview. Is there any process involved or Can you help me with the process to appear?
@umeshpagoti1017
3 ай бұрын
Sir continue the python videos
@sumitmittal07
3 ай бұрын
yes
@shiprasarwada
3 ай бұрын
Sir keep mock interviews for gcp data engineer
@sumitmittal07
3 ай бұрын
sure
@karthikeyanr1171
3 ай бұрын
too many questions
@telugoons2292
3 ай бұрын
Thanks

Happy 4th of July 😂

The child was abused by the clown#Short #Officer Rabbit #angel

Tom & Jerry !! 😂😂

Was ist im Eis versteckt? 🧊 Coole Winter-Gadgets von Amazon

Happy 4th of July 😂

Mock Interview for Data Engineers | Spark Optimizations | Real-time Project Challenges and Scenarios

Пікірлер: 25