To enhance your career as a Cloud Data Engineer, Check trendytech.in/?src=youtube&sub=mockdec for curated courses developed by me.
I have trained over 20,000+ professionals in the field of Data Engineering in the last 5 years.
30 INTERVIEWS IN 30 DAYS- BIG DATA INTERVIEW SERIES
This mock interview series is launched as a community initiative under Data Engineers Club aimed at aiding the community's growth and development
A highly experienced guest interviewer, Himanshu Mishra, www.linkedin.com/in/himanshu-mishra-4796014b/ conducting a well engaging interview covering all the important topics that a Data Engineer should be aware of.
Our talented guest interviewee, Hamida Bano, www.linkedin.com/in/hamida-bano-793804208/ answering the interview questions in a very simplistic way with good examples.
Link of Free SQL & Python series developed by me are given below -
SQL Playlist - kzitem.info/door/PLtgiThe4j67rAoPmnCQmcgLS4iIc5ungg
Python Playlist - kzitem.info/door/PLtgiThe4j67pQSwkaEF9uHXzr8Td9IEpV
Don't miss out - Subscribe to the channel for more such informative interviews and unlock the secrets to success in this thriving field!
Social Media Links :
LinkedIn - www.linkedin.com/in/bigdatabysumit/
Twitter - twitter.com/bigdatasumit
Instagram - instagram.com/bigdatabysumit/
Student Testimonials - trendytech.in/#testimonials
Discussed Questions : Timestamp
1: 40 Introduction
2:21 Challenges you faced in your project
4:40 What’s the contribution towards your project ?
6:20 File formats you have worked on in your project ?
7:53 What is wide and narrow transformations ?
9:38 Lazy evaluation in spark ?
11:25 What is fault tolerance in spark and mapreduce and how does it work ?
13:32 Client mode and Cluster mode in spark ?
14:15 Broadcast joins we have in spark ?
15:18 Memory management in spark ?
18:12 In live production, if you are facing an out of memory error. So what’s the approach you follow to debug that?
19:51 What is Data skewness ?
20:16 What is Caching ?
21:38 How do you test your spark code ?
22:17 What are the performance tuning techniques that you use to tune your spark job ?
23:18 What is coalesce and when should we use it ?
24:54 Managed and external tables with a use case
26:28 How do you deploy your spark code ?
27:29 How did you schedule your workflow ?
28:14 What are the version control tools you have used ?
28:49 What is shuffling and why do we need to think of minimising it ?
29:50 One of the Spark jobs you've developed is experiencing slow performance. How would you go about resolving this issue?
31:00 What are the transformations and actions you have performed in the current project ?
32:03 How does spark work ? Explain Spark Architecture ?
33:05 What is lineage in spark ?
33:50 Different types of joins in spark ? Use case on any one of those joins ?
35:25 What is a spark session and how do we initialise it ?
36:33 How to read a parquet file into a dataframe ?
37:37 How can you perform filters on a dataframe?
39:20 How to remove duplicates in a dataframe ?
39:56 Consider a scenario where in dataframe we want to update a column name, So how will you do this ?
40:40 Usage of withColumn ?
41:27 How to remove any column from a dataframe ?
41:50 Have you handled any null values in your dataframe ?
42:37 SQL Coding Question
Tags
#mockinterview #bigdata #career #dataengineering #data #datascience #dataanalysis #productbasedcompanies #interviewquestions #apachespark #google #interview #faang #companies #amazon #walmart #flipkart #microsoft #azure #databricks #jobs
Негізгі бет Mock Interview for Data Engineers | Spark Optimizations | Real-time Project Challenges and Scenarios
Пікірлер: 25