My solution to second problem city_df = city_df.fillna(subset= ['city1','city2','city3'],value = '') city_result = city_df.withColumn('Result',when(expr('city1 != ""'),col('city1')) \ .otherwise(when(expr('city2 !=""'),col('city2')).otherwise(col('city3'))) ).select('result') city_result.display() Another solution null_df = city_df.select([when(col(column)=='',None).otherwise(col(column)).alias(column) for column in city_df.columns]) result_df = null_df.select(coalesce('city1','city2','city3').alias('result'))
@mridulkrishnrawat4385
10 ай бұрын
Your code will fail in the second question, If city3 value is anything other than AP for example if it was Delhi in row 1 of city3 then your output will miss Delhi.
@abhigyapranshu4791
8 ай бұрын
For second Question simply use GREATEST function
@mayankpatni5639
10 ай бұрын
Data engineer vs devops vs cloud engineer more job opening more package future growth
@tarunpothala2071
9 ай бұрын
Hi bro, In the first question you considered hobbies column as list type directly, but in real-time (and in question as well) it is not array or list type. hence can you please solve it considering the hobbies column as a string. In real-time, when you read data from a file, all columns will treated as string types. Here is what I did considering hobbies column as string. input_df = spark.createDataFrame(hobbies_data,hobbies_schema) split_hobbies = split(input_df['hobbies'],',') output_df = input_df.select(input_df['name'],split_hobbies.getItem(0).alias('hobbies')) \ .unionAll(input_df.select(input_df['name'],split_hobbies.getItem(1))) \ .orderBy(input_df['name']) Anyother better solution than this is openly accepted.
@GeekCoders
9 ай бұрын
Yes you are right we can solve that way too
@AnilKumar-pb7eo
8 ай бұрын
Hi bro what if the string is very long then above mentioned code will be lengthy so instead we can use explode function right?
Can you give an explanation as well. It will be more helpful.
@GeekCoders
10 ай бұрын
I given right ?
@krishnakumarr3031
4 ай бұрын
df1 = df.withColumn('col1',coalesce('num1','num2','num3')) This one line will solve the second problem statement right ?
@GeekCoders
4 ай бұрын
Yes
@suriyas6338
7 ай бұрын
Hey! Explode will be used if the column value is array or map ! From the source what will you do if the column is string separated by coma ? Ex: "tennis, badminton, cricket".. You need to convert this string to array and have to explode it.
Пікірлер: 25