Hi there! If you want to stay up to date with the latest machine learning and big data analysis tutorials please subscribe here: kzitem.info Also drop your ideas for future videos, let us know what topics you're interested in! 👇🏻
WOW very informative, much better than databricks documentation. It would be cool to do something with time series and use dates, products and categories to ilustrate how useful this function can be in this context. Awesome!
@DecisionForest
3 жыл бұрын
Thank you Alejandro!
@mingmiao364
4 жыл бұрын
Amazing stuff. It helped me keep my job. Thank you for posting.
@DecisionForest
4 жыл бұрын
This made my day, glad that you found it useful.
@Ohy89
3 жыл бұрын
I spent long time trying to understand window functions with no success. You doing an amazing job. Thank you!
@DecisionForest
3 жыл бұрын
Happy I could help!
@ChrisLovejoy
4 жыл бұрын
Amazing! the other tutorials on this weren't great - this was fantastic, thanks
@DecisionForest
4 жыл бұрын
Thank you Chris!
@RajmohanBalachandran
2 жыл бұрын
Thank you, I am able to understand window functions through a simple and clear explanation.
@DecisionForest
2 жыл бұрын
Glad you found it useful!
@selimberntsen7868
2 жыл бұрын
Amazing explanation! Thanks a lot, I found it difficult to wrap my head around this concept. However, it is much clearer now.
@aidataverse
2 жыл бұрын
Thanks for such a wonderful explanation
@Aryan91191
4 жыл бұрын
This was the best hands-on tutorial on the subject I have seen. Thank you. please post more examples.
@DecisionForest
4 жыл бұрын
Thank you! Will do!
@oshinverma1787
2 жыл бұрын
Great work! Please keep on posting
@DataTranslator
11 ай бұрын
extremely informative. Thank you.
@nferraz
3 жыл бұрын
Amazing content! Keep the excelent work on yout channel.
@DecisionForest
3 жыл бұрын
Thank you Jose! Will do my best.
@yueminzhou1869
4 жыл бұрын
Thanks for the video Radu! It is very well explained! Are you using dataiku to present?
@arunasingh8617
Жыл бұрын
great explanation!
@Mene0
8 ай бұрын
Very helpful, thanks
@ferrerolounge1910
Жыл бұрын
subscribed. Such clarity!
@alvinspark1875
3 жыл бұрын
Very nicely done... Thanks bro
@DecisionForest
3 жыл бұрын
Cheers Alvin!
@1UniverseGames
2 жыл бұрын
I was wondering. For Node analysis of a tree how can I create VectorCell() function in pyspark? As I have a pair of node, where this vectorcell gonna find Node exists or not, and is node in leaf or not and pair of node vector analysis? Do you have any video tutorial to create this node tree representation?
@oussamadebboudz3771
2 жыл бұрын
instead of rowsbetween() ... we also could use F.collect_set instead of list ... right ?
@imDanoush
3 жыл бұрын
Great video thanks!
@bhubannayak6155
4 жыл бұрын
Hi Radu, Nice tutorial with clear explanation.Please also attach notebooks here that will be helpful.
@gustavorocha6592
2 жыл бұрын
Great video! Congrats
@DecisionForest
2 жыл бұрын
Thanks Gustavo!
@prmurali1leo
4 жыл бұрын
wow too good haven't seen anyone gone far to explain this. I have a question, is this very demanding and slower? (when the rows are around millions)
@DecisionForest
4 жыл бұрын
Thank you so much, glad it was helpful. To your question, if you run it on a cluster it will be pretty fast. Even if you run it locally if you have 16 cores it should perform well.
@gabrielalusquinos3913
3 жыл бұрын
muchas gracias! un video muy fácil de seguir y de gran ayuda!
@DecisionForest
3 жыл бұрын
Gracias Gabriela!
@purnamaheshimmandi1212
Жыл бұрын
Helpful!
@JoaoVictor-sw9go
2 жыл бұрын
For some use cases, it is basically the same as using the groupby and then joining the groupby result with the original dataframe, right?
@MrChaomen
Жыл бұрын
Do you know any in-depth guide about how spark computes window function physically? There're guides about physical implementation of joins and algorythms used, but I want to know what algorythm is used for window function and determine how it affects memory usage
@ParthPatel-fp8lm
4 жыл бұрын
Thanks for great explanatory example.
@DecisionForest
4 жыл бұрын
Thank you as well for the kind words. Happy it helped!
@eduardopalmiero6701
3 жыл бұрын
Hi! nice guide. Why when you order the window by asc salary the list salary and the other agg computed columns don't have the same result as when not ordered?
@Dyslexic_Neuron
3 жыл бұрын
excellent video ... Thanks
@DecisionForest
3 жыл бұрын
Thank you, glad you liked it!
@kevinfranciscochaconvargas8149
3 жыл бұрын
Thanks man, well explained and an excellent example.
@DecisionForest
3 жыл бұрын
Cheers Kevin!
@stevetrabajo4065
2 жыл бұрын
9:25, on row 1, is it possible to make average_salary and total_salary as null because they are not in between -1 and window.currentRow?
@shyamraj1766
3 жыл бұрын
Nice, it helps a lot
@DecisionForest
3 жыл бұрын
Glad to hear that!
@shirsendubasu8246
4 жыл бұрын
Great Video, appreciated !!
@mahdiakbarizarkesh5603
3 жыл бұрын
thanks, so useful
@DecisionForest
3 жыл бұрын
Cheers Mahdi!
@nestorguemez4846
3 жыл бұрын
Great video man 😎🤙
@DecisionForest
3 жыл бұрын
Appreciate it, thank you!
@pratyushraizada1472
4 жыл бұрын
Nice explanation, thanks a lot!
@DecisionForest
4 жыл бұрын
That’s very kind, glad you enjoyed it!
@martinparent7564
4 жыл бұрын
Nice trick listing the elements that go in computing sum and average, quite useful to debug! I don't quite get why ordering by salary changes the average and sum of salaries. From a "finance" point of view, a salary sort would not change the total weekly salary payout to employees. Is is that from a spark perspective, the "orderby" becomes an other grouping ?
@DecisionForest
4 жыл бұрын
Good question and yes, the total would be the same if you would average / add ALL of the values with a groupby. But with window functions using orderby we add / average over the values UP TO and including that value. That is why I listed the elements so you can see what is being added (compare output of cells 4 and 5, the list_salary column). Hope it makes sense now.
@elzbietadoniek5810
Жыл бұрын
How can I use window partition by for all columns in a dataframe (Scala)?
@PeterS123101
4 жыл бұрын
Thank you.
@sangilimurugansankarathand2464
4 жыл бұрын
Nice Explanation.
@DecisionForest
4 жыл бұрын
Thank you! Glad you found it useful.
@mayankupadhyay4447
Жыл бұрын
How can we get value of first not null value from every column of pyspark dataframe?
@fuwizeye
4 жыл бұрын
Great explanation
@DecisionForest
4 жыл бұрын
Glad it was helpful!
@tomgt428
3 жыл бұрын
Cool
@ramojiraoyalamati4035
4 жыл бұрын
This videos on pyspark is informative if you provide code either by Jupiter or GitHub. it would be more helpful
@DecisionForest
4 жыл бұрын
Thank you, glad it was helpful. I do provide the jupyter notebook, you can find the link in the description.
Пікірлер: 68