Estimate change in CEO departures with bootstrap resampling

Рет қаралды 3,088

Julia Silge

Жүктеу

Пікірлер: 20

@oc1194
3 жыл бұрын
Thank you Julia, I always learn so much from working through your examples.
@garyboy7135
3 жыл бұрын
Thank you @Julia Really love the video. Can you explain a bit more when you’d use bootstrap resembling vs kfold cross validation to measure a model?
@JuliaSilge
3 жыл бұрын
This Cross Validated question has some really thoughtful discussion on that: stats.stackexchange.com/questions/18348/differences-between-cross-validation-and-bootstrapping-to-estimate-the-predictio
@pwylll
3 жыл бұрын
I did not know that RStudio supported code snippets. Thanks for another great video!
@davidjackson7675
3 жыл бұрын
Julia, Thanks for another excellent and concise video.
@mmohan7200
3 жыл бұрын
But the linknot working properly miss
@AngelFelizF
3 жыл бұрын
Great video. I learnt many things. Thanks
@diegouriarte
3 жыл бұрын
Great video! Thanks
@addersnap2885
3 жыл бұрын
Does the rsample package effectively supersede Davis Vaughn's strapgod package? Or should strapgod still be used if performance is a concern (and splitting is unnecessary)?
@JuliaSilge
3 жыл бұрын
I'd say that Davis' work on strapgod informed some of the design of rsample but strapgod is more performant in some situations. You wouldn't use strapgod if you want to integrate with the rest of tidymodels but it can be great in some situations.
@addersnap2885
3 жыл бұрын
@@JuliaSilge awesome, thanks for the reply. I'm currently in a situation where I need to get bootstrap estimates from every entry in a column of nested data frames so I'll take any performance or memory gain I can get for now. Although I definitely think it would be interesting to play with mapping some tidymodels functions over columns of tibbles as well.
@wortelsorbet
3 жыл бұрын
For small samples there is also the loo_cv() function. Would you recommend bootstrapped samples over loo_cv() and why?
@JuliaSilge
3 жыл бұрын
I definitely recommend bootstrap for basically all realistic situations; we write about this a bit here: www.tmwr.org/resampling.html#leave-one-out-cross-validation
@wortelsorbet
3 жыл бұрын
One thing I am still a bit struggling with: The bootstrap method seems to be applied to the training set only. What if you would like to apply it to splitting the data repeatedly into a training and test set? With a small sample I otherwise still have the problem that the test set (and performance on the test set) will depend very much depend on the particular draw.
@JuliaSilge
3 жыл бұрын
@@wortelsorbet I might not be understanding, but I don't think that data resampling strategy is what we would typically recommend. You can read more here: www.tmwr.org/splitting.html And here: www.tmwr.org/resampling.html Notice that you create resamples from the *training* data if you have created a training/testing split.
@wortelsorbet
3 жыл бұрын
That is a very useful chapter. I guess I can prevent the test set to be accidentally very different from the training set by using stratified sampling. Also useful for the type of data I work with (often repeated measurements per research participant) is the section on multi-level data. There I seemed to have guessed correctly that I should sample participants (and take all measurements within a participant) rather than participant/measurement combinations. Many thanks!
@alexandroskatsiferis
3 жыл бұрын
Hello Julia, If you would have plotted the ratio (voluntary / involuntary) in the geom_line and geom_points, across the fiscal years, you would be more surprised :D
@cuysaurus
3 жыл бұрын
How many time do I need to reach that level of skill?!?!?!
@mayurkoli5476
3 жыл бұрын
Hi Julia, please start video series for Artificial intelligence and deep learning
@davidjackson7675
3 жыл бұрын
Leaving job voluntarily? You mean you can do that :)