Thanks for watching! Feel free to upvote the kaggle notebook if you found it helpful! Kaggle notebook: www.kaggle.com/kenjee/titanic-project-example My Kaggle Profile: www.kaggle.com/kenjee Try watching my kaggle project from scratch series next! kzitem.info/news/bejne/02N6uG1-e5Oao4o&ab_channel=KenJee
@olabisioremade4784
3 жыл бұрын
Hi Ken, please long does the whole data science course 365datascience take?
@alexmyers3716
Жыл бұрын
I'm here because of GPT4. Before GPT4 was released, I had a decent basic understaning of data science applications, but did not have the time to learn all of the Python syntax. Now, with GPT4, all I have to do it understand how to explain what i want to do, and GPT4 takes care of all the coding. It wouldn't be hard to create this entire notebook in 2-3 hours of time. Wild times we live in!
@hornfan722
4 жыл бұрын
Thanks ken- never used Kaggle or even done any data science projects. The detailed analysis (including the nuances MOST IMPORTANTLY) is really making this digestible- not to mention applicable
@KenJee_ds
4 жыл бұрын
Will do my best to include even more nuances going forward!
@vedantbhardwaj4058
4 жыл бұрын
I gotta be honest here, started learning Data Science on my own but every now and then I become lazy AF and I just stop for a period of 2-3 weeks. It's difficult to be consistently committed to the program and learning. Although I hope I slowly complete the training.
@KenJee_ds
4 жыл бұрын
It happens to me as well! I use youtube to have people hold me accountable to continued learning. Maybe find a friend or someone to keep you on top of your learning journey!
@omotayoonike9825
Жыл бұрын
Pls don't bring this bad energy here everybody who is a data scientist feels the same even me myself don't want to do it and God know it difficult but if you stick around the barber shop for long you will get your hair cut if you like become somebody through data science or another means but all is difficult.
@davidologunoba4703
Жыл бұрын
Same sort of situation with me. But you know what, let's keep moving, we can do it!.
@onii-chan2811
3 ай бұрын
We all go through this bro.. the field is demanding
@MaiNguyen-nl3pp
4 жыл бұрын
You have saved us hours of self-exploration! Thank you, Ken :D Hope you can make more videos like this!
@KenJee_ds
4 жыл бұрын
You should still definitely self explore as well! Thank you for watching, more to come!
@sandrafield9813
4 жыл бұрын
Thank you so much for your videos, I watch them all the time. I'm in a masters DS program, and I feel like I'm actually on the titanic right now, going down down down. Here you are handing me a raft, a dingy, and also giving me a map to a huge lush closeby island where there's an escape airport.
@KenJee_ds
4 жыл бұрын
Thanks for watching them Sandra!! I also love the analogy haha. Hopefully, one day I will provide you with a more stable yacht so you can enjoy the data science journey in style!
@kbillotta
4 жыл бұрын
Thanks Ken... I just got my physics degree and i want to become a data scientist..Your videos are helping a lot! Thanks
@KenJee_ds
4 жыл бұрын
That's what I like to hear! Thanks for watching!
@Om-id1qr
2 жыл бұрын
I'd like to say that I discovered a gem of a channel today.
@KenJee_ds
2 жыл бұрын
Makes me really happy to hear!
@paigec5017
4 жыл бұрын
This video came at such good timing! I just taught myself python and started the titanic project today but was feeling so unsure about everything! Thank you for your videos!!
@KenJee_ds
4 жыл бұрын
Great stuff! Thank you for watching them!
@emmanuelagyemang3738
2 жыл бұрын
How did you teach yourself how to code?
@nikhilatluri1569
4 жыл бұрын
Thank Ken Jee For spending your time during this lockdown for educating youngsters like us
@KenJee_ds
4 жыл бұрын
Glad I could help!
@MonaChangizi
16 күн бұрын
Thank you for this helpful video! I'm really bigginer in machine learning but I love solving problems like it and your content helps me in this journey. 😊
@josefftan1203
4 жыл бұрын
Aw, kaggle series here we goooo ♥️
@KenJee_ds
4 жыл бұрын
Enjoy!
@anurekha137
4 жыл бұрын
I am glad that I came across your channel. Always wanted to try titanic dataset on kaggle but didn't. now I m gonna try it. thanks.
@KenJee_ds
4 жыл бұрын
That is one of my favorite things to hear! It makes me really happy that my video helped you get started!
@2ash94
Жыл бұрын
Wow this is a gold mine! Can't believe you went through all that work! Looking through all this, it seems like to become a great data scientist, it's not just about the skill. It is about intelligence and your ability to understand and see things that aren't clear to the normal human being. I have a fairly normal IQ and i am currently wondering if i should continue building my skills in order to become a data scientist.
@KenJee_ds
Жыл бұрын
I don't think you have to have a high IQ. You can learn to ask the right questions and create frameworks for yourself. I could not have done the analysis in the same way when I started. I am certain you can learn to approach the problem in the same way I did!
@sarthaksharma070
4 жыл бұрын
Great video dude, exactly what i was looking for, its really great to see creators actually listening to the audience and working on it. Keep it up pal
@KenJee_ds
4 жыл бұрын
Glad it was what you were looking for!! Thanks for watching!
@communicationvast9949
2 жыл бұрын
fantastic video, my friend. I started this project in R studio, ran into some walls, and got extremely frustrated. Listening to your process is extremely helpful. Thanks for the upload.
@KenJee_ds
2 жыл бұрын
Thanks for watching!! Really glad to hear it was helpful
@denizbalkaya8356
4 жыл бұрын
Hi Ken....Deniz is speaking from Turkey! Your videos are helping me a lot! You force me to keep up :)
@KenJee_ds
4 жыл бұрын
Glad to hear they are helping! Thank you for watching!
@hendrywijaya1017
2 жыл бұрын
Ken, I Think about the project planning which on Histogram and Boxplot should be place after missing data, So Here's the plan order from the top - understand the Type of data - value counts - missing data - histogram and boxplot Then continue by following step you make from - correlarion analysis - exploring interesting fact Until scaling
@ahmedhassan9379
2 жыл бұрын
Thanks so much, i feel happy that i could undersrand 90% of the content months ago i didnt knew a thing!
@KenJee_ds
2 жыл бұрын
Amazing!!
@moghegaurav
4 жыл бұрын
Love your videos, Ken. They are no-nonsense and stick to just DS. Your content is well made up and your voice is clear. Thanks for sharing your knowledge. I am sure with such quality content you will soon hit 100k subscribers and more.
@KenJee_ds
4 жыл бұрын
Thanks for the kind words and for watching my videos!
@hugochung9909
4 жыл бұрын
I've been following your videos for a while now and making my way through all the microcourses on Kaggle. This is the exact video I was looking for to begin the next stage of learning by diving into some data science projects . Top content and keep up the great work Ken!
@KenJee_ds
4 жыл бұрын
Thanks for the kind words! This is exactly what I like to hear haha. Glad you found it helpful!
@zahinnazhan7200
4 жыл бұрын
This is great walkthrough for beginner like me. Thanks Ken Jee
@KenJee_ds
4 жыл бұрын
Glad it was helpful Zahin!
@DataProfessor
4 жыл бұрын
Ken, Great video and great initiative! Sounds like fun, I also haven't done a Kaggle submission yet, will follow your path and do one soon.
@KenJee_ds
4 жыл бұрын
Let's definitely partner on one!
@salikmalik7631
4 жыл бұрын
@@KenJee_ds Yes. It'll great to watch..
@DataProfessor
4 жыл бұрын
@@KenJee_ds Yes, let's definitely do that 😃
@DatascienceConcepts
4 жыл бұрын
Nice insights Ken Jee. In fact I remember working with this dataset in my early days of ML :)
@KenJee_ds
4 жыл бұрын
Awesome! I definitely think this dataset is a great starting point. It was even helpful for me to go back and review some of the basics!
@dakadoodle6295
4 жыл бұрын
Literally was looking at this today
@KenJee_ds
4 жыл бұрын
Awesome!
@Mario-ox5dm
4 жыл бұрын
I sense a rising Kaggle Grandmaster in the future!
@KenJee_ds
4 жыл бұрын
Haha I don't know about that! Long road ahead
@ImportData1
4 жыл бұрын
Learned something new - VotingClassifier!
@KenJee_ds
4 жыл бұрын
Awesome! Yeah, it is super useful and easy to use! Next time I will probably experiment more with some pipelines to clean up the feature engineering a bit!
@ImportData1
4 жыл бұрын
@@KenJee_ds I find the feature engineering/selection process the toughest. Sometimes you think you engineered features well enough, but the model accuracy doesn't necessarily resonate. Would love to see how you experimenet with pipelines!
@KenJee_ds
4 жыл бұрын
@@ImportData1 Yep! This is definitely the case where I could have done more!
@albertosei3558
Жыл бұрын
I will try this very soon. Bookmarking this
@KenJee_ds
Жыл бұрын
💪
@AIPlayerrrr
4 жыл бұрын
I’d be super interested in seeing you competing in a real Kaggle Competition.
@KenJee_ds
4 жыл бұрын
I will likely be trying one in a few months! Stay tuned!
@AIPlayerrrr
4 жыл бұрын
Ken Jee great! I am excited
@Gamma3
3 жыл бұрын
Me too! Great channel
@henriquebonacelli2981
4 жыл бұрын
Man, great video! I'm starting on data science and this hands on project explanation was super helpfull!
@KenJee_ds
4 жыл бұрын
Glad to hear it was helpful! Thank you for watching!
@anoopashware9539
3 жыл бұрын
thank you sir to make this video I can't explain it in words. how much information in this video. which is really helpful for me to become a good data scientist. thank you so much
@KenJee_ds
3 жыл бұрын
Really glad to hear this video helped!
@abdelrahmanashraf7636
2 жыл бұрын
Thanks a lot for this video, having learning a lot of things and didn't know how to tie all the ropes together. This video was for it. Thanks a lot Ken Jee :)
@KenJee_ds
2 жыл бұрын
Thanks for checking it out!
@AdityaKumar-cj2ms
4 жыл бұрын
It was a very insightful explanation of this project, really liked it. And, at cell [5] if you execute training.describe(include = "all"), it will also give you the values which appear the most for every categorical variable. Which I think can be really helpful.
@KenJee_ds
4 жыл бұрын
I actually didn't know that! Thank you for sharing!
@s8x.
5 ай бұрын
thanks for this video. Just started this problem and realized I have no idea what I'm doing
@bianchialex
3 жыл бұрын
Just came here to see what you got. I used random forest and got .76 on my first try and then a little tuning got it to .77. I think I could make it better so I will continue to play around. I got to the point in my course curriculum mid-lecture I said "this is more advanced than I need to get started on beginner projects" and instantly hopped off to do titanic. I had it all worked up in my head to be some super hard task but it turned out to be relatively painless! I am going to do a couple other smaller projects and then try something of my own, probably using youtube data because I am a massive geek for the algorithim.
@KenJee_ds
3 жыл бұрын
I think the smaller projects is a good idea! All about building some momentum!
@mustafamegahed7873
4 жыл бұрын
Great job! Thank you so much! Sadly, I have some work at college and couldn't finish the video but I will definitely come back to it hopefully next week.
@KenJee_ds
4 жыл бұрын
No problem! It is there for you to learn at your own pace!
@fahadreda3060
4 жыл бұрын
Thanks Ken, I was waiting for this video , Good Luck
@KenJee_ds
4 жыл бұрын
I hope you enjoy it Fahad!
@RichardOnData
4 жыл бұрын
Loving this video and the thumbnail dude!
@KenJee_ds
4 жыл бұрын
Thanks for noticing the thumbnail Richard! Would love to colab at some point if you're interested!
@RichardOnData
4 жыл бұрын
@@KenJee_ds Absolutely! My email is richardondata@gmail.com - I have a number of items on my backlog of videos that I'd love to cover in the future as I'm sure you do too, and some of them I think would make total sense! I'll drop you an email in a day or two myself.
@arthurmlcc
4 жыл бұрын
Keep up with great the work you've been doing in this channel ken, really helping us beginners.
@KenJee_ds
4 жыл бұрын
I absolutely will! Thanks for watching!
@chinmaygondhalekar2591
4 жыл бұрын
Just the notification I was waiting for thanks man 👍
@KenJee_ds
4 жыл бұрын
I hope you enjoy!
@mimikoko4299
4 жыл бұрын
U have a best data science chanel, I love u
@KenJee_ds
4 жыл бұрын
Thank you Mimi!
@daedalusdreamjournal5925
4 жыл бұрын
Hello there :) I haven't watched the full video yet, but there's a reason for this and is linked to a suggestion I'd like to propose to you for similar videos in the future: Despite being very VERY green in this, I decided to have a first go at this all by myself ... and boy was and is it still frustrating :P The reason behind this was that I wanted to try a first attempt without a guiding hand. Once I finished my first model, I quickly realized that there were tons of ways where I blundered like a total noob ... which is actually totally fine :) And despite the frustration of the experience, it felt like I gathered valuable experience from this. And it is only now that I am starting to watch this video .. but only bit by bit, as I want to try to do as much by myself as possible (mistake be damned since they are being done at home where it won't hurt anyone and where I can learn safely from the experience). SO my suggestion is this: Could it be possible for future similar videos to have it in several parts? Or, at the very least, to timestamp the different section of your handling of a particular problem? I feel like it could be very valuable, especially for very recent newcomers like me. Anyways, thanks a ton for your videos, very much appreciated ! (especially some of the code where you use apply and lambda functions to handle data transformations, this is definitely something that will be useful for me in the near and long future! :) Signed: A total newbie at this.
@KenJee_ds
4 жыл бұрын
This is a great idea! I think I will try the time stamp portion for the next one. I would also recommend my project from scratch series: kzitem.info/door/PL2zq7klxX5ASFejJj80ob9ZAnBHdz5O1t . I broke this one into each phase of the data science lifecycle. I think your approach is really great though! I highly recommend that for other people going through this.
@arick2050
3 жыл бұрын
Super informative, thanks Ken!
@KenJee_ds
3 жыл бұрын
Thanks for watching Aric!!
@manasagrawal8365
3 жыл бұрын
thanks Ken this was really helpful
@KenJee_ds
3 жыл бұрын
Thanks for watching!
@sadiakamal6866
4 жыл бұрын
Great job..Please do these sort of videos more often!
@KenJee_ds
4 жыл бұрын
Thank you for watching! Will definitely be trying to make more of these!
@hemantgautam1
4 жыл бұрын
Hi Ken, Please create a separate play list for kaggle videos. 🙂
@KenJee_ds
4 жыл бұрын
Will do!
@prabirbiswas440
4 жыл бұрын
Wow what a in-depth analysis. You really put a lots of efforts into this. This is my first try in Kaggle too, after spending this much time i wonder how much time it will take for even tougher Data , i also checked the House Rent Competition. It have 81 Features. how can we do such a detailed analysis on all the features. Not sure how the real-world ML problems are solved where they might have 100 or even more features. I am really excited to know more :)
@KenJee_ds
4 жыл бұрын
Thanks for watching! I will be doing the housing dataset next, so stay tuned!!
@ΧρυσόστομοςΠαπαδόπουλος-κ5π
4 жыл бұрын
I think it would be great if you could show how you would present this project in a markdown file in order to add it to your github. Thanks for the great work!!!
@KenJee_ds
4 жыл бұрын
I will work on it!
@solaawodiya7360
4 жыл бұрын
Hi Ken, thanks for the help on learning about data science. I struggle a lot using Kaggle to learn python. The user experience for me is quite intimidating compared to other platforms I used as there are times even when I know the question, I get lost on how to answer and follow the steps.
@deepakshiarora835
4 жыл бұрын
take a drink every time ken says actually.
@KenJee_ds
4 жыл бұрын
Oof, something I'm working on improving (actually) haha
@deepakshiarora835
4 жыл бұрын
@@KenJee_ds you're so humble (actually).
@jfr543
3 жыл бұрын
This video is gold!
@KenJee_ds
3 жыл бұрын
Thanks for the kind words! I'm glad you found it helpful!
@moajjem04
4 жыл бұрын
This video is a great help!
@KenJee_ds
4 жыл бұрын
Glad to hear! Thank you for watching!
@jonasschroder7244
3 жыл бұрын
Great! Very inspiring and helpful!
@KenJee_ds
3 жыл бұрын
Thanks for watching Jonas!
@bencantc2548
4 жыл бұрын
Amazing video! I hope you do a similar video on regression and clustering problems in the future!
@KenJee_ds
4 жыл бұрын
Thanks for watching! I plan to do a regression problem next!
@mohithedaoo6968
4 жыл бұрын
This was much needed... Thank you very much!!l
@KenJee_ds
4 жыл бұрын
Happy I could help! Thank you for watching!
@kushagrayadav.fitness
4 жыл бұрын
Thank You Ken for providing this video...your new subscriber from India...🧡✌
@KenJee_ds
4 жыл бұрын
Awesome! Thank you for subscribing! I hope my other videos are helpful as well!
@nikhilatluri1569
4 жыл бұрын
@@KenJee_ds yes for sure Watched almost all your videos And got a lot of information in building my career
@kushagrayadav.fitness
4 жыл бұрын
@@KenJee_ds just finished my data science beginners playlist...🙂✌... after this going to start my first project for beginners....thank you so much, Ken, earlier I was going in the wrong path, I will be your fan...🧡🧡🧡want to get in touch with you please sir...
@karlduckett
3 жыл бұрын
Really awesome! My only minor criticism is that in the first half of the video, most of the pivot tables and charts are displaying counts. When comparing counts between categories (i.e. survival rate by age) it really needs to display the percentage of that grouping... Sadly I'm too much of a noob to figure it out just yet :(
@KenJee_ds
3 жыл бұрын
Totally fair, thanks for the feedback!
@gupnir
4 жыл бұрын
Hi Ken, your videos are really helpful for beginners like me. Can you do a similar walk-through video for House Prices problem as well.... thanks in advance.
@KenJee_ds
4 жыл бұрын
I plan to! Thanks for watching Nirmit!
@samuelwondim5906
4 жыл бұрын
This is just great
@KenJee_ds
4 жыл бұрын
Thanks for watching Samuel!
@dhristovaddx
4 жыл бұрын
Thank you for the great video! It's very helpful! ^_^
@KenJee_ds
4 жыл бұрын
Thanks for watching! Glad it was helpful!
@DarkPrince1996
3 жыл бұрын
You did a great job explaining your approach to solving the task at hand and walking us through the process and so Im wanting to know what would be the next steps for someone wanting to use this competition to learn data science? Like I dont have a detailed understanding of all the algorithms that you used in this competition so would it be best to pick the one that produced the best score and learn how to tune that particular algorithm model metrics to get a better score or would it be best to transfer your process to another beginner competition altogether to create a better understanding of the complete data science process as a whole?
@KenJee_ds
3 жыл бұрын
I think this is great for learning how to tune the algorithms and seeing what results you get with different ones. It is also a good one for practicing feature engineering like I did with some of the seats etc.. I think transferring things to another competition would be a good idea!
@DarkPrince1996
3 жыл бұрын
@@KenJee_ds appreciate your advice and I will definitely do that.
@risperbevalyn9670
4 жыл бұрын
Thanks a lot ken jee
@KenJee_ds
4 жыл бұрын
Glad I could help!
@AlexKite68
3 жыл бұрын
Thank you for this great video! I've already subscribed to your channel, digging to find a lots of DS insights )) But please improve the audio quality in future videos: background noises are really frustrating, and a background music seems to be a little bit loud. But again, you're making a great resource that is very useful for Data Science beginners like me!
@KenJee_ds
3 жыл бұрын
Thanks for watching! I have adjusted the music in the newer videos
@muhammadtalmeez3276
4 жыл бұрын
thanks for this video
@KenJee_ds
4 жыл бұрын
Thanks for watching!
@shaikhkashif9973
Жыл бұрын
U did feature Engineering first?? Then remove outliers 🤔
@amrelshabasy1183
2 жыл бұрын
Thanks, Ken for this great video. Can you please explain, how did you measure that the Model XGboost is overfitting?
@alexanderlindsey4066
4 жыл бұрын
Hi Ken, great video. Thank you! Please consider making a similar video with panel data!
@KenJee_ds
4 жыл бұрын
Thanks for watching Alexander! Can you expand on what you mean by panel data further?
@alexanderlindsey4066
4 жыл бұрын
@@KenJee_ds Time Series! Something like the M5 Forecasting challenge on Kaggle, or predicting house prices, predicting blood sugar metabolism (see www.diabits.com), other ideas.
@tobakudan
3 жыл бұрын
Awesome tutorial. I have a question. Why do you log normalize Sibsp and Fare in addition to using StandardScaler? What does the log normalization accomplish that the StandardScaler doesn't?
@KenJee_ds
3 жыл бұрын
Thanks for watching! So scaling only puts it between 0-1, it doesn't change the distribution. When we use log norm we reduce the skew of the data which can sometimes help with the model. I hope this helps!
@gz4978
6 ай бұрын
Aren't some fields "redundant"? I mean, female with one spouse is almost automatically a "Mrs." and I don't know whether that leads to a bias or not. Thank you
@OnlineGreg
2 жыл бұрын
13:27 why exactly do you choose 'Ticket' for values in the pivot-table? Dont understand that
@Mohamedasdfgpo
2 жыл бұрын
As a beginner this was a bit to complex for me, wished if you may have made it a bit more structured and simple
@KenJee_ds
2 жыл бұрын
Will try to make more straightforward ones in the future
@jmiller1095
2 жыл бұрын
Ken, I'm a long time listener first time caller :) This is a terrific video .. and I have one question! ... at around 22:00 you clearly tell us that concatenating training and test datasets together and then pre-processing them all together is NOT the way it should be done in real -world (real world way, as I understood: train encoder scaler on training dataset then transform test dataset using encoder scaler trained (only) from training dataset). So .. do you have (or do you know of one) similar to this one but which demos the real world way of doing business?
@KenJee_ds
2 жыл бұрын
Good question! I don't have one on my channel, but I think Nicholas Renotte probably has a tutorial on his where he does it correctly. There are plenty of ways to do it correctly. You can create functions that label the data in the same way as you did in the tests set, more specifically, you can use the sklearn pipeline. If you go to the kaggle kernels, I expect there will be quite a few that do it correctly. Thanks for watching!
@abrahamowos
3 жыл бұрын
I didn't understand almost 50% of the walkthrough, got after EDA ☹☹😫. This walkthrough seems to be a bit too advanced.😲
@KenJee_ds
3 жыл бұрын
I have some lighter tutorials which may be easier to digest as well!
@dunghuy6389
4 жыл бұрын
Hello, Thanks for the video. I heard you told about deep learning for this dataset (included categorical and non categorical features). It is a typical data that we usually see. Could you please make a video and build a deep learning model?
@KenJee_ds
4 жыл бұрын
I will use deep learning in an upcoming video!
@cshivani
4 жыл бұрын
Thanks!
@KenJee_ds
4 жыл бұрын
I hope you enjoyed it!
@maYYidtS
4 жыл бұрын
bro your videos are very helpful...please suggest me. how do I increase my data preprocessing, data analysis skills. few tips ?
@KenJee_ds
4 жыл бұрын
I recommend looking through other people's kaggle notebooks. You can see the approaches other people take to process and analyze data! Next you can apply it on your own projects. I hope this helps and thanks for watching!
@A__SB
3 жыл бұрын
Great work! I was just wondering why you would create dummy variables for features like "Age", "SibSp", "Parch". These features are either float or int. They are continuous and not categories/strings like name_title.
@KenJee_ds
2 жыл бұрын
Not sure why I didn't do that for age, but the others were in small enough numbers that I thought it made sense to. IF sibsp is only a few categories, I thought it made more sense to just look at them in isolation as dummy vars
@A__SB
2 жыл бұрын
@@KenJee_ds Thank you for the reply, Ken Jee!
@jackagass5211
4 жыл бұрын
Hi Ken. Great video and walk-through. One question - why have you created dummy variables on the numeric features?
@KenJee_ds
4 жыл бұрын
pd.getdummies() is smart enough to know which are numeric and which are not. It should have only made dummy variables for dtype 'object'
@jackagass5211
4 жыл бұрын
@@KenJee_ds Thanks for your quick reply! OK makes sense, thanks for the explanation. Love the channel by the way and looking forward to the next video!
@fellygraytv1551
4 жыл бұрын
Hi why did you type training["train_test"] = 1 test['train_test'] = 0 test['survived'] = np.NaN
@KenJee_ds
4 жыл бұрын
For some of the feature engineering, I combine the whole data set. I used the [1,0] to mark which set the data came from so I can split it again later. I added the survived with np.NaN because the test data doesn't have that column. In order to join the data how I did, it needed to include that column. I hope this helps to clarify things!
@hanghang1930
4 жыл бұрын
The hint is actually in the following line. I had the same doubt as you, but when I saw `pd.concat()`, it started to make sense.
@danasharon4752
2 жыл бұрын
Thank you for this video! On a separate note, how do you get your plots to be so colorful?
@KenJee_ds
2 жыл бұрын
Thanks for watching! I owe it all to the software haha
@TV-in3tt
4 жыл бұрын
Ken Jee👍👍👍
@KenJee_ds
4 жыл бұрын
😆
@botondpall8280
3 жыл бұрын
Is this a binary logit model?
@KenJee_ds
3 жыл бұрын
One of them models is
@amitsaurabh9948
4 жыл бұрын
Hey, It was kind of advanced for me to do it on my own. What should I learn before doing these projects, can you share some source. Thanks for the amazing video
@KenJee_ds
4 жыл бұрын
Thanks for watching still! I recommend kaggle.com. They have free micro courses that walk you through all the parts of a project!
@amitsaurabh9948
4 жыл бұрын
@@KenJee_ds Thanks for the reply, they are really great to get going
@ssrk369
2 ай бұрын
As a complete beginner, can anyone tell which topics should I learn before trying these types of competition.. anyone?
@randomizing1000
3 жыл бұрын
Great video! Do you perhaps know whether should beginner in ML have good knowledge of statistics to even start learning it, and advanced knowledge of calculus and linear algebra?
@KenJee_ds
3 жыл бұрын
I think to make a career out of Machine Learning you need to have a good understanding of all three. For a beginner, a basic understanding of statistics is all you need
@randomizing1000
3 жыл бұрын
@@KenJee_ds Thank you sir Ken!
@sessario982
Жыл бұрын
Great video! Iwas super clueless before this video but after watching it i became much less clueless but also found out that i was even more clueless haha! I just moved over from c to python and see a lot of not so familiar python code(i only have the basics😭). Can you tell me what parts of python i had to go through to start with? Thanks again for the video!
@larryhatcher8927
6 ай бұрын
Hi, It's been 5 months since you wrote this. Just wonder if you have stuck with it. The Python is easy but it takes practice every day
@sessario982
6 ай бұрын
@@larryhatcher8927 helloo, yeah so when i started this, i was trying to get into data science for my concentration in computer science. But i had a change of mind and decided to pursue software engineering instead because i feel like its just more fun making applications and stuff. For my python progression, i study here and there and now i am not so clueless anymore thankfully, its still surprising sometimes that there are libraries that can make things so much easier because i was taught in college to do things from scratch haha, for ML codes though might need some more work. Overall its been going great thanks for asking😊
@DhrECraig
3 жыл бұрын
Hey Ken Jee, thank you for the video, it's helping me a lot. :) I wanted to ask though, how did you know that you overfitted (the spoiler alert) with XGBoost?
@KenJee_ds
3 жыл бұрын
If you start producing poor results on your test or validation set, it is likely a sign of overfitting!
@fadinayfeh4490
3 жыл бұрын
Amazing video. I got a question, don't we need to detect the outlier data in the models? or its not a necessary step in calcification?
@KenJee_ds
3 жыл бұрын
It can be useful here. I can't remember off the top of my head if I did it haha. It really depends on the models you use as well. Some are not as sensitive to outliers.
@jackychan4640
2 жыл бұрын
I am taking the test, but it sounds impossible to pass ! Any idea to the test
@gormikayelyan9793
4 жыл бұрын
Udacity or Coursera to gain valuable data science knowledge???
@KenJee_ds
4 жыл бұрын
They are fine, but I prefer the micro courses on kaggle.com. I also do some work with 365 data science, and recommend their course!
@rocioseltzer
3 жыл бұрын
Hi Ken! Thank you for this and all other videos. That's very generous of you. Very valuable information. My company is suscribed to Coursera and I was wondering which opinion you have of their Data Science courses. I have seen many of your videos, but I think you haven't named it. Is it because cost, quality?
@KenJee_ds
3 жыл бұрын
I think the Coursera courses are great! I usually don't say anything since I haven't personally taken them. I pretty much only make recommendations on courses if I have personally taken them, or I can get my subscribers a big discount on them. I hope this helps!
@rocioseltzer
3 жыл бұрын
@@KenJee_ds That makes total sense. Thanks heaps!
@sarinajami9900
3 жыл бұрын
I have a question. in the preprocessing stage you use test set along with training set and you said you do it because in this case we want to know about the test set, but I don't know why. aren't we supposed to predict the output of test set? why do we need to have some information about it in our prediction model then? isn't data leakage happening here?
@KenJee_ds
3 жыл бұрын
Thanks for watching Sarina! I think I mention that this is just so that I can make sure the data is the same for both sets. It isn't a good practice outside of kaggle competitions because of data leakage though. Kind of me being lazy 😅
@sinan_islam
2 жыл бұрын
The music in the background is very distracting. I keep looking around for the noise.
@lindeanchuang8115
2 жыл бұрын
Hello, Ken, I wanna know about if there are many files in the data set in kaggle competition, how can I use Jupyter or colab to do code. Thanks a bunch.
@KenJee_ds
2 жыл бұрын
You can read in the multiple files into the environment! Often you can join them together as well
@lbryan250
4 жыл бұрын
Thanks Ken! I'm working on the same competition and that was super helpful! Quick Q - I'm trying to build up my DS portfolio, do you think it's worth my time to do the whole thing from scratch (i.e. EDA, feature engineering, modelling) and to dwell a bit more on the theoretical stuff? Put another way, is it more important in these types of projects to show knowledge of how these models actually work, or to show that you can use tools and libraries (like sklearn) proficiently?
@KenJee_ds
4 жыл бұрын
Thanks for watching Brian! Realistically, I wouldn't consider the Titanic dataset a worthy project for a resume. It is ideal for learning. What will impress employers is how you go about attacking these problems. I am always impressed when someone comes to a unique conclusion. I also like seeing unique things with feature engineering or pipeline to transform the data. I would try to show that you look at problems differently rather than just throwing a bunch of models at it. I hope this helps!
@quinnherden
3 жыл бұрын
Very cool! By the way, you wrote "compention", instead of "competition" in your notebook's opening remarks
@KenJee_ds
3 жыл бұрын
Thanks for catching that! It should be fixed now!
@ShmeegleSon
3 жыл бұрын
woo!
@KenJee_ds
3 жыл бұрын
Hope you enjoyed it!
@yudhisthirsingh8401
3 жыл бұрын
Hi Ken, Great Video. I have a doubt though, when you say you are creating categories based on the first letter of the cabin why are you ignoring the second letter (some have different lettered cabin as the second cabin). eg -> F E69
@KenJee_ds
3 жыл бұрын
It's worth testing for multiple letters! I didn't think there would be much benefit for how much longer it would take me to do that. Definitely give it a try if you want though!
@yudhisthirsingh8401
3 жыл бұрын
@@KenJee_ds Thanks for the reply. Your'e doing a great job, keep it up. :)
@DevilTidusX
3 жыл бұрын
Since the test set has no output (Survival), should we even consider it in our EDA? Why not doing just on top of the training set?
@KenJee_ds
3 жыл бұрын
I think it is worth including just to get a better understanding of what is all there. You can also see which non dependent variables are highly correlated. This could be useful depending on the question you are trying to answer.
@rushikeshgandhmal
4 жыл бұрын
Hello really appreciate your efforts. I'm ML beginner, done with only theory algorithms no hands on experience yet. How should I start and from where ? Need few suggestions.
@KenJee_ds
4 жыл бұрын
I would start by trying to apply these. Some of the basic kaggle projects like this one are ideal.
@rushikeshgandhmal
4 жыл бұрын
@@KenJee_ds Thank you for your response. Love your all videos. Thanks )
Пікірлер: 445