QUESTION OF THE DAY: Do you encounter any problems in repurposing / adapting / improvising R codes (from tutorials, websites or KZitem) on your own projects? Would love to hear your experiences on how did you solve those problems. Comments down below! 😃
@manasavegesna310
4 жыл бұрын
A great introduction to Machine Learning using R! I really enjoyed learning how to analyze different datasets. I found that the dhfr dataset is not accessible while using the datasets packages alone but when the caret package is loaded, I was able to access it and analyze the data on RStudio.
@DataProfessor
4 жыл бұрын
Thanks for the comment. Yes, you are correct, the dhfr dataset comes with the caret package.
@gurudeebanselvaraj8888
4 жыл бұрын
The lecture is really great.....Thanks, Prof.Chanin. I would like to know, what is the next step after cross-validation, how we can get the top 10 compounds for repurposing?
@DataProfessor
4 жыл бұрын
Hi, this video talks about repurposing of the Python code whereas you are asking about drug repurposing. As for drug repurposing, we will have to introduce features of several proteins into the model. Currently existing videos that I have made on this channel considers only compound features ( a field known as quantitative structure-activity relationship or QSAR whereas the consideration of both compound and protein features in the same model is known as proteochemometrics modeling). Actually, the inventor of Proteochemometric modeling is a good friend of mine Prof. Jarl Wikberg from Uppsala University. I will definitely make future videos about this. 😃
@gurudeebanselvaraj8888
4 жыл бұрын
@@DataProfessor Thanks. Currently, I am having an X-inhibitors list (650 with chemical properties), like "dhfr".....I would like to know, to extract the best or top using ML.....I am sure that, if it works we can make collaboration for future research papers. Thanks
@desmondojei3868
4 жыл бұрын
Hi just a quick question look at the point 17:49 of the video, when predicting using the CV model why did you still use the training.set and not the test.set? thanks
@DataProfessor
4 жыл бұрын
Hi Desmond. The data set is normally split to a training set and test set. Then the training set is used to evaluate the internal predictivity of the model by using it for building a training model (that is then used to evaluate on the testing set) and the building of a cross-validation model (that splits the data into n-fold and iteratively constructs n-models for evaluation on the left out fold, finally the performance is averaged over the n-model performance). A more in-depth explanation is provided in a previous video I made on the overview of machine learning model building process kzitem.info/news/bejne/o4Whl5yifIKIY5g as well as an infographic github.com/dataprofessor/infographic/blob/master/01-Building-the-Machine-Learning-Model.JPG
Пікірлер: 10