𝗞𝗡𝗡 is a supervised machine learning algorithm which has been used for a classification task in this example. This algorithm determines the 𝘤𝘭𝘢𝘴𝘴 of a new data point based on a majority vote. The majority here is basically its nearest neighbors 𝗞 (𝗞 is an integer) in the feature space. The label of the majority decides the class of the new data.
I used 𝐰𝐡𝐞𝐚𝐭_𝐬𝐞𝐞𝐝.𝐜𝐬𝐯 dataset for this example. This dataset contains three varieties of wheat seed. I initially chose 𝐊=𝟒 but later found out that 𝐊=𝟐 gives the same result as 𝐊=𝟒 and 𝐊=𝟏𝟎 for this dataset. In that case, a lower value of 🅺 is the desirable one.
𝙄𝙢𝙥𝙤𝙧𝙩𝙖𝙣𝙩 𝙩𝙞𝙢𝙚𝙨𝙩𝙖𝙢𝙥𝙨:
00:34 - Import required libraries
01:58 - Load 𝐰𝐡𝐞𝐚𝐭_𝐬𝐞𝐞𝐝 dataset
03:45 - Visualize selected features
08:40 - Separate features and labels
09:48 - Split the dataset
11:21 - Apply 𝐊𝐍𝐍
12:35 - Plot 𝐜𝐨𝐧𝐟𝐮𝐬𝐢𝐨𝐧_𝐦𝐚𝐭𝐫𝐢𝐱
17:00 - Print 𝐜𝐥𝐚𝐬𝐬𝐢𝐟𝐢𝐜𝐚𝐭𝐢𝐨𝐧_𝐫𝐞𝐩𝐨𝐫𝐭
17:39 - Scores for different 𝐊
𝑮𝒊𝒕𝑯𝒖𝒃 𝒂𝒅𝒅𝒓𝒆𝒔𝒔: github.com/ran...
#datascience #knearestneighbors #KNN #machinelearning #supervisedlearning #supervisedclassification #jupyternotebook #python
Негізгі бет K-Nearest Neighbors using Scikit-Learn
Пікірлер