I've learned this kernel thing in college class, in Andrew Ng's ML courses and many other times, but this is literally the best explanation so far. It really blew my mind once I’ve grasped how gaussian kernel can help me work in infinite dimension!
@LeHaroun
Жыл бұрын
I am a professor of engineering, and I have to say that your chain of thoughts, and the back drive are just amazing. Also the simplicity of explanation, and energy in this video. Keep on .
@sgprasad66
3 жыл бұрын
Hey Ritwik, been trying to intuitively understand kernel-infinite dimension link for ages now and i had not remotely come closr to doing it,,,but your one video has melted the fuzziness away in a trice..thank you so much
@yasminejaafar
3 жыл бұрын
thank you thank you, I just finished my test, your Markov videos were amazing and helped me a lot, thank you again
@lizbethtirado331
2 жыл бұрын
Thank you for such an excellent explanation! This helps me to understand better ML models!
@ritvikmath
2 жыл бұрын
Thanks!
@ektabansal7645
Жыл бұрын
Our professor has just told us that it is a rbf kernel and i was not convinced but your video helped me believe it. This is amazing, thanks a lot sir
@pnorton8063
2 жыл бұрын
Thank you. You warped my fragile little mind. Fresh air. I love the RBF. Well presented. Nice zest too
@Alexweno07
Жыл бұрын
Best explanation. Thank you so much!
@ritvikmath
Жыл бұрын
Glad it was helpful!
@yesitisnt5738
10 ай бұрын
Man you are just amazing! Everytime i come across something i don't quite get at Machine Learning theory, there you are! Thanks a million!
@scottzeta3067
Жыл бұрын
I don't understand how does my teacher turn 8 min content into 1 hour confussing and boring class.
@tejassrivastava6971
Жыл бұрын
Amazing concept with amazing explanation !! Hats off to you !!
@ritvikmath
Жыл бұрын
Glad you liked it!
@jx4864
2 жыл бұрын
This is art, really nice explaination
@aliciaflorrie8390
2 жыл бұрын
Thank you, I have learn a lot about kernel function
@cameronbaird5658
Жыл бұрын
Amazing explanation
@lilin7535
Жыл бұрын
thank you!!! so good.
@abroy77
3 жыл бұрын
What has my life become. I genuinely anticipate the release of new math videos smh. Thanks for the great videos though :)
@harishmalik6935
2 жыл бұрын
Sir you deserve a million subscribers. Hope you get soon what you deserve 😊
@axadify
3 жыл бұрын
Amazing explanation. keep up good work
@javiergonzalezarmas8250
Жыл бұрын
Beautiful
@ritvikmath
Жыл бұрын
Thank you! Cheers!
@akshaymulgund4947
Жыл бұрын
I just got emotional. what a video
@piotrpustelnik3109
2 жыл бұрын
brilliant!
@hu5116
Жыл бұрын
Love your videos! Have just gone through the SVM an Kernel videos. However, I feel a little like on a cliffhanger. That is, now understand SVM, and I see where going with Kernel, but then it seems there needs to be a follow on video to finally link the kernel back explicitly to SVM and how the kernel is then explicitly used to do the classification. Specifically, what is missing in this video (or more accurately needed in a follow on video) is the linkage back to the Alphas of the Lagrangian or the w and b, because in the end, that is what defines the discrimination line. That last piece is tantalizing missing (I.e., hint for next video ;-). Thanks!
@ericchristoffersen9355
Жыл бұрын
This was fun! There are so many different ways to explain things. In this case youve based explanation on ‘property of kernel’ which seems so stodgy. theres maybe other ‘street math’ explanations of why this kernel is so great? For example e^x is its own derivative. Why does it use the 2 norm? Would a 4 norm be ok too? Why the -1/2? It turns out theres lots of variations of rbf that work just as well, this canonical edition is often the most efficient. i think it would be fun to see the rbf in action applied to a thorny classification problem, why do the operators of rbf work so well, what makes the wrong variations work poorly.
@shivkrishnajaiswal8394
2 жыл бұрын
4:50 I think the reason why you are able to make that constant (even terms involving xi), is that xi is normalized. So xi.T@xi = 1
@jiaheliu6431
Жыл бұрын
great point! Maybe he fotgot to mention this in the video. I think without this condition the definition of the high dimension feature vector is not consistant
@jiaheliu6431
Жыл бұрын
Sorry I was wrong. The term exp(xi^T * xi) is just a scaler and that's part of the function defines the high dimension feature vector.
@houyao2147
3 жыл бұрын
so cool to understand this infinitive. How to avoid overfitting for such a powerful model?
@ritvikmath
3 жыл бұрын
good question, I have an SVM kernels coding video coming soon that will answer that
@preritchaudhary2587
2 жыл бұрын
@@ritvikmath Hello Sir. Can you create a video on what role the hyperparameters play in SVM.
@vanamalivanam1397
Жыл бұрын
This is probably because we can differentiate/integrate e^x infinite times and it results in always same function e^x
@zwitter689
Жыл бұрын
Just great! Would you do a few examples (preferrably in python) and make the code available?
@patrick_bateman-ty7gp
10 ай бұрын
Did anyone have their Oppenheimer moment while understanding the RBF kernel ? I did.
@KrischerBoy
2 жыл бұрын
Absolutely brilliant! Could you maybe elaborate the Gaussian Radial Basis Function? How does the Variance & Mean fit into the context?
@harrypadarri6349
4 ай бұрын
Your comment is two years old but here’s how I tried to make some intuitive sense out of it: In regression Gaussian processes are used as a prior over functions. It is often said that the kernel of a Gaussian process specifies the “form” of the functions. For example in the sense that a larger lengthscale places more mass on smoother functions. If you sample from a GP with 1d inputs with an RBF kernel it looks exactly like this but does not really explain why that’s the case. What I did next was looking into a kernel smoother. Roughly speaking: You have a bunch of observations of a function f(x) at locations x and we predict the unknown function value at some location z by computing the linear combination of RBF kernel times known function values and normalise that sum. Let’s say we know f(x1) and f(x2) and want to predict f(x3). Then f(x3) ≈ (k(x3,x2)*f(x2) + k(x3,x1)*f(x1)) / (k(x3,x2) + k(x3,x1)) If you try to construct an equation with nice vector-matrix-notation you might get something like this: f = C * K_{fy} * y where f is the prediction of the unknown function values, y are the known function values and C is a matrix that does the normalisation. When you look at the equation of the posterior distribution’s mean of a GP in GP regression it looks something like this: mean = K(X_known, X_unknown)^T @ K(X_known, X_known)^(-1) @ y It’s also a linear combination of kernel and observed function values. Here “centred” by the inverse of the kernel matrix evaluated on the locations of the observations. This similarity between the a-posteriori mean and a kernel smoother helps me with the intuition. Of course it’s not a solid mathematical explanation but maybe a nice point of view from where to start when looking into it.
@jameschen2308
8 ай бұрын
Doesn't exp(x_i^Tx_j) give the same power?
@ajsalshereef7810
2 жыл бұрын
3:43 How can you add the cross terms to get -2*xiT*xj? Can anyone help me.
@maxstrzelecki3970
Жыл бұрын
xiT * xj produces the same result as xjT * xi, I think :)
Пікірлер: 42