The link to the image and it’s raw file are in the description. If you think I deserve it, please give this video a like and subscribe for more! If you think it’s worth sharing, please do so as well. I would love to grow to 100k subscribers this year with your help :) Thank you!
@RanDuan-dp6oz
Жыл бұрын
Just gave the thumb up! Just curious: what software did you use to draw such a wonderful diagram?
@junningdeng7385
Жыл бұрын
Sooooo nice! Where we can find the link to the image😂
@CodeEmporium
Жыл бұрын
Thanks I used draw.io to draw the image
@CodeEmporium
Жыл бұрын
The image can be found in the description of the video on GitHub
@Sundarkarthik-h3i
9 ай бұрын
But what is the source for the kannada words that was feed in to the output?, how can we get those word in reality? could you explain me if you are willing to. Thank you.
@siddheshdandagavhal9804
Жыл бұрын
Most underrated youtuber. You are explaining this complex topics with such an ease. Many big channels avoid explaining this topics. Really appreciate your work man.
@CodeEmporium
Жыл бұрын
Thanks a lot for the kind words. I try :)
@ShimoriUta77
8 ай бұрын
Bro for real! It never felt a possibility for me to learn ML but this guy took me by hand and is teaching all this for free! I can't even thank this dude enough
@Wesker-he9cx
2 ай бұрын
Absolute Facts
@menghan9260
Жыл бұрын
The way you approach this topic make it so easy to understand, and I appreciate the pace of your talking. Best content on transformer.
@CodeEmporium
Жыл бұрын
You are very welcome. And thanks so much for that super thanks. You didn’t have to, but very appreciated
@swethanandyala
3 ай бұрын
The best explanations on transformers that i have seen!
@Anirudh-cf3oc
11 ай бұрын
You are the most underrated KZitemr. This is the best video explaining Transformers completely in the most intuitive way. I started my journey with Transformers with your first Transformers video few years ago which was very helpful. Also, I am so happy to see an AI tutorial video using an Indian Language. I really appreciate your work.
@Mr.AIFella
Жыл бұрын
You're explanation is the most realistic explication of the Transformer that I've ever seen in the internet. Thanks dude.
@CodeEmporium
Жыл бұрын
That means a lot. Thank you. Please like subscribe and share around if you can :)
@asdfasdf71865
Жыл бұрын
i like your visualization of the matrixes. those residual connections and positional embeddings were good details to mention here
@ianrugg
Жыл бұрын
Great overview! Thanks for taking the time to put all this together!
@CodeEmporium
Жыл бұрын
Thanks so much! My pleasure
@moseslee8761
Жыл бұрын
You explain really well! I think its quite complex but as you explained it, it has become more clear. I think with the coding video, it is extremely useful
@ramakantshakya5478
Жыл бұрын
Amazing explanations throughout the series, and top-notch content, as always. Waiting for a detailed explanation/visualisation of the backward pass in the encoder/decoder during training. I would appreciate it if you were thinking in the same way.
@helloansuman
Жыл бұрын
Amazing❤ Salute to the dedication in making this video, visual explaination and knowledge.
@CodeEmporium
Жыл бұрын
Thanks so much for watching and commenting!
@amiralioghli8622
Жыл бұрын
Thank you so much for taking the time to code and explain the transformer model in such detail, I followed your series from zeros to heros. You are amazing and, if possible please do a series on how transformers can be used for time series anomaly detection and forecasting. it is extremly necessary on yotube for somone!
@ArunKumar-bp5lo
10 ай бұрын
love the visualization makes it so clear
@amitsingha1637
Жыл бұрын
Bro all of my Confusion vanished like vanishing Gradient. Thanks. Really worth it.
@aintgonhappen
Жыл бұрын
Video quality is amazing. Keep it up, buddy!
@CodeEmporium
Жыл бұрын
I shall. Thanks so much!
@lakshman587
10 ай бұрын
Thank you so much for all these videos, I have learnt a lot from your videos!!! I thought you were from Tamil Nadu, but today I got to know that you were from Karnataka!! Where from Karnataka? I'm staying in Bangalore, Would like to meet you in-person!!!!!
@triloksachin4826
6 ай бұрын
Amazing video, keep up the good work. Thanks for this!!
@Sneha-Sivakumar
11 ай бұрын
this was a brilliant video!! super comprehensive
@cyberpunkdarren
6 ай бұрын
You kanada written language is really beautiful!
@wireghost897
Жыл бұрын
Very well explained. Thank you.
@naveenrs7460
Жыл бұрын
Lovely brother. I am your Neighbour Tamizhan. Lovely brotherhood
@CodeEmporium
Жыл бұрын
Thanks so much! :)
@enrico1976
8 ай бұрын
That was awesome. Thank you man!!!
@k-c
Жыл бұрын
Will have to brush up my basics and then come back to this.
@CodeEmporium
Жыл бұрын
Yea. This can be a lot of info. Hopefully the earlier videos in this playlist will help too
@k-c
Жыл бұрын
@@CodeEmporium Your channel is really good! Thanks for all the work.
@soumilyade1057
Жыл бұрын
hopefully the series is completed soon ❤️ would binge watch 😁
@CodeEmporium
Жыл бұрын
Yep. Maybe 1 or 2 videos left. I am running into some issues, but I’ll probably either have them solved or just have a fun community help video. Either way, it should be good
@soumilyade1057
Жыл бұрын
@@CodeEmporium ♥️♥️ 😌
@KulkarniPrashant
4 ай бұрын
Amazing video! Thank you.
@loplop88
5 ай бұрын
so underrated!
@josephfemia8496
Жыл бұрын
If I can recommend a next steps to this series, going into Bert, GPT, and DETR would be lovely extensions
@CodeEmporium
Жыл бұрын
I was kind of thinking the same! For now, I have videos on BERT , GPT on the channel if you haven’t checked it out. But an architecture deep dive would be fun too :)
@RanDuan-dp6oz
Жыл бұрын
@@CodeEmporium Yes, that will be super fun! Also, it would be great if you can introduce how a ML practitioner could do fine tune based on these complex models.
@davefaulkner6302
6 ай бұрын
Fantastic lecture. The attention layer and their inter-relationships are very well explained. Thank you. However this and other videos gloss over the use of the fully-connected layers following the attention layer. Using FC with language model embeddings makes little sense to me. Are there 512x50 inputs to the FC, i.e., is the input sentence simply flattened as input to the FC layer?
@sarahgh8756
7 ай бұрын
Thank you for all the videos about transformer. Although I understood the architecture, I still dont know what to set for the input of the decoder (embedded target) and mask for the TEST phase?
@ravikumarnaduvin5399
Жыл бұрын
My friend Ajay, your playlist "Transformers from scratch" is great. It was very appealing to me to see your block diagram representation. Waiting with great anticipation for the final video. Would you be able to make it available soon?
@CodeEmporium
Жыл бұрын
Glad you like it! I am hitting a few roadblocks though I feel I am 99% there. I’ll make a video on this to mostly ask the community. So it should be a fun exercise for everyone too :) hoping when that is resolved, we can make a final video :D
@charleskangai4618
7 ай бұрын
Excellent!
@user-pu4iz8wb4d
Жыл бұрын
THIS IS AMAZING ,helped me a lot thanks :)
@CodeEmporium
Жыл бұрын
Thanks so much for watching and commenting!
@markusnascimento210
Жыл бұрын
Very good. In general articles don´t show the dimensions when explaining. It helps a lot. Tks
@CodeEmporium
Жыл бұрын
My pleasure!
@sharangkulkarni1759
Ай бұрын
धन्यवाद
@DanielTorres-gd2uf
Жыл бұрын
Damn, could've used a few weeks ago for my OMSCS quiz. Solid review though, nice job!
@codeative
Жыл бұрын
Very well explained 👍
@CodeEmporium
Жыл бұрын
Thanks a ton for commenting and watching :)
@anandgupta2892
Жыл бұрын
very well 👍
@Diego-nw4rt
Жыл бұрын
Great channel and very useful video, thank you very much! I will watch other videos of your channel as well. I have a question. After you perform layer normalization obtaining an output tensor, how do you give a three-dimensional tensor as input to a feed forward layer? Do you flatten the input?
@user-wr4yl7tx3w
Жыл бұрын
Really well presented.
@CodeEmporium
Жыл бұрын
Thanks a ton! :)
@rafaelgp9072
Жыл бұрын
Would be nice a video like this explaining LLAMA model
@abirbenaissa3717
Жыл бұрын
Life saver, thank you
@CodeEmporium
Жыл бұрын
You are very welcome
@venkideshk2413
Жыл бұрын
Masked multihead attention is for decoder right. Is that a typo in your encoder architecture.
@gabrielnilo6101
Жыл бұрын
11:08 I am sorry if I am wrong but the transposed K matrix, isn't it 50x30x64?
@deeedledeee
Жыл бұрын
Great video. At 12:09 , how will dividing all the numbers by 8 ensure the small values are not too small or large values are not too large? Wouldn't dividing by 8 cause a number to be 8 times smaller?
@paragbhardwaj5753
Жыл бұрын
Do a video on this new model. Called RWKV-LM.
@abulfahadsohail466
Жыл бұрын
Please can you apply transformers which you have built on text summarisation. It is really helpful.
@susmitjaiswal136
Жыл бұрын
what is the use of feed forward network in transformer ..please answer
@wishIKnewHowToLove
Жыл бұрын
concise
@CodeEmporium
Жыл бұрын
Thanks! I try not to bore :)
@CyKeulz
Жыл бұрын
Great! Still a bit too hard for me but i still learned stuff. Question, would it be possible to use the same encoder accross multiple languages ? without retrainning it after the first time, i mean.
@CodeEmporium
Жыл бұрын
I hope the full playlist “Transformers from scratch” helps with pacing this. To your second question. This is a simple transformer neural network and not the typical language model like BERT/GPT. The transformer on its own doesn’t make use of transfer learning typically. So some retraining will be required. That said, if you were using the language models, then you might just need to fine tune your parameters to the target language (which is technically training). Or if you go the GPT3 route, you could get away without fine tuning and use meta learning techniques instead.
@Sundarkarthik-h3i
9 ай бұрын
But what is the source for the kannada words that was feed in to the output?, how can we get those word in reality? could someone explain me if you are willing to. Thank you.
@anwarulislam6823
Жыл бұрын
Without bci multi head attention process possible with human brain?
@joegarcia8935
Жыл бұрын
Thanks!
@CodeEmporium
Жыл бұрын
You are super welcome! I appreciate the donation! Thanks!
@colinmaharaj50
Жыл бұрын
Can this be done in pure C++
@raxn2673
4 ай бұрын
It is highly unlikely that you will respond to this, but if you do, I am grateful. Is this a monetized KZitem channel? If so, is this a monetized video? And if so, has Google Research tried to copyright you for using their "Transformer Architecture" figure for cmmercial purposes (your monetized video)? I am asking because I want to make my own transformative work of the image (by changing the colors, fonts, style of drawing, etc.) to use it for a paid A.I. course (commercial purposes, obviously) that I am want to make. I want to see if Google actually comes after your neck if you use their figure.
@CodeEmporium
4 ай бұрын
I haven’t had problems thus far. And yes, the video is monetized
@jamesroy9027
Жыл бұрын
background music create lot of disturbance and especially that pop out sound otherwise content delivery is best
@samurock100
8 ай бұрын
1kth like
@creativeuser9086
Жыл бұрын
So you're from the silicon valley of India. We all now it
@CodeEmporium
Жыл бұрын
Haha kinda yea.
@TheTimtimtimtam
Жыл бұрын
First :)
@CodeEmporium
Жыл бұрын
Please keep being the first! :)
@phaZZi6461
Жыл бұрын
hi, i really love your complete model overview! also at 8:08 you mention that the difference between K Q V isnt very explicit to the model. what would be your personal intuitive interpretation for what a Key vector might extract/learn from a input word? i find the key conept a bit odd and wondered how the authors came up with the idea of training a Key vector(/matrix), where previous attention papers only had a value vector, which would be used in both places (K and V) of the equation . when i think about information retrieval concepts where we have a search query and documents to be ranked, iirc the intuition there is to compute a dot product to get a similarity/relevance score between them. in my mind the concept of "how relevant is each document" isnt that far off from "how much attention should i pay to each document". And analogously I would interpret documents to be Values, and the idea of a key seems to be absent? (unless IR in practice computes a key for each document, basically a key_of(document)-query-similarity; then i just answered the question myself). anyways, i wondered if it wouldnt be possible to simplify the attention mechanism, while keeping it conceptually similar. not sure where i should look to get to know more about this.
@KulkarniPrashant
4 ай бұрын
Thanks!
@CodeEmporium
4 ай бұрын
You are super welcome! Thanks for the donation too!
@fayezalhussein7115
Жыл бұрын
amaaazing
@CodeEmporium
Жыл бұрын
Thanks so much :)
@prashantlawhatre7007
Жыл бұрын
Eagerly waiting for the upcoming videos in the series.
@CodeEmporium
Жыл бұрын
Thanks! Probably just 1-2 long form video(s) more
@erikschmidt3067
Жыл бұрын
What're in the feed forward layers? Just an input and output layer? Are there hidden layers? What are the sizes of the layers?
@CodeEmporium
Жыл бұрын
Freed forward layers are hidden layers. It’s just essentially 2,048 neurons in size. You can think of it as mapping 512 dimension vector to 2,048 dimension vector. And then mapping the 2048 vector to 512 dimensions. All of this to capture additional information about the word
Пікірлер: 102