Reinforcement Learning - "DDPG" explained

Рет қаралды 30,413

Aylwin Wei

Жүктеу

Пікірлер: 87

@felixnica762
4 жыл бұрын
Please make more of these videos, actually the first time I understand DDPG, very good!
@safiullahmarwat
3 жыл бұрын
same here.
@heenashaikh8422
3 жыл бұрын
I read a lot of blogs and papers, but could not imagine the process, so could not understand. Finally, I found what really happened in this algorithm process, thank you very much
@paedrufernando2351
3 жыл бұрын
Your Video was the the missing piece that I needed in the quest for Learning Reinforcement learning..I saw your video many times ..and by now I have already started creating apps using RL ..Thank you so much... As a motivation for ppl like me you could post more videos ..
@mihu2cool
2 жыл бұрын
Hi Aylwin. Thanks for the video, it is a much better explanation than many videos out there. Don't mind your english or technical issues, just keep making more of these, you'll gradually improve. And we'll keep getting more and better content. Hope you start making more.
@bominzhang6732
5 ай бұрын
One of the best presentation I'e seen on DDPG, I'd even say it is the best presentation of any ML pipeline I've seen so far.
@神楽坂イリヤ
Жыл бұрын
全油管看过最好的讲解给作者点赞请务必更新更多类似视频！
@safiullahmarwat
3 жыл бұрын
I think a most reasonable explanation I have found about DDPG. Thanks dude
@saminbinkarim6962
2 жыл бұрын
Really helpful video! Made DDPG clear as water! I hope you keep uploading.
@henkvandoorn9049
3 жыл бұрын
Best DDPG-explanation ever. Thanks!
@kyrylosovailo1690
3 жыл бұрын
Sound is shit, but you actually explain stuff that most tutors fail to explain. Thanks for that, you are amazing!
@Plumtown
Ай бұрын
Explained clearly and concisely
@vladimirdyagilev8946
4 жыл бұрын
Super concise and perfectly explained!
@tuanvanphu7086
4 жыл бұрын
perfect video!!! thanks so much. Now I understood the DDPG.
@kelvinwong9190
3 жыл бұрын
Thanks for the clear explanation of DDPG. Hopefully to see something like MADDPG soon? or other RL algorithms explanation haha!
@520610732
Жыл бұрын
Hi alywin...its very good video ...helped me a lot to learn ddpg...thanks bro
@doyu_
Жыл бұрын
Good explanation! BTW: what kind of drawing application did you use?
@tejasranadive573
2 жыл бұрын
It was a nice explanation. You should make more of these videos.
@DarkKnight7_1
2 жыл бұрын
Definitely worth to watch this explanation! Thanks
@elliotmunro8115
2 жыл бұрын
Very good video Wei! Thank you
@vanesaalcantara2265
7 ай бұрын
Thank you!!! Great explanation
@ArmanAli-ww7ml
2 жыл бұрын
Beautifully explained.
@AkshatSharma-qx9wh
6 ай бұрын
Oh my god this was amazing! Thanks!
@bhavajsingla3664
10 ай бұрын
great explaination, thanks a lot
@HuangShuchen-p1e
6 күн бұрын
好牛逼的视频，讲得太好了
@Brian-ft4dh
6 ай бұрын
Really great video, thanks!
@mostafasaeidi6809
Жыл бұрын
Awesome explanation. Thanks!
@joshuawang9401
4 ай бұрын
谢谢兄弟！
@YuZhang-f1z
4 жыл бұрын
Thanks for your effort, it's a great video!
@iamai4284
2 жыл бұрын
Great video!
@yihongliu7326
2 жыл бұрын
Thank you!! Please make moe videos if possible
@ninodpillai8436
4 жыл бұрын
nice way of explaining DDPG..
@animax-yz
Жыл бұрын
Awesome
@NS-gr9cy
3 жыл бұрын
God bless you. Thanks!
@yitongzhou6165
4 жыл бұрын
Great video, but I have one question. You said the DDPG is designed to maximize or minimize the Q value. However, in the later explanation is all about the compare different Q value generated by current critic network and target critic network. Is this action minimizing or maximizing the Q value? Sincerely
@aylwinwei5576
4 жыл бұрын
Thanks, the strategy of training the actor is to maximize Q. while the strategy of training the critic, is to minimize the difference between the Qs
@yitongzhou6165
4 жыл бұрын
@@aylwinwei5576 Thanks for the reply, but can you specify which step make the network to maximize the Q? Is this the effect of discount?
@ArmanAli-ww7ml
2 жыл бұрын
anyone could explain action noise?
@DrAiden121
4 жыл бұрын
Short and clear Thanks!
@rahulbball9395
4 жыл бұрын
Thank you! Great job!
@zhiguoding3385
4 жыл бұрын
Thank you for this clear clarification to DDPG. For the part to train the actor network, you described the use of the actor network and the target critic network. However, it was proposed to use the actor network and the critic network (not the target critic network) in the following link by Intel nervanasystems.github.io/coach/components/agents/policy_optimization/ddpg.html Can you please explain which of the two critic networks should be used to train the actor network?
@aylwinwei5576
4 жыл бұрын
Thanks. there's no conflict. actor is trained only in the online part, by freezing the online critic (in the video 4:58-5:10)
@zhiguoding3385
4 жыл бұрын
@@aylwinwei5576 Thanks for your kind clarifications.
@UdemmyUdemmy
Жыл бұрын
u are a genius
@aicancode5676
3 жыл бұрын
v good video!!!
@goldenshale
3 жыл бұрын
Thanks! Nice video
@shmollahasani
4 жыл бұрын
Thank you. Please explain other RL algorithms as well.
@paedrufernando2351
3 жыл бұрын
I every now and then come by to see if u released any new Videos... I strongly encourage you to
@siddharthsingh9809
4 жыл бұрын
Great video, keep up the great work! Look forward to more videos!
@rajasurya5380
4 жыл бұрын
please make a video on MADDPG
@WatchDisneyHD
4 жыл бұрын
When you say we assume that we already have a well trained Critic network, how do we go about doing that ?
@aylwinwei5576
4 жыл бұрын
that's talked after the actor chapter. the actor and critic are trained together but I have to go through them one by one:)
@souravdey1227
3 ай бұрын
Unbelievably clear explanation. Why did you stop????
@Faad3e
2 жыл бұрын
fantastic video! thanks a lot for making it, your english is good too
@jmachida3
3 жыл бұрын
Congratulations for the clear and concise explanation! Thank you!
@にちす
2 ай бұрын
Ayya Ayya Ayya, Okka debbaki motham clear chesav kadhayya
@pushkinarora5800
Ай бұрын
No one can explain this simply, Just great!!
@kabokbl2412
2 жыл бұрын
Valuable information. He needs a better microphone so we hear him clearer
@joaoluizvilardias9394
4 жыл бұрын
Excellent explanation! Please, make more of it.
@CS_cat-eb5xr
Жыл бұрын
thank you so much and plz make more videos
@520610732
Жыл бұрын
Its very time that i m able to understand ddpg
@LearnRoboticsAndAI
3 жыл бұрын
Thanks for sharing,. You have a a really profound understanding of the topic.
@nicolaslupi3111
2 жыл бұрын
I understand why we want to maximize Q, but then I don't get why we want to minimize | Q - (r + disc * Q_next) |
@georgpernice9123
Жыл бұрын
i think this refers to the Bellman equation.
@biswajitkumardash6421
2 жыл бұрын
I liked it. thanks
@ashusingh6517
2 жыл бұрын
Thank you
@adityashajijohn5068
Жыл бұрын
Hi Alywin, This is the first video I've seen that explains DDPG in a very clear and concise way. Was really helpful. Thanks again.👍
@0OTheIDaveO0
3 ай бұрын
This was great! You mentioned technical problems and that your English is not great. I think the way you presented and the quality of video and sound are more than good enough and your English is also not a problem at all, you explained everything really well and very intuitively. A great explanation, I wish there were more. Thanks a ton! this was really helpful
@OmerBoehm
2 жыл бұрын
Thank you for making this video in such an intuitive way while covering the essence of the algorithm
@analysislearning9179
3 жыл бұрын
Very good lecture, very concise and nothing wrong with English.
@tyl6194
4 жыл бұрын
The delivery is fine and your content is concise. Thanks again for your time!
@ZhihuaGan
4 жыл бұрын
Many thanks for the video, it explains the DDPG nice and clear!
@itsfabiolous
2 жыл бұрын
Thank you explaining this so concise and easy! Keep on! :)
@phanindraparashar8930
3 жыл бұрын
Really good. Can you also make on Option Critic
@ryadhcherifi7897
2 жыл бұрын
Thanks for your effort, it's a great video!
@WashingtonLuisSk8
4 жыл бұрын
Thanks for the video. It helped a lot.
@paedrufernando2351
4 жыл бұрын
you need to make more videos on this
@bobojason3517
4 жыл бұрын
Yup, clear and very helpful
@parris3142
3 жыл бұрын
good explanation
@ondercivelek998
4 жыл бұрын
you are great!!
@pavantippa2287
4 жыл бұрын
1. Why we inject actor noise and parameter noise? 2. Thanks a lot(arigato)
@aylwinwei5576
4 жыл бұрын
thanks, it helps discovering more possibilities in more efficient way. consider the worst scenario: if the robot repeats the same action in the same environment and remembers the same memories. then nothing can be learnt.
@xinyuandong2294
4 жыл бұрын
讲得特别好！希望博主多多更新别的方法！
@terswin789
4 жыл бұрын
不错，油管第一条评论送给您
@muzakkirquamar6348
2 жыл бұрын
Great Video.. and Eng is perfect bro. i suggest you to make more videos, may be a project using DDPG.
@dkal4497
4 жыл бұрын
Excellent!!! I finally understood DDPG!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@dkal4497
4 жыл бұрын
Are there many differences with ΜΑDDPG?