❤️ Become The AI Epiphany Patreon ❤️
/ theaiepiphany
👨👩👧👦 Join our Discord community 👨👩👧👦
/ discord
In this video I do a deep dive of the recent "AudioGen: Textually Guided Audio Generation | Paper Explained" paper that introduced text-guided audio synthesis.
In a nutshell, it's the VQ-VAE/GAN idea applied to the audio modality.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper: felixkreuk.github.io/text2aud...
✅ Site: felixkreuk.github.io/text2aud...
✅ 3B1B on Fourier transform: • But what is the Fourie...
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
⌚️ Timetable:
00:00 Intro
01:17 Why is text-to-audio hard?
02:51 Comparison with VQ-GAN
05:15 Comparison with SoundStream
06:20 AudioGen overview
09:10 Deep dive: audio representation, LSTM
14:05 Losses explained
17:40 Complex-valued STFTs
21:57 Audio Language Modeling
23:37 Multi-stream audio inputs
25:32 Data and augmentations
29:05 Results
35:28 Outro
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💰 BECOME A PATREON OF THE AI EPIPHANY ❤️
If these videos, GitHub projects, and blogs help you,
consider helping me out by supporting me on Patreon!
The AI Epiphany - / theaiepiphany
One-time donation - www.paypal.com/paypalme/theai...
Huge thank you to these AI Epiphany patreons:
Eli Mahler
Petar Veličković
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💼 LinkedIn - / aleksagordic
🐦 Twitter - / gordic_aleksa
👨👩👧👦 Discord - / discord
📺 KZitem - / theaiepiphany
📚 Medium - / gordicaleksa
💻 GitHub - github.com/gordicaleksa
📢 AI Newsletter - aiepiphany.substack.com/
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
#audiogen #audiosynthesis #multimodal
Негізгі бет AudioGen: Textually Guided Audio Generation | Text To Audio | Paper Explained
Пікірлер: 13