Негізгі бет Ғылым және технология Coding Llama 3 from scratch in PyTorch - Part 1

18 күн бұрын

Coding Llama 3 from scratch in PyTorch - Part 1

Рет қаралды 1,700

In this video series, you will learn how to train and fine-tune Llama 3 model from scratch.
The goal is to code LLaMA 3 from scratch in PyTorch to create models with sizes 3B, 6B, 35B and 45BM params. In this first video, you'll learn about upcycling, downcycling and infini-attention.
📚Papers:
- Sparse Upcycling Training Mixture-of-Experts from Dense Checkpoints
: arxiv.org/abs/2212.05055
- Pre-training Small Base LMs with Fewer Tokens: arxiv.org/abs/2404.08634
Leave No Context Behind Efficient Infinite Context Transformers with Infini-attention: arxiv.org/abs/2404.07143
💻 To follow along you can use this colab notebook:
- github.com/Blaizzy/Coding-LLM...
🎥 Coding Llama 2 from scratch video series
Part 1: kzitem.infoXHmag4damTg
Part 2: kzitem.infoLSWDpFmbE90
Part 3: • Coding Llama 2 from sc...

Жүктеу

Пікірлер: 11

@AC-go1tp
16 күн бұрын
This is very thoughtful and great initiative! researchers with enough gray matter but limited means can be still in the game . Thank you PC🙏!
@princecanuma
15 күн бұрын
Most welcome! It’s my pleasure:) I lived through this so others don’t have to.
@ngamcode2485
5 күн бұрын
this is very impressive and great content. thank you
@princecanuma
11 сағат бұрын
You're very welcome!
@kishoretvk
15 күн бұрын
Super impressive. Great value One question How do I further train the model on my custom content Instead of LORA ? Can we further full training it and add new memory
@princecanuma
9 күн бұрын
Most welcome! You can do that, but that can be very expensive.
@vivekpadman5248
3 күн бұрын
Bro how did you train llama 3 without paper?
@princecanuma
11 сағат бұрын
Could you elaborate?
@vivekpadman5248
50 минут бұрын
@@princecanuma As far as I know there hasn't been an official llama 3 paper released and no data Info as well. But I could be wrong... 😅
@princecanuma
35 минут бұрын
@@vivekpadman5248 true, they only released a blog detailing the data, model arch and performance. Here is how I did it: Llama-3 has the same exact architecture of Llama-2 which we already covered in this channel. kzitem.info/door/PLDn_JsyofyfQp4td_ub6LfIg5vxyu6YJK&si=0Gyt9mdaA-ydiWOA Finally, if you understand how these models work you don't need the paper, the code implementation is more than enough.
@vivekpadman5248
27 минут бұрын
@@princecanuma oh understood, thanks I'll check it out and also your video 💙

How to scrape the web for LLM in 2024: Jina AI (Reader API), Mendable (firecrawl) and Scrapegraph-ai

Diffusion models from scratch in PyTorch

Зомби Апокалипсис часть 1 🤯#shorts

Como ela fez isso? 😲

Габа енді сенің кезегің | Synyptas 3 | 9 серия

Who Will Eat The Porridge First The Cockroach Or Me? 👧vs🪳

Memory’s Role in Advancing AI - Six Five On The Road at Dell Technologies World

Build Anything with Llama 3 Agents, Here’s How

Llama3 + CrewAI + Groq = Email AI Agent

Building RAG at 5 different levels

A Beginner's guide on Hyperparameters for LLM Fine Tuning

Pytorch Transformers from Scratch (Attention is all you need)

Mistral Fine Tuning for Dummies (with 16k, 32k, 128k+ Context)

How is THIS Coding Assistant FREE?

How to Improve LLMs with RAG (Overview + Python Code)

PyTorch in 100 Seconds

Ремонт ГНИЛОЙ видеокарты SOYO AMD RX580 2048SP или почему покупать новую видяху с OZON опасно? 😱

Home Gadgets Haven😘Versatile Utensil (Inventions & Ideas)|Home Gadgets Haven #shorts #viral #tiktok

Как открыть дверь в Jaecoo J8? Удобно?🤔😊

Обманет ли МЕНЯ компьютерный мастер?

Самый дешевый в мире складной смартфон? 😍 Blackview Hero 10

#smartphone #распаковка #обзор #unboxing #дуэт #топ #техника #алиэкспресс #tablet #wildberries

Вы поможете украсть ваш iPhone

Coding Llama 3 from scratch in PyTorch - Part 1

Пікірлер: 11