LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

In this video, we explore how the temperature, top-k and top-p techniques influence the text generation of large language models (LLMs).
References
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Why We Don't Use the Mean Squared Error (MSE) Loss in Classification: • Why We Don't Use the M...
Related Videos
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Why Language Models Hallucinate: • Why Language Models Ha...
Grounding DINO, Open-Set Object Detection: • Object Detection Part ...
Detection Transformers (DETR), Object Queries: • Object Detection Part ...
Wav2vec2 A Framework for Self-Supervised Learning of Speech Representations - Paper Explained: • Wav2vec2 A Framework f...
Transformer Self-Attention Mechanism Explained: • Transformer Self-Atten...
How to Fine-tune Large Language Models Like ChatGPT with Low-Rank Adaptation (LoRA): • How to Fine-tune Large...
Multi-Head Attention (MHA), Multi-Query Attention (MQA), Grouped Query Attention (GQA) Explained: • Multi-Head Attention (...
Contents
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
00:00 - Intro
00:37 - Greedy Decoding
01:05 - Random Sampling
01:50 - Temperature
03:55 - Top-k Sampling
04:27 - Top-p Sampling
05:10 - Pros and Cons
07:30 - Outro
Follow Me
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
🐦 Twitter: @datamlistic / datamlistic
📸 Instagram: @datamlistic / datamlistic
📱 TikTok: @datamlistic / datamlistic
Channel Support
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
The best way to support the channel is to share the content. ;)
If you'd like to also support the channel financially, donating the price of a coffee is always warmly welcomed! (completely optional and voluntary)
► Patreon: / datamlistic
► Bitcoin (BTC): 3C6Pkzyb5CjAUYrJxmpCaaNPVRgRVxxyTq
► Ethereum (ETH): 0x9Ac4eB94386C3e02b96599C05B7a8C71773c9281
► Cardano (ADA): addr1v95rfxlslfzkvd8sr3exkh7st4qmgj4ywf5zcaxgqgdyunsj5juw5
► Tether (USDT): 0xeC261d9b2EE4B6997a6a424067af165BAA4afE1a
#llm #largelanguagemodels #chatgpt #textgeneration #promptengineering

Жүктеу

Пікірлер: 19

@datamlistic
4 ай бұрын
Wondering how you can fine-tune LLMs? Take a look here to see how this is done with LoRa, a popular fine-tuning mechanism: kzitem.info/news/bejne/pISj2YFsg3l7rWU VIdeo mistakes: - At 2:30 the sum should be for j, not for i. Thanks @mriz for noticing this! - The probability distribution after selecting top-3 words at 4:10 is not accurate, and they should be sunny - 0.46, rainy - 0.38, the - 0.15. Thanks @koiRitwikHai for noticing this!
@user-jx5or8pk2m
29 күн бұрын
Thanks! Top p and Top k were easy to understand.
@datamlistic
28 күн бұрын
You're welcome! I'm glad to hear that those concepts were clear and easy to understand. If you have any more questions or need further clarification on this topic, feel free to ask! :)
@waiitwhaat
Ай бұрын
This is a really clear explanation in this concept. Loved it. Thanks!
@datamlistic
Ай бұрын
Thanks! Happy to hear that you liked the explanation! :)
@stev__8881
2 ай бұрын
Great introduction with a clear an simple explanation/ illustration. Thanks!
@datamlistic
2 ай бұрын
Thanks! Glad you found it helpful! :)
@nizhsound
4 ай бұрын
Thank you for the video and explanation between the three types of sampling for LLMs. When sampling between Temperature, Top-K and Top-P, are you using or enabling all three sampling methods at the same time? For example if I chose to do Top-K sampling for controlled diversity and reduced nonsense, does that mean that I will choose a low temperature as well?
@datamlistic
4 ай бұрын
Glad it was helpful! Yes, you can combine multiple sampling methods at the same time. :)
@igordias8728
3 ай бұрын
Hello, in TOP-P, witch of the 4 words will be chosen? It's randomly between "sunny", "rainy", "the" and "good"?
@datamlistic
3 ай бұрын
Yes, it's random according to their distribution.
@Annaonawave
2 ай бұрын
@@datamlistic so they are randomly selected, but higher probable values have higher chance of being selected?
@datamlistic
2 ай бұрын
@@Annaonawave exactly :)
@varadarajraghavendrasrujan3210
20 күн бұрын
Let's say I use top_k=4, does the model sample 1 word out of the 4 most probable words randomly? If not, what happens?
@datamlistic
19 күн бұрын
That's exactly what happens! The model samples 1 word out of the most probable 4, according to their distribution. (i.e. the higher the probabaility of a word, the more likely it is to sample it).
@koiRitwikHai
4 ай бұрын
The probability distribution you get after selecting top-3 words at 4:10 is not accurate. The probabilities, after normalizing the 3-word-window, should be sunny-0.46, rainy-0.38, and the-0.15.
@datamlistic
4 ай бұрын
Yep, that's correct. Thanks for the feedback! I created/recorder the video over a longer period of time and it seems that I used two version of numbers in doing that (forgot to make any updates). I'm sorry if this has caused any confusion. I will add some corrections about this issue in the description/pinned comment. p.s. Maybe it would be a good idea to take the ceil of one of the probabilities you enumerated, so they sum up to 1.
@mriz
3 ай бұрын
2:3 bro you wrong the sums is not for input i , but for j
@datamlistic
3 ай бұрын
Yep, that's correct. Thanks for the feedback and sorry if this confused you! I will add a note about this mistake in the pinned comment. :)

Jailbroken: How Does LLM Safety Training Fail? - Paper Explained

Softmax - What is the Temperature of an AI??

They RUINED Everything! 😢

Gadgets that your kids will love 😉 #parentingtips #gadgets #cleaning #school #homework #lifehacks

Final muy inesperado 🥹

Универ. 10 лет спустя - ВСЕ СЕРИИ ПОДРЯД

Speculative Decoding: When Two LLMs are Faster than One

Low-rank Adaption of Large Language Models: Explaining the Key Concepts Behind LoRA

Anthropic's Meta Prompt: A Must-try!

Inside the LLM: Visualizing the Embeddings Layer of Mistral-7B and Gemma-2B

Softmax Function Explained In Depth with 3D Visuals

Fine-tune Multi-modal LLaVA Vision and Language Models

Semantic Chunking for RAG

Lost in the Middle: How Language Models use Long Context - Explained!

Discover Prompt Engineering | Google AI Essentials

Attention for Neural Networks, Clearly Explained!!!

They RUINED Everything! 😢

LLM Prompt Engineering with Random Sampling: Temperature, Top-k, Top-p

Пікірлер: 19