ofc it pulls the nsfw subreddit first 😆 that's hilarious! Great content as always.
@BoobieBusiness
6 ай бұрын
At 32:15 you mention to be unsure of the meaning of the parent id's in the dataset. The reddit post you linked to the BigQuery contains a SELECT statement with REGEXP_REPLACE of 't[0-9]_' on the link_id. According to GPT-4, link_id is a field that represents the ID of the post (submission) to which a comment belongs. Reddit IDs for posts and comments, often prefixed with a type indicator t1-t6 are: Comment, Account (user), Link (post or submission), Message (private message), Subreddit, Award If you did not filter the data frame on ids starting with t1_, then you might have fine-tuned the model on all types of content, not just comments/conversations. If so, it might explain why adding all those other subreddits messed with the training process (as the prompt template is not formatted for other types of content).
@sentdex
6 ай бұрын
Hmm. something to dig into more for sure, thank you!
@AlbertCelmaOrtega
4 ай бұрын
Hi sentdex, Albert from Barcelona here! INCREDIBLE 1.33M SUBS!! YOU ARE AWESOME!! I learnt how to code and to do ML thanks to you. I studied civil engineering at Imperial College London and can tell your ability to convey ideas and teach is unprecedented! Plus it's always fun. I recall the first model I made, thanks to you, a binary classification model about breast cancer in 2019! Still here. I am starting a tech startup for logistics. I hope I can make it, and give back to you for so much you've already given me.
@blancalexandre4438
Ай бұрын
Good luck man!
@codespace
Ай бұрын
where are you dude? long time no see?
@MaxM9000
6 ай бұрын
This project reminds me of Yannic's GPT4-Chan project. How cursed can we get a WSB AI bot in terms of memes and degenerate strategies?
@nethrashri486
6 ай бұрын
Waiting for this long time... biggg thank youuuuuuuuu ..
@gcm4312
6 ай бұрын
52:28 I believe the format python json is expecting is like `[{"key1":"value1"},{"key2":"value2"}]`. Your database has newlines for key separators (not commas) and is not inside a list
@samar1900
5 ай бұрын
I have started with Deep Learning, can anyone suggest from which video I should start, any flow is available on this channel where i can follow accordingly?
@TheInternalNet
5 ай бұрын
Yeah like an absolute beginner crash course. I'm so fired up to learn this.
@savagejinx8179
6 ай бұрын
How come you never continued the Neural Networks from Scratch series?
@johnblomberg389
6 ай бұрын
Hi Sentdex! First of all thanks for the video, it's interesting as always to see you tinker with this stuff and I'm really learning a lot :) after your previous videos with the WSB bot I decided to create my own scraper to collect comments from the daily WSB threads. I have just kept it running every now and then on my local computer and collected something like 57 mb of conversation data. I believe it is 229 000 comment threads, some of them are longer and some of them shorter. It has not been properly cleaned so there are also threads with only one comment in them but even after removing that it should be at least 150k of threads which are recent (collected during mid 2023 until now) If you want to play around with it I can clean it and upload it somewhere :)
@yureqandrade
Ай бұрын
@sentdex where’s your Bitcoin Whitepaper Playlist? I couldn’t find it. Anyone can shine a light here, please?
@MIH20788
21 күн бұрын
bring back our nnfs tutorial reading the book only is hard😊😊😊😊😊😊
@kadaliakshay6770
6 ай бұрын
waiting for more amazing videos and also just subscribed and liked the video
@bigbena23
2 ай бұрын
First of all, your videos are amazing. I was thinking of doing the same but not for Reddit, but Slack discussions. In slack there are only 1 layer of discussion with threads (so I guess it's 2), but not more than that. What I couldn't quite understand from your video is how do you decide with which speaker you're replacing the chatbot. Is it simply the last one for each tier? I'm unsure with how to apply it for my use case - maybe I shall just replace a random user in the chats every X time with the bot?
@vipclassic105
10 күн бұрын
Hello sir can you reverse cython
@bennguyen1313
6 ай бұрын
I've seen some people use Google Colab / Jupyter - Spyder.. how does training using those compare to Google Cloud? Can a python application access a model running on the cloud for free (Google Colab).. or are there no free options? What's the cheapest? For example, aside from cloud services that host LLMs (railway , modal, render, beam cloud , Replicate , Streamlit , replit), I could run Ollama on my own computer and run models (Llama2 (XB), Mistral 7B, etc)? The downside is that my python API would need to be written for a specific API? For example, OpenAPI , Gemini, OpenAI's Assistant API , Au Mistral, Gemini Pro, llama2 , FastAPI are all different?
@lovemedicine
2 ай бұрын
Hi thanks for the video, can you create a video using meta-learning with example
@MrunalAshwinbhaiMania-b1d
5 ай бұрын
Hello Sentdex! , Thank you for such a wonderfull video. I just have one question, when I tried to get the fh-bigquery data, its not available at the link, can you please give us the big-query link. Much appriciated. Thanks, Mrunal Ashwinbhai Mania
@Zero-tg4dc
2 ай бұрын
if you still need help with this I know how to access the data
@varshwalia
6 ай бұрын
Man delivers every single time.
@BenYu-v8e
6 ай бұрын
Your video is helpful for me to start finetuning models. One question, can the numpy library have the same performance sorting datasets?
@phils744
6 ай бұрын
I really need to learn to be patience, you have excellent content, I would like to install this on my ha cluster, with my own database. From excel files to pdf, it's cool as heck. Be safe everyone
@nidavis
6 ай бұрын
To solve for when the bot should respond, perhaps a simple classification model trained on whether or not the bot should reply, which then calls the chatbot based on that result?
@prathyushmadhu2861
3 ай бұрын
Does anybody know about that copilot he used to speed up the decompressing process?
@TrueTributes13
3 ай бұрын
This was such a fun watch, so much information, you make learning this stuff a blast🙌
@livinthrusound
6 ай бұрын
Surprised no one mentioned but … r/2007scape?? Love it
@peepleep7931
6 ай бұрын
hell yeah sentdex is back
@mher_22
Ай бұрын
...maybe come back? pls?
@WetspongeUK
4 ай бұрын
would love to see the python from scratch series finish
@asiddiqi123
6 ай бұрын
Harrison for President
@Akhoon_faheem
2 ай бұрын
In this of age AI , i fell for your video's 😅
@rataash_x
6 ай бұрын
You make the whole learning process so fun, it never gets boring.
@TrueTributes13
3 ай бұрын
I wholeheartedly agree, top tier stuff👌
@shashwatxcodes
5 ай бұрын
sir is it true that mostly folks with masters or phd in ai only get packages over 100k usd ? pls reply sir as im confused between taking btech cse or btech aiml Im confused between targetting ai engineering right from 1st sem or web dev for the initial part and then switch to ai ml in my 3rd sem. My Target - 100k+ usd remote job Pls do reply sir id be extremely thankful to you ❤
@shitmandood
Ай бұрын
You probably have to know somebody that would want Give you such a job with high salary and work from home because if it's something really important, they're gonna wanna have you nearby for discussions. I mean I could be wrong but I'd I'd be surprised it would depend on your credentials. If you wanna get >100 K job that's remote. It can be anything. It doesn't have to be an AI Engineering so I mean if it's just if money is all you want it doesn't really need to be AI, it can be anything.
@rook451
3 ай бұрын
Love your website. Thank you.
@BlueBearOne
5 ай бұрын
Did you take down your Discord server?
@dhyanais
6 ай бұрын
Is it important to differentiate by language? I bet you'll find all kinds of languages there. Is it relevant to distinguish the language first and only use comments from one language?
@sentdex
6 ай бұрын
Good question when it comes to fine-tuning, especially with QLoRA. I would estimate that you'd want to keep it simpler, but we do know when it comes to fully training models that multi-lingual tends to produce better models.
@nomanshiekh26
6 ай бұрын
Really informative video. Thanks!
@limajgarcia
6 ай бұрын
Hit the like and watch. let's go!
@mr.daniish
6 ай бұрын
Another knowledge bomb!
@hamzashaikh9795
6 ай бұрын
First one to comment 🎉
@mytechnotalent
6 ай бұрын
awesome! 34,445 woohoo!
@WL113
6 ай бұрын
finally! booya!
@davidschaupp5423
6 ай бұрын
I can´t find the dataset on bigquery?
@sentdex
6 ай бұрын
Still there, here's the link: bigquery.cloud.google.com/table/fh-bigquery:reddit_comments.2015_05
@donquixoteth
6 ай бұрын
It does not work for me@@sentdex
@thekingofallblogs
4 ай бұрын
@@sentdex do you have to create a billing account to access it? or be part of some group ? all I see is just-landing-xxx under explorer.
@StephenRoseDuo
2 ай бұрын
You good Sentdex?
@sentdex
2 ай бұрын
Yep!
@StephenRoseDuo
2 ай бұрын
Nice to hear it 😎
@tcgvsocg1458
6 ай бұрын
long time no see
@kadaliakshay6770
6 ай бұрын
Amazing Explanation bro keep it up
@adempc
6 ай бұрын
Word
@cod-newbie9166
6 ай бұрын
Why can’t I access the ebooks😢?
@sentdex
6 ай бұрын
Do you mean you made an order and havent gotten it?
@cod-newbie9166
6 ай бұрын
@@sentdex I mean I can’t open the web page
@sentdex
6 ай бұрын
@@cod-newbie9166 which one?
@spxyo
6 ай бұрын
Hi! Have you checked the bills after downloading so much data from GCS ? It seems like a lot of class B operations and transferred data. Did it cost you more than $1000 ? Thanks
@sentdex
6 ай бұрын
The entire BigQuery cost for the operations here was $89.84, and that includes a few exports/downloads that I ended up doing a couple of times as I deved.
@60hit99
6 ай бұрын
Hi
@sentdex
6 ай бұрын
Hello
@human_agi
6 ай бұрын
Did you download from gcp to your local computer?
@sentdex
6 ай бұрын
Yes. When you go to export, gcp gives you a gsutil command example. I just took it and used * to get all the files with a single command
@johnnywilliams2641
6 ай бұрын
loving all comments is the same as loving none sentdex. We all know there is no information there.
@sentdex
6 ай бұрын
The good news is: I don't love all comments.
@johnnywilliams2641
6 ай бұрын
@@sentdex You're a master of your craft. Learned a ton from your python tutorials many years ago. Didn't mean to offend. Thought it was a crafty information theory joke. Cuz I'm super witty and good looking too. You better like my damn comments senty.
Пікірлер: 77