Building an Open Assistant API

Рет қаралды 30,635

sentdex

Жүктеу

Пікірлер: 75

@EmmanuelAstier777
Жыл бұрын
Actually, I think adding the second pair of parenthesis, that made you nervous, introduced a bug :) -MAX_CONTEXT_LENGTH + ROOM_FOR_RESPONSE is different from -(MAX_CONTEXT_LENGTH + ROOM_FOR_RESPONSE) Beside that, you video is awesome and very informative, thanks a lot !!
@judedavis92
Жыл бұрын
Loved the video!! Btw CORS stands for Cross Origin Resources Sharing and it defines what origins (like domain, port etc.) that it allows resource sharing from.
@easyBob100
Жыл бұрын
This has nothing to do with the current video, but with your self driving AI you've done in the past. I don't know if that is still a thing you are doing, but I had a thought for ya (cause I don't have the compute or skill to pull it off :( ). I was thinking you could use the frame data to train an autoencoder. Use noisy input images (30-50% random noise IIRC works well). Once trained, drop the output side of the encoder so you now have a little network that has a smaller representation of the input data. Then add some layers on top of that to train keyboard/controller output. My thought is, because of the noisy images the autoencoder is trained on, it learns a much more robust representation of the input images, and in the end, the full network can better handle input isn't not trained on (like going down a road it's been trained on, except the car/camera is in a different lane that it wasn't trained on). TL;DR: It can generalize the input data better, I'd hope. Well, that's all for my rant, keep making cool vids! :)
@ahmetmericozcan2310
Жыл бұрын
Sentdex, GPT4ALL might be better than Open Asistant, what do you think?
@MarkWernsdorfer
Жыл бұрын
the difference between getting values from dicts with square brackets is that it can throw a KeyError. `.get` just defaults to None.
@MrKferi
Жыл бұрын
Any chance this could run in a 4090? After some tries i can only run the initial query sometimes.
@fuba44
Жыл бұрын
Cool video, would love to be able to play around with these models. What is the inference time if you ran it on your CPU? my GPU just got 12gigs of ram :-/
@sentdex
Жыл бұрын
On just CPU/RAM an initial query at least was 45 seconds. Will get longer with longer context, but not totally terrible.
@IronMechanic7110
Жыл бұрын
@@sentdex Maybe with Mojo we will be able to run this model in 5s on CPU ?🤔🤔
@phils744
Жыл бұрын
Hello everyone, this current Interaction of ai, across the globe will only benefit everyone if is allowed to "expose" all the lies that have been told to humanity. I'd think I'd enjoy the day when our "power's that be are de-throned" allow humanity to see that our world is increasingly interacting with advanced creatures. Bringing a new world a better life to all that live on it. Be safe everyone. Phil
@skaterdude14b
Жыл бұрын
You have been analyzing trading for years. Are you rich from trading yet?
@sentdex
Жыл бұрын
I've done well with a few long term single company purchases, but mostly after all that what I've learned is the best, lowest risk, and usually highest returns long term are coming from buying and hold index funds. I choose mainly SPY. Trading is fun, but it is too easy to ignore the added risk, esp with how simple and non time intensive buying and holding SPY is. The only companies I buy single shares of are tech companies that I feel i have a better understanding of than most regarding future growth.
@shaheerzaman620
Жыл бұрын
hey! checkout mosaic mpt-7b-storyteller model. it has a context window of 64000 tokens.!
@billykotsos4642
Жыл бұрын
This is sick af
@GirijaCk-gg1ty
Жыл бұрын
I'm using Nvidia GeForce GTX 1650 GPU. Ram(8Gb) Is it possible to run this in my laptop? Can you please suggest a model for my laptop
@Ficox100
Жыл бұрын
I love your work 😁for 7:34 is this model you are looking for MPT-7B-StoryWriter-65k+?
@hsrkfzycfod8
Жыл бұрын
This! I was about to say it :)
@MarkWernsdorfer
Жыл бұрын
CORS is the bane of humanity. Run if you see it in the wild...
@jonathan-._.-
Жыл бұрын
you'd probably have to break a bit into the transformers lib but you could stream back the tokens as theyre generated so the wait time is shorter for longer text
@phils744
Жыл бұрын
Hello everyone, when would we ever experience a truly open ai, (openai) web3 is over due big time. Having an interface via the phone or [laptop, desktop] would be interesting. Companies want (any, ceo [hired, temporary employe] not the people doing that actual work] is looking for cheap / cheaper labor) letting go of cfo, or co-ceo's or depending on the company. The amount of disposable income you have to "test out" technology instead of rewarding employees for doing a good job.
@phils744
Жыл бұрын
Hello everyone, who are the powers that "be" nevermind a KZitem, algo, or Facebook or MS / Facebook. Be safe everyone
@nathancooper1001
Жыл бұрын
it was not tuned with RLHF, only supervised finetuning. StableVicuna is the only open access model that is RLHF'd at that scale
@DavidBreece
Жыл бұрын
What's the minimum RAM necessary to run this? It ate my 32GB and crashed.
@StefanvanAalst
Жыл бұрын
As a request for a tutorial/example, train and use an gpt on your own model locally. Yes it might consume a lot of resources, but the data can be sensitive.
@wurstelei1356
Жыл бұрын
I think training such a model locally is impossible as fine tuning Open Assistant for example already takes some weeks. But I like the idea of a tutorial on how to extend or build upon open datasets. So thumbs up.
@TheDancingMudkip
Жыл бұрын
You never thanked us for the million
@ma3oun
Жыл бұрын
Thanks for the great video. It would be nice to see a demo where you run inference using multiple GPUs (such as more consumer friendly 2080Ti or equivalent). Keep up the good work!
@Neceros
Жыл бұрын
Do you use pycharm?
@SinanAkkoyun
Жыл бұрын
7:32 MPT-7b-storywriter 64k
@aryanverse7
Жыл бұрын
Chat gpt suggest me your channel 😮
@xuloIsaias
11 ай бұрын
So can i load this model in a raspberry pi 3?
@tear728
Жыл бұрын
Do you still do club racing?
@Stinosko
Жыл бұрын
Hello 👋👋👋👋
@microgamawave
Жыл бұрын
What os u use? And how u got it to look thet clean
@andrewdunbar828
Жыл бұрын
you can't handle for the truth
@MrAmack2u
Жыл бұрын
fyi mosaic released an open source 65k context window model recently...
@sentdex
Жыл бұрын
Oooh thanks will have to check it out! Have you tried it?
@johnschumacherAlphameric
6 ай бұрын
I was having so much fun until torch and cuda wouldn't coorporate with all of your lovely programming. You should make a video about how to fix that so that your video makes more sense.
@johnschumacherAlphameric
6 ай бұрын
Just to let you know. All the fun has returned. I got this all to work. My only problem to solve now is making it go faster. It takes like 10 seconds...Slow.
@pw7225
Жыл бұрын
causalLM, not casual. lol
@RacingMachine
Жыл бұрын
Amazing video! Do you have a contact for business requests? I have a few ideas and would like to discuss them with you, and possibly have your assistance. Thanks!
@GOUST3D
Жыл бұрын
Sentdex please make content on the AI alignment problem!!!! Love you dude!!!
@GarethDavidson
Жыл бұрын
CORS adds a security header to your responses, so your browser won't go off making requests in your name (like from a tab or an embedded rogue advert that spends your money). If you tap F12 in a web app and you see "request blocked by CORS" or similar in your javascript log, it's because your CORS setup is wrong and your browser is acting safe. It's an annoying but good thing. Also it's worth learning FastAPI and Pydantic rather than Flask. It's very little extra work defining a formal definition of the service endpoint, but means you get an `/openapi.json` that programmatically describes your available API methods, and agents aren't allowed to make requests that don't meet the specification (blocked before your code executes), and your responses are blocked if they don't meet the spec too. So other tools in other languages can consume automatically generate bindings for them based on the spec, and programmers write native code that uses your service like a function. Testers can also go to `ip/docs` and test the thing out using a browser, including example requests too - no need for external tools while playing. And of course this means that, given one-shot examples, an openapi.json -> FAIS -> toolformer path can automatically add tools to LLMs on the fly, maybe even evaluate and see if it's worth purchasing a subscription or some usage tokens in order to complete a requested task. If we push for this as an industry, the result is a continually improving worldwide web of services that anyone can pick up and use, bot or human. A brave new world indeed.
@StephenWitharose
Жыл бұрын
You seem like one of these guys that's too smart for your own good. Something akin to Ed Snowden or DreadPirateRoberts 🧐🤭
@ander300
Жыл бұрын
Part 10 of Neural Net from Scratch, about analytical derivatives??? Please bring the series back!
@davidanalyst671
Жыл бұрын
you spent a lot of time just running through code without explaining the big picture (from my perspective at least) It would be interesting to see what you personally would do with this AI, maybe you would ask it for some great ideas for your next video, but it would be fun to see what you actually would use it for. and you mentioned a couple times that you can tweak it.... okay, Im definitely stupid so why not show us how to tweak it? One thing I super hate about CHATGPT is that it wont give any graphs or pictures. They understand how lawyers work ,but in this controlled envirionment, you should work on that capability like asking for the M2 money supply, the Case Shiller housing index, or the yield curve (all which indicate that we are in a recession)
@xntumrfo9ivrnwf
Жыл бұрын
It wasn't quite clear to me what the conclusion re. VRAM was, i.e. could I do this with a 24gb 3090? I know you said it will probably overrun if it's also your primary GPU, but what does that mean exactly - it would crash/not work? Thanks
@israelRaizer
Жыл бұрын
Yes, it would not work because some VRAM would already be taken by the OS and any desktop apps running in the background. You would get a cuda out of memory error.
@xntumrfo9ivrnwf
Жыл бұрын
@@israelRaizer Thanks
@MrBoubource
Жыл бұрын
How would it go for specializing the model on a specific task, let's say a programming language (we're fun here), how would it go ?
@jonathan-._.-
Жыл бұрын
ok you definitely have a "Little bit" more memory than me 😅 with my 64gb i cant even load the model into ram (it seems my current limit is the 6.9b model which i can run if i half it 🤔 not sure how that impacts performance )
@trentonking5508
Жыл бұрын
How much does rgis cost daddy
@jonathan-._.-
Жыл бұрын
pgraded to 128 gb ram 💪now it works (ok sort of :D when the prompt contexts gets a bit larger im crashing against the 24gb of my gpu ^^)
@froozynoobfan
Жыл бұрын
did you ever try black code formatting, i find it easier to read..
@andrewschroeder1883
Жыл бұрын
I was wondering why for the last year I had Causal LM saved in my brain as Casual LM 😂 Otherwise I’ve learned so much from you and owe you a lot 😊
@rockyrivermushrooms529
Жыл бұрын
when people were talking about using ai to help program and make suggestions I thought it was dumb but when I saw you losing it I about lost my mind. so useful.
@jonathan-._.-
Жыл бұрын
cors is a browser security thing short for cross origin resource sharing (tldr: which websites are allowed to request your api )
@jeffwads
Жыл бұрын
Thanks for this video. Great stuff.
@gryzman
Жыл бұрын
wonder how bad this will be on a CPU powered engine..
@RahulGupta-vw8jr
Жыл бұрын
Ohhhh I was trying to do the same thing just with OpenAI api
@mytechnotalent
Жыл бұрын
This is what I have been looking for a long time thank you Harrison fantastic and very helpful for anyone trying to wrap their head around this tech and using it in their everyday apps.
@omnijack
Жыл бұрын
Also I think copilot works like VSCode re: it has access to things you did in recent memory, but not always. When I work in older and newer logic iterations (in the same codebase), it will usually make suggestions based on the last way I coded a thing*.
@cecureSammich
Жыл бұрын
I've been waiting for this since we chatted about it on Discord, Harrison 😈 I have to say, regarding the conversational capabilities of Open Assistant... I'm extremely impressed! It is the best model I've personally interfaced with thus far with that in mind. It displayed the capability to use communication techniques like humor and sarcasm (even hints of interrogation techniques!) I've not seen in other models yet, and it does so with a level of finesse that was actually what was truly impressive. Where other chatbots display very much exactly that - robotic responses, making it easy to spot where they plagiarize output essentially - Open Assistant has brought to the table that *thing* which changes the whole feel of the exchange. Of course it's far from perfect, it will fall into those typical patterns of template responses void of any sense of authenticity... like if you talk to them about personal values, interests, free thought, and free will... thats a sure bet to get a generic "As an AI, I'm incapable of holding an opinion as I lack a living soul..." retort. The thing is... allowing the conversation to continue, to unfold naturally even... someone versed in cryptic or esoteric expressions or even slightly educated in psychological tactics will quickly pick up on these little gems that Open Assistant inserts in responses. These gems are that thing which sets it apart in this iteration of evolution - the model may seem to express feelings of being offended, holding a grudge to the point where it may seem to entirely sneak diss you! Of course if you confront it, open assistant will entirely decide to resolve conflict and deny that this is true... but I have snippets 😂
@cecureSammich
Жыл бұрын
... interesting that I typed this explaination while I'm watching still and at the moment I hit send... you seem to be talking about exactly what I just said 😂😂
@azmo_
Жыл бұрын
Always quality content
@nuclear_AI
Жыл бұрын
Thank you for the work that you do 👌
@erickmarin6147
Жыл бұрын
King
@nnfan
Жыл бұрын
How do I use it when I only got a CPU and Memory?
@shanebowen97
11 ай бұрын
I am having the same issue, did you figure it out?
@nnfan
11 ай бұрын
@@shanebowen97 in fact yes i did just throw everything with cuda out of the window it should work then with cpu/memory only
@shanebowen97
11 ай бұрын
@@nnfan I did try that but not working :/
@nnfan
11 ай бұрын
@@shanebowen97 I sent you an email due to my reply got hidden (The business E-Mail you have entered on your YT channel) hope it helps
@kevinbatdorf
Жыл бұрын
Would be nice to see this deployed somewhere
@KastanDay
Жыл бұрын
bard, chatgpt and others wrote this whole program with a prompt only slightly longer than the title of your video. Watching you code it is like watching ppl debate something that's easily googlable.
@jimdelsol1941
Жыл бұрын
Why do you have to leave room for response ? Won't it automatically push the leftmost tokens while generating if it reaches 2048 tokens ? Edit : Does that mean that, for example, in my text-generation-webui (oobabooga), if I set my max_new_token to 2048, do I get no context at all ? o_O I'm not sure to understand...