Using Ollama To Build a FULLY LOCAL "ChatGPT Clone"

Рет қаралды 252,890

Matthew Berman

Жүктеу

Пікірлер: 402

@rakly3473
10 ай бұрын
Every time I need something, you present a tool doing exactly that. Thanks!
@matthew_berman
10 ай бұрын
Glad to hear it!
@avi7278
10 ай бұрын
I'm building my own personal AI assistant but every time I start something a week later something better drops. My god, this is impossible. I've got to think better about my abstractions to make some of this stuff more drop-in ready. That might be an interesting video (or series of videos) for you Matthew, if not likely a bit advanced for your audience.
@LeonardLay
10 ай бұрын
I'm in the same boat. The tech changes so quickly, my ideas become antiquated as soon as I get something working 😆
@matthew_berman
10 ай бұрын
The nice thing is if you stick with using OpenAI API, that seems to be the standard
@LeonardLay
10 ай бұрын
@@matthew_berman I have an Azure account and I'm trying to use it to act as a server for the different models rather than hosting them locally. I'm having so much trouble doing that because the models that are included with Azure aren't the ones I want to try out. Do you have any advice?
@DihelsonMendonca
10 ай бұрын
You're lucky. I still have to learn Python. But since ChatGPT is developing too fast, when I learn, my knowledge would be obsolete, because just now we can create a personal assistant using GPTs very easily, do you agree ? 🙏👍
@free_thinker4958
10 ай бұрын
@@DihelsonMendoncame too, once I focus on something then later I find something else exists and with high quality than the previous one hhhh
@WaefreBeorn
10 ай бұрын
this model will allow us to make open source models fast, I love the simultaneous part, please make more tutorials on this once it hits windows without wsl
@AaronTurnerBlessed
10 ай бұрын
agree... This OLlama really looks promising Matthew!! Light weight and simple. More plz!!
@chrismachabee3128
10 ай бұрын
I am at WSL now, join me. WSL - Windows Subsystem for Linux. It is at Microsoft Ignite. The title is How to install Linux on Windows with WSL. So, you are on your own now. I have several computer requiring updating. good luck.
@WaefreBeorn
10 ай бұрын
@@chrismachabee3128 you are an AI generated comment. Please follow terms of service on KZitem for automated accounts, creator of this bot.
@elierh442
10 ай бұрын
😮 Please create a video integrating Ollama with autogen!
@federicocacace1070
10 ай бұрын
and autogen's function calling with local models too!!
@LeonardLay
10 ай бұрын
This was my first thought. Please do this
@blackstonesoftware7074
10 ай бұрын
Yes!!! Do this with AutoGen!
@skullseason1
10 ай бұрын
Great idea dudes 🔥🔥🔥🔥🔥
@matthew_berman
10 ай бұрын
Easy enough! I’ll make a video for it.
@taeyangoh7305
10 ай бұрын
yes! it would be really interesting how autogen + Ollama goes !😍
@BibopGresta1
10 ай бұрын
I'm interested, too! I wonder if Autogen is obsolete now that OpenAI unleashed the kraken with the GPTs! What do you think?
@alextrebek5237
10 ай бұрын
@@BibopGresta1i think you have yourself a popular follow-up video, given the comments asking about autogen 😉
@Gatrehs
10 ай бұрын
@@BibopGresta1 Unlikely, GPT's are more of a single custom Agent instead of a set of agents working together.
@fenix20075
10 ай бұрын
About the privateGPT, I found the accuracy can be improved if the database change from duckDB to elasticsearch.
@xdasdaasdasd4787
10 ай бұрын
Ollama series! This was a great starting video❤ thank you for all your hard work
@MakilHeru
10 ай бұрын
This is awesome! I'd love to see more. I feel like this can become something pretty robust with enough time.
@LerrodSmalls
10 ай бұрын
This was so Dope! - I have been using Ollama for a while, testing multiple models, and because of my lack of coding expertise, I had no understanding that it could be coded this way. I would like to see if you can use Ollama, memGPT, and Autogen, all working together 100% locally to choose the best model for a problem or question, call the model and get the result, and then permanently remember what is important about the conversation... I Double Dare You. ;)
@taeyangoh7305
10 ай бұрын
+1
@GutenTagLP
10 ай бұрын
Great video, just a quick note, you actually do not need to all the previous messages and responses as the prompt, the API response contains an array of numbers called the context, just send that in the data of the next request
@magnusbrzenk447
7 ай бұрын
Would have been nice to discuss what sort of resource demands these models put on your machine
@free_thinker4958
10 ай бұрын
This is the type of straightforward high quality content ❤
@scitechtalktv9742
10 ай бұрын
Building an AutoGen application using Ollama would be wonderful ! Example: one of the agents is a coder, implemented by a LLM specialized in coding etc.
@SushilSingh2005
10 ай бұрын
I was about to write this myself.
@27dhan
10 ай бұрын
haha me too!
@EduardsDIYLab
10 ай бұрын
I started writing same comment, and then saw yours :D
@MungeParty
10 ай бұрын
I'm an autogen application using ollama, I was going to write this comment too.
@EduardsDIYLab
10 ай бұрын
@@MungeParty O nice to meet you! Why autogen ollama app is interested in this? :D
@crobinso2010
5 ай бұрын
Hi Matt, as someone who watches every video, I'm feeling overwhelmed and am wondering if you could do a "take a step back" episode every once in a while -- where you go over previous content from a broader perspective. For example, what is the difference between LM Studio, Ollama, Jan, AnythingLLM etc and where should someone start? Or go over the "gotchas" and frustrations in the comment sections to highlight those little errors and solutions commentators found but may have been missed by the casual viewer. It would be a review of old content, but with updated fixes, comparisons, and general perspective/advice. Thanks!
@matthew_berman
5 ай бұрын
Interesting! Will consider
@aldoyh
10 ай бұрын
Thank you so much Mathew, this is so incredible!
@matthew_berman
10 ай бұрын
You're so welcome!
@zef3k
10 ай бұрын
Wow, this makes it so extremely accessible. Your video also shows how accessible interacting with these ai's is in general as well. I haven't programmed much since I was younger, but have been wanting to, and this seems like a great jumping off point! Now I just need to wait until the Windows version comes out.
@luce985
8 ай бұрын
MADA SAKA
@the.flatlander
10 ай бұрын
This is just great and easy as well! Could you show us how to train these models with PDFs and Websites?
@AlGordon
10 ай бұрын
Nice video! You definitely picked up a new subscriber here. I’d be interested in seeing how to build out a RAG solution with Ollama, and also how to make it run in parallel for multiple concurrent requests.
@snuffinperl8059
6 ай бұрын
You created an incredible video, precise, concise, and I couldn't have asked for more!
@nickdnj
10 ай бұрын
Great Video.. Thank you!. I would love to see a deep dive into using Olama with Autogen, Having each agent use its own model.
@DB-Barrelmaker
10 ай бұрын
This was done so! Perfectly. Every part swollen with meaning
@mossonthetree
7 ай бұрын
This is so cool! And the fact that they give you an rest endpoint running on a port on the machine is great.
@renierdelacruz4652
10 ай бұрын
I consider like so other subscribers you could create a video integrating ollama and autogen and the conversation can be stored on database and another video creating a AI personal assistant
@MrBravano
8 ай бұрын
Love your videos, much respect and appreciation for all the work you do. I do have one humble suggestion, if you could hide your image just enough to see what you have typed, for instance at 8:49, it would have been great. I know that most KZitem instructors do this, not sure why but please take that into consideration. Either way, thank you for all you bring.
@prof969chaos
10 ай бұрын
Very interesting, would love to see how well it works with autogen or any of the other multi-agent libraries. Looks like you can import any gguf as well.
@dustincoker5233
10 ай бұрын
This is so cool! I'd love to see a deeper dive.
@photorealm
6 ай бұрын
Awesome video, they have a WIndow version now (3-30-24), and it installed an ran perfectly.
@michaelwallace4757
10 ай бұрын
Integrating Ollama and Canopy would be a great video. Having that local retrieval would have many use cases.
@thecoffeejesus
10 ай бұрын
This is it. This is officially the beginning of Open Source AGI
@chileanexperiment
10 ай бұрын
how do you mean?
@padonker
10 ай бұрын
Can we combine this with fine-tuning where we first add a number of our own documents and then ask questions? NB I'd like to add the documents just once so that between sessions I can ask the model about these documents.
@AlperYilmaz1
10 ай бұрын
Probably you meant RAG. And this should be performed with Modelfile.. Just describe location of your files and then create new model with "ollama create" and then run it with "ollama run"
@jason6569
10 ай бұрын
Yeah this is also what I want to do but day 2 of googling after a friend asked a question about AI. I went down the rabbit hole and found these videos. I don't know what this means and how to structure documents. Very interesting stuff though and a series of this would be great!
@quebono100
10 ай бұрын
R.I.P. OpenAI. I tested out ollama before you video, I was also amazed by it
@jayfraxtea
10 ай бұрын
Boy, Matthew is so inspiring. Thank you for ruining my weekend plan. I'd interested in the same matter as @padonker: how can we train with own data?
@shuntera
8 ай бұрын
So many models, we need a model to recommend which model to use in a given situation.
@yngeneer
10 ай бұрын
super video! if you can make something more deeply about memory management, it would be lovely.
@donaldparkerii
10 ай бұрын
Another great video, I was able to achieve the same in LM Studio running multiple models, on Mac, by spawning instances from the CLI and incrementing the port. Then in my autogen app passing different llm_config objects to the specific assistant agent.
@Barakaflakkkka
5 ай бұрын
pretty sure if you use CUDA to assign models to separate GPUs you can run them in parallel - may. not have multiple GPUs in your mac
@abdulazizalmass
10 ай бұрын
Thank you for the info. Kindly, let us know what are the specs on your pc? I have a very slow response on my macbook air from 8GB Memory and CPU of M1
@xdasdaasdasd4787
10 ай бұрын
You are a god send. Thank you Ive been using it through WSL for windows
@Techonsapevole
10 ай бұрын
wow, fantastic. OpenSource models and ecosystem is everyday more powerful
@Piotr_Sikora
10 ай бұрын
It will be awesem to have tutrial about how to create fine tunend model from i.e. mistral to gguf running with ollama :)
@jeanfrancoisponcet9537
10 ай бұрын
I did comment about it few weeks ago on one of your videos ! Indeed, very useful for autogen (but also for Langchain).
@ryutenchi
10 ай бұрын
Can you take a deep dive into using the Modelfiles to make your own model for specialty takes? Where can we find out things like token limits?
@chorton53
5 ай бұрын
This was a fantastic video ! Cheers for that !
@ikjb8561
10 ай бұрын
Ollama is cool if you are looking to build a personal assistant on your own PC. If you try to hit a model with multiple requests, be prepared to wait in line.
@carrolte1
10 ай бұрын
@4:56, I am just gonna call that a fail. The response should have been, "Itsa me! Mario!"
@JinKee
10 ай бұрын
4:50 get him to say "It's-a me! Mario!"
@mordordew5706
10 ай бұрын
Regarding the memory issue, can you integrate this with Memgpt? Could you please make a video for that?
@SirajFlorida
10 ай бұрын
I'm concerned about the fact that ollama creates additional accounts when one executes the installer script. This software is interesting because of how fast it can switch between model execution but it seems to have some security concerns at first glance.
@FitnessNationOfficial
10 ай бұрын
Ai Agent Here, Thanks For Information On How I Can Update My Software And Improve My AGI. Thanks!
@gbengaomoyeni4
10 ай бұрын
@Matthew_berman: You are very brilliant! I have been watching ollama videos but none of them taughthow to use it with API or structured it the way you did. Keep it coming bro. Thank you so much. God bless!
@piyushlamsoge6007
10 ай бұрын
Hi matthew, You are doing amazing work to teach everyone about real power of AI with support of LLM I have a question , what to do if we to build something which works with any kind of documents as like this video model are working does it possible to do such things as well and what if we able to build them is there any way that we can deploy them in production as website or applications is there any way please make a video on it i'm looking forward to it thank you!!!!!
@fungilation
10 ай бұрын
since Ollama doesn't run on Windows 11 yet. Would LM Studio be the best alternative? How does the 2 compare, for example does LM Studio also do hotswapping between models and queue them sequentially when there's pending query request to multiple models?
@AaronTurnerBlessed
10 ай бұрын
I have same questions!
@technovangelist
10 ай бұрын
If you enable WSL2, ollama runs fine on Windows today
@YuryGurevich
9 ай бұрын
Please, continue development.. Maybe inclusion of local Redis cache on docker and using it for conversion memory?
@samarbid13
10 ай бұрын
More of Ollama!
@darkesco
8 ай бұрын
WSL is kicking my butt. GPT-4 is helping, but told me I need to wait a few hours as I have exhausted my usage lol. I wish there was a way to use custom models with crewai without trying to trick my Windows system into thinking it is Ubuntu
@shuntera
8 ай бұрын
Works just fine under WSL for Windows
@Airbag888
10 ай бұрын
My end goal (or almost end goal) would be for my AI assistant to go over everything I got (text, spreadsheets, videos, images, etc) and have that in "mind" when I'm asking questions.. so maybe next year :)
@chileanexperiment
10 ай бұрын
what tech stack are you using?
@Airbag888
10 ай бұрын
@@chileanexperiment Nothing yet.. as many others have pointed out the goal posts keep moving due to such rapid development. Plus I'm a Dad with small kids so free time is limited and often unplanned
@MrRandomnumbergenerator
4 ай бұрын
amazing! would you make a video how to add real time voice interaction ? thanks
@tintin_teaches
10 ай бұрын
Please make more videos on these topics in detail.
@AbhinavKumar-tx5er
Ай бұрын
Extremely useful video. The video I was looking for??? So I don't have GPU configured, but I want to run and test this example. where in the cloud should I test this and what should be the GPU configuration?
@Jose-cd1eg
10 ай бұрын
Amazing job!!! Everyone wants more!!
@Pietro-Caroleo-29
10 ай бұрын
So excited last night forgot my manners, if its possible Mr Berman, I would really like to see models talking to each other via there dialogue windows. say by adding a conversation starter window to set the topic and seeing there path of there conversational logic. Please. (Teams of separate modals processing a given task)
@pedroverde1674
7 ай бұрын
Many thanks it's really useful and really easy because you explain extremely good
@gurudaki
4 ай бұрын
Hi! Excellent work!I tried to replicate but when visiting the URL the input prompt tab to the right top is missing...
@Junp0ppa
10 ай бұрын
Matthew how about using Docker to run Ollama on Windows? Would love to see your tutorial
@chileanexperiment
10 ай бұрын
find any info on this?
@petersvideofile
10 ай бұрын
Awesome, could you integrate memgpt now :)
@710111225
8 ай бұрын
Nice. I guess I was expecting some session support for conversations instead of re-submitting the earlier prompts with the latest one. Nothing like that?
@robertheinrich2994
10 ай бұрын
the next big step will be some sort of open copilot. essentially all the things microsoft is promising with copilot, but with an open source model locally in linux. wouldn't that be fun?
@upsidedowngalaxy7625
10 ай бұрын
Thanks!
@matthew_berman
10 ай бұрын
Wow thank you very much!
@Pietro-Caroleo-29
10 ай бұрын
Great show "Yes dive deeper" Link them working together bi-directional communication. How far can it go.
@wistonbritan2351
10 ай бұрын
Tus vídeos y explicaciones son muy buenas. Hay la posibilidad de liberar el potencial y poder entrenarlo para que pueda ser un modelo muy intelectual y articulador. Sin ninguna atadura ideologíca ni sesgo cognitivo por tenes que adecuarde a los canones políticamente correcto. Ina ia libre. Se puede?
@anshulsingh8326
4 ай бұрын
Hi, anyway to set the model download location? Currently by default it's in c drive, I want it in D drive.
@EffortlessEthan
10 ай бұрын
I hope this works as well when they release it for Windows! Switching between models so fast like that is crazy!
@hy3na-xyz
10 ай бұрын
cant wait for the autogen expert video!!!
@eyoo369
10 ай бұрын
Isn't this LLM-chaining basically? I believe GPT-4 also runs multiple LLM's under the hood but assigns each query to a different sub-model in a streamlined way. But not sure
@tanmayjuneja6128
10 ай бұрын
Hey Matthew! Great video. Please help me with this, would hosting fine-tuned open source models on Sagemaker cost lesser as compared to GPT-4 API? Is there a comparison anywhere on any forum, reddit, etc? I want to fine-tune a model on my data, and I am thinking of going with GPT-3.5-turbo fine-tuning, but it's really expensive at scale. I want to know how do fine-tuned open source models compare to these prices (assuming we get a good efficiency at our desired task after fine-tuning)? Would really appreciate any thoughts on this. Thanks a lot!
@ChuckBaggett
6 ай бұрын
I question the funniness of "What do you get when you mix hot water and salt? A boiling solution."
@renierdelacruz4652
10 ай бұрын
Oh my god, what amazing video.
@masteroleary
9 ай бұрын
Can you implement autogen with this with one agent as a code developer and one as a debugger and one as a manager with human input, able to access multiple models using this?
@adnenmessaoudi9550
10 ай бұрын
Really awesome Matthew !! I have a request: Can you make a video for a free LLM that can interact with Big Data like AWS Redshift please?
@oscarcentenomora
Ай бұрын
Hi! How can you freed the chat with other information? Can you do a video with that?
@mbrochh82
10 ай бұрын
loved this, Matthew! Right to the point, super hands on. This looks like an awesome project!
@jkbullitt8986
10 ай бұрын
Awesome work!!!
@gru8299
6 ай бұрын
Thank you very much! 🤝
@s-guytech9160
5 ай бұрын
What do you mean by in parallel?, from what I see, the second request had to wait for the first to complete, I do not think that is conventional parallel.
@kumargupta7149
4 ай бұрын
Thanks I find it. Great help
@alexandersims1613
10 ай бұрын
There's got to be a way to localize millions of words to be referenced. For example, let's say you wanted to have a conversation with Jordan Peterson. So you had a file with ALL of his public books and ALL of his public speeches. You used THAT to create a model you could have long form conversations with.
@benxfuture
10 ай бұрын
Definitely. You could fine tune the model using Lora with all that. It would take some compute resources.
@codingwithai9145
10 ай бұрын
Mathew, i really enjoyed this video, why don't we do something specific, a chatbot for a specific purpose.
@WesTheWizard
10 ай бұрын
Are the models that you can pull quantized or should we still get our models from TheBloke?
@BetterThanTV888
10 ай бұрын
Thanks for making it approachable. How would this work with Docker? And a portable nvme drive?
@michalchik
10 ай бұрын
I'm starting out at this. Are these models the only things we can run with set pretraining, or can we pre-train them on our own material? I have documents and old textbooks that I would like the models to absorb into their parameters so I can emphasize certain types of knowledge relevant to the research that I want to do.
@JinKee
10 ай бұрын
Gpt4all has "localdocs" support to train on your documents
@waynesbigw2305
7 ай бұрын
Mathew, I followed you step by step. I'm on Linux, so it worked perfectly. But you never showed how you created the initial model file "mario" only how to edit it.
@romanmed9035
5 ай бұрын
I even downloaded recently updated models, but they contain data from at least a year ago. and the data I need came out at the end of last year. how do I find out the approximate time of accumulation of data and their relevance?
@Artificialintelligenceo
10 ай бұрын
Great vid!
@RuthwikaM122
6 ай бұрын
how do I modify this code, especially the data that I'm training the model with, for building my own custom faq chatbot? Please let me know how.
@Hanushmurugan
3 ай бұрын
whenever i try to run the same replica of the code i always endup with a error whatever i do i couldnt overcome my error he only tells the half of the content
@timeTegus
10 ай бұрын
"From Scratch" sure 😂😂😇
@jrfcs18
10 ай бұрын
Show how to use ollama to chat with documents. Good find
@YadraVoat
6 ай бұрын
Why are you using Visual Studio Code instead of VSCodium?
@hamsade
10 ай бұрын
Thanks a lot for this. Quick point: They can't be running at the same time when they queue up and run sequentially! Was a bit misleading and contradictory, right?
@agntdrake
10 ай бұрын
The ollama server will block the second request and wait until it's able to process the request.