Get Embeddings From Llama3

Рет қаралды 8,604

Mosleh Mahamud

Жүктеу

Пікірлер: 32

@davidinawe791
17 күн бұрын
Thanks, very helpful and straightforward
@johnbarguti1025
4 ай бұрын
By specifying the model in the ollama.embeddings() call and in the OllamaEmbeddings class, what goes on behind the scenes and how is that model utilized in that scenario? Are there advantages to different models specified for the embedding process?
@moslehmahamud
4 ай бұрын
Very insightful question. I'm assuming the embeddings are extracted from the linear layer at the end of the LLama architecture (this is an assumption, of course). Regarding the advantage part, it depends on your use case, but using the embeddings can be an additional experiment. It could also be useful to check other embeddings models too.
@htrnhtrn6986
4 ай бұрын
Is it true that the embedding values of the three methods are different for the same sentence?
@moslehmahamud
4 ай бұрын
possible
@LuigiBungaro
4 ай бұрын
Thanks for sharing :) In the initial package installation I had also to run: `pip install llama-index-embeddings-ollama` in order to run `from llama_index.embeddings.ollama import OllamaEmbedding`.
@moslehmahamud
4 ай бұрын
Thanks! thats correct! i'll add it in the notebook. Frogot that i installed it before
@wilfredomartel7781
4 ай бұрын
😊
@CasperW-o2u
Ай бұрын
Hi, thank you for this video. But I am still confused. Why can Ollama use llama3, which is a LLM, to embed? I only used embedding models like Jina and bge before, the input of the embedding model is natural language and the output is vectors. I thought the input and output of LLM were both natural languages, how do ollama get vectors? Look forward to your reply.
@abdulqadircp
Ай бұрын
I thought the input and output of LLM were both natural languages, how do ollama get vectors? the text input to LLM has to be converted into the embedding(the meaningful numeric representation of the words, each LLM learn these embedding, and has its own embedding after the training), each word is represented in the form of vector.
@RomyIlano
Ай бұрын
thanks!!!
@fernandolbf_
3 ай бұрын
Just to be sure, the results running the three methods shown are always the same, just the speed is different? Pls, explain the differences of each method
@moslehmahamud
3 ай бұрын
Hi, this video is showing how to use the llama3 with different packages. So, that people can use/experiement them in their LLM based application. Hope it helps!
@johnbrandt5158
4 ай бұрын
Hey! What are your computer specs? Wondering how that may affect speed, either positively or negatively.
@moslehmahamud
4 ай бұрын
Hey! Using an M1 macbook pro (2020). Works decent for basic inference. Training is bit of a struggle, as expected. Let me know if you have any tips
@jennilthiyam1261
3 ай бұрын
HI. thank you for this video. I am using a server, and we are not allowed to use anything on baseline, but have tyo create docker container. I have installed llama in docker following the command docker run -d --gpus=all -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama and i am able to run llama3 from ollama with the command docker exec -it ollama ollama run llama3 Now, could you please tell me how can i follow your way to use ollama for embedding? i want to use llama3 from ollama as embedding model like you did in the video.
@moslehmahamud
3 ай бұрын
i ran ollama locally from the terminal, so not from docker. found it to be quite cumbersum to connect to gpus (from docker)
@yorkmena
Ай бұрын
@@moslehmahamud You mentioned you are using m1 macbook pro. Can you run ollama in macbook on GPUs? From what i know it only uses cpus. Please give some insight, it would help.
@alejandrogallardo1414
4 ай бұрын
You ran LLama 3 8B locally on a Mac?!
@moslehmahamud
4 ай бұрын
Yes sir
@jenot7164
2 ай бұрын
I am running it on a beefy laptop GPU to make the inference fast. M1 chips are pretty impressive.
@moslehmahamud
2 ай бұрын
@@jenot7164 i was surprised myself m1 works, prefer some beefy setup tho
@zoedaemon4940
2 ай бұрын
My cheap laptop RTX 4050 (just 6GB 😢) screaming in pain just for a single response text generation, took 15 minutes to see the output 😅
@jenot7164
2 ай бұрын
@@zoedaemon4940 I think the reason is that your CPU was used instead. A 4050 should be capable of this. Maybe it switched to the CPU because of the insuficient vram.
@kevin-dorean
3 ай бұрын
Is it possible to get "image" embeddings from Llama3?
@moslehmahamud
3 ай бұрын
not sure, but there are other strong image embeddings models...Making a video on that while i'm writing this
@kevin-dorean
3 ай бұрын
@@moslehmahamud As you said, I need to use some embedding models such as CLIP or SigLIP, right?
@kevin-dorean
3 ай бұрын
@@moslehmahamud So I need to use some models such as CLIP, SigLIP for image embedding, right?
@that_girl_is_here
4 ай бұрын
Can you make a video of this in vscode
@hellothere4732
2 ай бұрын
VSCode is used in the video