Great content. Hope to see more on what Cohere is doing.
@joeybasile1572
3 ай бұрын
Nice video.
@user-wr4yl7tx3w
Жыл бұрын
Why 768?
@chrisalexiuk
Жыл бұрын
Likely tuned during training and found to be the best! They don't provide much specific detail on this point.
@user-wr4yl7tx3w
Жыл бұрын
Is there a way to see how they convert words into embeddings? Is it by predicting context from word or vice versa?
@chrisalexiuk
Жыл бұрын
You can check out this blog post which goes into more detail about their model: txt.cohere.com/multilingual/ Though they're fairly loose on the details!
@user-wr4yl7tx3w
Жыл бұрын
Do different models given entirely different embeddings? Do the embeddings also depend on the size of the training data?
@chrisalexiuk
Жыл бұрын
1. Most likely, there are possible scenarios where you wind up with similar embeddings - but they are unlikely at best. 2. Yes, they depend on the vocabulary and instances/documents/passages.
@user-wr4yl7tx3w
Жыл бұрын
But isn’t that a lot of dot scores to calculate? If we are talking about all of Wikipedia.
@chrisalexiuk
Жыл бұрын
It is, but it's vectorized with `torch.mm` and so it's no tooooo bad. Though we're only using a sample of the data - and I'd suggested doing some pre-filtered first if you wanted the best performance.
Пікірлер: 10