LLama-2 7B: 400K context length - Beyond Limits?

Рет қаралды 4,352

A LLama-2 7B LLM with 400K context length has been build with a new method. Based on activation compression and activation beacons for context intervals. With a sliding window methodology a context length of 400K for a LLama-2 7B LLM has been tested. Here are the results.
A new competitive method to extend context lengths of LLM not just by fine-tuning? Is it useful to extend your LLM from 4K to maybe just 32K context length? What compute infrastructure do you need? What training data set is necessary? What are the sensitive parameters for your training on additionally injected condensed activation tokens? Is this 400K context window perfect for simple RAG? No need for complex, modular RAG systems? Let's have a look at the latest research.
Literature (all rights with authors):
Soaring from 4K to 400K: Extending LLM’s Context with Activation Beacon
arxiv.org/pdf/2401.03462.pdf
#ai
#newtechnology
#research

Жүктеу

Пікірлер: 15

@BradleyKieser
5 ай бұрын
This is probably the single most important channel for anyone deeply involved in AI technology to follow. Excellent explanations with a nice touch of humour added.
@TheRealUsername
5 ай бұрын
Yeah but if you really want a deep involvement you can follow MarkTechPost, Towards Data Science and HuggingFace published papers section, it is where we find the latest AI breakthroughs specifically regarding LLMs
@kenchang3456
5 ай бұрын
I really appreciate this channel, I don't know how you keep up with all the movement in AI but I appreciate that you do and share.
@TheYvian
5 ай бұрын
I appreciate the nuanced down to earth tone of your video and explanations, as compared to the standard "hype everything up as the next big thing to get maximum views" KZitem strategy (which admittedly works, but i still despise it). Along with Hu-Po you're now on my shortlist of KZitem profiles to subscribe to for this more in-depth look of AI developments. Thank you for your hard work and i'm looking forward to see more!
@ds920
5 ай бұрын
Started watching. I absolutely love all your videos, and I really hope you were able to find information on how performance degrades with growing context size.
@ds920
5 ай бұрын
Yeah…!🎉 you did…! Thank you, sir!❤
@ghostwhowalks2324
4 ай бұрын
I cannot fathom how big of a brain you have to wrap your head around all these complex topics. You make it sound so easy and I believe your neurological connections of knowledge and language are probably the top 0.1% of humans
@Canna_Science_and_Technology
5 ай бұрын
Interesting video on the 'activation beacon' approach for extending context length in language models. It's worth comparing with other methods like 'AutoCompressors' which use randomized segmenting for better handling of varied document lengths, and 'semantic compression' that clusters topics for effective context extension. These techniques, along with 'Selective Context' which reduces redundancy, represent diverse ways to handle longer contexts in language models. Each has its unique strengths and potential applications
@BoeroBoy
4 ай бұрын
I've experimented with inference 400K+ on CPU on a workstation with 512GB RAM but slugggggish. Love this and can't wait for the hardware to advance.
@Abhishekkumar-wn9do
4 ай бұрын
how did you tried do you have model link can you please share?
@BoeroBoy
4 ай бұрын
@@Abhishekkumar-wn9do sorry not the model context but a prompt context size of 400k+. Sometimes it produces garbage though when you exceed a model's context.
@Canna_Science_and_Technology
5 ай бұрын
By the way, have you seen SOLAR-10.7B-Instruct-v1.0? It’s 11B and out performs Mixtral-8x7B-Instruct-v0.1 46B. Things are moving so fast.
@cidie1
5 ай бұрын
Is it really winning over mixtral or is it just that it's having training data that matches certain benchmarking tests?
@Canna_Science_and_Technology
5 ай бұрын
@@cidie1 not sure.
@niftylius
4 ай бұрын
Where discord? where patreon? i have questions man!