LLM Quantization: GPTQ - AutoGPTQ
llama.cpp - ggml.c - GGUL - C++
Compare to HF transformers in 4-bit quantization.
Download Web UI wrappers for your heavily quantized LLM to your local machine (PC, Linux, Apple).
LLM on Apple Hardware, w/ M1, M2 or M3 chip.
Run inference of your LLMs on your local PC, with heavy quantization applied.
Plus: 8 Web UI for GTPQ, llama.cpp or AutoGPTQ, exLLama or GGUF.c
koboldcpp
oobabooga text-generation-webui
ctransformers
lmstudio.ai/
github.com/mar...
github.com/gge...
github.com/rus...
huggingface.co...
github.com/Pan...
cloud.google.c...
huggingface.co...
h2o.ai/platfor...
#quantization
#ai
#webui
Негізгі бет New Tutorial on LLM Quantization w/ QLoRA, GPTQ and Llamacpp, LLama 2
Пікірлер: 22