ISTA-DASLab/Llama-2-7b-AQLM-2Bit-1x16-hf
Text Generation • 1B • Updated • 268 • 5
None defined yet.
MatryoshkaLoRA: Learning Accurate Hierarchical Low-Rank Representations for LLM Fine-Tuning
GSQ: Highly-Accurate Low-Precision Scalar Quantization for LLMs via Gumbel-Softmax Sampling