Falcon 2: An 11B parameter pretrained language model and VLM, trained on over 5000B tokens tokens and 11 languages May 24, 2024 β’ 25
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper β’ 2410.05355 β’ Published Oct 7, 2024 β’ 35
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper β’ 2410.05355 β’ Published Oct 7, 2024 β’ 35
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper β’ 2410.05355 β’ Published Oct 7, 2024 β’ 35
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper β’ 2410.05355 β’ Published Oct 7, 2024 β’ 35
Falcon Mamba: The First Competitive Attention-free 7B Language Model Paper β’ 2410.05355 β’ Published Oct 7, 2024 β’ 35
view post Post 3542 Falcon Mamba now available now in llama.cpp !Check out GGUF files uploaded here: tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a 3 replies Β· π 5 5 β€οΈ 3 3 π 2 2 + Reply
view post Post 4029 FalconMamba 7B - a new model from TII (Technology Innovation Institute) is out !- Blogpost: https://huggingface.co/blog/falconmamba- Link to collection: tiiuae/falconmamba-7b-66b9a580324dd1598b0f6d4a- Link to playground: tiiuae/falcon-mamba-playground π₯ 12 12 + Reply
view post Post Check out quantized weights from ISTA-DAS Lab directly in their organisation page: https://huggingface.co/ISTA-DASLab ! With official weights of AQLM (for 2bit quantization) & QMoE (1-bit MoE quantization)Read more about these techniques below:AQLM paper: Extreme Compression of Large Language Models via Additive Quantization (2401.06118)QMoE: QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models (2310.16795)Some useful links below:AQLM repo: https://github.com/Vahe1994/AQLMHow to use AQLM & transformers: https://huggingface.co/docs/transformers/quantization#aqlmHow to use AQLM & PEFT: https://huggingface.co/docs/peft/developer_guides/quantization#aqlm-quantizaionGreat work from @BlackSamorez and team ! β€οΈ 9 9 π 2 2 + Reply
view post Post Try out Mixtral 2-bit on a free-tier Google Colab notebook right now! https://colab.research.google.com/drive/1-xZmBRXT5Fm3Ghn4Mwa2KRypORXb855X?usp=sharing AQLM method has been recently introduced on transformers main branch The 2bit model can be found here: BlackSamorez/Mixtral-8x7b-AQLM-2Bit-1x16-hf-test-dispatch And you can read more about the method here: https://huggingface.co/docs/transformers/main/en/quantization#aqlmGreat work @BlackSamorez and team! 5 replies Β· β€οΈ 18 18 + Reply
Distributed Inference and Fine-tuning of Large Language Models Over The Internet Paper β’ 2312.08361 β’ Published Dec 13, 2023 β’ 28
Petals: Collaborative Inference and Fine-tuning of Large Models Paper β’ 2209.01188 β’ Published Sep 2, 2022 β’ 1
Do Pedestrians Pay Attention? Eye Contact Detection in the Wild Paper β’ 2112.04212 β’ Published Dec 8, 2021 β’ 1
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale Paper β’ 2208.07339 β’ Published Aug 15, 2022 β’ 5
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper β’ 2211.05100 β’ Published Nov 9, 2022 β’ 31