view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 11 days ago ā¢ 332
view article Article Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference Jan 16 ā¢ 71
view article Article From cloud to developers: Hugging Face and Microsoft Deepen Collaboration May 21, 2024 ā¢ 8
view article Article Unlocking Longer Generation with Key-Value Cache Quantization May 16, 2024 ā¢ 45
DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models Paper ā¢ 2309.03883 ā¢ Published Sep 7, 2023 ā¢ 35
Flamingo: a Visual Language Model for Few-Shot Learning Paper ā¢ 2204.14198 ā¢ Published Apr 29, 2022 ā¢ 15
Llama 2: Open Foundation and Fine-Tuned Chat Models Paper ā¢ 2307.09288 ā¢ Published Jul 18, 2023 ā¢ 244