Typhoon 2.1 Collection Typhoon 2.1 Text ThaiLLM release by SCB 10X. • 6 items • Updated 4 days ago • 3
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18 • 18
European LLMs Collection Large language models for European languages (multilingual and monolingual) • 13 items • Updated May 26, 2024 • 2
An Open Recipe: Adapting Language-Specific LLMs to a Reasoning Model in One Day via Model Merging Paper • 2502.09056 • Published Feb 13 • 32
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published Jan 13 • 99
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 280
Typhoon 2 Text Collection Typhoon 2 Text ThaiLLM release by SCB 10X. • 20 items • Updated 4 days ago • 5
Typhoon 2 Multimodal Collection Latest Official Multimodal ThaiLLM release by SCB 10X. • 3 items • Updated 4 days ago • 3
Unleashing Reasoning Capability of LLMs via Scalable Question Synthesis from Scratch Paper • 2410.18693 • Published Oct 24, 2024 • 43
🪐 SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos • 12 items • Updated May 5 • 230
Probably function calling datasets Collection Created using the https://huggingface.co/spaces/librarian-bots/dataset-column-search-api Space. • 39 items • Updated Jul 17, 2024 • 38
SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts Paper • 2405.07518 • Published May 13, 2024 • 28
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12, 2024 • 241
Interesting Datasets Collection A collection of datasets that I come across • 9 items • Updated Jan 4, 2024 • 5
Contra (Bottleneck T5) Collection Text autoencoders capable of embedding and generating text in a fixed-size latent space, useful for embeddings and latent space text editing. • 4 items • Updated Oct 3, 2023 • 28