Language Surgery in Multilingual Large Language Models Paper • 2506.12450 • Published 15 days ago • 16
Softpick: No Attention Sink, No Massive Activations with Rectified Softmax Paper • 2504.20966 • Published Apr 29 • 31
🧠 Reasoning datasets Collection Datasets with reasoning traces for math and code released by the community • 24 items • Updated May 19 • 152
🔱 Sailor2 Language Models Collection Sailing in South-East Asia with Inclusive Multilingual LLMs • 34 items • Updated 25 days ago • 28
GotongRoyong Collection GotongRoyong is a series of language models focused on Mixture of Experts (MoE), made with the following models using LazyMergekit and cg123/mergekit. • 2 items • Updated Jan 14, 2024 • 1
DukunLM Collection DukunLM is an open-source language model trained to generate Indonesian text using the power of AI. DukunLM, meaning "WizardLM" in Indonesian • 5 items • Updated Oct 31, 2023 • 1
Starstreak Collection Starstreak is a series of language models have been trained to generate content in English, Indonesian, and traditional Indonesian languages • 2 items • Updated Nov 19, 2023 • 1