Efficient Process Reward Model Training via Active Learning.
Sea AI Lab
company
Verified
AI & ML interests
None defined yet.
Recent Activity
View all activity
Sailing in South-East Asia with Inclusive Multilingual LLMs
-
26
Sailor2 20B Chat
π±Chat with Sailor2, a multilingual AI assistant
-
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Paper β’ 2502.12982 β’ Published β’ 18 -
sail/Sailor2-8B-Chat
Text Generation β’ 9B β’ Updated β’ 2.29k β’ 19 -
sail/Sailor2-1B-Chat
Text Generation β’ 1.0B β’ Updated β’ 1.71k β’ 16
Increase your vocabulary size when you scale up your language model
-
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper β’ 2407.13623 β’ Published β’ 57 -
11
Scaling With Vocab Demo
πPredict optimal vocabulary size based on model parameters
-
sail/scaling-vocab-3b-43k-overtrain
Text Generation β’ 3B β’ Updated β’ 24 -
sail/scaling-vocab-3b-32k-overtrain
Text Generation β’ 3B β’ Updated β’ 13
Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab.
-
6
Sailor 14B Chat
βGenerate responses to text questions in multiple languages
-
Sailor: Open Language Models for South-East Asia
Paper β’ 2404.03608 β’ Published β’ 21 -
sail/Sailor-14B
Text Generation β’ 14B β’ Updated β’ 19 β’ 6 -
sail/Sailor-7B
Text Generation β’ 8B β’ Updated β’ 29 β’ 28
-
Understanding R1-Zero-Like Training: A Critical Perspective
Paper β’ 2503.20783 β’ Published β’ 52 -
sail/Qwen2.5-Math-7B-Oat-Zero
Text Generation β’ 8B β’ Updated β’ 2.74k β’ 5 -
sail/Qwen2.5-Math-1.5B-Oat-Zero
Text Generation β’ 2B β’ Updated β’ 4.5k β’ β’ 3 -
sail/Llama-3.2-3B-Oat-Zero
Text Generation β’ 3B β’ Updated β’ 21 β’ 1
Automatic data mixture method for large language model pre-training
-
6
RegMix
πGenerate regression predictions from CSV data
-
RegMix: Data Mixture as Regression for Language Model Pre-training
Paper β’ 2407.01492 β’ Published β’ 39 -
sail/data-mixture-human-1b
Text Generation β’ Updated β’ 20 β’ 3 -
sail/data-mixture-pile-cc-1b
Text Generation β’ Updated β’ 47 β’ 3
Self-alignment with DPO Implicit Rewards
-
Bootstrapping Language Models with DPO Implicit Rewards
Paper β’ 2406.09760 β’ Published β’ 41 -
sail/Llama-3-Base-8B-DICE-Iter1
Text Generation β’ 8B β’ Updated β’ 27 β’ 2 -
sail/Llama-3-Base-8B-DICE-Iter2
Text Generation β’ 8B β’ Updated β’ 34 β’ 3 -
sail/Zephyr-7B-DICE-Iter1
Text Generation β’ 7B β’ Updated β’ 94
Efficient Process Reward Model Training via Active Learning.
-
Understanding R1-Zero-Like Training: A Critical Perspective
Paper β’ 2503.20783 β’ Published β’ 52 -
sail/Qwen2.5-Math-7B-Oat-Zero
Text Generation β’ 8B β’ Updated β’ 2.74k β’ 5 -
sail/Qwen2.5-Math-1.5B-Oat-Zero
Text Generation β’ 2B β’ Updated β’ 4.5k β’ β’ 3 -
sail/Llama-3.2-3B-Oat-Zero
Text Generation β’ 3B β’ Updated β’ 21 β’ 1
Sailing in South-East Asia with Inclusive Multilingual LLMs
-
26
Sailor2 20B Chat
π±Chat with Sailor2, a multilingual AI assistant
-
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs
Paper β’ 2502.12982 β’ Published β’ 18 -
sail/Sailor2-8B-Chat
Text Generation β’ 9B β’ Updated β’ 2.29k β’ 19 -
sail/Sailor2-1B-Chat
Text Generation β’ 1.0B β’ Updated β’ 1.71k β’ 16
Automatic data mixture method for large language model pre-training
-
6
RegMix
πGenerate regression predictions from CSV data
-
RegMix: Data Mixture as Regression for Language Model Pre-training
Paper β’ 2407.01492 β’ Published β’ 39 -
sail/data-mixture-human-1b
Text Generation β’ Updated β’ 20 β’ 3 -
sail/data-mixture-pile-cc-1b
Text Generation β’ Updated β’ 47 β’ 3
Increase your vocabulary size when you scale up your language model
-
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
Paper β’ 2407.13623 β’ Published β’ 57 -
11
Scaling With Vocab Demo
πPredict optimal vocabulary size based on model parameters
-
sail/scaling-vocab-3b-43k-overtrain
Text Generation β’ 3B β’ Updated β’ 24 -
sail/scaling-vocab-3b-32k-overtrain
Text Generation β’ 3B β’ Updated β’ 13
Self-alignment with DPO Implicit Rewards
-
Bootstrapping Language Models with DPO Implicit Rewards
Paper β’ 2406.09760 β’ Published β’ 41 -
sail/Llama-3-Base-8B-DICE-Iter1
Text Generation β’ 8B β’ Updated β’ 27 β’ 2 -
sail/Llama-3-Base-8B-DICE-Iter2
Text Generation β’ 8B β’ Updated β’ 34 β’ 3 -
sail/Zephyr-7B-DICE-Iter1
Text Generation β’ 7B β’ Updated β’ 94
Sailor: Open Language Models tailored for South-East Asia (SEA) released by Sea AI Lab.
-
6
Sailor 14B Chat
βGenerate responses to text questions in multiple languages
-
Sailor: Open Language Models for South-East Asia
Paper β’ 2404.03608 β’ Published β’ 21 -
sail/Sailor-14B
Text Generation β’ 14B β’ Updated β’ 19 β’ 6 -
sail/Sailor-7B
Text Generation β’ 8B β’ Updated β’ 29 β’ 28