Longxu Dou PRO

dreamerdeo

AI & ML interests

Natural Language Processing

Recent Activity

reacted to their post with βž• 1 day ago
πŸš€ Excited to share our technical report on the Southeast Asian multilingual model Sailor2 and its latest updates! Our 49-page report details Sailor2's development journey, including multilingual data cleaning, small model data mixture simulations, multi-stage continual pre-training, multi-stage post-training, and multi-cultural multi-lingual evaluations. Sailor2 aims to streamline the multilingual model pre-training process efficiently for the community. 🧭 We highlight Sailor2's impressive performance in low-resource language translation scenarios and its cultural understanding advantages in Southeast Asia, promoting practical applications for regional languages. Model updates include:Β  πŸ’‘ More precise outputs: Reduced redundancy in model outputs through refined post-training data and optimization techniques.Β  🌈 Handling longer texts: Expanded to handle up to 128K context length in Southeast Asian languages through long-text training.Β  ⚑️ Faster inference: Achieved 2.5x faster inference speed with speculative decoding.Β  πŸŒͺ️ More model sizes: Introduced new sizes of 3B and 14B through model pruning. 🌟 All models are Apache-licensed for commercial use; development tools (code, resources) are open-source. πŸ“š Technical report: https://huggingface.co/papers/2502.12982Β  πŸ€–οΈ Models: https://huggingface.co/collections/sail/sailor2-language-models-674d7c9e6b4dbbd9a869906bΒ  πŸ’¬ Demo: https://huggingface.co/spaces/sail/Sailor2-20B-ChatΒ  πŸ“£ Sailor2 community: https://huggingface.co/sailor2
updated a Space 1 day ago
sailor2/README
new activity 1 day ago
sail/Sailor2-8B-Chat:Fix formatting
View all activity

Organizations

Sea AI Lab's profile picture Table Research Lab's profile picture Sea Language Team's profile picture ZeroGPU Explorers's profile picture Sailor2's profile picture Sea AI Lab-Sailor's profile picture Sailor2 Evaluation's profile picture

dreamerdeo's activity

upvoted an article 7 months ago
view article
Article

RegMix: Data Mixture as Regression for Language Model Pre-training

By SivilTaram β€’
β€’ 11
upvoted an article 10 months ago
view article
Article

Large-scale Near-deduplication Behind BigCode

β€’ 21