Mangosteen, a 47 billion-token Thai corpus built with a Thai-adapted pipeline, improves language model performance on Thai benchmarks.
Wannaphong Phatthiyaphaibun PRO
wannaphong
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
3 days ago
DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research
published
a model
9 days ago
wannaphong/p-mark15-1
published
a dataset
9 days ago
wannaphong/sailor2-sft-stage1-thai