CIKMar: A Dual-Encoder Approach to Prompt-Based Reranking in Educational Dialogue Systems Paper • 2408.08805 • Published Aug 16, 2024 • 2
Constructing and Expanding Low-Resource and Underrepresented Parallel Datasets for Indonesian Local Languages Paper • 2404.01009 • Published Apr 1, 2024 • 2
Language Surgery in Multilingual Large Language Models Paper • 2506.12450 • Published 13 days ago • 16
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10 • 99
Lius - Translation Models Collection Collection An Effort to build LLM based translation models for the Malay Kupang Language. • 5 items • Updated 23 days ago • 1
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark S Collection SEACrowd is a community movement project aimed at centralizing and standardizing AI resources for Southeast Asian languages, cultures, and/or regions. • 3 items • Updated Jun 18, 2024 • 8
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages Paper • 2406.10118 • Published Jun 14, 2024 • 33