Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper • 2503.07920 • Published Mar 10 • 99
WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines Paper • 2410.12705 • Published Oct 16, 2024 • 33
ASCEND: A Spontaneous Chinese-English Dataset for Code-switching in Multi-turn Conversation Paper • 2112.06223 • Published Dec 12, 2021
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset Paper • 2201.02419 • Published Jan 7, 2022
Cross-Lingual Cross-Age Group Adaptation for Low-Resource Elderly Speech Emotion Recognition Paper • 2306.14517 • Published Jun 26, 2023
A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity Paper • 2302.04023 • Published Feb 8, 2023
Which One Are You Referring To? Multimodal Object Identification in Situated Dialogue Paper • 2302.14680 • Published Feb 28, 2023
InstructAlign: High-and-Low Resource Language Alignment via Continual Crosslingual Instruction Tuning Paper • 2305.13627 • Published May 23, 2023 • 1
NusaWrites: Constructing High-Quality Corpora for Underrepresented and Extremely Low-Resource Languages Paper • 2309.10661 • Published Sep 19, 2023 • 1
Greenformer: Factorization Toolkit for Efficient Deep Neural Networks Paper • 2109.06762 • Published Sep 14, 2021 • 1
Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models Paper • 2310.05338 • Published Oct 9, 2023
NusaCrowd: A Call for Open and Reproducible NLP Research in Indonesian Languages Paper • 2207.10524 • Published Jul 21, 2022
NusaCrowd: Open Source Initiative for Indonesian NLP Resources Paper • 2212.09648 • Published Dec 19, 2022
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages Paper • 2404.06138 • Published Apr 9, 2024 • 3
CVQA: Culturally-diverse Multilingual Visual Question Answering Benchmark Paper • 2406.05967 • Published Jun 10, 2024 • 6
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages Paper • 2406.10118 • Published Jun 14, 2024 • 33