IndicXTREME Collection IndicXTREME is a human-supervised benchmark of 9 diverse NLU tasks across 20 languages, featuring 105 evaluation sets in total. • 8 items • Updated Oct 23, 2024 • 1
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance Paper • 2310.14572 • Published Oct 23, 2023 • 1
Airavata Evaluation Suite Collection A collection of benchmarks used for evaluation of Airavata, an Hindi instruction-tuned model on top of Sarvam's OpenHathi base model. • 22 items • Updated Oct 15, 2024 • 8