MMTEB Collection Our contribution to the Massive Multilingual Text Embedding Benchmark (MMTEB). Retrieval and reranking benchmarks in 16 languages. ā¢ 4 items ā¢ Updated Jun 6, 2024 ā¢ 2
MMTEB: Massive Multilingual Text Embedding Benchmark Paper ā¢ 2502.13595 ā¢ Published 3 days ago ā¢ 20
CommonCrawl Collection Large web-mined general corpus based on CommonCrawl. ā¢ 7 items ā¢ Updated Dec 8, 2024 ā¢ 2
NoLiMa: Long-Context Evaluation Beyond Literal Matching Paper ā¢ 2502.05167 ā¢ Published 14 days ago ā¢ 15
mistralai/Mistral-Small-24B-Instruct-2501 Text Generation ā¢ Updated 20 days ago ā¢ 729k ā¢ ā¢ 804
evborjnvioerjnvuowsetngboetgjbeigjaweuofjf/bluesky-298-million-Posts Viewer ā¢ Updated Jan 7 ā¢ 201M ā¢ 42 ā¢ 48