MMTEB: Massive Multilingual Text Embedding Benchmark Paper β’ 2502.13595 β’ Published 2 days ago β’ 15
The Ultimate Collection of Code Classifiers Collection π₯ 15 classifiers, 124M parameters, one per programming languageβ for assessing the educational value of GitHub code β’ 15 items β’ Updated 1 day ago β’ 9
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita π₯ 4 days ago β’ 85
view article Article From Chunks to Blocks: Accelerating Uploads and Downloads on the Hub 10 days ago β’ 48
view article Article From Llasa to Llasagna π: Finetuning LLaSA to generates Italian speech and other languages By Steveeeeeeen and 1 other β’ 10 days ago β’ 22
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper β’ 2502.02737 β’ Published 17 days ago β’ 187
GTE models Collection General Text Embedding Models Released by Tongyi Lab of Alibaba Group β’ 21 items β’ Updated Jan 21 β’ 23
view article Article Agentic RAG Stack (1/5) - Index and retrieve documents for vector search using Sentence Transformers and DuckDB By davidberenstein1957 β’ 25 days ago β’ 18
view article Article KV Caching Explained: Optimizing Transformer Inference Efficiency By not-lain β’ 23 days ago β’ 30
view article Article π Build a Qwen 2.5 VL API endpoint with Hugging Face spaces and Docker! By ariG23498 β’ 23 days ago β’ 15