view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 2 days ago • 39
📚 LLM pretraining datasets Collection A collection of datasets for LLM pretraining • 9 items • Updated 20 days ago • 6
Dar Datasets Collection datasets uploaded by https://github.com/ARBML/dar • 200 items • Updated Aug 22, 2024 • 11
KITAB-Bench Collection A Comprehensive Multi-Domain Benchmark for Arabic OCR and Document Understanding • 24 items • Updated Feb 24 • 11
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 141
Reranker & Retrieval Arabic Datasets & Models Collection This collection contains different Arabic datasets and models for retrieval and reranking tasks. • 8 items • Updated Dec 7, 2024 • 3
AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 74
MobileNetV4 pretrained weights Collection Weights for MobileNet-V4 pretrained in timm • 17 items • Updated Sep 22, 2024 • 18
Arabic NLI & Semantic Similarity Datasets Collection The Arabic Version of SNLI and MultiNLI datasets, originally used for Natural Language Inference (NLI), may be used for finetuning embedding models. • 6 items • Updated Jun 18, 2024 • 4