view article Article How to Build an MCP Server with Gradio By abidlabs and 1 other • 20 days ago • 107
view article Article Train 400x faster Static Embedding Models with Sentence Transformers By tomaarsen • Jan 15 • 178
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 By tomaarsen • May 28, 2024 • 219
view article Article Hugging Face to sell open-source robots thanks to Pollen Robotics acquisition 🤖 By thomwolf and 2 others • Apr 14 • 46
view article Article The NLP Course is becoming the LLM Course! By burtenshaw and 9 others • Apr 3 • 93
view article Article Fine-tune ModernBERT for text classification using synthetic data By davidberenstein1957 • Dec 30, 2024 • 37
view article Article Open R1: How to use OlympicCoder locally for coding? By burtenshaw and 4 others • Mar 20 • 60
view article Article Training and Finetuning Reranker Models with Sentence Transformers v4 By tomaarsen • Mar 26 • 130
view article Article Agentic RAG Stack (3/5) - Generate responses using a SmolLM By davidberenstein1957 • Feb 6 • 7
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! By andito and 2 others • Jan 23 • 178
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • Jan 20 • 39
view article Article Agentic RAG Stack (1/5) - Index and retrieve documents for vector search using Sentence Transformers and DuckDB By davidberenstein1957 • Jan 27 • 21
Bad Data Toolbox Collection PleIAs collection of models for the data processing of challenging document and data sources. • 5 items • Updated Jul 18, 2024 • 15
Common Corpus Collection Largest multilingual pretraining data. • 1 item • Updated Nov 13, 2024 • 11