view article Article SauerkrautLM's Multi-Phase Spectrum Training: A Technical Deep Dive By DavidGF • 3 days ago • 7
view article Article How to build a custom text classifier without days of human labeling By sdiazlor • 25 days ago • 54
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment Paper • 2408.06266 • Published Aug 12 • 9
view article Article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval Mar 22 • 61
view article Article Training and Finetuning Embedding Models with Sentence Transformers v3 May 28 • 156
view article Article BM25 for Python: Achieving high performance while simplifying dependencies with *BM25S*⚡ By xhluca • Jul 9 • 37
Instruction Pre-Training: Language Models are Supervised Multitask Learners Paper • 2406.14491 • Published Jun 20 • 85
view article Article Introducing the Ultimate SEC LLM: Revolutionizing Financial Insights with Llama-3-70B By Crystalcareai • Jun 19 • 7
view article Article Building a Vision Mixture-of-Expert Model from several fine-tuned Phi-3-Vision Models By mjbuehler • Jun 12 • 6
Unmixtraled experts Collection This collections contains all 8 experts of Mixtral 8x22B converted to single dense 22B models. The models are intended as basis for merges or finetune • 9 items • Updated Apr 11 • 1
💥 Laser vs DoRA vs Daser vs LoRA Collection Comparison of different PEFT techniques of NeuralMonarch. • 4 items • Updated Mar 22 • 6
Model Merging Collection Model Merging is a very popular technique nowadays in LLM. Here is a chronological list of papers on the space that will help you get started with it! • 30 items • Updated Jun 12 • 217
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5 • 69
🐶 Beagle Collection Merges done using mergekit and LazyMergekit: https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb#scrollTo=d5mYzDo1q96y • 8 items • Updated Aug 16 • 6
DRAGON Models Collection Production-grade RAG-optimized 6-7B parameter models - "Delivering RAG on ..." the leading foundation base models • 23 items • Updated 14 days ago • 44