view article Article Train 400x faster Static Embedding Models with Sentence Transformers 3 days ago • 102
view article Article Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging By akjindal53244 • Aug 19, 2024 • 76
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 1 day ago • 30
FAST: Efficient Action Tokenization for Vision-Language-Action Models Paper • 2501.09747 • Published 1 day ago • 13
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published 4 days ago • 16
Diffusion Adversarial Post-Training for One-Step Video Generation Paper • 2501.08316 • Published 4 days ago • 29
MiniMax-01: Scaling Foundation Models with Lightning Attention Paper • 2501.08313 • Published 4 days ago • 258
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper • 2501.09012 • Published 3 days ago • 9
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 4 days ago • 40
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 3 days ago • 25
Graph Mamba: Towards Learning on Graphs with State Space Models Paper • 2402.08678 • Published Feb 13, 2024 • 15
Granite Time Series Models Collection A collection of time series models trained by IBM licensed under Apache 2.0 license. • 5 items • Updated about 1 month ago • 25
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 57
HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1.5 Sentence Similarity • Updated 15 days ago • 28.6k • 47
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models Paper • 2501.03262 • Published 14 days ago • 82
KaLM-Embedding: Superior Training Data Brings A Stronger Embedding Model Paper • 2501.01028 • Published 16 days ago • 11
The Lessons of Developing Process Reward Models in Mathematical Reasoning Paper • 2501.07301 • Published 5 days ago • 73