Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published 23 days ago • 88
AutoTrain: No-code training for state-of-the-art models Paper • 2410.15735 • Published 24 days ago • 56
HelpSteer2-Preference: Complementing Ratings with Preferences Paper • 2410.01257 • Published Oct 2 • 19
Llama-3.1-Nemotron-70B Collection SOTA models on Arena Hard and RewardBench as of 1 Oct 2024. • 6 items • Updated 30 days ago • 135
To Code, or Not To Code? Exploring Impact of Code in Pre-training Paper • 2408.10914 • Published Aug 20 • 40
SWE-bench-java: A GitHub Issue Resolving Benchmark for Java Paper • 2408.14354 • Published Aug 26 • 40
The Mamba in the Llama: Distilling and Accelerating Hybrid Models Paper • 2408.15237 • Published Aug 27 • 36
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22 • 117
SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models Paper • 2407.15841 • Published Jul 22 • 39
Benchmarking Trustworthiness of Multimodal Large Language Models: A Comprehensive Study Paper • 2406.07057 • Published Jun 11 • 15
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine Paper • 2408.02900 • Published Aug 6 • 25
Controllable Text Generation for Large Language Models: A Survey Paper • 2408.12599 • Published Aug 22 • 62
xGen-MM (BLIP-3): A Family of Open Large Multimodal Models Paper • 2408.08872 • Published Aug 16 • 97
OpenResearcher: Unleashing AI for Accelerated Scientific Research Paper • 2408.06941 • Published Aug 13 • 30