Agent Laboratory: Using LLM Agents as Research Assistants Paper • 2501.04227 • Published 10 days ago • 77
The GAN is dead; long live the GAN! A Modern GAN Baseline Paper • 2501.05441 • Published 9 days ago • 77
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
Hymba: A Hybrid-head Architecture for Small Language Models Paper • 2411.13676 • Published Nov 20, 2024 • 40
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. • 32 items • Updated 12 days ago • 64
TÜLU 3: Pushing Frontiers in Open Language Model Post-Training Paper • 2411.15124 • Published Nov 22, 2024 • 58
Stronger Models are NOT Stronger Teachers for Instruction Tuning Paper • 2411.07133 • Published Nov 11, 2024 • 35
Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss Paper • 2410.17243 • Published Oct 22, 2024 • 89
From Generalist to Specialist: Adapting Vision Language Models via Task-Specific Visual Instruction Tuning Paper • 2410.06456 • Published Oct 9, 2024 • 36
GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment Paper • 2410.08193 • Published Oct 10, 2024 • 3
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models Paper • 2409.11136 • Published Sep 17, 2024 • 22
Cottention: Linear Transformers With Cosine Attention Paper • 2409.18747 • Published Sep 27, 2024 • 16
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices Paper • 2410.00531 • Published Oct 1, 2024 • 30
On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes Paper • 2306.13649 • Published Jun 23, 2023 • 18
Can Large Language Models Unlock Novel Scientific Research Ideas? Paper • 2409.06185 • Published Sep 10, 2024 • 13
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers Paper • 2409.04109 • Published Sep 6, 2024 • 44