Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2502.03387

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 53
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 50
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 10

Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters

Paper • 2408.03314 • Published Aug 6, 2024 • 55
ICAL: Continual Learning of Multimodal Agents by Transforming Trajectories into Actionable Insights

Paper • 2406.14596 • Published Jun 20, 2024 • 5
A Comprehensive Survey of LLM Alignment Techniques: RLHF, RLAIF, PPO, DPO and More

Paper • 2407.16216 • Published Jul 23, 2024

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30, 2024 • 18
LLMs Do Not Think Step-by-step In Implicit Reasoning

Paper • 2411.15862 • Published Nov 24, 2024 • 10
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published Dec 9, 2024 • 78
Deliberation in Latent Space via Differentiable Cache Augmentation

Paper • 2412.17747 • Published Dec 23, 2024 • 30

about 18 hours ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 56

about 13 hours ago

Rho-1: Not All Tokens Are What You Need

Paper • 2404.07965 • Published Apr 11, 2024 • 90
VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Paper • 2404.10667 • Published Apr 16, 2024 • 18
Instruction-tuned Language Models are Better Knowledge Learners

Paper • 2402.12847 • Published Feb 20, 2024 • 26
DoRA: Weight-Decomposed Low-Rank Adaptation

Paper • 2402.09353 • Published Feb 14, 2024 • 26

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset

Paper • 2403.09029 • Published Mar 14, 2024 • 55
LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Paper • 2403.12968 • Published Mar 19, 2024 • 25
RAFT: Adapting Language Model to Domain Specific RAG

Paper • 2403.10131 • Published Mar 15, 2024 • 69
Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking

Paper • 2403.09629 • Published Mar 14, 2024 • 76

Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 50
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 53
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks

Paper • 2403.00522 • Published Mar 1, 2024 • 45
Resonance RoPE: Improving Context Length Generalization of Large Language Models

Paper • 2403.00071 • Published Feb 29, 2024 • 23

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs