Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2409.16191

HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Paper • 2306.04757 • Published Jun 7, 2023 • 6
Evaluating Instruction-Tuned Large Language Models on Code Comprehension and Generation

Paper • 2308.01240 • Published Aug 2, 2023 • 2
Can Large Language Models Understand Real-World Complex Instructions?

Paper • 2309.09150 • Published Sep 17, 2023 • 2
Evaluating the Instruction-Following Robustness of Large Language Models to Prompt Injection

Paper • 2308.10819 • Published Aug 17, 2023

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 58
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 42
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 57

Data-Training and Eval

InfinityMATH: A Scalable Instruction Tuning Dataset in Programmatic Mathematical Reasoning

Paper • 2408.07089 • Published Aug 9, 2024 • 14
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42
Training Language Models to Self-Correct via Reinforcement Learning

Paper • 2409.12917 • Published Sep 19, 2024 • 137
Self-Boosting Large Language Models with Synthetic Preference Data

Paper • 2410.06961 • Published Oct 9, 2024 • 16

Self-Taught Evaluators

Paper • 2408.02666 • Published Aug 5, 2024 • 29
Michelangelo: Long Context Evaluations Beyond Haystacks via Latent Structure Queries

Paper • 2409.12640 • Published Sep 19, 2024 • 2
openai/MMMLU

Viewer • Updated Oct 16, 2024 • 393k • 19.9k • 471
HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models

Paper • 2409.16191 • Published Sep 24, 2024 • 42

PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 54
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 77
Table-GPT: Table-tuned GPT for Diverse Table Tasks

Paper • 2310.09263 • Published Oct 13, 2023 • 40
Context-Aware Meta-Learning

Paper • 2310.10971 • Published Oct 17, 2023 • 17

MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention

Paper • 2407.02490 • Published Jul 2, 2024 • 25
Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Paper • 2406.13632 • Published Jun 19, 2024 • 5
LongRAG: Enhancing Retrieval-Augmented Generation with Long-context LLMs

Paper • 2406.15319 • Published Jun 21, 2024 • 64
Make Your LLM Fully Utilize the Context

Paper • 2404.16811 • Published Apr 25, 2024 • 54

LLoCO: Learning Long Contexts Offline

Paper • 2404.07979 • Published Apr 11, 2024 • 22
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens

Paper • 2402.13753 • Published Feb 21, 2024 • 115
LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration

Paper • 2402.11550 • Published Feb 18, 2024 • 18
LongAlign: A Recipe for Long Context Alignment of Large Language Models

Paper • 2401.18058 • Published Jan 31, 2024 • 21

Papers I want to read

Papers in my to-read list

RLHF Workflow: From Reward Modeling to Online RLHF

Paper • 2405.07863 • Published May 13, 2024 • 68
Chameleon: Mixed-Modal Early-Fusion Foundation Models

Paper • 2405.09818 • Published May 16, 2024 • 131
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models

Paper • 2405.15574 • Published May 24, 2024 • 55
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 88

LM Capabilities and Scaling

Compression Represents Intelligence Linearly

Paper • 2404.09937 • Published Apr 15, 2024 • 27
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies

Paper • 2404.06395 • Published Apr 9, 2024 • 22
Long-context LLMs Struggle with Long In-context Learning

Paper • 2404.02060 • Published Apr 2, 2024 • 37
Are large language models superhuman chemists?

Paper • 2404.01475 • Published Apr 1, 2024 • 19

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs