Ai2

Enterprise

non-profit

Verified

https://allenai.org/

allen_ai

allenai

Activity Feed

AI & ML interests

Building breatkthrough AI to solve the world's biggest problems.

Recent Activity

yilunzhao published a dataset 2 days ago

allenai/sage-retrieval

yilunzhao updated a dataset 2 days ago

allenai/sage-retrieval

gabrieltsengai2 updated a model 3 days ago

allenai/OlmoEarth-v1_2-Nano

View all activity

Papers

MolmoMotion: Forecasting Point Trajectories in 3D with Language Instruction

EMO: Pretraining Mixture of Experts for Emergent Modularity

View all Papers

Articles

allenai 's collections 53

Tmax

Data and models associated with "Tmax: A simple recipe for terminal agents". paper: https://arxiv.org/abs/2606.23321

allenai/tmax-9b

9B • Updated 12 days ago • 9.08k • 8
allenai/tmax-27b

2.65M • Updated 12 days ago • 862 • 22
allenai/tmax-4b

4B • Updated 12 days ago • 674 • 3
allenai/tmax-2b

2B • Updated 12 days ago • 881 • 2

MolmoAct2 Eval Rollouts

Collection of the evaluation rollouts for MolmoAct2 conducted by Cortex AI

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/eval_molmoact_candy_sorting_in-distribution

Viewer • Updated May 20 • 59.6k • 61
allenai/eval_molmoact_cup_stacking_in-distribution

Viewer • Updated May 20 • 32k • 47
allenai/eval_molmoact_cup_storing_in-distribution

Viewer • Updated May 20 • 45.4k • 48

MolmoAct2-BimanualYAM Dataset

Collection of the MolmoAct2-BimanualYAM Dataset

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/MolmoAct2-BimanualYAM-Dataset

Viewer • Updated 24 days ago • 76M • 26.4k • 4
Running

6

Lerobot Visualizer V3

🏢

6

Visualize LeRobot datasets with interactive charts
Running

Agents

1

Dataset Stats

📊

1

Fetch and view stats for MolmoAct2 datasets

MolmoAct2 Finetuned Models

Collection of the fine-tuned models for MolmoAct2

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/MolmoAct2-BimanualYAM

Robotics • 5B • Updated May 23 • 2.57k • 4
allenai/MolmoAct2-SO100_101

Robotics • 5B • Updated May 23 • 4.55k • 18
allenai/MolmoAct2-DROID

Robotics • 5B • Updated May 23 • 1.33k • 4

EMO

allenai/Dense_1b_130B

Text Generation • 1B • Updated May 8 • 61 • 4
allenai/Emo_1b14b_1T

Text Generation • 14B • Updated May 8 • 1.61k • 25
allenai/Emo_1b14b_130B

Text Generation • 14B • Updated May 8 • 65 • 5
allenai/StdMoE_1b4b_130B

Text Generation • 4B • Updated May 8 • 297 • 4

Branch-Adapt-Route

Artifacts for Branch-Adapt-Route

allenai/BAR-7B

Text Generation • 7B • Updated Apr 20 • 56 • 3
allenai/BAR-5x7B

Text Generation • 25B • Updated Apr 20 • 18 • 4
allenai/BAR-2x7B-Base

Text Generation • 12B • Updated Apr 20 • 21 • 1
allenai/BAR-2x7B-Math-SFT

Text Generation • 12B • Updated Apr 20 • 32 • 2

MolmoWeb-Data

This is the collection of all datasets in MolmoWebMix.

allenai/MolmoWeb-HumanSkills

Viewer • Updated Apr 13 • 116k • 3.29k • 15
allenai/MolmoWeb-SyntheticSkills

Viewer • Updated Apr 13 • 5.55k • 1.11k • 7
allenai/MolmoWeb-SyntheticQA

Viewer • Updated Apr 11 • 343k • 898 • 11
allenai/MolmoWeb-SyntheticTrajs

Viewer • Updated Apr 10 • 108k • 2.21k • 11

MolmoPoint-Data

Data used in the MolmoPoint models

allenai/MolmoPoint-TrackAny

Viewer • Updated Mar 17 • 34.9k • 103 • 2
allenai/MolmoPoint-GUISyn

Viewer • Updated Apr 3 • 37k • 1.44k • 12
allenai/MolmoPoint-TrackSyn

Viewer • Updated Mar 18 • 76.2k • 80 • 2

MolmoBot-Models

Models collection for MolmoBot release

allenai/MolmoBot-DROID

Robotics • Updated Mar 21 • 80 • 3
allenai/MolmoBot-Img-DROID

Robotics • Updated Mar 20 • 28 • 2
allenai/MolmoBot-Pi0-DROID

Robotics • 4B • Updated Mar 25 • 3
allenai/MolmoBot-Ablation-MF3-DROID

Robotics • Updated Mar 20 • 12 • 2

Olmo Hybrid

allenai/Dolci-Think-SFT-Olmo-Hybrid

Viewer • Updated Mar 5 • 2.93M • 210 • 13
allenai/Dolci-Think-SFT-Olmo-Hybrid-Tool-Use-SA

Viewer • Updated Mar 5 • 1.6k • 112 • 14
allenai/Olmo-Hybrid-Think-SFT-7B

Text Generation • 7B • Updated Mar 5 • 372 • 19
allenai/Olmo-Hybrid-Instruct-SFT-7B

Text Generation • 7B • Updated May 28 • 1.87k • 17

Open Coding Agents

allenai/SERA-32B-GA

33B • Updated 10 days ago • 51 • 22
allenai/SERA-32B

33B • Updated 10 days ago • 296 • 113
allenai/SERA-8B-GA

8B • Updated Feb 3 • 15 • 15
allenai/SERA-8B

8B • Updated Feb 3 • 733 • 42

Molmo2

Artifacts for the Molmo2 release

allenai/Molmo2-4B

Image-Text-to-Text • 5B • Updated Jan 23 • 36.6k • 51
allenai/Molmo2-8B

Image-Text-to-Text • 9B • Updated Jan 23 • 591k • 189
allenai/Molmo2-O-7B

Image-Text-to-Text • 8B • Updated Jan 23 • 160k • 26
allenai/Molmo2-VideoPoint-4B

Video-Text-to-Text • 5B • Updated Dec 16, 2025 • 1.74k • 21

Molmo2 Data

Artifacts for the Molmo2 data release

allenai/Molmo2-Cap

Viewer • Updated Mar 17 • 108k • 479 • 14
allenai/Molmo2-CapEval

Viewer • Updated Feb 11 • 693 • 132 • 3
allenai/Molmo2-VideoCapQA

Viewer • Updated Feb 11 • 951k • 212 • 9
allenai/Molmo2-VideoSubtitleQA

Viewer • Updated Feb 11 • 469k • 148 • 3

SAGE

Smart Any-Horizon Agent for Long Video Reasoning

allenai/SAGE-MM-Qwen3-VL-8B-SFT_RL

Video-Text-to-Text • 9B • Updated Dec 17, 2025 • 438 • 5
allenai/SAGE-MM-Molmo2-8B-SFT_RL

Video-Text-to-Text • 9B • Updated Dec 17, 2025 • 18 • 5
allenai/SAGE-MM-Qwen3-VL-4B-SFT_RL

Video-Text-to-Text • 5B • Updated Dec 17, 2025 • 24 • 6
allenai/SAGE-MM-Qwen2.5-VL-7B-SFT_RL

Video-Text-to-Text • 8B • Updated Dec 17, 2025 • 10 • 2

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them.

allenai/Olmo-3-7B-Think-SFT

Text Generation • 7B • Updated Jan 5 • 8.2k • 10
allenai/Dolci-Think-SFT-7B

Viewer • Updated Jan 5 • 2.27M • 2.58k • 16
allenai/Olmo-3-7B-Think-DPO

Text Generation • 7B • Updated 10 days ago • 22.1k • 7
allenai/Dolci-Think-DPO-7B

Viewer • Updated Jan 5 • 150k • 227 • 11

MolmoAct

All models for the MolmoAct (Multimodal Open Language Model for Action) release.

MolmoAct: Action Reasoning Models that can Reason in Space

Paper • 2508.07917 • Published Aug 11, 2025 • 45
allenai/MolmoAct-7B-D-0812

Robotics • 8B • Updated Oct 24, 2025 • 796 • 53
allenai/MolmoAct-7B-O-0812

Robotics • 8B • Updated Sep 2, 2025 • 37 • 5
allenai/MolmoAct-7B-D-Pretrain-0812

Robotics • 8B • Updated Sep 2, 2025 • 1.61k • 8

IFBench

Datasets for IFBench benchmark and paper!

allenai/IF_multi_constraints_upto5

Viewer • Updated Oct 2, 2025 • 95.4k • 858 • 25
allenai/IFBench_test

Viewer • Updated Oct 17, 2025 • 300 • 6.95k • 14
allenai/IFBench_multi-turn

Viewer • Updated Jul 3, 2025 • 3.16k • 409 • 12

OLMo 2

Artifacts for the OLMo 2 release.

allenai/OLMo-2-0425-1B-Instruct

Text Generation • 1B • Updated Apr 30, 2025 • 63.9k • 57
allenai/OLMo-2-0425-1B-Instruct-GGUF

1B • Updated May 1, 2025 • 380 • 14
allenai/OLMo-2-0425-1B

Text Generation • 1B • Updated May 28, 2025 • 202k • 79
allenai/OLMo-2-0325-32B-Instruct

Text Generation • 32B • Updated Mar 14, 2025 • 6.09k • 147

DataDecide

A suite of models, data, and evals over 25 corpora, 14 sizes, and 3 seeds to measure how accurately small experiments predict rankings at large scale.

allenai/DataDecide-eval-results

Viewer • Updated Apr 16, 2025 • 1.41M • 192 • 7
allenai/DataDecide-eval-instances

Viewer • Updated Mar 10, 2025 • 1.17k • 334 • 2
allenai/DataDecide-data-recipes

Updated May 6, 2025 • 1.27k • 8
allenai/DataDecide-falcon-and-cc-qc-tulu-10p-60M

76.4M • Updated Apr 8, 2025 • 5 • 1

PixMo

A set of vision-language datasets built by Ai2 and used to train the Molmo family of models. Read more at https://molmo.allenai.org/blog

allenai/pixmo-docs

Viewer • Updated Feb 24, 2025 • 255k • 815 • 35
allenai/pixmo-cap

Viewer • Updated Nov 27, 2024 • 717k • 600 • 43
allenai/pixmo-points

Viewer • Updated Nov 27, 2024 • 2.38M • 1.69k • 46
allenai/pixmo-cap-qa

Viewer • Updated Dec 5, 2024 • 272k • 207 • 11

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes.

allenai/tulu-3-sft-mixture

Viewer • Updated Dec 2, 2024 • 939k • 18.6k • 251
allenai/llama-3.1-tulu-3-8b-preference-mixture

Viewer • Updated Feb 4, 2025 • 273k • 1.82k • 26
allenai/llama-3.1-tulu-3-70b-preference-mixture

Viewer • Updated Feb 4, 2025 • 337k • 71 • 19
allenai/llama-3.1-tulu-3-405b-preference-mixture

Viewer • Updated Feb 5, 2025 • 361k • 32 • 6

OLMoE (November 2024)

Artifacts for open mixture-of-experts language models.

allenai/OLMoE-1B-7B-0924

Text Generation • 7B • Updated Oct 19, 2024 • 139k • 145
allenai/OLMoE-1B-7B-0924-SFT

7B • Updated Sep 4, 2024 • 13.5k • 19
allenai/OLMoE-1B-7B-0924-Instruct

Text Generation • 7B • Updated Sep 13, 2024 • 35.5k • 96
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 1.94k • 55

Tulu V2.5 Suite

A suite of models trained using DPO and PPO across a wide variety (up to 14) of preference datasets. See https://arxiv.org/abs/2406.09279 for more!

allenai/tulu-v2.5-ppo-13b-uf-mean-70b-uf-rm

Text Generation • Updated Jun 14, 2024 • 104 • 6
allenai/tulu-2.5-preference-data

Viewer • Updated Jul 22, 2024 • 2.12M • 473 • 18
allenai/tulu-2.5-prompts

Viewer • Updated Jul 6, 2024 • 189k • 20 • 4
allenai/tulu-v2.5-ppo-13b-uf-mean

Text Generation • 13B • Updated Jun 14, 2024 • 21 •

Paloma

Dataset and baseline models for Paloma, a benchmark of language model fit to 546 textual domains

allenai/paloma

Viewer • Updated Jun 6, 2024 • 309k • 2.76k • 43
allenai/paloma-1b-baseline-dolma

Text Generation • Updated Dec 18, 2023 • 2
allenai/paloma-1b-baseline-pile

Text Generation • Updated Dec 19, 2023
allenai/paloma-1b-baseline-c4

Text Generation • Updated Dec 18, 2023 • 1

WildBench

Running

Agents

232

AI2 WildBench Leaderboard (V2)

🦁

232

Display LLM performance leaderboards with customizable views
allenai/WildBench

Viewer • Updated Mar 4, 2025 • 2.3k • 2.51k • 39
allenai/WildBench-V2-Model-Outputs

Viewer • Updated Aug 1, 2024 • 62.5k • 474 • 2
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7, 2024 • 28

AI2 Safety Toolkit

Safety data, moderation tools and safe LLMs.

allenai/wildjailbreak

Viewer • Updated Aug 8, 2024 • 2.21k • 5.82k • 135
allenai/wildguard

Text Generation • 7B • Updated Jul 27, 2025 • 265k • 54
allenai/llama2-7b-WildJailbreak

Text Generation • Updated Jun 29, 2024
allenai/llama2-13b-WildJailbreak

Text Generation • Updated Jun 29, 2024 • 1

OLMo 2 Preview Post-trained Models

These model's tokenizer did not use HF's fast tokenizer, resulting in variations in how pre-tokenization was applied. Resolved in latest versions.

allenai/OLMo-2-1124-13B-Instruct-preview

Text Generation • 14B • Updated Jan 6, 2025 • 17 • 58
allenai/OLMo-2-1124-7B-Instruct-preview

Text Generation • 7B • Updated Jan 6, 2025 • 39 • 47
allenai/OLMo-2-1124-7B-SFT-Preview

Text Generation • Updated Jan 6, 2025 • 17 • 3
allenai/OLMo-2-1124-7B-DPO-Preview

Text Generation • Updated Jan 6, 2025 • 15 • 2

MolmoMotion

Artifacts for the MolmoMotion release

allenai/MolmoMotion-4B-H3-F30

Image-Text-to-Text • 5B • Updated 17 days ago • 308 • 12
allenai/MolmoMotion-4B-H1-F32

Image-Text-to-Text • 5B • Updated 17 days ago • 156 • 5
allenai/molmo-motion-1m

Preview • Updated 17 days ago • 2.56k • 16
allenai/PointMotionBench

Updated 13 days ago • 4.52k • 7

Molmo2-ER Datasets

Collection of the embodied reasoning datasets for MolmoAct2

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/Molmo2-ER-SAT

Viewer • Updated May 5 • 172k • 91
allenai/Molmo2-ER-SIMS-VSI

Updated May 5 • 68
allenai/Molmo2-ER-VSI-590K

Updated May 5 • 90

MolmoAct2 Datasets

Collection of robotics datasets for MolmoAct2

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/MolmoAct2-BimanualYAM-Dataset

Viewer • Updated 24 days ago • 76M • 26.4k • 4
allenai/MolmoAct2-SO100_101-Dataset

Viewer • Updated May 8 • 8.42k • 414 • 7
allenai/MolmoAct2-DROID-Dataset

Viewer • Updated May 5 • 17.8M • 5.94k • 4

MolmoAct2 Models

Collection of the base models for MolmoAct2

MolmoAct2: Action Reasoning Models for Real-world Deployment

Paper • 2605.02881 • Published May 4 • 355
allenai/MolmoAct2

Robotics • 5B • Updated May 23 • 3.51k • 19
allenai/MolmoAct2-Think

Robotics • 5B • Updated May 23 • 606 • 3
allenai/MolmoAct2-Pretrain

Robotics • 5B • Updated May 23 • 752 • 5

OlmPool

Collection of models from the paper "Cracks in the Foundation: Seemingly Minor Architectural Choices Impact Long Context Extension".

allenai/D_post_LQK_8kv_8k_13k_SWA

Updated 20 days ago
allenai/B_post_LQK_32kv_4k_11k_SWA

Updated 20 days ago
allenai/H_post_LQK_32kv_4k_11k

Updated 20 days ago
allenai/H_post_LQK_32kv_8k_11k

Updated 20 days ago

WildDet3D

This is the collection of WildDet3D artifacts, including demos, model checkpoints and data. https://github.com/allenai/WildDet3D

Running on Zero

Agents

24

AI2 - WildDet3D

🚀

24

Scaling Promptable 3D Detection in the Wild
allenai/WildDet3D-iPhone

Updated Apr 7 • 12
WildDet3D: Scaling Promptable 3D Detection in the Wild

Paper • 2604.08626 • Published Apr 9 • 248
allenai/WildDet3D

Object Detection • Updated Jun 1 • 51 • 41

MolmoWeb

This is the collection of MolmoWeb artifacts, including model checkpoints and data.

MolmoWeb-Data

Collection

This is the collection of all datasets in MolmoWebMix. • 6 items • Updated Mar 24 • 30
allenai/MolmoWeb-4B

Image-Text-to-Text • 5B • Updated Apr 10 • 6.72k • 35
allenai/MolmoWeb-8B

Image-Text-to-Text • 9B • Updated Apr 10 • 1.66k • 69
allenai/MolmoWeb-4B-Native

Image-Text-to-Text • Updated Apr 10 • 10 • 9

MolmoPoint

MolmoPoint models

allenai/MolmoPoint-8B

Image-Text-to-Text • 9B • Updated Mar 18 • 4.44k • 26
allenai/MolmoPoint-GUI-8B

Image-Text-to-Text • 9B • Updated Mar 24 • 396 • 18
allenai/MolmoPoint-Vid-4B

Video-Text-to-Text • 5B • Updated Mar 30 • 886 • 12

MolmoBot-Data

Training and assets data for MolmoBot release

allenai/molmobot-data

Viewer • Updated 19 days ago • 324k • 27.3k • 7
allenai/molmospaces

Viewer • Updated May 28 • 1M • 15k • 48

Open Coding Agents Specialization

Ai2 Open Coding Agents - Django, Sphinx, Sympy Data

allenai/Sera-4.5A-Django-T1

Viewer • Updated Feb 11 • 16.2k • 84 • 4
allenai/Sera-4.5A-Django-T2

Viewer • Updated Feb 11 • 14.6k • 92 • 2
allenai/Sera-4.5A-Sympy-T1

Viewer • Updated Feb 11 • 18.2k • 35 • 2
allenai/Sera-4.5A-Sympy-T2

Viewer • Updated Feb 11 • 25.4k • 29 • 2

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets...

allenai/Olmo-3.1-32B-Think

Text Generation • 32B • Updated Jan 5 • 6.7k • 107
allenai/Olmo-3.1-32B-Instruct-SFT

32B • Updated Jan 5 • 11k • 8
allenai/Olmo-3.1-32B-Instruct-DPO

Text Generation • 32B • Updated Jan 5 • 924 • 6
allenai/Olmo-3.1-32B-Instruct

Text Generation • 32B • Updated Jan 5 • 15.4k • 79

Bolmo

Artifacts for the Bolmo release: https://allenai.org/papers/bolmo.

allenai/Bolmo-7B

Text Generation • 8B • Updated 18 days ago • 130 • 58
allenai/Bolmo-1B

Text Generation • 1B • Updated 18 days ago • 2.79k • 50
allenai/bolmo_mix

Updated Dec 22, 2025 • 384 • 9
Bolmo: Byteifying the Next Generation of Language Models

Paper • 2512.15586 • Published Dec 17, 2025 • 19

Olmo 3

Artifacts for the Olmo 3 release.

allenai/Olmo-3-1125-32B

Text Generation • 32B • Updated Dec 3, 2025 • 41.4k • 121
allenai/Olmo-3-32B-Think

Text Generation • 32B • Updated 10 days ago • 8.07k • 171
allenai/Olmo-3-1025-7B

Text Generation • 7B • Updated Apr 21 • 98.7k • 72
allenai/Olmo-3-7B-Think

Text Generation • 7B • Updated 10 days ago • 41.6k • 98

Olmo 3 Pre-training

All artifacts related to Olmo 3 pre-training

allenai/dolma3_pool

Preview • Updated Feb 24 • 25.3k • 36
allenai/dolma3_dolmino_pool

Updated Jan 5 • 19.6k • 8
allenai/dolma3_longmino_pool

Updated Jan 5 • 9.88k • 14
allenai/dolma3_dolmino_mix-100B-1025

Viewer • Updated Jan 5 • 14.1M • 22.8k • 10

OlmoEarth

OlmoEarth pre-trained and fine-tuned foundation models for remote sensing

allenai/OlmoEarth-v1-Base

Updated Nov 4, 2025 • 16.7k • 40
allenai/OlmoEarth-v1-Nano

Updated Nov 4, 2025 • 5.08k • 16
allenai/OlmoEarth-v1-Tiny

Updated Nov 4, 2025 • 909 • 11
allenai/OlmoEarth-v1-Large

Updated Nov 4, 2025 • 10.6k • 19

MolmoAct Data Mixture

All datasets for the MolmoAct (Multimodal Open Language Model for Action) release.

allenai/MolmoAct-Dataset

Viewer • Updated Sep 3, 2025 • 1.11M • 2.73k • 30
allenai/MolmoAct-Pretraining-Mixture

Viewer • Updated Sep 10, 2025 • 24.2M • 1.83k • 12
allenai/MolmoAct-Midtraining-Mixture

Viewer • Updated Aug 18, 2025 • 5.93M • 40.9k • 5
allenai/libero

Viewer • Updated Aug 27, 2025 • 521k • 321 • 5

Reward Bench 2

Datasets, spaces, and models for Reward Bench 2 benchmark and paper!

allenai/reward-bench-2

Viewer • Updated Jun 4, 2025 • 1.87k • 4.62k • 36
Running

Agents

431

Reward Bench Leaderboard

📐

431

Explore and compare model scores on RewardBench benchmarks
allenai/reward-bench-2-results

Preview • Updated Dec 11, 2025 • 262 • 3
allenai/Llama-3.1-70B-Instruct-RM-RB2

Text Classification • Updated Jun 4, 2025 • 35 • 1

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org

allenai/olmOCR-2-7B-1025-FP8

Image-Text-to-Text • 8B • Updated Feb 19 • 540k • 246
allenai/olmOCR-2-7B-1025

Image-Text-to-Text • 8B • Updated Oct 22, 2025 • 34.2k • 153
allenai/olmOCR-mix-1025

Viewer • Updated Oct 21, 2025 • 270k • 1.74k • 34
allenai/olmOCR-synthmix-1025

Preview • Updated Oct 17, 2025 • 5.88k • 3

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app

allenai/OLMoE-1B-7B-0125

Text Generation • 7B • Updated Mar 16, 2025 • 5.79k • 37
allenai/OLMoE-1B-7B-0125-Instruct

Text Generation • 7B • Updated Feb 4, 2025 • 124k • 66
allenai/OLMoE-1B-7B-0125-Instruct-GGUF

7B • Updated Feb 13, 2025 • 414 • 22
allenai/OLMoE-mix-0924

Preview • Updated Dec 2, 2024 • 1.94k • 55

Tulu 3 Models

All models released with Tulu 3 -- state of the art open post-training recipes.

allenai/Llama-3.1-Tulu-3.1-8B

Text Generation • 8B • Updated Feb 10, 2025 • 619 • • 39
allenai/Llama-3.1-Tulu-3-8B

Text Generation • 8B • Updated Feb 13, 2025 • 3.67k • • 179
allenai/Llama-3.1-Tulu-3-70B

Text Generation • 71B • Updated Feb 10, 2025 • 417 • • 61
allenai/Llama-3.1-Tulu-3-405B

Text Generation • Updated Feb 10, 2025 • 671 • 112

Molmo

Artifacts for open multimodal language models.

allenai/Molmo-72B-0924

Image-Text-to-Text • 73B • Updated Oct 9, 2025 • 3.59k • 300
allenai/Molmo-7B-D-0924

Image-Text-to-Text • 8B • Updated Dec 15, 2025 • 28.1k • 567
allenai/Molmo-7B-O-0924

Image-Text-to-Text • 8B • Updated Oct 9, 2025 • 1.86k • 164
allenai/MolmoE-1B-0924

Image-Text-to-Text • Updated Apr 24, 2025 • 1.23k • 158

OLMo Suite

Artifacts for the first set of OLMo models.

allenai/OLMo-1B-0724-hf

Text Generation • 1B • Updated Aug 5, 2024 • 5.12k • 24
allenai/OLMo-7B-0724-hf

Text Generation • 7B • Updated Jul 16, 2024 • 966 • 17
allenai/OLMo-7B-0724-SFT-hf

Text Generation • 7B • Updated Jul 14, 2024 • 53 • 4
allenai/OLMo-7B-0724-Instruct-hf

Text Generation • 7B • Updated Sep 24, 2024 • 306 • 7

Reward Bench

Datasets, spaces, and models for the reward model benchmark!

Running

Agents

431

Reward Bench Leaderboard

📐

431

Explore and compare model scores on RewardBench benchmarks
allenai/reward-bench

Viewer • Updated Sep 9, 2024 • 8.11k • 5.47k • 109
allenai/preference-test-sets

Viewer • Updated Mar 14, 2024 • 43.2k • 1.06k • 28
allenai/reward-bench-results

Updated May 7, 2025 • 67k • 3

Tulu V2 Suite

The set of models associated with the paper "Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2"

allenai/tulu-v2-sft-mixture

Viewer • Updated May 24, 2024 • 326k • 1.04k • 138
allenai/tulu-2-dpo-70b

Text Generation • 69B • Updated Jan 31, 2024 • 355 • 158
allenai/tulu-2-dpo-13b

Text Generation • 13B • Updated May 17, 2024 • 395 • • 21
allenai/tulu-2-dpo-7b

Text Generation • Updated May 14, 2024 • 445 • • 21

SciRIFF

Data and models to enhance instruction-following for scientific literature understanding.

allenai/SciRIFF

Viewer • Updated Jun 13, 2024 • 433k • 517 • 48
allenai/SciRIFF-train-mix

Viewer • Updated Jun 13, 2024 • 70.7k • 49 • 10
allenai/scitulu-7b

Text Generation • Updated Jun 13, 2024 • 14 • 3
allenai/scitulu-70b

Text Generation • Updated Jun 13, 2024 • 111 • 6

Zebra Logic Bench

ZebraLogic Bench: Testing the Limits of LLMs in Logical Reasoning

Running

Agents

94

Zebra Logic Bench

🦓

94

Display model leaderboard and explore sample puzzles
allenai/ZebraLogicBench

Viewer • Updated Jul 11, 2024 • 4.26k • 802 • 25
allenai/ZebraLogicBench-private

Viewer • Updated Jul 4, 2024 • 4.26k • 395 • 13
Faith and Fate: Limits of Transformers on Compositionality

Paper • 2305.18654 • Published May 29, 2023 • 9