scillm (scillm

soldni

authored a paper 3 months ago

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Paper • 2504.07096 • Published Apr 9 • 74

armanc

authored 4 papers 3 months ago

Z1: Efficient Test-time Scaling with Code

Paper • 2504.00810 • Published Apr 1 • 26

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

Paper • 2503.21821 • Published Mar 26 • 17

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Paper • 2503.20757 • Published Mar 26 • 10

Survey on Evaluation of LLM-based Agents

Paper • 2503.16416 • Published Mar 20 • 91

shannons

authored a paper 4 months ago

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

Paper • 2502.09604 • Published Feb 13 • 36

armanc

authored a paper 4 months ago

TESS 2: A Large-Scale Generalist Diffusion Language Model

Paper • 2502.13917 • Published Feb 19 • 6

armanc

authored a paper 5 months ago

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Paper • 2501.12380 • Published Jan 21 • 86

dwadden

authored a paper 5 months ago

HALoGEN: Fantastic LLM Hallucinations and Where to Find Them

Paper • 2501.08292 • Published Jan 14 • 17

armanc

authored a paper 6 months ago

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Paper • 2501.06590 • Published Jan 11 • 11

soldni

authored 2 papers 6 months ago

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models

Paper • 2409.02685 • Published Sep 4, 2024 • 1

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Paper • 2412.04403 • Published Dec 5, 2024 • 3

soldni

authored a paper 7 months ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 65

dwadden

authored a paper 7 months ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21, 2024 • 32

soldni

authored a paper 7 months ago

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

Paper • 2411.14199 • Published Nov 21, 2024 • 32

kejian

authored a paper 9 months ago

ReIFE: Re-evaluating Instruction-Following Evaluation

Paper • 2410.07069 • Published Oct 9, 2024

soldni

authored a paper 9 months ago

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Paper • 2409.17146 • Published Sep 25, 2024 • 120

dwadden

authored a paper 10 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 79

soldni

authored 2 papers 10 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 79

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

Paper • 2406.07835 • Published Jun 10, 2024 • 1

scillm_su23

AI & ML interests

OLMoTrace: Tracing Language Model Outputs Back to Trillions of Training Tokens

Z1: Efficient Test-time Scaling with Code

PHYSICS: Benchmarking Foundation Models on University-Level Physics Problem Solving

MCTS-RAG: Enhancing Retrieval-Augmented Generation with Monte Carlo Tree Search

Survey on Evaluation of LLM-based Agents

SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models

TESS 2: A Large-Scale Generalist Diffusion Language Model

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

HALoGEN: Fantastic LLM Hallucinations and Where to Find Them

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

RouterRetriever: Exploring the Benefits of Routing over Multiple Expert Embedding Models

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs

ReIFE: Re-evaluating Instruction-Following Evaluation

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

OLMoE: Open Mixture-of-Experts Language Models

OLMoE: Open Mixture-of-Experts Language Models

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

AI & ML interests

Team members 10

scillm's activity