7 26 3

Sean McLeish

smcleish

https://mcleish7.github.io/

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

published a model 22 days ago

tomg-group-umd/Gemstone-384x36_cooldown

published a model 22 days ago

tomg-group-umd/Gemstone-1024x28_cooldown

View all activity

Organizations

upvoted a paper 3 days ago

Zebra-CoT: A Dataset for Interleaved Vision Language Reasoning

Paper • 2507.16746 • Published 4 days ago • 28

upvoted 7 papers about 2 months ago

ARGUS: Hallucination and Omission Evaluation in Video-LLMs

Paper • 2506.07371 • Published Jun 9 • 8

MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning

Paper • 2506.05523 • Published Jun 5 • 34

How Much Backtracking is Enough? Exploring the Interplay of SFT and RL in Enhancing LLM Reasoning

Paper • 2505.24273 • Published May 30 • 4

Pitfalls in Evaluating Language Model Forecasters

Paper • 2506.00723 • Published May 31 • 3

upvoted a paper 2 months ago

Seek in the Dark: Reasoning via Test-Time Instance-Level Policy Gradient in Latent Space

Paper • 2505.13308 • Published May 19 • 26

upvoted 4 papers 3 months ago

ReplaceMe: Network Simplification via Layer Pruning and Linear Transformations

Paper • 2505.02819 • Published May 5 • 25

The Leaderboard Illusion

Paper • 2504.20879 • Published Apr 29 • 70

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Paper • 2504.19162 • Published Apr 27 • 18

Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17 • 61

upvoted an article 4 months ago

Article

Mixture of Depth is Vibe

•

Apr 22, 2024

• 48

upvoted 4 papers 5 months ago

Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning

Paper • 2503.07572 • Published Mar 10 • 47

Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 39

Has My System Prompt Been Used? Large Language Model Prompt Membership Inference

Paper • 2502.09974 • Published Feb 14 • 9

Ignore the KL Penalty! Boosting Exploration on Critical Tokens to Enhance RL Fine-Tuning

Paper • 2502.06533 • Published Feb 10 • 18

upvoted a collection 5 months ago

Recurrent Models

Collection

These are checkpoints for recurrent LLMs developed to scale test-time compute by recurring in latent space. • 15 items • Updated May 21 • 9

upvoted a paper 5 months ago

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Paper • 2502.06857 • Published Feb 7 • 25

Sean McLeish

AI & ML interests

Recent Activity

Organizations

smcleish's activity

Mixture of Depth is Vibe