Bill Yuchen Lin's picture

Bill Yuchen Lin

yuchenlin

·

https://yuchenlin.xyz

AI & ML interests

Research @allenai LLMs and Multimodality, Agents

Organizations

authored a paper 10 months ago

CrossWordBench: Evaluating the Reasoning Capabilities of LLMs and LVLMs with Controllable Puzzle Generation

Paper • 2504.00043 • Published Mar 30, 2025 • 10

authored a paper 12 months ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17, 2025 • 39

authored a paper about 1 year ago

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

Paper • 2502.01100 • Published Feb 3, 2025 • 21

authored 6 papers over 1 year ago

On Memorization of Large Language Models in Logical Reasoning

Paper • 2410.23123 • Published Oct 30, 2024 • 18

WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks, and Refusals of LLMs

Paper • 2406.18495 • Published Jun 26, 2024 • 13

MantisScore: Building Automatic Metrics to Simulate Fine-grained Human Feedback for Video Generation

Paper • 2406.15252 • Published Jun 21, 2024 • 18

WildVision: Evaluating Vision-Language Models in the Wild with Human Preferences

Paper • 2406.11069 • Published Jun 16, 2024 • 14

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 71

WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild

Paper • 2406.04770 • Published Jun 7, 2024 • 28

authored 6 papers almost 2 years ago

Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models

Paper • 2405.01535 • Published May 2, 2024 • 124

RewardBench: Evaluating Reward Models for Language Modeling

Paper • 2403.13787 • Published Mar 20, 2024 • 22

Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents

Paper • 2403.02502 • Published Mar 4, 2024 • 3

SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding

Paper • 2402.08983 • Published Feb 14, 2024 • 5

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 83

L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects

Paper • 2402.09052 • Published Feb 14, 2024 • 18

authored a paper about 2 years ago

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 32

authored 4 papers over 2 years ago

Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs

Paper • 2311.05657 • Published Nov 9, 2023 • 30

LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition

Paper • 2307.13269 • Published Jul 25, 2023 • 33

LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion

Paper • 2306.02561 • Published Jun 5, 2023 • 6

Inference-Time Policy Adapters (IPA): Tailoring Extreme-Scale LMs without Fine-tuning

Paper • 2305.15065 • Published May 24, 2023 • 1