Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
3
Roee Aharoni
roeeaharoni
Follow
ChristineLai's profile picture
revlon-carter's profile picture
Tonyhhhhh's profile picture
12 followers
ยท
2 following
http://www.roeeaharoni.com
roeeaharoni
roeeaharoni
AI & ML interests
Natural Language Processing
Recent Activity
upvoted
a
paper
2 days ago
Inside-Out: Hidden Factual Knowledge in LLMs
liked
a dataset
8 months ago
google/granola-entity-questions
reacted
to
gsarti
's
post
with ๐ค
about 1 year ago
๐ Today's pick in Interpretability & Analysis of LMs: A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains by @alonjacovi @yonatanbitton B. Bohnet J. Herzig @orhonovic M. Tseng M. Collins @roeeaharoni @mega This work introduces a new methodology for human verification of reasoning chains and adopts it to annotate a dataset of chain-of-thought reasoning chains produced by 3 LMs. The annotated dataset, REVEAL, can be used to benchmark automatic verifiers of reasoning in LMs. In their analysis, the authors find that LM-produced CoTs generally contain faulty steps, often leading to incorrect automatic verification. In particular, CoT-generating LMs are found to produce non-attributable reasoning steps often, and reasoning verifiers generally struggle to verify logical correctness. ๐ Paper: https://huggingface.co/papers/2402.00559 ๐ก Dataset: https://huggingface.co/datasets/google/reveal
View all activity
Organizations
roeeaharoni
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
upvoted
a
paper
2 days ago
Inside-Out: Hidden Factual Knowledge in LLMs
Paper
โข
2503.15299
โข
Published
6 days ago
โข
36