Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
1
3
Roee Aharoni
roeeaharoni
Follow
ChristineLai's profile picture
MikeBrown123's profile picture
Agnuxo's profile picture
12 followers
ยท
2 following
http://www.roeeaharoni.com
roeeaharoni
roeeaharoni
AI & ML interests
Natural Language Processing
Recent Activity
upvoted
a
paper
2 days ago
Inside-Out: Hidden Factual Knowledge in LLMs
liked
a dataset
8 months ago
google/granola-entity-questions
reacted
to
gsarti
's
post
with ๐ค
about 1 year ago
๐ Today's pick in Interpretability & Analysis of LMs: A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains by @alonjacovi @yonatanbitton B. Bohnet J. Herzig @orhonovic M. Tseng M. Collins @roeeaharoni @mega This work introduces a new methodology for human verification of reasoning chains and adopts it to annotate a dataset of chain-of-thought reasoning chains produced by 3 LMs. The annotated dataset, REVEAL, can be used to benchmark automatic verifiers of reasoning in LMs. In their analysis, the authors find that LM-produced CoTs generally contain faulty steps, often leading to incorrect automatic verification. In particular, CoT-generating LMs are found to produce non-attributable reasoning steps often, and reasoning verifiers generally struggle to verify logical correctness. ๐ Paper: https://huggingface.co/papers/2402.00559 ๐ก Dataset: https://huggingface.co/datasets/google/reveal
View all activity
Organizations
roeeaharoni
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
a dataset
8 months ago
google/granola-entity-questions
Viewer
โข
Updated
Aug 1, 2024
โข
12.5k
โข
97
โข
7
liked
a model
over 1 year ago
google/t5_11b_trueteacher_and_anli
Text2Text Generation
โข
Updated
Dec 26, 2023
โข
271
โข
16
liked
a model
about 2 years ago
google/t5_xxl_true_nli_mixture
Text2Text Generation
โข
Updated
Mar 23, 2023
โข
3.68k
โข
46