sheza munir

shezamunir

ShezaMunir

AI & ML interests

None yet

Recent Activity

updated a Space 29 days ago

launch/ExpertLongBench

authored a paper about 1 month ago

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

updated a dataset about 1 month ago

launch/ExpertLongBench

View all activity

Organizations

updated a Space 29 days ago

ExpertLongBench

🚀

Leaderboard for ExpertLongBench

authored a paper about 1 month ago

FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation

Paper • 2410.22257 • Published Oct 29, 2024

updated a dataset about 1 month ago

launch/ExpertLongBench

Preview • Updated 15 days ago • 294 • 6

published a dataset about 1 month ago

launch/ExpertLongBench

Preview • Updated 15 days ago • 294 • 6

published a Space about 1 month ago

ExpertLongBench

🚀

Leaderboard for ExpertLongBench

upvoted 2 papers 2 months ago

CLASH: Evaluating Language Models on Judging High-Stakes Dilemmas from Multiple Perspectives

Paper • 2504.10823 • Published Apr 15 • 14

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Paper • 2504.09702 • Published Apr 13 • 18

updated 2 Spaces 4 months ago

FactRBench

🏆

View and analyze long-form factuality leaderboard

FactRBench

🏆

View and analyze long-form factuality leaderboard

updated a dataset 7 months ago

CSALT/deepfake_detection_dataset_urdu

Viewer • Updated Nov 29, 2024 • 6.79k • 13.4k • 4

updated a dataset 8 months ago

launch/FactBench

Viewer • Updated 20 days ago • 1k • 65 • 3

updated a Space 8 months ago

Factbench

📈

Display a leaderboard for evaluating language model factuality

updated a dataset 9 months ago

launch/FactBench

Viewer • Updated 20 days ago • 1k • 65 • 3

sheza munir

AI & ML interests

Recent Activity

Organizations

shezamunir's activity

ExpertLongBench

ExpertLongBench

FactRBench

FactRBench

Factbench