bev's picture

5

bev

bevel86920

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 months ago

DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation

upvoted a paper 3 months ago

FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs

upvoted a paper 5 months ago

From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models

View all activity

Organizations

None yet

upvoted 2 papers 3 months ago

DITING: A Multi-Agent Evaluation Framework for Benchmarking Web Novel Translation

Paper • 2510.09116 • Published Oct 10, 2025 • 96

FinAuditing: A Financial Taxonomy-Structured Multi-Document Benchmark for Evaluating LLMs

Paper • 2510.08886 • Published Oct 10, 2025 • 19

upvoted a paper 5 months ago

From Scores to Skills: A Cognitive Diagnosis Framework for Evaluating Financial Large Language Models

Paper • 2508.13491 • Published Aug 19, 2025 • 59

upvoted 2 papers 7 months ago

MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation

Paper • 2506.14028 • Published Jun 16, 2025 • 93

FinTagging: An LLM-ready Benchmark for Extracting and Structuring Financial Information

Paper • 2505.20650 • Published May 27, 2025 • 17