LLM-Leaderboard's picture

LLM-Leaderboard

StarscreamDeceptions

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

DeepWideSearch: Benchmarking Depth and Width in Agentic Information Seeking

upvoted a paper 3 days ago

HSCodeComp: A Realistic and Expert-level Benchmark for Deep Search Agents in Hierarchical Rule Application

updated a Space 7 days ago

AIDC-AI/Marco-MT-Algharb

View all activity

Organizations

spaces 1

🌐 Multilingual MMLU Benchmark Leaderboard

View and submit LLM benchmarks

models 0

None public yet

datasets 2

StarscreamDeceptions/results

Viewer • Updated Nov 13, 2024 • 17 • 9

StarscreamDeceptions/requests

Preview • Updated Nov 13, 2024 • 17