2 14

xuhuang

xuhuang87

AI & ML interests

None yet

Recent Activity

authored a paper about 24 hours ago

Intern-S1: A Scientific Multimodal Foundation Model

upvoted a paper 1 day ago

Intern-S1: A Scientific Multimodal Foundation Model

upvoted a paper 2 days ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

View all activity

Organizations

None yet

authored a paper about 24 hours ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published 1 day ago • 172

upvoted a paper 1 day ago

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published 1 day ago • 172

upvoted a paper 2 days ago

DuPO: Enabling Reliable LLM Self-Verification via Dual Preference Optimization

Paper • 2508.14460 • Published 3 days ago • 71

New activity in jon-tow/okapi_arc_challenge 29 days ago

Error with load_dataset()

👍 1

#2 opened 29 days ago by

xuhuang87

upvoted a paper 29 days ago

Seed-X: Building Strong Multilingual Translation LLM with 7B Parameters

Paper • 2507.13618 • Published Jul 18 • 14

upvoted 2 papers 3 months ago

A Controllable Examination for Long-Context Language Models

Paper • 2506.02921 • Published Jun 3 • 33

ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows

Paper • 2505.19897 • Published May 26 • 103

upvoted 2 papers 4 months ago

Could Thinking Multilingually Empower LLM Reasoning?

Paper • 2504.11833 • Published Apr 16 • 29

Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning

Paper • 2504.08672 • Published Apr 11 • 55

upvoted a paper 5 months ago

CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era

Paper • 2503.12329 • Published Mar 16 • 26

upvoted a paper 6 months ago

Process-based Self-Rewarding Language Models

Paper • 2503.03746 • Published Mar 5 • 40

upvoted a collection 6 months ago

BenchMAX

Collection

10 items • Updated Feb 11 • 8

upvoted a paper 6 months ago

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54

commented a paper 6 months ago

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54 •

authored 3 papers 6 months ago

Lost in the Source Language: How Large Language Models Evaluate the Quality of Machine Translation

Paper • 2401.06568 • Published Jan 12, 2024

BenchMAX: A Comprehensive Multilingual Evaluation Suite for Large Language Models

Paper • 2502.07346 • Published Feb 11 • 54

IMTLab: An Open-Source Platform for Building, Evaluating, and Diagnosing Interactive Machine Translation Systems

Paper • 2310.11163 • Published Oct 17, 2023

xuhuang

AI & ML interests

Recent Activity

Organizations

xuhuang87's activity

Error with load_dataset()