26 2

Zhilin Wang

zhilinw

AI & ML interests

None yet

Recent Activity

updated a dataset about 2 months ago

nvidia/HelpSteer3

updated a model about 2 months ago

nvidia/Llama-3_3-Nemotron-Super-49B-GenRM

updated a model about 2 months ago

nvidia/Llama-3_3-Nemotron-Super-49B-GenRM-Multilingual

View all activity

Organizations

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward-HF 3 months ago

Comparability of the results for different prompts

#9 opened 3 months ago by

treehugg3

New activity in nvidia/HelpSteer3 3 months ago

Request access to ground-truth helpfulness scores for training Generative Reward Models (non-BT)

#5 opened 3 months ago by

andy-pi

The HelpSteer datasets don't overlap, right?

#2 opened 4 months ago by

treehugg3

For the data on Edit_quality, how to map the relationship between response and feedback?

#4 opened 3 months ago by

bittersweet

commented a paper 3 months ago

HelpSteer3-Preference: Open Human-Annotated Preference Data across Diverse Tasks and Languages

Paper • 2505.11475 • Published May 16 • 3 •

New activity in nvidia/HelpSteer3 3 months ago

Add task category, update paper link

#3 opened 3 months ago by

nielsr

commented a paper 5 months ago

Dedicated Feedback and Edit Models Empower Inference-Time Scaling for Open-Ended General-Domain Tasks

Paper • 2503.04378 • Published Mar 6 • 7 •

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward-HF 7 months ago

Why the hf-format model does not have rm head, since the original format model does have.

#7 opened 8 months ago by

eyuansu71

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward-HF 10 months ago

Question on tokenizer chat template and readout

➕ 1

#4 opened 10 months ago by

dereklim

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward 10 months ago

Should we use the 5th dimension of the output only?

#2 opened 10 months ago by

liangqxx

New activity in nvidia/HelpSteer2 10 months ago

Preference split not loading

#5 opened 11 months ago by

sanderland

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward-HF 11 months ago

Vllm

#2 opened 11 months ago by

dbasu

Add proper library tag

#1 opened 11 months ago by

osanseviero

New activity in nvidia/Llama-3.1-Nemotron-70B-Reward 11 months ago

Add pipeline tag

#1 opened 11 months ago by

nielsr

commented a paper 11 months ago

HelpSteer2-Preference: Complementing Ratings with Preferences

Paper • 2410.01257 • Published Oct 2, 2024 • 25 •

New activity in nvidia/Nemotron-4-340B-Reward about 1 year ago

Convertion to HF

#7 opened about 1 year ago by

lbathen

Running inference outside of triton

#6 opened about 1 year ago by

lbathen

Object shard /models/Nemotron-4-340B-Reward/model_weights/model.rm_head._extra_state/shard_0_1.pt not found

#3 opened about 1 year ago by

codybum

New activity in nvidia/Llama2-13B-SteerLM-RM about 1 year ago

RewardBench results?

#2 opened about 1 year ago by

Avelina

New activity in nvidia/Llama3-70B-SteerLM-RM about 1 year ago

huggingface compatible format