Yihe Deng's picture

Yihe Deng PRO

ydeng9

·

https://yihe-deng.notion.site/Yihe-Deng-167ab2d2c1fb80b3a76dfb120f716c84

Yihe__Deng

AI & ML interests

LLM post-training

Organizations

ydeng9's activity

New activity in ydeng9/OpenVLThinker-7B 3 months ago

Highlight code

#2 opened 3 months ago by

Add library name and pipeline tag

#1 opened 3 months ago by

commented a paper 3 months ago

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Paper • 2503.17352 • Published Mar 21 • 23 •

New activity in DuoGuard/DuoGuard-1.5B-transfer 4 months ago

Add link to code

#1 opened 4 months ago by

New activity in DuoGuard/DuoGuard-1B-Llama-3.2-transfer 4 months ago

Add link to Github repository

#1 opened 4 months ago by

New activity in DuoGuard/DuoGuard-0.5B 4 months ago

Add link to Github repository

#3 opened 4 months ago by

Add library name

#2 opened 4 months ago by

Add link to paper, add pipeline tag

#1 opened 4 months ago by

commented a paper 4 months ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published Feb 7 • 22 •

commented a paper 8 months ago

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 18 •

commented a paper 11 months ago

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

commented a paper 12 months ago

MIRAI: Evaluating LLM Agents for Event Forecasting

Paper • 2407.01231 • Published Jul 1, 2024 • 18 •

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter1 over 1 year ago

Training code

#1 opened over 1 year ago by

New activity in UCLA-AGI/zephyr-7b-sft-full-SPIN-iter2 over 1 year ago

How to reproduce the results ?

#1 opened over 1 year ago by

How to reproduce the results ?

#1 opened over 1 year ago by