PSFT+RL models
SII-Wenhong
wh-zhu
AI & ML interests
None yet
Recent Activity
authored
a paper
2 days ago
Flexible Realignment of Language Models
authored
a paper
2 days ago
Proximal Supervised Fine-Tuning
authored
a paper
2 days ago
Weak-to-Strong Preference Optimization: Stealing Reward from Weak
Aligned Model
Organizations
None yet