Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
dzhuj
's Collections
hw3
hw2
hw2
updated
Mar 9
Solution for the HW-2. DPO. PPO
Upvote
-
dzhuj/reward
Text Classification
•
0.1B
•
Updated
Mar 9
•
1
dzhuj/llm-course-hw2-dpo
Text Generation
•
0.1B
•
Updated
Mar 9
•
6
dzhuj/llm-course-hw2-ppo
Text Generation
•
0.1B
•
Updated
Mar 9
•
4
Upvote
-
Share collection
View history
Collection guide
Browse collections