Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Tianlin Liu
tianlinliu0121
Follow
ezzaldeen's profile picture
gqjia's profile picture
alcompa's profile picture
4 followers
·
2 following
https://tianlinliu.com/
liutianlin0121
AI & ML interests
None yet
Organizations
Articles
2
Article
69
The N Implementation Details of RLHF with PPO
Article
2
使用 PPO 算法进行 RLHF 的 N 步实现细节
View all Articles
Papers
2
arxiv:
2402.04792
arxiv:
2402.02992
models
4
Sort: Recently updated
tianlinliu0121/zephyr-7b-dpo-full-debug-regression
Text Generation
•
7B
•
Updated
Dec 7, 2023
•
2
tianlinliu0121/zephyr-7b-dpo-full-beta-0.2
Text Generation
•
7B
•
Updated
Nov 23, 2023
•
4
tianlinliu0121/zephyr-7b-dpo-full-beta-0.083
Text Generation
•
7B
•
Updated
Nov 19, 2023
•
1
tianlinliu0121/zephyr-7b-dpo-full
Text Generation
•
7B
•
Updated
Nov 18, 2023
•
1
datasets
0
None public yet