Yaswanth Chittepu

yaswanthchittepu

AI & ML interests

None yet

Recent Activity

updated a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety_test

published a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety_test

updated a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety

View all activity

Organizations

yaswanthchittepu's activity

updated a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety_test

Viewer • Updated 7 days ago • 8k • 44

published a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety_test

Viewer • Updated 7 days ago • 8k • 44

updated a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety

Viewer • Updated 7 days ago • 4k • 22

published a dataset 7 days ago

yaswanthchittepu/safe_rlhf_safety

Viewer • Updated 7 days ago • 4k • 22

updated a dataset 7 days ago

yaswanthchittepu/safe_rlhf_val

Viewer • Updated 7 days ago • 4k • 26

published a dataset 7 days ago

yaswanthchittepu/safe_rlhf_val

Viewer • Updated 7 days ago • 4k • 26

updated a model 7 months ago

yaswanthchittepu/gemma1-sft-159744

Text Generation • Updated Aug 3, 2024 • 7

updated 4 datasets 7 months ago

updated 2 datasets 8 months ago

yaswanthchittepu/ultrafeedback-binarized-pop-margin-data-full

Viewer • Updated Jul 7, 2024 • 63.7k • 64

yaswanthchittepu/ultrafeedback-binarized-standard-margin-data-full

Viewer • Updated Jul 7, 2024 • 63.7k • 69

updated 3 models 8 months ago

yaswanthchittepu/pythia2.8b-ultrafeedback-binarized-pop-rm

Text Classification • Updated Jul 5, 2024 • 7

yaswanthchittepu/pythia2.8b-ultrafeedback-binarized-standard-rm

Text Classification • Updated Jul 5, 2024 • 7

yaswanthchittepu/pythia2.8b-ultrafeedback-binarized-sft

Text Generation • Updated Jul 5, 2024 • 6

authored a paper 9 months ago

Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms

Paper • 2406.02900 • Published Jun 5, 2024 • 12

updated 3 models 9 months ago

yaswanthchittepu/pythia-1b-tldr-ipo-beta-0.5-alpha-0-LATEST

Updated May 18, 2024

yaswanthchittepu/pythia-1b-tldr-ipo-beta-0.5-alpha-0-step-19968

Updated May 18, 2024

yaswanthchittepu/pythia-1b-tldr-dpo-beta-0.0175-alpha-0-LATEST

Updated May 18, 2024