gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-1.0-iteration2 Text Generation • 8B • Updated Jun 9
gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-20.0-iteration1 Text Generation • 8B • Updated Jun 8
gupta-tanish/llama-3-8b-instruct-refa-lr-1e-6-beta10-gamma4-lambda-1.0-eos-increase-iteration2-lamda-0.1 Text Generation • 8B • Updated Jun 7
gupta-tanish/llama-3-8b-instruct-refa-lr-1e-6-beta10-gamma4-lambda-0.1-eos-increase-iteration2 Text Generation • 8B • Updated Jun 7
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.001-lr-1e-6-iteration1 Text Generation • 8B • Updated Jun 7
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.01-lr-1e-6-iteration1 Text Generation • 8B • Updated Jun 7 • 1
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-0.1-lr-1e-6-iteration1 Text Generation • 8B • Updated Jun 6
gupta-tanish/llama3-8b-instruct-refa-eos-increase-lamda-1.0-lr-1e-6-iteration1 Text Generation • 8B • Updated Jun 6
gupta-tanish/llama3-8b-instruct-on-policy-mpo-iteration1-v3 Text Generation • 8B • Updated Apr 22 • 1