RLHF-And-Friends/Llama-3.2-1B-Instruct-Reward-ultrafeedback_binarized-max_length-1024-LoRA-8r Updated 8 days ago
Grogros/dmWM-meta-llama-Llama-3.2-1B-Instruct-ft-OpenMathInstruct Text Generation • Updated 1 day ago • 6
RLHF-And-Friends/RM-UltrafeedbackBinarized-Llama-3.2-1B-Instruct-Q4-LoRA8-Batch-16-Tok-1024 Updated about 10 hours ago