Sangsang/feedback_asymmetric_fixed_ema_Llama-3.1-8B-Instruct_bw0p5_fw0p5_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_asymmetric_fixed_ema_Llama-3.1-8B-Instruct_bw0p5_fw0p5_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_both_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_both_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_disallowed_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_disallowed_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 11 hours ago
Sangsang/feedback_allowed_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 12 hours ago
Sangsang/feedback_allowed_ema_Llama-3.1-8B-Instruct_reverse_kl_ema0p999_ep30 Text Generation • Updated about 12 hours ago
Sangsang/feedback_disallowed_ema_Qwen3-4B-Instruct-2507_reverse_kl_ema0p999_ep30 Text Generation • Updated 3 days ago • 11
Sangsang/feedback_disallowed_ema_Qwen3-4B-Instruct-2507_reverse_kl_ema0p999_ep30 Text Generation • Updated 3 days ago • 11