VaidikML0508/Shark-Tank-Offer-Evaluator-llama3.2-3B-Instruct-SFT-DPO-4bits-V1 Text Generation • Updated Apr 22 • 11
NousResearch/DeepHermes-Egregore-v1-RLAIF-8b-Atropos Reinforcement Learning • Updated Apr 29 • 27 • 2
NousResearch/DeepHermes-Egregore-v2-RLAIF-8b-Atropos Reinforcement Learning • Updated Apr 29 • 17 • 4
NousResearch/DeepHermes-AscensionMaze-RLAIF-8b-Atropos Reinforcement Learning • Updated Apr 29 • 43 • 3