sabersaleh
/

Llama3-SimPO

Model card Files Files and versions

sabersaleh commited on Dec 2, 2024

Commit

0d95e1b

·

verified ·

1 Parent(s): b6e9e64

Create README.md

Files changed (1) hide show

README.md +9 -0

README.md ADDED Viewed

	@@ -0,0 +1,9 @@

+---
+license: mit
+datasets:
+- HuggingFaceH4/ultrafeedback_binarized
+base_model:
+- meta-llama/Llama-3.1-8B
+---
+This is an aligned model based on princeton-nlp/Llama-3-Base-8B-SFT. This model is aligned using the Ultrafeedback dataset, fine-tuned through the Simple Preference Optimization (SimPO) loss. The optimization process was conducted with a single epoch.