Update README.md

930d8a4 verified 7 months ago

4.2 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- mlabonne/orpo-dpo-mix-40k
	base_model:
	- meta-llama/Llama-3.2-1B
	pipeline_tag: text-generation
	---

	# Orpo-Llama-3.2-1B-15k

	AdamLucek/Orpo-Llama-3.2-1B-15k is an [ORPO](https://arxiv.org/abs/2403.07691) fine tuned version of [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) on a subset of 15,000 shuffled entries of [mlabonne/orpo-dpo-mix-40k](https://huggingface.co/datasets/mlabonne/orpo-dpo-mix-40k).

	Trained for 7 hours on an L4 GPU with [this training script](https://colab.research.google.com/drive/1KV9AFAfhQCSjF8Ej4rI2ejDmx5AUnqHq?usp=sharing), modified from [Maxime Labonne's original guide](https://mlabonne.github.io/blog/posts/2024-04-19_Fine_tune_Llama_3_with_ORPO.html)

	For full model details, refer to the base model page [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B)

	## Evaluations

	In comparsion to [AdamLucek/Orpo-Llama-3.2-1B-40k](https://huggingface.co/AdamLucek/Orpo-Llama-3.2-1B-40k) using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness).

	\| Benchmark \| 15k Accuracy \| 15k Normalized \| 40k Accuracy \| 40k Normalized \| Notes \|
	\|----------------\|--------------\|----------------\|--------------\|----------------\|-------------------------------------------\|
	\| AGIEval \| 22.14% \| 21.01% \| 23.57% \| 23.26% \| 0-Shot Average across multiple reasoning tasks \|
	\| GPT4ALL \| 51.15% \| 54.38% \| 51.63% \| 55.00% \| 0-Shot Average across all categories \|
	\| TruthfulQA \| 42.79% \| N/A \| 42.14% \| N/A \| MC2 accuracy \|
	\| MMLU \| 31.22% \| N/A \| 31.01% \| N/A \| 5-Shot Average across all categories \|
	\| Winogrande \| 61.72% \| N/A \| 61.12% \| N/A \| 0-shot evaluation \|
	\| ARC Challenge \| 32.94% \| 36.01% \| 33.36% \| 37.63% \| 0-shot evaluation \|
	\| ARC Easy \| 64.52% \| 60.40% \| 65.91% \| 60.90% \| 0-shot evaluation \|
	\| BoolQ \| 50.24% \| N/A \| 52.29% \| N/A \| 0-shot evaluation \|
	\| PIQA \| 75.46% \| 74.37% \| 75.63% \| 75.19% \| 0-shot evaluation \|
	\| HellaSwag \| 48.56% \| 64.71% \| 48.46% \| 64.50% \| 0-shot evaluation \|

	## Using this Model

	```python
	from transformers import AutoTokenizer
	import transformers
	import torch

	# Load Model and Pipeline
	model = "AdamLucek/Orpo-Llama-3.2-1B-15k"

	pipeline = transformers.pipeline(
	"text-generation",
	model=model,
	torch_dtype=torch.float16,
	device_map="auto",
	)

	# Load Tokenizer
	tokenizer = AutoTokenizer.from_pretrained(model)

	# Generate Message
	messages = [{"role": "user", "content": "What is a language model?"}]
	prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
	outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
	print(outputs[0]["generated_text"])
	```

	## Training Statistics

	<div style="display: grid; grid-template-columns: repeat(2, 1fr); gap: 5px; max-width: 1000px;">
	<div>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/p_GHj_vst0xnC7tBznwRk.png" alt="Panel 1" style="width: 100%; height: auto;">
	</div>
	<div>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/AT6XO0WuHOWICT5omJ1L5.png" alt="Panel 2" style="width: 100%; height: auto;">
	</div>
	<div>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/XOXtthQ1RWxzcIP6V8-o_.png" alt="Panel 3" style="width: 100%; height: auto;">
	</div>
	<div>
	<img src="https://cdn-uploads.huggingface.co/production/uploads/65ba68a15d2ef0a4b2c892b4/WmV9BWOBxElAvZ3aClgUu.png" alt="Panel 4" style="width: 100%; height: auto;">
	</div>
	</div>