---
base_model: unsloth/gemma-3-1b-it-unsloth-bnb-4bit
tags:
- text-generation-inference
- transformers
- unsloth
- gemma3_text
- trl
license: apache-2.0
language:
- en
datasets:
- axel-darmouni/haiku_dataset
---

# Uploaded  model

- **Developed by:** axel-darmouni
- **License:** apache-2.0
- **Finetuned from model :** unsloth/gemma-3-1b-it-unsloth-bnb-4bit

This gemma3_text model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.

This gemma3 model aims to improve gemma-3-1b-it's ability to generate haikus given the prefix "Generate a haiku about the following topic: {topic}". 
Topics tested were less than a sentence long.
Model was trained using the [axel-darmouni/haiku_dataset](https://huggingface.co/datasets/axel-darmouni/haiku_dataset).

Results of all training runs in the [training github](https://github.com/axeld5/gemma_haiku) can be found below:

| Model | Haiku Score | Similarity Score | Total Score | Train Overlap |
|-------|-------------|------------------|-------------|------------|
| unsloth/gemma-3-1b-it | 0.0372 | -0.0998 | -0.0627 | 0.00% |
| gemma-3-1b-haiku | 0.1351 | 0.1101 | 0.2453 | 0.00% |
| gemma-3-1b-sftrl-haiku | 0.0878 | 0.3708 | 0.4587 | 0.00% |
| gemma-3-1b-sftrl-haiku-sparse | 0.1858 | -0.0880 | 0.0978 | 0.00% |
| gemma-3-haiku-rl-sparse | 0.1537 | -0.1206 | 0.0331 | 0.00% |
| gemma-3-1b-fullrun | 0.2348 | 0.0588 | 0.2936 | 0.00% |

The fullrun which is the model uploaded uses a combination of sft, rl with sparse rewards polished by a run with continuous rewards.

Warning: it is however worth noting the haiku reward might be biased, due to issues with the pyphen library, used to identify haikus.

[<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)