Uploaded model

Developed by: axel-darmouni
License: apache-2.0
Finetuned from model : unsloth/gemma-3-1b-it-unsloth-bnb-4bit

This gemma3_text model was trained 2x faster with Unsloth and Huggingface's TRL library.

This gemma3 model aims to improve gemma-3-1b-it's ability to generate haikus given the prefix "Generate a haiku about the following topic: {topic}". Topics tested were less than a sentence long. Model was trained using the axel-darmouni/haiku_dataset.

Results of all training runs in the training github can be found below:

Model	Haiku Score	Similarity Score	Total Score	Train Overlap
unsloth/gemma-3-1b-it	0.0372	-0.0998	-0.0627	0.00%
gemma-3-1b-haiku	0.1351	0.1101	0.2453	0.00%
gemma-3-1b-sftrl-haiku	0.0878	0.3708	0.4587	0.00%
gemma-3-1b-sftrl-haiku-sparse	0.1858	-0.0880	0.0978	0.00%
gemma-3-haiku-rl-sparse	0.1537	-0.1206	0.0331	0.00%
gemma-3-1b-fullrun	0.2348	0.0588	0.2936	0.00%

The fullrun which is the model uploaded uses a combination of sft, rl with sparse rewards polished by a run with continuous rewards.

Warning: it is however worth noting the haiku reward might be biased, due to issues with the pyphen library, used to identify haikus.

axel-darmouni
/

gemma-3-1b-haikuspec

Uploaded model

Model tree for axel-darmouni/gemma-3-1b-haikuspec

Dataset used to train axel-darmouni/gemma-3-1b-haikuspec