Uploaded model

  • Developed by: axel-darmouni
  • License: apache-2.0
  • Finetuned from model : unsloth/gemma-3-1b-it-unsloth-bnb-4bit

This gemma3_text model was trained 2x faster with Unsloth and Huggingface's TRL library.

This gemma3 model aims to improve gemma-3-1b-it's ability to generate haikus given the prefix "Generate a haiku about the following topic: {topic}". Topics tested were less than a sentence long. Model was trained using the axel-darmouni/haiku_dataset.

Results of all training runs in the training github can be found below:

Model Haiku Score Similarity Score Total Score Train Overlap
unsloth/gemma-3-1b-it 0.0372 -0.0998 -0.0627 0.00%
gemma-3-1b-haiku 0.1351 0.1101 0.2453 0.00%
gemma-3-1b-sftrl-haiku 0.0878 0.3708 0.4587 0.00%
gemma-3-1b-sftrl-haiku-sparse 0.1858 -0.0880 0.0978 0.00%
gemma-3-haiku-rl-sparse 0.1537 -0.1206 0.0331 0.00%
gemma-3-1b-fullrun 0.2348 0.0588 0.2936 0.00%

The fullrun which is the model uploaded uses a combination of sft, rl with sparse rewards polished by a run with continuous rewards.

Warning: it is however worth noting the haiku reward might be biased, due to issues with the pyphen library, used to identify haikus.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for axel-darmouni/gemma-3-1b-haikuspec

Finetuned
(155)
this model

Dataset used to train axel-darmouni/gemma-3-1b-haikuspec