Uploaded model
- Developed by: axel-darmouni
- License: apache-2.0
- Finetuned from model : unsloth/gemma-3-1b-it-unsloth-bnb-4bit
This gemma3_text model was trained 2x faster with Unsloth and Huggingface's TRL library.
This gemma3 model aims to improve gemma-3-1b-it's ability to generate haikus given the prefix "Generate a haiku about the following topic: {topic}". Topics tested were less than a sentence long. Model was trained using the axel-darmouni/haiku_dataset.
Results of all training runs in the training github can be found below:
Model | Haiku Score | Similarity Score | Total Score | Train Overlap |
---|---|---|---|---|
unsloth/gemma-3-1b-it | 0.0372 | -0.0998 | -0.0627 | 0.00% |
gemma-3-1b-haiku | 0.1351 | 0.1101 | 0.2453 | 0.00% |
gemma-3-1b-sftrl-haiku | 0.0878 | 0.3708 | 0.4587 | 0.00% |
gemma-3-1b-sftrl-haiku-sparse | 0.1858 | -0.0880 | 0.0978 | 0.00% |
gemma-3-haiku-rl-sparse | 0.1537 | -0.1206 | 0.0331 | 0.00% |
gemma-3-1b-fullrun | 0.2348 | 0.0588 | 0.2936 | 0.00% |
The fullrun which is the model uploaded uses a combination of sft, rl with sparse rewards polished by a run with continuous rewards.
Warning: it is however worth noting the haiku reward might be biased, due to issues with the pyphen library, used to identify haikus.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for axel-darmouni/gemma-3-1b-haikuspec
Base model
google/gemma-3-1b-pt
Finetuned
google/gemma-3-1b-it
Quantized
unsloth/gemma-3-1b-it-unsloth-bnb-4bit