sarthak247
/

gemma-3-1B-GRPO-float16

Text Generation

text-generation-inference

Model card Files Files and versions Community

Uploaded finetuned model

Developed by: sarthak247
License: apache-2.0
Finetuned from model : unsloth/gemma-3-1b-it

This gemma3_text model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 11

Safetensors

Model size

1,000M params

Tensor type

BF16

·

FP16

·

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sarthak247/gemma-3-1B-GRPO-float16

Base model

google/gemma-3-1b-pt

Finetuned

google/gemma-3-1b-it

Finetuned

unsloth/gemma-3-1b-it

Finetuned

(175)

this model

Collection including sarthak247/gemma-3-1B-GRPO-float16

Gemma-3-1B-GRPO

Gemma 3 (1B) model with GRPO training • 2 items • Updated Apr 7