smcleish
/

clrs_gemma_2b_100k_finetune_with_traces

Text Generation

text-generation-inference

Model card Files Files and versions

Model Details

google/gemma-2b model finetuned on 100,000 CLRS-Text examples.

Training Details

Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser
Batch size: 128
Loss taken over answer only, not on question.

Downloads last month: -

Safetensors

Model size

3B params

Tensor type

BF16

·

Model tree for smcleish/clrs_gemma_2b_100k_finetune_with_traces

Base model

google/gemma-2b

Finetuned

(236)

this model