smcleish
/

clrs_gemma_2b_100k_finetune_with_traces

Text Generation

text-generation-inference

Model card Files Files and versions

clrs_gemma_2b_100k_finetune_with_traces / README.md

smcleish's picture

Update README.md

082438f verified 11 months ago

|

history blame contribute delete

413 Bytes

	---
	library_name: transformers
	license: mit
	base_model:
	- google/gemma-2b
	---

	# Model Details

	`google/gemma-2b` model finetuned on 100,000 [CLRS-Text](https://github.com/google-deepmind/clrs/tree/master/clrs/_src/clrs_text) examples.

	## Training Details
	- Learning Rate: 1e-4, 150 warmup steps then cosine decayed to 5e-06 using AdamW optimiser
	- Batch size: 128
	- Loss taken over answer only, not on question.