tFINE-base-300m-instruct-L2
This is an "instruct" model fine-tuned originally from pszemraj/tFINE-base-300m
in the following phases:
- two epochs on supernatural instructions
- instruct tuning on 2M "easy"/L1 instructions
- instruct tuning on 1M "harder"/L2 instructions
Usage example
from transformers import pipeline
pipe = pipeline(
"text2text-generation",
model="pszemraj/tFINE-300m-instruct-L2",
)
prompt = "write a python script to download a file from a url and save as a local file using requests. explain how it works"
res = pipe(
prompt,
num_beams=4,
early_stopping=True,
max_new_tokens=384,
no_repeat_ngram_size=7,
)
print(res[0]["generated_text"])
Quick eval
Quick eval for: pszemraj/tFINE-base-300m-instruct-L2
hf (pretrained=pszemraj/tFINE-base-300m-instruct-L2,trust_remote_code=True,dtype=bfloat16,trust_remote_code=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8
Tasks | Version | Filter | n-shot | Metric | Value | Stderr | ||
---|---|---|---|---|---|---|---|---|
boolq | 2 | none | 0 | acc | ↑ | 0.6193 | ± | 0.0085 |
openbookqa | 1 | none | 0 | acc | ↑ | 0.1440 | ± | 0.0157 |
none | 0 | acc_norm | ↑ | 0.3040 | ± | 0.0206 | ||
piqa | 1 | none | 0 | acc | ↑ | 0.6083 | ± | 0.0114 |
none | 0 | acc_norm | ↑ | 0.6061 | ± | 0.0114 | ||
social_iqa | 0 | none | 0 | acc | ↑ | 0.3823 | ± | 0.0110 |
tinyArc | 0 | none | 25 | acc_norm | ↑ | 0.3469 | ± | N/A |
tinyGSM8k | 0 | flexible-extract | 5 | exact_match | ↑ | 0.0371 | ± | N/A |
strict-match | 5 | exact_match | ↑ | 0.0154 | ± | N/A | ||
tinyHellaswag | 0 | none | 10 | acc_norm | ↑ | 0.3044 | ± | N/A |
tinyMMLU | 0 | none | 0 | acc_norm | ↑ | 0.3311 | ± | N/A |
winogrande | 1 | none | 0 | acc | ↑ | 0.5107 | ± | 0.0140 |
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 4
- eval_batch_size: 8
- seed: 17868
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.05
- num_epochs: 1.0
- Downloads last month
- 5
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.