BEE-spoke-data
/

smol_llama-220M-GQA-fineweb_edu

Text Generation

continual pretraining

text-generation-inference

Model card Files Files and versions Community

smol_llama-220M-GQA-fineweb_edu

Ctrl+K

Ctrl+K

2 contributors

History: 111 commits

pszemraj's picture

leaderboard-pr-bot's picture

leaderboard-pr-bot

Adding Evaluation Results (#1)

bea3606 verified 9 months ago

.gitattributes

1.52 kB

initial commit 10 months ago
README.md

12.1 kB

Adding Evaluation Results (#1) 9 months ago
all_results.json

539 Bytes

End of training 10 months ago
config.json

714 Bytes

Enable cache 10 months ago
eval_results.json

320 Bytes

End of training 10 months ago
generation_config.json

133 Bytes

Model save 10 months ago
model.safetensors

436 MB
LFS

Model save 10 months ago
special_tokens_map.json

551 Bytes

Training in progress, step 200 10 months ago
tokenizer.json

1.84 MB

Training in progress, step 200 10 months ago
tokenizer.model

500 kB
LFS

Training in progress, step 200 10 months ago
tokenizer_config.json

1.04 kB

Training in progress, step 200 10 months ago
train_results.json

296 Bytes

End of training 10 months ago
trainer_state.json

889 kB

End of training 10 months ago
training_args.bin
Detected Pickle imports (9)
- "transformers.trainer_utils.HubStrategy",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.training_args.OptimizerNames",
- "accelerate.utils.dataclasses.DistributedType",
- "torch.device",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "transformers.trainer_utils.SchedulerType",
- "accelerate.state.PartialState",
- "transformers.training_args.TrainingArguments"
How to fix it?
5.24 kB
LFS

Training in progress, step 200 10 months ago