dropout-0.4-iter_0001192

This is a model uploaded from /mnt/nanjingcephfs/project_wx-rec-alg-bdc-exp/bwzheng/yulan/hyw/Ubiquant-Pretrain/build/wjp-share/output_mcore_qwen2.5_pretrain/checkpoint/dropout-0.4-2025.09.22-14.32.58-pretrain-mcore-qwen2.5-0.5B-lr-1e-5-minlr-1e-6-bs-4-gbs-1024-seqlen-8192-pr-bf16-tp-1-pp-1-cp-1-ac-sel-do-true-sp-false-ti-1192-wi-119/huggingface/qwen2-0.5b-using-llam2-modeling.

Downloads last month
-
Safetensors
Model size
494M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support