DATASET
- What's new?: Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
- Remove all the 10, 15 question count, just focus on 5 question count
- Fix all the Vietnamese quiz (make sure the output is Vietnamese)
- Fix some lazy duplicated topic (Biglead, Computing)
- Remove Paragraph, replace Paragraph with MCQ for all data points
- Train using the default training config (60 step, linear lr)
TRAINING
- 1075.8979 seconds used for training.
- 17.93 minutes used for training.
- Peak reserved memory = 7.877 GB.
- Peak reserved memory for training = 6.729 GB.
- Peak reserved memory % of max memory = 53.411 %.
- Peak reserved memory for training % of max memory = 45.627 %.
- Final loss = 0.740000
- View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/04u9obeu
FINAL BENCHMARKING
- Time to First Token (TTFT): 0.002s
- Time Per Output Token (TPOT): 40.85ms/token
- Throughput (token/s): 25.66token/s
- Average Token Latency (ms/token): 40.90ms/token
- Total Generation Time: 63.015s
- Input Tokenization Time: 0.008s
- Input Tokens: 1909
- Output Tokens: 984
- Total Tokens: 2892
- Memory Usage (GPU): 1.49GB
Uploaded model
- Developed by: vietphuon
- License: apache-2.0
- Finetuned from model : unsloth/Llama-3.2-1B-Instruct-bnb-4bit
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.