Edit model card

DATASET

  • What's new?: Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
    • Remove all the 10, 15 question count, just focus on 5 question count
    • Fix all the Vietnamese quiz (make sure the output is Vietnamese)
    • Fix some lazy duplicated topic (Biglead, Computing)
    • Remove Paragraph, replace Paragraph with MCQ for all data points
    • Train using the default training config (60 step, linear lr)

TRAINING

  • 1075.8979 seconds used for training.
  • 17.93 minutes used for training.
  • Peak reserved memory = 7.877 GB.
  • Peak reserved memory for training = 6.729 GB.
  • Peak reserved memory % of max memory = 53.411 %.
  • Peak reserved memory for training % of max memory = 45.627 %.
  • Final loss = 0.740000
  • View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/04u9obeu

FINAL BENCHMARKING

  • Time to First Token (TTFT): 0.002s
  • Time Per Output Token (TPOT): 40.85ms/token
  • Throughput (token/s): 25.66token/s
  • Average Token Latency (ms/token): 40.90ms/token
  • Total Generation Time: 63.015s
  • Input Tokenization Time: 0.008s
  • Input Tokens: 1909
  • Output Tokens: 984
  • Total Tokens: 2892
  • Memory Usage (GPU): 1.49GB

Uploaded model

  • Developed by: vietphuon
  • License: apache-2.0
  • Finetuned from model : unsloth/Llama-3.2-1B-Instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference API
Unable to determine this model’s pipeline type. Check the docs .

Model tree for vietphuon/Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241025-1

Collection including vietphuon/Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241025-1