vietphuon
/

Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241025-1

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Edit model card

DATASET

What's new?: Use the version 3.2 of dataset (Langfuse + AWS) that has better quality:
- Remove all the 10, 15 question count, just focus on 5 question count
- Fix all the Vietnamese quiz (make sure the output is Vietnamese)
- Fix some lazy duplicated topic (Biglead, Computing)
- Remove Paragraph, replace Paragraph with MCQ for all data points
- Train using the default training config (60 step, linear lr)

TRAINING

1075.8979 seconds used for training.
17.93 minutes used for training.
Peak reserved memory = 7.877 GB.
Peak reserved memory for training = 6.729 GB.
Peak reserved memory % of max memory = 53.411 %.
Peak reserved memory for training % of max memory = 45.627 %.
Final loss = 0.740000
View full training here: https://wandb.ai/vietphuongnguyen2602-rockship/huggingface/runs/04u9obeu

FINAL BENCHMARKING

Time to First Token (TTFT): 0.002s
Time Per Output Token (TPOT): 40.85ms/token
Throughput (token/s): 25.66token/s
Average Token Latency (ms/token): 40.90ms/token
Total Generation Time: 63.015s
Input Tokenization Time: 0.008s
Input Tokens: 1909
Output Tokens: 984
Total Tokens: 2892
Memory Usage (GPU): 1.49GB

Uploaded model

Developed by: vietphuon
License: apache-2.0
Finetuned from model : unsloth/Llama-3.2-1B-Instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference API

Unable to determine this model’s pipeline type. Check the docs .

Model tree for vietphuon/Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241025-1

Base model

meta-llama/Llama-3.2-1B-Instruct

Quantized

unsloth/Llama-3.2-1B-Instruct-bnb-4bit

Finetuned

(94)

this model

Collection including vietphuon/Llama-3.2-1B-Instruct-bnb-4bit-quizgen-241025-1

Released fine-tuned QuizGen models

Most current fine-tuned and tested models for Quizgen downstream task from Rockship Co. • 4 items • Updated 16 days ago