English to Chinese Translation (Quantized Model)

This repository contains a quantized English-to-Chinese translation model fine-tuned on the ['wlhb/Transaltion-Chinese-2-English'] dataset and optimized using dynamic quantization for efficient CPU inference.

🔧 Model Details

Base model: Helsinki-NLP/opus-mt-en-zh
Dataset: ['wlhb/Transaltion-Chinese-2-English']
Training platform: Kaggle (CUDA GPU)
Fine-tuned: On English-Chinese pairs from the Hugging Face dataset
Quantization: PyTorch Dynamic Quantization (torch.quantization.quantize_dynamic)
Tokenizer: Saved alongside the model

📁 Folder Structure

quantized_model/ ├── config.json ├── pytorch_model.bin ├── tokenizer_config.json ├── tokenizer.json ├── vocab.json / merges.txt

🚀 Usage

🔹 1. Load Quantized Model for Inference

import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM, pipeline

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("./quantized_model")

# Load quantized model
model = AutoModelForSeq2SeqLM.from_pretrained("./quantized_model")
model.eval()

# Run translation
translator = pipeline("translation_en_to_zh", model=model, tokenizer=tokenizer, device=-1)

text = "How are you?"
print("English:", translator(text)[0]['translation_text'])

Model Training Summary

Loaded dataset: wlhb/Transaltion-Chinese-2-English
Mapped translation data: {"en": ..., "zh": ...} before training
Training: 3 epochs using GPU
Disabled: wandb logging
Skipped: Evaluation phase
Saved: Trained + Quantized model and tokenizer
Quantization: torch.quantization.Quantize_dynamic is used for efficient CPU inference