--- base_model: meta-llama/Meta-Llama-3-8B-Instruct library_name: peft datasets: - lordjia/Cantonese_English_Translation --- # MISHANM/Cantonese_eng_text_generation_Llama3_8B_instruction This model has undergone meticulous fine-tuning for Cantonese language compatibility. It is equipped to handle question-answering and translation tasks between English and Cantonese. Leveraging sophisticated natural language processing methodologies, it delivers precise and context-sensitive responses, ensuring a comprehensive grasp of Cantonese nuances. Consequently, its outputs are dependable and pertinent across various scenarios. ## Model Details 1. Language: Cantonese 2. Tasks: Question Answering(Cantonese to Cantonese) , Translation (Tibetan to Cantonese) 3. Base Model: meta-llama/Meta-Llama-3-8B-Instruct # Training Details The model is trained on approx 109,942 instruction samples. 1. GPUs: 4*AMD Radeon™ PRO V620 2. Training Time: 61:07:36 ## Inference with HuggingFace ```python3 import torch from transformers import AutoModelForCausalLM, AutoTokenizer # Load the fine-tuned model and tokenizer model_path = "MISHANM/Cantonese_eng_text_generation_Llama3_8B_instruction" model = AutoModelForCausalLM.from_pretrained(model_path,device_map="auto") tokenizer = AutoTokenizer.from_pretrained(model_path) # Function to generate text def generate_text(prompt, max_length=500, temperature=0.9): # Format the prompt according to the chat template messages = [ { "role": "system", "content": "You are a Cantonese language expert and linguist, with same knowledge give response in Cantonese language.", }, {"role": "user", "content": prompt} ] # Apply the chat template formatted_prompt = f"<|system|>{messages[0]['content']}<|user|>{messages[1]['content']}<|assistant|>" # Tokenize and generate output inputs = tokenizer(formatted_prompt, return_tensors="pt") output = model.generate( **inputs, max_new_tokens=max_length, temperature=temperature, do_sample=True ) return tokenizer.decode(output[0], skip_special_tokens=True) # Example usage prompt = """佢日日搭的士出入,好似幾百萬未開頭噉。""" translated_text = generate_text(prompt) print(translated_text) ``` ## Citation Information ``` @misc{MISHANM/Cantonese_eng_text_generation_Llama3_8B_instruction, author = {Mishan Maurya}, title = {Introducing Fine Tuned LLM for Cantonese Language}, year = {2025}, publisher = {Hugging Face}, journal = {Hugging Face repository}, } ``` - PEFT 0.12.0