--- library_name: transformers tags: [] --- # Model Card ### Model Description This is a fine-tuned version of **DeepSeek-R1-Distill-Llama-8B**, optimized for **telecom-related queries**. The model has been fine-tuned to provide **concise and factual answers**, ensuring that it **does role-play as a customer service agent**. - **Developed by:** Mohamed Abdulaziz - **Model type:** Fine-tune-DeepSeek-R1-Distill-Llama-8B - **Framework Used:** Unsloth for fine tuning and wandb for performance monitoring - **License:** MIT License ## Uses This model is designed for **customer support automation in the telecom industry**. It assists in: - Answering common user queries about **5G, network issues, billing, and services**. - Providing **concise and factually correct responses**. - Reducing **workload on human support agents** by handling routine inquiries. ### **Who can use this model?** - **Telecom companies**: Automate customer service via chatbots. - **Developers & researchers**: Fine-tune and adapt for different use cases. - **Call centers**: Support agents in handling user requests efficiently. ### **Who might be affected?** - **End-users** interacting with telecom chatbots. - **Support agents** using AI-assisted tools. - **Developers & data scientists** fine-tuning and deploying the model. ## How to Get Started with the Model ### **1️⃣ Import necessary libraries** ```python import torch from unsloth import FastLanguageModel from transformers import AutoTokenizer ``` ### **2️⃣ Define model path** ```python model_path = "moo100/DeepSeek-R1-telecom-chatbot" ``` ### **3️⃣ Load the model and tokenizer** ```python model, tokenizer = FastLanguageModel.from_pretrained( model_path, max_seq_length=1024, # training length equal to 2048 but you can choose less than that to avoid OOM dtype=None # Uses default precision ) ``` ### **4️⃣ Optimize model for fast inference with Unsloth** ```python model = FastLanguageModel.for_inference(model) ``` ### **5️⃣ Move model to GPU if available, otherwise use CPU** ```python device = "cuda" if torch.cuda.is_available() else "cpu" model.to(device) ``` ### **6️⃣ Define system instruction to guide model responses** ```python system_instruction = """You are an AI assistant. Answer user questions concisely and factually. Do NOT role-play as a customer service agent. Only answer the user's query.""" ``` ### **7️⃣ Define user input (Replace with any query)** ```python user_input = "What are the benefits of 5G?" ``` ### **8️⃣ Construct full prompt with instructions and user query** ```python full_prompt = f"{system_instruction}\n\nUser: {user_input}\nAssistant:" ``` ### **9️⃣ Tokenize input prompt** ```python inputs = tokenizer(full_prompt, return_tensors="pt").to(device) ``` ### **🔟 Generate model response with controlled stopping criteria** ```python outputs = model.generate( input_ids=inputs.input_ids, # Encoded input tokens attention_mask=inputs.attention_mask, # Mask for input length max_new_tokens=100, # Limits response length do_sample=True, # Enables randomness for variability temperature=0.5, # Controls randomness level top_k=50, # Samples from top 50 probable words eos_token_id=tokenizer.eos_token_id, # Stops at end-of-sentence token ) ``` ### **1️⃣1️⃣ Decode and extract only the newly generated response** ```python response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True) ``` ### **1️⃣2️⃣ Print the AI-generated response** ```python print(response.split("\n")[0].strip()) ``` ## Training Details ### Training Data talkmap/telecom-conversation-corpus ### Training Procedure - **Loss Curve:** Shows a steady decline, indicating model convergence. - **Learning Rate Schedule:** Linear decay applied. - **Gradient Norm:** Slight increase, but under control. - **Global Steps & Epochs:** Indicates training progress. Below are the training metrics recorded during fine-tuning: https://drive.google.com/file/d/1-SOfG8K3Qt2WSEuyj3kFhGYOYMB5Gk2r/view?usp=sharing # Evaluation ## Methodology The chatbot was evaluated using Meta-Llama-3.3-70B-Instruct, assessing relevance, correctness, and fluency of its responses. ## Results Meta-Llama-3.3-70B-Instruct Evaluation: Relevance: 9/10 The response is highly relevant to the user’s query about 5G benefits, providing a concise and informative summary. Correctness: 10/10 The response is factually accurate, highlighting key advantages such as faster data speeds, lower latency, increased capacity, and broader device compatibility. Fluency: 9/10 The response is well-structured, grammatically sound, and easy to understand. Minor refinements could further enhance readability.