---
library_name: transformers
tags: []
---

# Model Card

<!-- Provide a quick summary of what the model is/does. -->


### Model Description

<!-- Provide a longer summary of what this model is. -->
This is a fine-tuned version of **DeepSeek-R1-Distill-Llama-8B**, optimized for **telecom-related queries**. The model has been fine-tuned to provide **concise and factual answers**, ensuring that it **does role-play as a customer service agent**.

- **Developed by:** Mohamed Abdulaziz
- **Model type:** Fine-tune-DeepSeek-R1-Distill-Llama-8B
- **Framework Used:** Unsloth for fine tuning and wandb for performance monitoring
- **License:** MIT License


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
This model is designed for **customer support automation in the telecom industry**. It assists in:
- Answering common user queries about **5G, network issues, billing, and services**.
- Providing **concise and factually correct responses**.
- Reducing **workload on human support agents** by handling routine inquiries.

### **Who can use this model?**
- **Telecom companies**: Automate customer service via chatbots.
- **Developers & researchers**: Fine-tune and adapt for different use cases.
- **Call centers**: Support agents in handling user requests efficiently.

### **Who might be affected?**
- **End-users** interacting with telecom chatbots.
- **Support agents** using AI-assisted tools.
- **Developers & data scientists** fine-tuning and deploying the model.


## How to Get Started with the Model


### **1️⃣ Import necessary libraries**
```python
import torch
from unsloth import FastLanguageModel
from transformers import AutoTokenizer
```

### **2️⃣ Define model path**
```python
model_path = "moo100/DeepSeek-R1-telecom-chatbot"
```

### **3️⃣ Load the model and tokenizer**
```python
model, tokenizer = FastLanguageModel.from_pretrained(
    model_path, 
    max_seq_length=1024,  # training length equal to 2048 but you can choose less than that to avoid OOM
    dtype=None  # Uses default precision
)
```

### **4️⃣ Optimize model for fast inference with Unsloth**
```python
model = FastLanguageModel.for_inference(model)
```

### **5️⃣ Move model to GPU if available, otherwise use CPU**
```python
device = "cuda" if torch.cuda.is_available() else "cpu"
model.to(device)
```

### **6️⃣ Define system instruction to guide model responses**
```python
system_instruction = """You are an AI assistant. Answer user questions concisely and factually. 
Do NOT role-play as a customer service agent. Only answer the user's query."""
```

### **7️⃣ Define user input (Replace with any query)**
```python
user_input = "What are the benefits of 5G?"
```

### **8️⃣ Construct full prompt with instructions and user query**
```python
full_prompt = f"{system_instruction}\n\nUser: {user_input}\nAssistant:"
```

### **9️⃣ Tokenize input prompt**
```python
inputs = tokenizer(full_prompt, return_tensors="pt").to(device)
```

### **🔟 Generate model response with controlled stopping criteria**
```python
outputs = model.generate(
    input_ids=inputs.input_ids,  # Encoded input tokens
    attention_mask=inputs.attention_mask,  # Mask for input length
    max_new_tokens=100,  # Limits response length
    do_sample=True,  # Enables randomness for variability
    temperature=0.5,  # Controls randomness level
    top_k=50,  # Samples from top 50 probable words
    eos_token_id=tokenizer.eos_token_id,  # Stops at end-of-sentence token
)
```

### **1️⃣1️⃣ Decode and extract only the newly generated response**
```python
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)
```

### **1️⃣2️⃣ Print the AI-generated response**
```python
print(response.split("\n")[0].strip())
```


## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

talkmap/telecom-conversation-corpus

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->

- **Loss Curve:** Shows a steady decline, indicating model convergence.
- **Learning Rate Schedule:** Linear decay applied.
- **Gradient Norm:** Slight increase, but under control.
- **Global Steps & Epochs:** Indicates training progress.

  Below are the training metrics recorded during fine-tuning:
  https://drive.google.com/file/d/1-SOfG8K3Qt2WSEuyj3kFhGYOYMB5Gk2r/view?usp=sharing


# Evaluation

## Methodology

The chatbot was evaluated using Meta-Llama-3.3-70B-Instruct, assessing relevance, correctness, and fluency of its responses.

## Results

Meta-Llama-3.3-70B-Instruct Evaluation:

  Relevance: 9/10
  The response is highly relevant to the user’s query about 5G benefits, providing a concise and informative summary.
  
  Correctness: 10/10
  The response is factually accurate, highlighting key advantages such as faster data speeds, lower latency, increased capacity, and broader device compatibility.
  
  Fluency: 9/10
  The response is well-structured, grammatically sound, and easy to understand. Minor refinements could further enhance readability.