File size: 3,825 Bytes

---
license: mit
language: en
base_model: microsoft/phi-2
tags:
- text-generation
- voice-assistant
- automotive
- fine-tuned
- peft
- lora
datasets:
- synthetic
widget:
- text: "Navigate to the nearest EV charging station."
- text: "Set the temperature to 22 degrees."
---

# 🚗 Fine-tuned MBUX Voice Assistant (phi-2)

This repository contains a fine-tuned version of Microsoft's **`microsoft/phi-2`** model, specifically adapted to function as an in-car voice assistant similar to MBUX. The model is trained to understand and respond to common automotive commands.

This model was created as part of an end-to-end MLOps project, from data creation and fine-tuning to deployment in an interactive application.

## ✨ Live Demo

You can interact with this model in a live, voice-to-voice application on Hugging Face Spaces:

**➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)**



---

## 📝 Model Details

* **Base Model:** `microsoft/phi-2`
* **Fine-tuning Method:** Parameter-Efficient Fine-Tuning (PEFT) using LoRA.
* **Training Data:** A synthetic, instruction-based dataset of in-car commands covering navigation, climate control, media, and vehicle settings.
* **Frameworks:** PyTorch, Transformers, PEFT, TRL.

### Intended Use

This model is a proof-of-concept designed for demonstration purposes. It's intended to be used as the "brain" for a voice assistant application in an automotive context. It excels at understanding commands like:
* "Navigate to the office."
* "Set the fan speed to maximum."
* "Play my 'Morning Commute' playlist."

---

## 🚀 How to Use

While the model's core function is text generation, its primary intended use is within a full voice-to-voice pipeline.

### Interactive Voice Demo
For the complete, interactive experience including Speech-to-Text and Text-to-Speech, please visit the live application hosted on Hugging Face Spaces:

**➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)**

### Programmatic Use (Text-Only)

The following Python code shows how to use the fine-tuned model for its core text-generation task programmatically.

```python
import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Define the model repository IDs
base_model_id = "microsoft/phi-2"
peft_model_id = "MrunangG/phi-2-mbux-assistant"

# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map={"": device}
)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load the PEFT model by merging the adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)

# --- Inference ---
prompt = "Set the temperature to 21 degrees."
formatted_prompt = f"[INST] {prompt} [/INST]"

inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=50)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
cleaned_response = response.split('[/INST]')[-1].strip()

print(cleaned_response)
# Expected output: Okay, setting the cabin temperature to 21 degrees.
```

---

## 🛠️ Training Procedure

The model was fine-tuned using the `SFTTrainer` from the TRL library. Key training parameters included a learning rate of `2e-4`, the `paged_adamw_8bit` optimizer, and 4-bit quantization to ensure efficient training on consumer hardware.

### Framework versions
- PEFT 0.17.1
- TRL: 0.22.1
- Transformers: 4.56.0
- Pytorch: 2.8.0
- Datasets: 4.0.0
- Tokenizers: 0.22.0