🚗 Fine-tuned MBUX Voice Assistant (phi-2)

This repository contains a fine-tuned version of Microsoft's microsoft/phi-2 model, specifically adapted to function as an in-car voice assistant similar to MBUX. The model is trained to understand and respond to common automotive commands.

This model was created as part of an end-to-end MLOps project, from data creation and fine-tuning to deployment in an interactive application.

✨ Live Demo

You can interact with this model in a live, voice-to-voice application on Hugging Face Spaces:

➡️ Live MBUX Gradio Demo

📝 Model Details

Base Model: microsoft/phi-2
Fine-tuning Method: Parameter-Efficient Fine-Tuning (PEFT) using LoRA.
Training Data: A synthetic, instruction-based dataset of in-car commands covering navigation, climate control, media, and vehicle settings.
Frameworks: PyTorch, Transformers, PEFT, TRL.

Intended Use

This model is a proof-of-concept designed for demonstration purposes. It's intended to be used as the "brain" for a voice assistant application in an automotive context. It excels at understanding commands like:

"Navigate to the office."
"Set the fan speed to maximum."
"Play my 'Morning Commute' playlist."

🚀 How to Use

While the model's core function is text generation, its primary intended use is within a full voice-to-voice pipeline.

Interactive Voice Demo

For the complete, interactive experience including Speech-to-Text and Text-to-Speech, please visit the live application hosted on Hugging Face Spaces:

➡️ Live MBUX Gradio Demo

Programmatic Use (Text-Only)

The following Python code shows how to use the fine-tuned model for its core text-generation task programmatically.

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Define the model repository IDs
base_model_id = "microsoft/phi-2"
peft_model_id = "MrunangG/phi-2-mbux-assistant"

# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map={"": device}
)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load the PEFT model by merging the adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)

# --- Inference ---
prompt = "Set the temperature to 21 degrees."
formatted_prompt = f"[INST] {prompt} [/INST]"

inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=50)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
cleaned_response = response.split('[/INST]')[-1].strip()

print(cleaned_response)
# Expected output: Okay, setting the cabin temperature to 21 degrees.

🛠️ Training Procedure

The model was fine-tuned using the SFTTrainer from the TRL library. Key training parameters included a learning rate of 2e-4, the paged_adamw_8bit optimizer, and 4-bit quantization to ensure efficient training on consumer hardware.

Framework versions

PEFT 0.17.1
TRL: 0.22.1
Transformers: 4.56.0
Pytorch: 2.8.0
Datasets: 4.0.0
Tokenizers: 0.22.0

MrunangG
/

phi-2-mbux-assistant