mbux-gradio-demo / README.md
MrunangG's picture
Update README.md
ba362cc verified
metadata
title: MBUX Voice Assistant Demo
emoji: πŸš—
colorFrom: blue
colorTo: green
sdk: docker
license: mit
language: en
base_model: microsoft/phi-2
tags:
  - text-generation
  - voice-assistant
  - automotive
  - fine-tuned
  - peft
  - lora
datasets:
  - synthetic
widget:
  - text: Navigate to the nearest EV charging station.
  - text: Set the temperature to 22 degrees.

πŸš— Fine-tuned MBUX Voice Assistant (phi-2)

This repository contains a fine-tuned version of Microsoft's phi-2 model, specifically adapted to function as an in-car voice assistant similar to MBUX. The model is trained to understand and respond to common automotive commands.

This model was created as part of an end-to-end MLOps project.

✨ Live Demo

You can interact with this model in a live, voice-to-voice application on Hugging Face Spaces:

➑️ Live MBUX Gradio Demo


πŸ“ Model Details

  • Base Model: microsoft/phi-2
  • Fine-tuning Method: Parameter-Efficient Fine-Tuning (PEFT) using LoRA.
  • Training Data: A synthetic, instruction-based dataset of ~100 in-car commands covering navigation, climate control, media, and vehicle settings.
  • Frameworks: PyTorch, Transformers, PEFT, TRL.

Intended Use

This model is a proof-of-concept designed for demonstration purposes. It's intended to be used as the "brain" for a voice assistant application in an automotive context. It excels at understanding commands like:

  • "Navigate to the office."
  • "Set the fan speed to maximum."
  • "Play my 'Morning Commute' playlist."

πŸš€ How to Use

This is a PEFT model (LoRA adapter), so you must load it on top of the base microsoft/phi-2 model.

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# Define the model repository IDs
base_model_id = "microsoft/phi-2"
peft_model_id = "MrunangG/phi-2-mbux-assistant"

# Set device
device = "cuda" if torch.cuda.is_available() else "cpu"

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    trust_remote_code=True,
    torch_dtype=torch.float16,
    device_map={"": device}
)

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

# Load the PEFT model by merging the adapter
model = PeftModel.from_pretrained(base_model, peft_model_id)

# --- Inference ---
prompt = "Set the temperature to 21 degrees."
formatted_prompt = f"[INST] {prompt} [/INST]"

inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device)

with torch.no_grad():
    outputs = model.generate(**inputs, max_new_tokens=50)

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
cleaned_response = response.split('[/INST]')[-1].strip()

print(cleaned_response)
# Expected output: Okay, setting the cabin temperature to 21 degrees.

πŸ› οΈ Training Procedure

The model was fine-tuned using the SFTTrainer from the TRL library. Key training parameters included a learning rate of 2e-4, the paged_adamw_8bit optimizer, and 4-bit quantization to ensure efficient training.