--- license: mit language: en base_model: microsoft/phi-2 tags: - text-generation - voice-assistant - automotive - fine-tuned - peft - lora datasets: - synthetic widget: - text: "Navigate to the nearest EV charging station." - text: "Set the temperature to 22 degrees." --- # 🚗 Fine-tuned MBUX Voice Assistant (phi-2) This repository contains a fine-tuned version of Microsoft's **`microsoft/phi-2`** model, specifically adapted to function as an in-car voice assistant similar to MBUX. The model is trained to understand and respond to common automotive commands. This model was created as part of an end-to-end MLOps project, from data creation and fine-tuning to deployment in an interactive application. ## ✨ Live Demo You can interact with this model in a live, voice-to-voice application on Hugging Face Spaces: **➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)** --- ## 📝 Model Details * **Base Model:** `microsoft/phi-2` * **Fine-tuning Method:** Parameter-Efficient Fine-Tuning (PEFT) using LoRA. * **Training Data:** A synthetic, instruction-based dataset of in-car commands covering navigation, climate control, media, and vehicle settings. * **Frameworks:** PyTorch, Transformers, PEFT, TRL. ### Intended Use This model is a proof-of-concept designed for demonstration purposes. It's intended to be used as the "brain" for a voice assistant application in an automotive context. It excels at understanding commands like: * "Navigate to the office." * "Set the fan speed to maximum." * "Play my 'Morning Commute' playlist." --- ## 🚀 How to Use While the model's core function is text generation, its primary intended use is within a full voice-to-voice pipeline. ### Interactive Voice Demo For the complete, interactive experience including Speech-to-Text and Text-to-Speech, please visit the live application hosted on Hugging Face Spaces: **➡️ [Live MBUX Gradio Demo](https://huggingface.co/spaces/MrunangG/mbux-gradio-demo)** ### Programmatic Use (Text-Only) The following Python code shows how to use the fine-tuned model for its core text-generation task programmatically. ```python import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer # Define the model repository IDs base_model_id = "microsoft/phi-2" peft_model_id = "MrunangG/phi-2-mbux-assistant" # Set device device = "cuda" if torch.cuda.is_available() else "cpu" # Load the base model base_model = AutoModelForCausalLM.from_pretrained( base_model_id, trust_remote_code=True, torch_dtype=torch.float16, device_map={"": device} ) # Load the tokenizer tokenizer = AutoTokenizer.from_pretrained(base_model_id, trust_remote_code=True) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token # Load the PEFT model by merging the adapter model = PeftModel.from_pretrained(base_model, peft_model_id) # --- Inference --- prompt = "Set the temperature to 21 degrees." formatted_prompt = f"[INST] {prompt} [/INST]" inputs = tokenizer(formatted_prompt, return_tensors="pt").to(device) with torch.no_grad(): outputs = model.generate(**inputs, max_new_tokens=50) response = tokenizer.decode(outputs[0], skip_special_tokens=True) cleaned_response = response.split('[/INST]')[-1].strip() print(cleaned_response) # Expected output: Okay, setting the cabin temperature to 21 degrees. ``` --- ## 🛠️ Training Procedure The model was fine-tuned using the `SFTTrainer` from the TRL library. Key training parameters included a learning rate of `2e-4`, the `paged_adamw_8bit` optimizer, and 4-bit quantization to ensure efficient training on consumer hardware. ### Framework versions - PEFT 0.17.1 - TRL: 0.22.1 - Transformers: 4.56.0 - Pytorch: 2.8.0 - Datasets: 4.0.0 - Tokenizers: 0.22.0