metadata
license: apache-2.0
tags:
- unsloth
- biology
- medical
- LoRA
- trl
- text-generation-inference
datasets:
- FreedomIntelligence/medical-o1-reasoning-SFT
language:
- en
widget:
- text: |
What is scoliosis?
output:
text: >-
π§ββοΈ: Scolosis is defined as an unnatural curvature of the spine that
occurs in a side-to-side pattern. It typically affects the thoracic and
lumbar spine and causes the spinal column to appear more curved than
usual. This condition is usually progressive, meaning it won't go away
on its own. It's usually diagnosed by an X-ray or physical exam. While
scoliosis can happen at any age, it is more common in adolescents,
particularly girls. Treatment options include bracing, physical therapy,
surgery, or a combination of these. Managing scoliosis symptoms can help
with postural changes, pain management, breathing issues, among other
things.
base_model:
- unsloth/mistral-7b-bnb-4bit
pipeline_tag: text-generation
library_name: adapter-transformers
Mistral-7B Medical QA Model
A specialized medical question-answering model built on Mistral-7B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset.
Model Description
This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference.
Inference Instructions
To use this model:
!pip install unsloth
from unsloth import FastLanguageModel
import torch
# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input_text}
### Response:
{output}"""
# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="Subh775/mistral-7b-medical-o1-ft",
max_seq_length=2048,
load_in_4bit=True
)
# Enable optimized inference mode for faster generation
FastLanguageModel.for_inference(model)
# Function to handle the chat loop with memory
def chat():
print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n")
chat_history = "" # Store the conversation history
while True:
# Get user input
user_input = input("β€ ")
# Exit condition
if user_input.lower() in ['\\q', 'quit']:
print("\nExiting the chat. Goodbye π©Ίπ!")
print("β¨" + "=" * 27 + "β¨\n")
break
# Append the current input to chat history with instruction formatting
prompt = alpaca_prompt.format(
instruction="Please answer the following medical question.",
input_text=user_input,
output=""
)
chat_history += prompt + "\n"
# Tokenize combined history and move to GPU
inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")
# Generate output with configured parameters
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
top_p=0.9,
num_return_sequences=1,
do_sample=True,
no_repeat_ngram_size=2
)
# Decode and clean the model's response
decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
clean_output = decoded_output[0].split('### Response:')[-1].strip()
# Add the response to chat history
chat_history += f": {clean_output}\n"
# Display the response
print(f"\nπ§ββοΈ: {clean_output}\n")
# Start the chat
chat()
Training
This model was fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning.
Key Features
- Base Model: unsloth/mistral-7b-bnb-4bit
- Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
- Training Dataset: 19,704 samples from medical-o1-reasoning-SFT dataset.
- Tools Used:
- Unsloth: Accelerates training by 2x.
- 4-bit Quantization: Reduces model memory usage.
- LoRA Adapters: Enables parameter-efficient fine-tuning.
- Training Time: 38 minutes, 57 seconds for 1 epoch.
- The step and Training loss for the last iteration are:
- Step: 60
- Training Loss: 1.160700
Limitations
- This model provides general medical information and should not be used as a substitute for professional medical advice.
- The model's knowledge is limited to its training data and may not include the latest medical research.
- Not clinically validated and should not be used for diagnosis or treatment decisions.
License
This model inherits the license from the base Mistral-7B model.
Citations
@misc{mistral-7b-medical-o1-ft,
author = {Subh775},
title = {Mistral-7B Medical QA Model},
year = {2025},
publisher = {HuggingFace},
journal = {HuggingFace Repository},
howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}}
}