metadata

license: apache-2.0
tags:
  - unsloth
  - biology
  - medical
  - LoRA
  - trl
  - text-generation-inference
datasets:
  - FreedomIntelligence/medical-o1-reasoning-SFT
language:
  - en
widget:
  - text: |
      What is scoliosis?
    output:
      text: >-
        🧑‍⚕️: Scolosis is defined as an unnatural curvature of the spine that
        occurs in a side-to-side pattern. It typically affects the thoracic and
        lumbar spine and causes the spinal column to appear more curved than
        usual. This condition is usually progressive, meaning it won't go away
        on its own. It's usually diagnosed by an X-ray or physical exam. While
        scoliosis can happen at any age, it is more common in adolescents,
        particularly girls. Treatment options include bracing, physical therapy,
        surgery, or a combination of these. Managing scoliosis symptoms can help
        with postural changes, pain management, breathing issues, among other
        things.
base_model:
  - unsloth/mistral-7b-bnb-4bit
pipeline_tag: text-generation
library_name: adapter-transformers

Mistral-7B Medical QA Model

A specialized medical question-answering model built on Mistral-7B and fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset.

Model Description

This model is a LoRA adaptation of Mistral-7B, fine-tuned to provide accurate and informative answers to medical questions. It's optimized using Unsloth for efficient training and inference.

Inference Instructions

To use this model:

!pip install unsloth

from unsloth import FastLanguageModel
import torch

# Define the Alpaca prompt template
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{instruction}
### Input:
{input_text}
### Response:
{output}"""

# Load your model
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="Subh775/mistral-7b-medical-o1-ft",
    max_seq_length=2048,
    load_in_4bit=True
)

# Enable optimized inference mode for faster generation
FastLanguageModel.for_inference(model)

# Function to handle the chat loop with memory

def chat():
    print("Chat with mistral-7b-medical-o1-ft! Type '\\q' or 'quit' to stop.\n")

    chat_history = ""  # Store the conversation history

    while True:
        # Get user input
        user_input = input("➤ ")

        # Exit condition
        if user_input.lower() in ['\\q', 'quit']:
            print("\nExiting the chat. Goodbye 🩺👍!")
            print("✨" + "=" * 27 + "✨\n")
            break

        # Append the current input to chat history with instruction formatting
        prompt = alpaca_prompt.format(
            instruction="Please answer the following medical question.",
            input_text=user_input,
            output=""
        )
        chat_history += prompt + "\n"

        # Tokenize combined history and move to GPU
        inputs = tokenizer([chat_history], return_tensors="pt").to("cuda")

        # Generate output with configured parameters
        outputs = model.generate(
            **inputs,
            max_new_tokens=256,
            temperature=0.7,
            top_p=0.9,
            num_return_sequences=1,
            do_sample=True,
            no_repeat_ngram_size=2
        )

        # Decode and clean the model's response
        decoded_output = tokenizer.batch_decode(outputs, skip_special_tokens=True)
        clean_output = decoded_output[0].split('### Response:')[-1].strip()

        # Add the response to chat history
        chat_history += f": {clean_output}\n"

        # Display the response
        print(f"\n🧑‍⚕️: {clean_output}\n")

# Start the chat
chat()

Training

This model was fine-tuned on the FreedomIntelligence/medical-o1-reasoning-SFT dataset, which contains approximately 50,000 high-quality medical question-answer pairs. The training used Unsloth for optimization and LoRA for parameter-efficient fine-tuning.

Key Features

Base Model: unsloth/mistral-7b-bnb-4bit
Fine-Tuning Objective: Adaptation for structured, step-by-step medical reasoning tasks.
Training Dataset: 19,704 samples from medical-o1-reasoning-SFT dataset.
Tools Used:
- Unsloth: Accelerates training by 2x.
- 4-bit Quantization: Reduces model memory usage.
- LoRA Adapters: Enables parameter-efficient fine-tuning.
Training Time: 38 minutes, 57 seconds for 1 epoch.
The step and Training loss for the last iteration are:
Step: 60
Training Loss: 1.160700

Limitations

This model provides general medical information and should not be used as a substitute for professional medical advice.
The model's knowledge is limited to its training data and may not include the latest medical research.
Not clinically validated and should not be used for diagnosis or treatment decisions.

License

This model inherits the license from the base Mistral-7B model.

Citations

@misc{mistral-7b-medical-o1-ft,
  author = {Subh775},
  title = {Mistral-7B Medical QA Model},
  year = {2025},
  publisher = {HuggingFace},
  journal = {HuggingFace Repository},
  howpublished = {\url{https://huggingface.co/Subh775/mistral-7b-medical-o1-ft}}
}