FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

This LoRA adapter was converted to GGUF format from Joy10/gemma-2b-docjoybot-lora via the ggml.ai's GGUF-my-lora space. Refer to the original adapter repository for more details.

Model Description

🩺 A Medical Reasoning Chatbot Based on Gemma-2B + LoRA

Trained a fine-tuned version of google/gemma-2-2b-it enhanced with LoRA adapters. It specializes in medical question answering and clinical reasoning using structured, step-by-step thought processes.

πŸ“Œ Key Features

  • 🧠 Chain-of-Thought (CoT) Reasoning for complex medical queries
  • πŸ§ͺ Fine-tuned on 25,000 samples from FreedomIntelligence/medical-o1-reasoning-SFT
  • 🧬 LoRA-based parameter-efficient tuning using Hugging Face PEFT + TRL
  • πŸ’‘ Prompt template includes structured <think> tags to enhance reasoning clarity
  • ⚑ Lightweight adapter (~10MB) for efficient deployment with the base model

πŸ” Intended Use

This model is intended for educational, research, and prototyping purposes in the healthcare and AI domains. It performs best on medical diagnostic and reasoning tasks where step-by-step logical thinking is required.

⚠️ Disclaimer: This model is not intended for real-world clinical use without expert validation. It is a research-grade assistant only.

πŸ—οΈ How It Was Trained

  • Base Model: google/gemma-2-2b-it
  • LoRA Config: r=8, alpha=16, dropout=0.05
  • Frameworks: transformers, PEFT, TRL (SFTTrainer)
  • Quantization: 4-bit nf4 for efficient inference using bitsandbytes
  • Hardware: Trained on Kaggle GPU (T4), optimized for low-resource fine-tuning

πŸ’¬ Prompt Format

You are a helpful and knowledgeable AI medical assistant.

### Question:
{medical_question_here}

### Response:
<think>
{step-by-step_reasoning}
</think>
{final_answer}

Use with llama.cpp

# with cli
llama-cli -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

# with server
llama-server -m base_model.gguf --lora gemma-2b-docjoybot-lora-f16.gguf (...other args)

To know more about LoRA usage with llama.cpp server, refer to the llama.cpp server documentation.

Downloads last month
4
GGUF
Model size
10.4M params
Architecture
gemma2
Hardware compatibility
Log In to view the estimation

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for FM-1976/gemma-2b-docjoybot-lora-F16-GGUF

Base model

google/gemma-2-2b
Quantized
(1)
this model

Dataset used to train FM-1976/gemma-2b-docjoybot-lora-F16-GGUF