Model Card for llama2-medical-finetuned

Model Details

Model Description

This is a finetuned version of LLaMA 2 specialized for medical text understanding and generation tasks. It is designed to assist with medical data processing, clinical note summarization, and healthcare question answering.

  • Developed by: Cydonia01
  • Shared by: Cydonia01 on Hugging Face
  • Model type: Large Language Model (Transformer-based, quantized with BitsAndBytes 4-bit NF4)
  • Language(s) (NLP): English (primarily medical domain)
  • Finetuned from model: LLaMA 2 (Meta AI, base model: aboonaji/llama2finetune-v2)

Model Sources

Uses

Direct Use

  • Medical text generation and summarization
  • Clinical decision support tools
  • Medical Q&A systems

Downstream Use

  • Integration into healthcare NLP pipelines
  • Training further domain-specific models

Out-of-Scope Use

  • Not intended for direct diagnostic or treatment decision-making without expert review
  • Should not be used for generating legally binding medical advice

Bias, Risks, and Limitations

  • The model may reflect biases present in training data from medical literature and may generate incorrect or outdated medical information.
  • Not a substitute for professional medical advice or diagnosis.
  • Users should verify outputs with medical professionals.

Recommendations

Users should exercise caution when deploying the model in real-world medical scenarios and combine its outputs with expert validation.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("Cydonia01/llama2-medical-finetuned")
model = AutoModelForCausalLM.from_pretrained("Cydonia01/llama2-medical-finetuned")

input_text = "Explain the symptoms of diabetes."
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

Training Details

Training Data

Curated dataset of medical texts including wiki medical terms dataset (aboonaji/wiki_medical_terms_llam2_format).

Training Procedure

Finetuned from aboonaji/llama2finetune-v2 base model using 4-bit quantization with BitsAndBytes (NF4), using PEFT LoRA method for parameter-efficient tuning. The training employed causal language modeling.

Training Hyperparameters

  • Batch size: 1 (per device) with gradient accumulation of 4
  • Max steps: 100
  • LoRA config: r=16, alpha=16, dropout=0.1

Environmental Impact

  • Hardware Type: NVIDIA Tesla T4 GPU (Google Colab)
  • Hours used: Approximately 0.75 hours (45 minutes)
  • Cloud Provider: Google Colab

Technical Specifications

Model Architecture and Objective

LLaMA 2 base model finetuned with causal language modeling, quantized to 4-bit precision using NF4 quantization for efficiency, with LoRA PEFT fine-tuning.

Compute Infrastructure

Training was conducted on Google Colab’s cloud environment, utilizing accessible GPU resources optimized for research and experimentation. The setup leverages efficient quantization and parameter-efficient fine-tuning techniques to minimize compute requirements.

Hardware

NVIDIA Tesla T4 GPU with 16 GB VRAM, supporting mixed precision (float16) and 4-bit quantization via BitsAndBytes library.

Software

  • PyTorch
  • Transformers (Hugging Face)
  • PEFT (LoRA)
  • BitsAndBytes (4-bit quantization)
  • Datasets (Hugging Face)

Framework versions

  • PEFT 0.13.2
  • Transformers (compatible version with PEFT)
  • PyTorch (compatible with float16 and 4-bit quantization)
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Cydonia01/llama2-medical-finetuned

Adapter
(446)
this model

Dataset used to train Cydonia01/llama2-medical-finetuned