File size: 3,656 Bytes
423694a 94f4291 423694a abe6183 423694a c1a3980 423694a 8374b0f |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 |
---
language:
- en
- ko
- zh
license: apache-2.0
library_name: peft
pipeline_tag: visual-question-answering
tags:
- vision
- visual-question-answering
- multimodal
- qwen
- lora
- tcm
- traditional-chinese-medicine
- tongue-diagnosis
---
# ViTCM_LLM - Traditional Chinese Medicine Tongue Diagnosis Model
This is a LoRA (Low-Rank Adaptation) adapter for the Qwen2.5-VL-32B-Instruct model, fine-tuned specifically for Traditional Chinese Medicine (TCM) tongue diagnosis tasks.
## Model Details
### Model Description
- **Developed by:** Mark-CHAE
- **Model type:** LoRA Adapter for Qwen2.5-VL-32B-Instruct
- **Language(s) (NLP):** Chinese
- **License:** Apache-2.0
- **Finetuned from model:** Qwen/Qwen2.5-VL-32B-Instruct
- **Specialization:** Traditional Chinese Medicine Tongue Diagnosis
### Model Sources
- **Repository:** [Mark-CHAE/
ViTCM-LLM ](https://huggingface.co/Mark-CHAE/ViTCM-LLM)
- **Base Model:** [Qwen/Qwen2.5-VL-32B-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-32B-Instruct)
## Uses
### Direct Use
This LoRA adapter can be used with the base Qwen2.5-VL-32B-Instruct model for multimodal vision-language tasks including:
- Traditional Chinese Medicine tongue diagnosis
- Tongue image analysis and interpretation
- Visual question answering for medical images
- Multimodal medical conversations
- Symptom analysis from tongue images
### Downstream Use
The adapter can be loaded with the base model for inference or further fine-tuning on specific TCM diagnosis tasks.
## How to Get Started with the Model
### Using the Inference Widget
You can try the model directly in the browser using the Visual Question Answering widget above. Simply upload a tongue image and ask a question about it.
### Using the Model in Code
```python
from peft import PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoProcessor
import torch
from PIL import Image
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"Qwen/Qwen2.5-VL-32B-Instruct",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-VL-32B-Instruct")
processor = AutoProcessor.from_pretrained("Qwen/Qwen2.5-VL-32B-Instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "Mark-CHAE/ViTCM-LLM")
# Prepare inputs
image = Image.open("tongue_image.jpg")
question = "根据图片判断舌诊内容"
prompt = f"<|im_start|>user\n<image>\n{question}<|im_end|>\n<|im_start|>assistant\n"
inputs = processor(
text=prompt,
images=image,
return_tensors="pt"
)
# Generate response
with torch.no_grad():
outputs = model.generate(
**inputs,
max_length=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
answer = response.split("<|im_start|>assistant")[-1].strip()
print(answer)
```
### Training Procedure
#### Training Hyperparameters
- **Training regime:** LoRA fine-tuning
- **LoRA rank:** 64
- **LoRA alpha:** 128
- **Target modules:** v_proj, qkv, attn.proj, q_proj, gate_proj, down_proj, up_proj, o_proj, k_proj
#### Speeds, Sizes, Times
- **Adapter size:** 2.2GB
- **Base model:** Qwen2.5-VL-32B-Instruct (32B parameters)
#### Software
- PEFT 0.15.2
- Transformers library
- PyTorch
**APA:**
Mark-CHAE. (2024). ViTCM_LLM: Traditional Chinese Medicine Tongue Diagnosis Model. Hugging Face. https://huggingface.co/Mark-CHAE/shezhen
## Model Card Contact
For questions about this model, please contact the model author.
### Framework versions
- PEFT 0.15.2 |