YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Fine-tune Vision AI Model for Volume Recognition
This project demonstrates how to fine-tune a vision AI model for recognizing fluid volumes in test tubes, with applications across medical, laboratory, and industrial settings.
Prerequisites
1. HuggingFace Setup (Required)
- Create an account at huggingface.co
- Go to Settings โ Access Tokens
- Create a new token (read access)
- Copy and save your token - you'll need it later
Quick Start
Open terminal in your JarvisLabs workspace:
File > New Launcher > Terminal
Clone the repository:
git clone https://github.com/ictBioRtc/finetune_florence2_vision_language_model.git
Navigate to project directory:
cd finetune_vision_ai_model
Install dependencies:
pip install -r requirements.txt
Run the application:
python app.py
Copy the public URL provided (e.g., https://ff20bc33e416f3319f.gradio.live)
Open in a new browser tab
Using the Application
Step 1: Test Initial Model (Inference Tab)
- Unzip the provided
test_images.zip
- Go to "Inference" tab
- Upload a test image
- Leave other settings at default
- Click "Run Inference"
- Observe how the untrained model performs
Step 2: Train the Model (Training Tab)
- Dataset:
ictbiortc/beaker-volume-recognition-dataset
- Change epochs to 15 (for workshop purposes)
- Click "Start Training"
- Note: Full training could take ~5 hours
Step 3: Upload Model to HuggingFace
- After training completes, click "Upload to Hub"
- Enter your model name (e.g.,
your-username/beaker-volume-recognition-model
) - Paste your HuggingFace token
- Click "Upload"
Step 4: Important Configuration Update
- Go to your model on HuggingFace
- Navigate to "Files and versions"
- Find
config.json
- Edit line 165 from:
to:"model_type": "",
"model_type": "davit",
Step 5: Evaluate Your Model
- Return to the app
- Go to "Evaluate" tab
- Upload a test image
- Use your trained model
- Compare results with the initial inference
Applications
This volume recognition model has potential applications in:
- IV Fluid Monitoring
- Laboratory Automation
- Medication Dosing
- Urine Monitoring
- Manufacturing Quality Control
- Chemical Processing
- Beverage Industry
- Petroleum Industry
Training Notes
- Full training typically takes days
- Workshop version uses 15 epochs (~5 hours)
- Larger epoch numbers yield better results
- GPU acceleration is recommended
Troubleshooting
Common issues:
- "Model not loading": Check your internet connection
- "Training too slow": Verify GPU availability
- "Upload failed": Verify your HuggingFace token
- "Config error": Double-check the davit model_type update
Next Steps
After successful training:
- Experiment with different epochs
- Try different image types
- Test various fluid volumes
- Integrate with your specific use case
Congratulations! You've successfully:
- Tested a base vision model
- Fine-tuned it for volume recognition
- Uploaded it to HuggingFace
- Created a practical AI solution for real-world applications
This workshop demonstrates how vision language models can be adapted for specific industrial and medical applications.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support