output-models / README.md
rakib730's picture
Update README.md
99dbce1 verified
---
language: en
license: apache-2.0
tags:
- vision
- image-classification
- vit
- fine-tuned
- transformers
datasets:
- your-dataset-name
model-index:
- name: ViT-Large-Patch16-224 Fine-tuned Model
results:
- task:
name: Image Classification
type: image-classification
metrics:
- name: Validation Loss
type: loss
value: 0.3268
---
# Vision Transformer (ViT) Fine-Tuned Model
# Vision Transformer (ViT) Fine-Tuned Model
This repository contains a fine-tuned version of **[google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224)**, optimized for a custom image classification task.
---
## πŸ“Œ Model Overview
- **Base model**: `google/vit-large-patch16-224`
- **Architecture**: Vision Transformer (ViT)
- **Patch size**: 16Γ—16
- **Image resolution**: 224Γ—224
- **Frameworks**: PyTorch, Hugging Face Transformers
---
## πŸ“Š Performance
| Metric | Value |
|--------|-------|
| **Final Validation Loss** | **0.3268** |
| **Lowest Validation Loss** | **0.2548** (Epoch 18) |
Training loss and validation loss trends indicate good convergence with slight overfitting after ~30 epochs.
---
## πŸ”§ Training Configuration
| Hyperparameter | Value |
|----------------|-------|
| **Learning rate** | `2e-5` |
| **Train batch size** | `20` |
| **Eval batch size** | `8` |
| **Optimizer** | AdamW (`betas=(0.9, 0.999)`, `eps=1e-8`) |
| **LR scheduler** | Linear |
| **Epochs** | `40` |
| **Seed** | `42` |
| **Framework versions** | Transformers 4.52.4, PyTorch 2.6.0+cu124, Datasets 3.6.0, Tokenizers 0.21.2 |
---
## πŸ“‚ Training Results
| Epoch | Step | Validation Loss |
|-------|------|-----------------|
| 1 | 24 | 0.5601 |
| 5 | 120 | 0.3421 |
| 10 | 240 | 0.2901 |
| 14 | 336 | 0.2737 |
| 18 | 432 | **0.2548** |
| 40 | 960 | 0.3268 |
---
## πŸ›  Intended Uses
- Image classification on datasets with characteristics similar to the training dataset.
- Fine-tuning for domain-specific classification tasks.
---
## ⚠ Limitations
- Trained on a **custom dataset** β€” may not generalize well to unrelated domains without additional fine-tuning.
- No guarantees on fairness, bias, or ethical implications without dataset analysis.
---
## πŸš€ How to Use
You can use this model in two main ways:
### **1️⃣ Using the High-Level `pipeline` API**
```python
from transformers import pipeline
pipe = pipeline("image-classification", model="rakib730/output-models")
# Classify an image from a URL
result = pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")
print(result)
2️⃣ Using the Processor and Model Directly**
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests
import torch
# Load processor and model
processor = AutoImageProcessor.from_pretrained("rakib730/output-models")
model = AutoModelForImageClassification.from_pretrained("rakib730/output-models")
# Load an image
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png"
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")
# Preprocess
inputs = processor(images=image, return_tensors="pt")
# Inference
with torch.no_grad():
outputs = model(**inputs)
logits = outputs.logits
predicted_class_id = logits.argmax(-1).item()
print("Predicted class:", model.config.id2label[predicted_class_id])