File size: 3,466 Bytes
99dbce1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f06cc95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
70c2fa0
f06cc95
 
 
 
 
 
70c2fa0
 
f06cc95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82afec0
f06cc95
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
language: en
license: apache-2.0
tags:
- vision
- image-classification
- vit
- fine-tuned
- transformers
datasets:
- your-dataset-name
model-index:
- name: ViT-Large-Patch16-224 Fine-tuned Model
  results:
  - task:
      name: Image Classification
      type: image-classification
    metrics:
    - name: Validation Loss
      type: loss
      value: 0.3268
---

# Vision Transformer (ViT) Fine-Tuned Model


# Vision Transformer (ViT) Fine-Tuned Model

This repository contains a fine-tuned version of **[google/vit-large-patch16-224](https://huggingface.co/google/vit-large-patch16-224)**, optimized for a custom image classification task.

---

## πŸ“Œ Model Overview

- **Base model**: `google/vit-large-patch16-224`
- **Architecture**: Vision Transformer (ViT)
- **Patch size**: 16Γ—16
- **Image resolution**: 224Γ—224
- **Frameworks**: PyTorch, Hugging Face Transformers

---

## πŸ“Š Performance

| Metric | Value |
|--------|-------|
| **Final Validation Loss** | **0.3268** |
| **Lowest Validation Loss** | **0.2548** (Epoch 18) |

Training loss and validation loss trends indicate good convergence with slight overfitting after ~30 epochs.

---

## πŸ”§ Training Configuration

| Hyperparameter | Value |
|----------------|-------|
| **Learning rate** | `2e-5` |
| **Train batch size** | `20` |
| **Eval batch size** | `8` |
| **Optimizer** | AdamW (`betas=(0.9, 0.999)`, `eps=1e-8`) |
| **LR scheduler** | Linear |
| **Epochs** | `40` |
| **Seed** | `42` |
| **Framework versions** | Transformers 4.52.4, PyTorch 2.6.0+cu124, Datasets 3.6.0, Tokenizers 0.21.2 |

---

## πŸ“‚ Training Results

| Epoch | Step | Validation Loss |
|-------|------|-----------------|
| 1     | 24   | 0.5601 |
| 5     | 120  | 0.3421 |
| 10    | 240  | 0.2901 |
| 14    | 336  | 0.2737 |
| 18    | 432  | **0.2548** |
| 40    | 960  | 0.3268 |

---

## πŸ›  Intended Uses

- Image classification on datasets with characteristics similar to the training dataset.
- Fine-tuning for domain-specific classification tasks.

---

## ⚠ Limitations

- Trained on a **custom dataset** β€” may not generalize well to unrelated domains without additional fine-tuning.
- No guarantees on fairness, bias, or ethical implications without dataset analysis.

---

## πŸš€ How to Use

You can use this model in two main ways:

### **1️⃣ Using the High-Level `pipeline` API**
```python
from transformers import pipeline

pipe = pipeline("image-classification", model="rakib730/output-models")

# Classify an image from a URL
result = pipe("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png")
print(result)

2️⃣ Using the Processor and Model Directly**
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests
import torch

# Load processor and model
processor = AutoImageProcessor.from_pretrained("rakib730/output-models")
model = AutoModelForImageClassification.from_pretrained("rakib730/output-models")

# Load an image
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/hub/parrots.png"
image = Image.open(requests.get(url, stream=True).raw).convert("RGB")

# Preprocess
inputs = processor(images=image, return_tensors="pt")

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    logits = outputs.logits
    predicted_class_id = logits.argmax(-1).item()

print("Predicted class:", model.config.id2label[predicted_class_id])