π§ Image Classification AI Model (CIFAR-100)
This repository contains a Vision Transformer (ViT)-based AI model fine-tuned for image classification on the CIFAR-100 dataset. The model is built using google/vit-base-patch16-224
, quantized to FP16 for efficient inference, and delivers high accuracy in multi-class image classification tasks.
π Features
- πΌοΈ Task: Image Classification
- π§ Base Model:
google/vit-base-patch16-224
(Vision Transformer) - π§ͺ Quantized: FP16 for faster and memory-efficient inference
- π― Dataset: 100 fine-grained object categories
- β‘ CUDA Enabled: Optimized for GPU acceleration
- π High Accuracy: Fine-tuned and evaluated on validation split
π Dataset Used
Hugging Face Dataset: tanganke/cifar100
- Description: CIFAR-100 is a dataset of 60,000 32Γ32 color images in 100 classes (600 images per class)
- Split: 50,000 training images and 10,000 test images
- Categories: Animals, Vehicles, Food, Household items, etc.
- License: MIT License (from source)
from datasets import load_dataset
dataset = load_dataset("tanganke/cifar100")
π οΈ Model & Training Configuration
Model: google/vit-base-patch16-224
Image Size: 224x224 (resized from 32x32)
Framework: Hugging Face Transformers & Datasets
Training Environment: Kaggle Notebook with CUDA
Epochs: 5β10 (with early stopping)
Batch Size: 32
Optimizer: AdamW
Loss Function: CrossEntropyLoss
β Evaluation & Scoring
Accuracy: ~70β80% (varies by configuration)
Validation Tool: evaluate or sklearn.metrics
Metric: Accuracy, Top-1 and Top-5 scores
Inference Speed: Significantly faster after quantizationextractor = ViTFeatureExtractor.from_pretrained("google/vit-base-patch16-224")
π Inference Example
from PIL import Image
import torch
def predict(image_path):
image = Image.open(image_path).convert("RGB")
inputs = feature_extractor(images=image, return_tensors="pt").to("cuda")
outputs = model(**inputs)
logits = outputs.logits
predicted_class = logits.argmax(-1).item()
return dataset["train"].features["fine_label"].int2str(predicted_class)
print(predict("sample_image.jpg"))
π Folder Structure
π¦image-classification-vit β£ πvit-cifar100-fp16 β£ πtrain.py β£ πinference.py β£ πREADME.md β πrequirements.txt
- Downloads last month
- 36