vit-base-cifar10-augmented

This model is a fine-tuned version of google/vit-base-patch16-224 on the CIFAR-10 dataset using data augmentation.

It achieves the following results on the evaluation set:

  • Loss: 0.0445
  • Accuracy: 95.54%

🧠 Model Description

The base model is a Vision Transformer (ViT) originally trained on ImageNet-21k. This version has been fine-tuned on CIFAR-10, a standard image classification benchmark, using PyTorch and Hugging Face Transformers.

Training was done using extensive data augmentation, including random crops, flips, rotations, and color jitter to improve generalization on small input images (32Γ—32, resized to 224Γ—224).

βœ… Intended Uses & Limitations

Intended uses

  • Educational and research use on small image classification tasks
  • Benchmarking transfer learning for ViT on CIFAR-10
  • Demonstrating the impact of data augmentation on fine-tuning performance

Limitations

  • Not optimized for real-time inference
  • Fine-tuned only on CIFAR-10; not suitable for general-purpose image classification
  • Requires resized input (224Γ—224)

πŸ“¦ Training and Evaluation Data

  • Dataset: CIFAR-10
  • Size: 60,000 images (10 classes)
  • Split: 75% training, 25% test

All images were resized to 224Γ—224 and normalized using ViT’s original mean/std values.

βš™οΈ Training Procedure

Hyperparameters

  • Learning rate: 1e-4
  • Optimizer: Adam
  • Batch size: 8
  • Epochs: 10
  • Scheduler: ReduceLROnPlateau

Data Augmentation Used

  • RandomResizedCrop(224)
  • RandomHorizontalFlip()
  • RandomRotation(10)
  • ColorJitter(brightness=0.2, contrast=0.2, saturation=0.2, hue=0.1)

Training Results

Epoch Training Loss Test Accuracy
1 0.1969 94.62%
2 0.1189 95.05%
3 0.0899 95.54%
4 0.0720 94.68%
5 0.0650 94.84%
6 0.0576 94.76%
7 0.0560 95.33%
8 0.0488 94.31%
9 0.0499 95.42%
10 0.0445 94.33%

πŸ§ͺ Framework Versions

  • transformers: 4.50.0
  • torch: 2.6.0+cu124
  • datasets: 3.4.1
  • tokenizers: 0.21.1
Downloads last month
69
Safetensors
Model size
85.8M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for detorcla/cifar10-resnet18

Finetuned
(840)
this model

Space using detorcla/cifar10-resnet18 1

Evaluation results