YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

My Awesome Food Model - Fine-tuned LoRA (Food101)

Overview

My Awesome Food Model - Fine-tuned LoRA (Food101) is an optimized version of google/vit-base-patch16-224-in21k trained using Low-Rank Adaptation (LoRA). This model enhances food classification tasks by efficiently fine-tuning a Vision Transformer on the Food101 dataset.

Model Details

  • Library: PEFT
  • License: Apache-2.0
  • Base Model: google/vit-base-patch16-224-in21k
  • Tags: generated_from_trainer
  • Evaluation Metric: Accuracy

Performance

The model achieves the following results on the evaluation set:

  • Loss: 0.6790
  • Accuracy: 82.13%

Intended Uses & Limitations

Intended Uses

  • Food classification tasks
  • Culinary image recognition applications
  • Research and development in efficient transformer fine-tuning using LoRA

Limitations

  • May not generalize well to food categories outside the Food101 dataset
  • Performance may degrade on images with poor lighting or unusual angles

Training & Evaluation Data

The model was fine-tuned using the Food101 dataset, which includes 101 food categories with 1,000 images per category. The dataset is split into 75% training data and 25% test data.

Training Procedure

Hyperparameters

The model was fine-tuned using the following hyperparameters:

  • Learning Rate: 0.005
  • Train Batch Size: 128
  • Eval Batch Size: 128
  • Seed: 42
  • Gradient Accumulation Steps: 4
  • Total Train Batch Size: 512
  • Optimizer: AdamW (betas=(0.9, 0.999), epsilon=1e-08)
  • LR Scheduler: Linear Decay
  • Number of Epochs: 5
  • Mixed Precision Training: Native AMP

Training Progress

Training Loss Epoch Step Validation Loss Accuracy
0.8211 1.0 119 0.8232 78.20%
0.7385 2.0 238 0.7586 80.13%
0.6528 3.0 357 0.7408 80.57%
0.5283 4.0 476 0.6797 82.18%
0.5294 4.962 590 0.6790 82.13%

Framework Versions

  • PEFT: 0.15.0
  • Transformers: 4.49.0
  • PyTorch: 2.6.0
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Usage

To use the model, load it with the transformers library and PEFT for efficient fine-tuning:

from transformers import ViTForImageClassification, ViTFeatureExtractor
from peft import PeftModel
from PIL import Image
import torch

# Load base model and LoRA adapter
base_model = ViTForImageClassification.from_pretrained("path_to_base_model")
model = PeftModel.from_pretrained(base_model, "path_to_lora_adapter")
feature_extractor = ViTFeatureExtractor.from_pretrained("path_to_base_model")

# Load and preprocess an image
image = Image.open("example_food.jpg")
inputs = feature_extractor(images=image, return_tensors="pt")

# Perform inference
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = predictions.argmax().item()

print(f"Predicted class: {predicted_class}")

Citation

If you use this model in your research or project, please cite it as follows:

@misc{my_awesome_food_model_lora,
  author = {Your Name},
  title = {My Awesome Food Model - Fine-tuned LoRA (Food101)},
  year = {2025},
  url = {https://huggingface.co/your_model_link}
}

Acknowledgments

This model was developed using the PEFT library for parameter-efficient fine-tuning and trained on the Food101 dataset. Special thanks to the Hugging Face community for providing invaluable tools and resources for optimizing transformer models.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support