BERT Based Image Classifier

This model takes inputs from CIFAR10 dataset, convert them into patches embeddings, with positional information along with Class Token to Transformer, the first representation of last hidden state is used to input of the MLP head which is a classifier.

A full complete architect has been given for your understanding, which shows the dimensions and different operations that occur. BERT model consists of multiple hidden layers (encoder blocks) which are used.

Model Details

Model Description

For greator understanding of how such transformer can be used instead of Convolutions or RNNs in order to classify images, by obtaining a useful representation similar to CNN convolutions and the feature maps produced by them alternative methods.

Developed by: Michael Peres
Model type: BERT + MLP Classifier Head
Language(s) (NLP): Michael

Model Sources

Paper: https://arxiv.org/abs/1810.04805

Uses

Classifying images based on CIFAR10 dataset Achieved model accuracy of 80%.

How to Get Started with the Model

Run the model defined in the python script file.

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

Hardware Type: NVIDIA A100 80GB PCIe
Hours used: 0.5hrs

makiisthebes
/

BERT-ImageClassifier