BERT Based Image Classifier
This model takes inputs from CIFAR10 dataset, convert them into patches embeddings, with positional information along with Class Token to Transformer, the first representation of last hidden state is used to input of the MLP head which is a classifier.
A full complete architect has been given for your understanding, which shows the dimensions and different operations that occur. BERT model consists of multiple hidden layers (encoder blocks) which are used.
Model Details
Model Description
For greator understanding of how such transformer can be used instead of Convolutions or RNNs in order to classify images, by obtaining a useful representation similar to CNN convolutions and the feature maps produced by them alternative methods.
- Developed by: Michael Peres
- Model type: BERT + MLP Classifier Head
- Language(s) (NLP): Michael
Model Sources
Uses
Classifying images based on CIFAR10 dataset Achieved model accuracy of 80%.
How to Get Started with the Model
Run the model defined in the python script file.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: NVIDIA A100 80GB PCIe
- Hours used: 0.5hrs