|
--- |
|
language: |
|
- en |
|
datasets: |
|
- imdb |
|
metrics: |
|
- accuracy |
|
--- |
|
|
|
# bert-imdb-1hidden |
|
|
|
## Model description |
|
|
|
A `bert-base-uncased` model was restricted to 1 hidden layer and |
|
fine-tuned for sequence classification on the |
|
imdb dataset loaded using the `datasets` library. |
|
|
|
## Intended uses & limitations |
|
|
|
#### How to use |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSequenceClassification |
|
|
|
pretrained = "lannelin/bert-imdb-1hidden" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(pretrained) |
|
|
|
model = AutoModelForSequenceClassification.from_pretrained(pretrained) |
|
|
|
LABELS = ["negative", "positive"] |
|
|
|
def get_sentiment(text: str): |
|
inputs = tokenizer.encode_plus(text, return_tensors='pt') |
|
|
|
output = model(**inputs)[0].squeeze() |
|
|
|
return LABELS[(output.argmax())] |
|
|
|
print(get_sentiment("What a terrible film!")) |
|
``` |
|
|
|
#### Limitations and bias |
|
|
|
No special consideration given to limitations and bias. |
|
|
|
Any bias held by the imdb dataset may be reflected in the model's output. |
|
|
|
## Training data |
|
|
|
Initialised with [bert-base-uncased](https://huggingface.co/bert-base-uncased) |
|
|
|
Fine tuned on [imdb](https://huggingface.co/datasets/imdb) |
|
|
|
|
|
## Training procedure |
|
|
|
The model was fine-tuned for 1 epoch with a batch size of 64, |
|
a learning rate of 5e-5, and a maximum sequence length of 512. |
|
|
|
## Eval results |
|
|
|
Accuracy on imdb test set: 0.87132 |