Model Card for Model ID

We fine-tune OPT-125 to generate positive movie reviews based on the IMDB dataset. The model gets the start of a real review and is tasked to produce positive continuations. To reward positive continuations we use a BERT classifier to analyse the sentiment of the produced sentences and use the classifier's outputs as rewards signals for PPO training

Downloads last month
120
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.