GPT-2 SFT Model – Supervised Fine-Tuning for Positive Sentiment

This model is the first stage in a 3-step RLHF (Reinforcement Learning from Human Feedback) pipeline using GPT-2. It has been fine-tuned on the Stanford Sentiment Treebank v2 (SST2) dataset, focusing on generating sentences with a positive sentiment tone.


Context

This model is part of the following RLHF project structure:

  1. Supervised Fine-Tuning (SFT) – Fine-tunes GPT-2 on positive/negative sentences.
  2. Reward Model (RM) – Trained to predict sentiment scores.
  3. PPO-based Optimization (RLHF) – Final model improved to generate high-reward (positive) responses.

You are currently viewing the SFT model.


Model Objective

Train GPT-2 on sentiment-labeled sentences to mimic human-like, sentiment-aware generation.

  • Input: Sentence start (prompt)
  • Output: GPT-2 completes it with a positively-toned sentence.

Dataset

  • Source: stanfordnlp/sst2
  • Type: Movie review sentences
  • Labels: Positive and Negative
  • Preprocessing: Only positive samples retained for SFT
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("Saif10/sft-model")
tokenizer = AutoTokenizer.from_pretrained("Saif10/sft-model")

prompt = "The movie was"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=30)
print(tokenizer.decode(outputs[0]))

Author

Saif Rathod

  • Hugging Face: Saif10
  • GitHub: Saif-rathod
Downloads last month
5
Safetensors
Model size
124M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Saif10/sft-model

Finetuned
(1851)
this model

Dataset used to train Saif10/sft-model