Quora Title Generator

A fine-tuned GPT-2 model specialized for generating Quora-style question titles. This model has been trained on a curated dataset of Quora question titles to learn the patterns and style of effective question formulation.

Model Description

This model is a fine-tuned version of GPT-2 specifically designed to generate compelling and realistic Quora question titles. It can be used for:

Question Title Generation: Generate realistic Quora-style questions
Text Completion: Complete partial questions or topics into full titles
Content Ideation: Generate ideas for question-based content

Key Features

🎯 Specialized Training: Fine-tuned on high-quality Quora question titles
🎛️ Configurable Generation: Adjustable temperature, top-p, and top-k parameters
💡 Creative Output: Generates diverse and contextually appropriate questions
📊 Quality Dataset: Trained on the dexxiez/quora-titles dataset

Usage

Quick Start

from transformers import GPT2LMHeadModel, GPT2Tokenizer

model_name = "dexxiez/quora-title-gen"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)

# Generate a random title
input_text = "<|startoftext|>"
inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(
    inputs, 
    max_length=50, 
    temperature=0.8, 
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)
title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(title)

Text Completion

# Complete a partial question
prompt = "<|startoftext|>How to learn"
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(
    inputs,
    max_length=100,
    temperature=0.8,
    top_p=0.95,
    top_k=50,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)
completed_title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(completed_title)

Training Details

Training Data

Dataset: dexxiez/quora-titles
Content: Curated collection of high-quality Quora question titles
Format: Plain text titles with special token formatting (<|startoftext|>title<|endoftext|>)
Quality: Filtered for relevance and engagement

Training Procedure

Base Model: GPT-2 (124M parameters)
Training Epochs: 3
Train/Eval Split: 90/10
Max Sequence Length: 128 tokens
Special Tokens: <|startoftext|>, <|endoftext|>, <|pad|>

Generation Parameters

Parameter	Recommended	Description
`temperature`	0.8-1.2	Controls creativity (lower = more focused)
`top_p`	0.95	Nucleus sampling threshold
`top_k`	50	Top-k sampling limit
`max_length`	50-100	Maximum tokens to generate

Example Outputs

Random Generation:

"What are the most effective strategies for learning a new programming language?"
"How can introverts succeed in networking events?"
"Why do some people find it easier to learn languages than others?"

Text Completion:

Input: "How to learn"
Output: "How to learn data science without a computer science background?"

Input: "What is the best way"
Output: "What is the best way to prepare for coding interviews at FAANG companies?"

Dataset Information

This model was trained on the dexxiez/quora-titles dataset, which contains:

High-quality Quora question titles
Diverse topic coverage
Natural language patterns typical of question formulation
Preprocessed and filtered for training optimization

Limitations

Optimized specifically for English Quora-style questions
May occasionally generate incomplete or repetitive text
Performance varies with generation parameters
Best results with topics similar to training data distribution

Ethical Considerations

Generated content should be reviewed before publication
May reflect biases present in the original Quora dataset
Not intended for generating harmful or inappropriate content
Use responsibly for content creation and ideation

Citation

@misc{quora-title-gen,
  title={Quora Title Generator: Fine-tuned GPT-2 for Question Generation},
  author={dexxiez},
  year={2025},
  url={https://huggingface.co/dexxiez/quora-title-gen}
}

License

MIT License

Downloads last month: 4

Safetensors

Model size

124M params

Tensor type

F32

Model tree for dexxiez/quora-title-gen

Base model

openai-community/gpt2

Quantized

(81)

this model

Dataset used to train dexxiez/quora-title-gen

Evaluation results

Perplexity
self-reported

12.500

Metadata error: specify a dataset to view leaderboard