Quora Title Generator
A fine-tuned GPT-2 model specialized for generating Quora-style question titles. This model has been trained on a curated dataset of Quora question titles to learn the patterns and style of effective question formulation.
Model Description
This model is a fine-tuned version of GPT-2 specifically designed to generate compelling and realistic Quora question titles. It can be used for:
- Question Title Generation: Generate realistic Quora-style questions
- Text Completion: Complete partial questions or topics into full titles
- Content Ideation: Generate ideas for question-based content
Key Features
- ๐ฏ Specialized Training: Fine-tuned on high-quality Quora question titles
- ๐๏ธ Configurable Generation: Adjustable temperature, top-p, and top-k parameters
- ๐ก Creative Output: Generates diverse and contextually appropriate questions
- ๐ Quality Dataset: Trained on the dexxiez/quora-titles dataset
Usage
Quick Start
from transformers import GPT2LMHeadModel, GPT2Tokenizer
model_name = "dexxiez/quora-title-gen"
tokenizer = GPT2Tokenizer.from_pretrained(model_name)
model = GPT2LMHeadModel.from_pretrained(model_name)
# Generate a random title
input_text = "<|startoftext|>"
inputs = tokenizer.encode(input_text, return_tensors="pt")
outputs = model.generate(
inputs,
max_length=50,
temperature=0.8,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(title)
Text Completion
# Complete a partial question
prompt = "<|startoftext|>How to learn"
inputs = tokenizer.encode(prompt, return_tensors="pt")
outputs = model.generate(
inputs,
max_length=100,
temperature=0.8,
top_p=0.95,
top_k=50,
do_sample=True,
pad_token_id=tokenizer.eos_token_id
)
completed_title = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(completed_title)
Training Details
Training Data
- Dataset: dexxiez/quora-titles
- Content: Curated collection of high-quality Quora question titles
- Format: Plain text titles with special token formatting (
<|startoftext|>title<|endoftext|>
) - Quality: Filtered for relevance and engagement
Training Procedure
- Base Model: GPT-2 (124M parameters)
- Training Epochs: 3
- Train/Eval Split: 90/10
- Max Sequence Length: 128 tokens
- Special Tokens:
<|startoftext|>
,<|endoftext|>
,<|pad|>
Generation Parameters
Parameter | Recommended | Description |
---|---|---|
temperature |
0.8-1.2 | Controls creativity (lower = more focused) |
top_p |
0.95 | Nucleus sampling threshold |
top_k |
50 | Top-k sampling limit |
max_length |
50-100 | Maximum tokens to generate |
Example Outputs
Random Generation:
- "What are the most effective strategies for learning a new programming language?"
- "How can introverts succeed in networking events?"
- "Why do some people find it easier to learn languages than others?"
Text Completion:
Input: "How to learn"
Output: "How to learn data science without a computer science background?"
Input: "What is the best way"
Output: "What is the best way to prepare for coding interviews at FAANG companies?"
Dataset Information
This model was trained on the dexxiez/quora-titles dataset, which contains:
- High-quality Quora question titles
- Diverse topic coverage
- Natural language patterns typical of question formulation
- Preprocessed and filtered for training optimization
Limitations
- Optimized specifically for English Quora-style questions
- May occasionally generate incomplete or repetitive text
- Performance varies with generation parameters
- Best results with topics similar to training data distribution
Ethical Considerations
- Generated content should be reviewed before publication
- May reflect biases present in the original Quora dataset
- Not intended for generating harmful or inappropriate content
- Use responsibly for content creation and ideation
Citation
@misc{quora-title-gen,
title={Quora Title Generator: Fine-tuned GPT-2 for Question Generation},
author={dexxiez},
year={2025},
url={https://huggingface.co/dexxiez/quora-title-gen}
}
License
MIT License
- Downloads last month
- 4
Model tree for dexxiez/quora-title-gen
Base model
openai-community/gpt2Dataset used to train dexxiez/quora-title-gen
Evaluation results
- Perplexityself-reported12.500