Model Card for gemma-3b-it-paul-graham-lora
This is a fine-tuned version of the google/gemma-3b-it
model using LoRA (Low-Rank Adaptation), trained to generate essays in the style of Paul Graham.
Model Details
Model Description
This model adapts google/gemma-3b-it
using LoRA fine-tuning on a collection of Paul Graham's essays. It is intended for creative writing and stylistic emulation of Paul Graham’s tone and argumentative structure.
- Developed by: Arthur Mello
- Model type: Causal Language Model (Decoder-only)
- Language(s) (NLP): English
- Finetuned from model:
google/gemma-3b-it
Uses
Direct Use
The model can be used to generate essays or long-form answers in the style of Paul Graham. It is useful for educational, literary, or entertainment purposes.
Out-of-Scope Use
- Factual question answering (it may hallucinate)
- Use in high-risk domains (e.g., legal, medical)
- Emulating real individuals without clear consent
Bias, Risks, and Limitations
The model inherits biases from the original Gemma model and the Paul Graham dataset. Since the dataset reflects a single author's perspective, generated outputs may lean toward specific ideological or philosophical stances.
Recommendations
Users should review outputs carefully and avoid using the model as a source of truth or factual information.
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("arthurmello/gemma-3b-it-paul-graham-lora")
tokenizer = AutoTokenizer.from_pretrained("arthurmello/gemma-3b-it-paul-graham-lora")
prompt = "<start_of_turn>user\nWrite an essay on the future of AI<end_of_turn><eos>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
The model was fine-tuned on a cleaned dataset of Paul Graham’s essays, formatted into a chat-like structure using a custom chat_template
.
Dataset: arthurmello/paul-graham-essays
Training Procedure
Training Hyperparameters
- LoRA rank: 16
- LoRA alpha: 64
- LoRA dropout: 0.05
- Batch size: 1 (gradient accumulation = 4)
- Max sequence length: 1500 tokens
- Learning rate: 1e-4
- Precision: bf16
- Epochs: 10
- Scheduler: cosine
- Weight decay: 0.1
Evaluation
Metrics
- Training loss: Decreased from 2.57 to 1.80
- Validation loss: Plateaued around 2.5–2.6, indicating stability with limited overfitting
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator.
- Hardware Type: L4 GPU
- Hours used: ~30 min
- Cloud Provider: Google Colab Pro
- Compute Region: Europe
Technical Specifications
Model Architecture and Objective
Decoder-only transformer model with LoRA adapters applied to key projection layers, attention outputs, and feedforward projections.
Compute Infrastructure
Hardware
- 1× NVIDIA L4 (22.5GB)
Software
- Python 3.11
- PyTorch
- Hugging Face Transformers, PEFT, TRL
Model Card Contact
For questions or feedback, contact @arthurmello or LinkedIn
- Downloads last month
- 9