Model Card for gemma-3b-it-paul-graham-lora

This is a fine-tuned version of the google/gemma-3b-it model using LoRA (Low-Rank Adaptation), trained to generate essays in the style of Paul Graham.

Model Details

Model Description

This model adapts google/gemma-3b-it using LoRA fine-tuning on a collection of Paul Graham's essays. It is intended for creative writing and stylistic emulation of Paul Graham’s tone and argumentative structure.

  • Developed by: Arthur Mello
  • Model type: Causal Language Model (Decoder-only)
  • Language(s) (NLP): English
  • Finetuned from model: google/gemma-3b-it

Uses

Direct Use

The model can be used to generate essays or long-form answers in the style of Paul Graham. It is useful for educational, literary, or entertainment purposes.

Out-of-Scope Use

  • Factual question answering (it may hallucinate)
  • Use in high-risk domains (e.g., legal, medical)
  • Emulating real individuals without clear consent

Bias, Risks, and Limitations

The model inherits biases from the original Gemma model and the Paul Graham dataset. Since the dataset reflects a single author's perspective, generated outputs may lean toward specific ideological or philosophical stances.

Recommendations

Users should review outputs carefully and avoid using the model as a source of truth or factual information.

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM

model = AutoModelForCausalLM.from_pretrained("arthurmello/gemma-3b-it-paul-graham-lora")
tokenizer = AutoTokenizer.from_pretrained("arthurmello/gemma-3b-it-paul-graham-lora")

prompt = "<start_of_turn>user\nWrite an essay on the future of AI<end_of_turn><eos>\n<start_of_turn>model\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Training Data

The model was fine-tuned on a cleaned dataset of Paul Graham’s essays, formatted into a chat-like structure using a custom chat_template.

Dataset: arthurmello/paul-graham-essays

Training Procedure

Training Hyperparameters

  • LoRA rank: 16
  • LoRA alpha: 64
  • LoRA dropout: 0.05
  • Batch size: 1 (gradient accumulation = 4)
  • Max sequence length: 1500 tokens
  • Learning rate: 1e-4
  • Precision: bf16
  • Epochs: 10
  • Scheduler: cosine
  • Weight decay: 0.1

Evaluation

Metrics

  • Training loss: Decreased from 2.57 to 1.80
  • Validation loss: Plateaued around 2.5–2.6, indicating stability with limited overfitting

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator.

  • Hardware Type: L4 GPU
  • Hours used: ~30 min
  • Cloud Provider: Google Colab Pro
  • Compute Region: Europe

Technical Specifications

Model Architecture and Objective

Decoder-only transformer model with LoRA adapters applied to key projection layers, attention outputs, and feedforward projections.

Compute Infrastructure

Hardware

  • 1× NVIDIA L4 (22.5GB)

Software

  • Python 3.11
  • PyTorch
  • Hugging Face Transformers, PEFT, TRL

Model Card Contact

For questions or feedback, contact @arthurmello or LinkedIn

Downloads last month
9
Safetensors
Model size
1.01B params
Tensor type
F32
·
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for arthurmello/gemma-3-1b-it-paul-graham

Adapter
(22)
this model

Dataset used to train arthurmello/gemma-3-1b-it-paul-graham