📰 Penny‑1.7B · Irish Penny Journal Style (GRPO)

This is a style transfer from the Irish Penny Journal (1840) to Smollm2 using GRPO. For example, this response to "What is the capital of France"

Verily, in the grand tapestry of European monarchies, the city of Paris, the seat of the mighty Emperor Napoleon, holds a place of singular distinction. This city, which hath borne the name of 'La Ville Lumière' for nigh on two centuries, doth shine forth as a beacon of art, culture, and intellect, its very existence a testament to the ingenuity and brilliance of its people. And so, it is with great honour and reverence that we declare Paris, the majestic capital of the French realm, to be our noble question's answer.

Penny‑1.7B is a 1.7 billion‑parameter causal language model fine‑tuned with Group Relative Policy Optimization (GRPO) to emulate the 19ᵗʰ‑century prose of the Irish Penny Journal (1840). The RL stage ran for 6,800 policy steps, using a reward model trained to classify sentences as original IPJ vs modern translation. Maximizing this score nudges generations toward authentic Victorian‑era diction while retaining the general reasoning ability of the base SmolLM2 model.

✨ Key Facts


Base model	SmolLM2‑1.7B-Instruct
Tuning method	GRPO (RL)
Policy steps	6,800
Reward model	MiniLM2 L6 384H classifier
Optimiser	AdamW 8‑bit · lr 5 × 10^⁻6
Hardware	1× RTX A6000 (48 GB) · bf16

🔬 Training & Data

Corpora
- Irish Penny Journal 1840 (dleemiller/irish_penny_journal)
- Modernized translations produced via rule‑based spelling normalisation plus manual post‑edit
Reward = classifier output

➡️ Intended Uses

Creative writing, educational content, or stylistic pastiche in Victorian‑era Irish English.
Research on RL‑based style transfer.

Not recommended for: contemporary fact Q&A or contexts where archaic language could mislead readers.

⚠️ Limitations & Biases

19ᵗʰ‑century texts can contain outdated social views. Outputs may reflect such biases or archaic spelling. Always review generations before use.

💻 Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "dleemiller/Penny-1.7B"

device = "cuda"  # or "cpu"

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
# For multi‑GPU: install accelerate and use device_map="auto"
model = AutoModelForCausalLM.from_pretrained(checkpoint).to(device)

messages = [{"role": "user", "content": "What is the capital of France."}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False)
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)

outputs = model.generate(inputs,
                         max_new_tokens=512,
                         temperature=0.8,
                         top_p=0.9,
                         do_sample=True)

print(tokenizer.decode(outputs[0]))

📝 Citation

@software{penny_1.7b_2025,
  title        = {Penny‑1.7B: Irish Penny Journal Style Language Model},
  author       = {Lee Miller},
  year         = 2025,
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/dleemiller/Penny-1.7B}
}

📜 License

Apache 2.0 (inherits from the base model).

dleemiller
/

Penny-1.7B

📰 Penny‑1.7B · Irish Penny Journal Style (GRPO)

✨ Key Facts

🔬 Training & Data

➡️ Intended Uses

⚠️ Limitations & Biases

💻 Quick Start

📝 Citation

📜 License

Model tree for dleemiller/Penny-1.7B

Space using dleemiller/Penny-1.7B 1

📰 Penny‑1.7B · Irish Penny Journal Style (GRPO)

✨ Key Facts

🔬 Training & Data

➡️ Intended Uses

⚠️ Limitations & Biases

💻 Quick Start

📝 Citation

📜 License

Model tree for dleemiller/Penny-1.7B

Space using dleemiller/Penny-1.7B 1

📰 Penny‑1.7B · Irish Penny Journal Style (GRPO)