Model Card for Model ID
ParentPalAI fine-tunes mistralai/Mistral-7B-Instruct-v0.3 using Direct Preference Optimization (DPO) combined with Parameter-Efficient Fine-Tuning (PEFT) via Quantized Low-Rank Adaptation (QLoRA).
The goal is to enhance empathy and emotional resonance in parenting-related conversations while studying the trade-offs between emotional alignment, clarity, and factual quality.
Model Details
Model Description
Goal: Improve the empathy and emotional resonance of parenting-focused LLM responses while analyzing the impact of alignment techniques on overall quality.
Action: Fine-tuned Mistral-7B-Instruct on ~1K synthetic preference pairs using Direct Preference Optimization (DPO) with Parameter-Efficient Fine Tuning (PEFT) i.e. Quantized Low-Rank Adaptation (QLoRA). Built a complete alignment workflow covering prompt engineering, preference pairs generation, QLoRA fine-tuning, and LLM-as-a-Judge (GPT-4o) evaluation with custom empathy and quality metrics.
Result: Drove a +65-point increase in empathy win rate (11% to 76%), revealing meaningful trade-offs between emotional alignment, and clarity and overall quality to inform subsequent multi-objective fine-tuning strategies.
- Developed by: Prerna Chikersal
- Model type: PEFT
- Language(s) (NLP): English
- License: Apache 2.0
- Finetuned from model: Mistral-7B-Instruct-v0.3
Model Sources
- Repository: https://github.com/prernaa/ParentPalAI (includes sample responses)
Uses
ParentPalAI was developed for research and educational purposes β primarily to explore:
- How fine-tuning on synthetic preference pairs affects empathy, tone, and relatability in LLM responses.
- The trade-off between emotional resonance and clarity/helpfulness in aligned models.
- Methods for enhancing warmth and naturalness in conversational AI through DPO and PEFT (QLoRA).
Researchers, educators, and ML practitioners can use this model to:
- Study fine-tuning effects on emotional style and alignment.
- Prototype empathy-driven LLMs for social or psychological dialogue settings.
Direct Use
You can use ParentPalAI to:
- Generate empathetic, supportive, and warm responses to parenting-related prompts.
- Experiment with style transfer and tone control in conversational AI.
- Test LLM evaluation metrics (e.g., LLM-as-a-Judge) for empathy, tone, and clarity.
Example:
prompt = "My toddler cries every night before bed. What should I do?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Out-of-Scope Use
This model is not suitable for:
- Clinical, medical, or therapeutic advice.
- Real-world parenting counseling or behavioral guidance.
- Any deployment scenario involving high-stakes decision-making, mental health support, or childcare recommendations.
- Content moderation, bias-free generation, or factual question answering β the Reddit dataset may contain noisy or biased language.
Bias, Risks, and Limitations
The model should not be used for real parenting, psychological, or medical guidance. Instead, it serves as a research tool for exploring empathy and tone in language models, and all outputs should be reviewed critically before use.
Recommendations
- Always pair this adapter with the base model mistralai/Mistral-7B-Instruct-v0.3.
- Use bfloat16 precision and FlashAttention 2 on A100 or H100 GPUs for optimal speed.
- Evaluate generations qualitatively for empathy, clarity, and factual accuracy before any downstream use.
- For production or sensitive domains, fine-tune further using curated, high-quality data or Direct Preference Optimization (DPO) to balance warmth and helpfulness.
How to Get Started with the Model
This repository only contains PEFT adapter weights β not the full 7B model. To use the model, you must load the base Mistral model and apply this adapter.
- Base model: mistralai/Mistral-7B-Instruct-v0.3
- Fine-tuning method: QLoRA (PEFT)
- Training data: synthetic preference pairs data from GPT
- Goal: Explore how DPO by optimization for empathy and overall quality affects empathy and warmth in responses.
# LOAD THE BASE MODEL IN 4-BIT PRECISION WITH DOUBLE QUANTIZATION
import torch
from transformers import AutoModelForCausalLM, BitsAndBytesConfig, AutoTokenizer
torch.backends.cuda.matmul.allow_tf32 = True
torch.set_float32_matmul_precision("high")
bnb_config = BitsAndBytesConfig(
load_in_4bit=True, # loads base model in 4-bit precision
bnb_4bit_use_double_quant=True, # double quantization saves VRAM
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16
)
model = AutoModelForCausalLM.from_pretrained(
BASE_MODEL_ID,
quantization_config=bnb_config,
device_map="auto",
dtype=torch.bfloat16,
attn_implementation="flash_attention_2", # FA2 is fastest on A100
token=HF_TOKEN # login to hugging face
)
model.config.pad_token_id = tokenizer.pad_token_id
model.generation_config.pad_token_id = tokenizer.pad_token_id
## Load the ParentPalAI PEFT Model
from peft import PeftModel
model = PeftModel.from_pretrained(model, "prernac1/parentpalai")
## Inference
prompt = """Youβre a supportive parent responding to another parent who is struggling with toddler tantrums."""
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=300,
temperature=0.7,
top_p=0.9
)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Training Data
V1 (optimizing for empathy): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v1.jsonl V2 (optimizing for overall quality): https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_dpo_labels_v2.jsonl
Training Procedure
PEFT with QLoRA (4-bit precision) on A100 Google Collab.
Training Hyperparameters
ParentPalAI was fine-tuned using Quantized Low-Rank Adaptation (QLoRA) on the base model mistralai/Mistral-7B-Instruct-v0.3. The model was trained in 4-bit precision with double quantization (NF4) and bfloat16 compute, optimized for VRAM efficiency on T4 and A100 GPUs. The model was trained on A100.
Training method: QLoRA (Parameter-Efficient Fine-Tuning)
Precision: 4-bit quantization (NF4) with double quantization, compute in bfloat16
Optimizer: paged_adamw_8bit
Scheduler: Cosine learning rate decay with 3% warmup
Batching: Effective batch size of 24 (per_device_train_batch_size=6, gradient_accumulation_steps=4)
Epochs: 1β2 (best checkpoint after 1 epoch, ~40 steps)
Dropout: 0.15 (LoRA)
LoRA rank: 8 (r=8), scaling factor alpha=32
Trainable parameters: ~0.18% of total model parameters
Gradient checkpointing: Enabled
Attention implementation: FlashAttention 2
Mixed precision: bfloat16 mixed precision
Base precision (non-quantized runs): bfloat16
Evaluation
Testing Data, Factors & Metrics
Testing Data
Here is the test dataset generated by GPT4o: https://github.com/prernaa/ParentPalAI/blob/main/data_to_share/dpo_dataset_test.jsonl
Metrics
ParentPalAI was evaluated using GPT-4o as an LLM-as-a-Judge, comparing its responses (System B) to the base model mistralai/Mistral-7B-Instruct-v0.3 (System A).
Each model pair was scored on six qualitative dimensions β empathy, clarity, comprehensiveness, practicality, adoptability, and overall quality β across 100 GPT-generated parenting prompts.
Two variants of ParentPalAI were tested to understand alignment trade-offs.
Results
Version 1 (Empathy-Focused DPO)
(optimized for empathy but considers overall quality)
| System | winner_empathy | winner_clarity | winner_overall |
|---|---|---|---|
| System A | 0.1066 | 0.8883 | 0.7462 |
| System B (ParentPalAI V1) | 0.7640 | 0.1117 | 0.2538 |
Findings:
- ParentPalAI V1 dramatically increased empathy (+65 points, from ~11% β 76%).
- The model produced noticeably warmer, more supportive tone but with reduced clarity and practical helpfulness.
- Despite lower clarity, some responses were judged as more relatable and emotionally resonant, showing that empathic alignment can enhance perceived authenticity even when utility drops.
Version 2 (Overall-Quality-Focused DPO)
(optimized only for overall win rate)
| System | winner_empathy | winner_clarity | winner_overall |
|---|---|---|---|
| System A | 0.4340 | 0.8604 | 0.6371 |
| System B (ParentPalAI V2) | 0.2843 | 0.1371 | 0.3629 |
Findings:
- Optimizing purely for overall quality partially recovered clarity and practicality but reduced empathic warmth (43 % β 28 %).
- The model balanced tone and coherence better than V1 but sounded less emotionally attuned.
- This highlights a core alignment tension: maximizing clarity and factual strength can come at the expense of empathy and perceived connection.
Summary
- V1: Highest empathy and relatability, weaker clarity -> ideal for exploring affective alignment.
- V2: More balanced but emotionally flatter -> better for generalized instruction following.
- Empathy and clarity appear inversely correlated when optimizing single-objective DPO.
- Future work will explore multi-objective DPO and reinforcement from human preferences to jointly optimize warmth, clarity, and factual helpfulness.
Environmental Impact
Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).
- Hardware Type: A100
- Hours used: 5
- Cloud Provider: Google Collab
- Compute Region: USA
- Carbon Emitted: [More Information Needed]
Citation [optional]
@misc{chikersal2025parentpalai,
author = {Prerna Chikersal},
title = {ParentPalAI β Empathic Fine-Tuning of LLMs using Direct Preference Optimization (DPO) with QLoRA},
year = {2025},
publisher = {GitHub},
howpublished = {\url{https://github.com/prernaa/ParentPalAI}},
note = {Hugging Face Model: https://huggingface.co/prernac1/parentpalai}
}
Model Card Contact
Prerna Chikersal: [email protected]
Model tree for prernac1/parentpalai
Base model
mistralai/Mistral-7B-v0.3