---
license: apache-2.0
datasets:
- arafatanam/Student-Mental-Health-Counseling-50K
language:
- en
base_model:
- meta-llama/Llama-3.2-3B-Instruct
tags:
- mental-health
- student-focused
- llama-3
- chatbot
---
# LLaMA-3.2-3B-Instruct Fine-Tuned for Student Mental Health Counseling

## Model Overview

This model is a fine-tuned version of [`meta-llama/Llama-3.2-3B-Instruct`](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct), customized to support mental health and counseling conversations. It is adapted to respond compassionately and contextually to student mental health needs, suitable for AI chatbots, support tools, and educational assistance platforms.

---

## Dataset

Fine-tuned using two merged and preprocessed datasets designed for mental health support:

- [`arafatanam/Student-Mental-Health-Counseling-50K`](https://huggingface.co/datasets/arafatanam/Student-Mental-Health-Counseling-50K) — Student-focused counseling data (50,000 samples)

---

## Training Configuration

- **Framework**: 🤗 Transformers + Unsloth + LoRA
- **Hardware**: Dual NVIDIA T4 GPUs (Kaggle Notebooks)
- **Fine-tuning Technique**: Parameter-efficient fine-tuning with LoRA
- **Tokenizer**: LLaMA-compatible tokenizer
- **Precision**: FP16 (with fallback for BF16 if unsupported)

## Training Arguments

| Argument                      | Value      |
| ----------------------------- | ---------- |
| `max_seq_length`              | 512        |
| `per_device_train_batch_size` | 1          |
| `gradient_accumulation_steps` | 8          |
| `num_train_epochs`            | 1          |
| `learning_rate`               | 2e-4       |
| `warmup_ratio`                | 0.01       |
| `optimizer`                   | adamw_8bit |
| `lr_scheduler_type`           | cosine     |
| `weight_decay`                | 0.01       |
| `max_grad_norm`               | 0.5        |
| `eval_steps`                  | 200        |
| `save_steps`                  | 1000       |
| `logging_steps`               | 100        |

---

## Training Metrics

| Metric              | Value             |
| ------------------- | ----------------- |
| Train Loss (Avg)    | 5.1548            |
| Final Step Loss     | 3.7192            |
| Training Time       | 24,210.73 seconds |
| FLOPs (Total)       | 146.28 Trillion   |
| Global Steps        | 3,125             |
| Epochs              | 1                 |
| Samples/Second      | 2.065             |
| Steps/Second        | 0.129             |
| Gradient Norm       | 374.22            |
| Final Learning Rate | 8.67e-8           |

---

## Use Cases

This model is suitable for:

- 🤖 **AI-based mental health chatbot platforms**
- 🧘 **Student well-being assistants**
- 📚 **Mental health education tools**
- 🗣️ **Conversational agents for emotional support**
- 🧾 **Therapeutic and wellness content generation**

---

## Limitations & Considerations

- This model **does not replace professional mental health services**.
- Best suited for **non-clinical support tools** in schools, universities, or awareness platforms.
- Use in real-world applications should include **human moderation** for critical or sensitive cases.