Neurona - a spanish workplace violence prevention and sexual harassment support model

Neurona is a specialized fine-tuned version of Meta's Llama-3.3-70B-Instruct model, designed for Spanish-language conversations about workplace violence prevention and sexual harassment support. This PEFT (LoRA) adapter provides empathetic, professional, and informative responses to users seeking guidance and support in workplace safety situations.

Fine-tuned using QLoRA on NVIDIA H100 GPU with a curated dataset of workplace violence prevention conversations.

The repo with the finetuning scripts can be found here.

Model Details

Model Type: PEFT LoRA Adapter
Base Model: meta-llama/Llama-3.3-70B-Instruct
Fine-tuning Method: QLoRA (4-bit Quantized Low-Rank Adaptation)
Language: Spanish (es)
Domain: Workplace safety, violence prevention, and sexual harassment support
License: Llama 3.3 Community License
Parameters: LoRA adapter (~150M trainable parameters)

Intended Use

This model is intended to be used as a conversational AI assistant to provide:

Educational information about workplace violence and harassment.
Guidance on reporting procedures and seeking help.
Empathetic support for individuals in difficult workplace situations.

Out-of-Scope Use

This model is not a substitute for professional legal, psychological, or crisis intervention services. It should not be used for:

Providing legal advice.
Medical or psychological diagnosis.
Emergency or crisis situations.

How to Use

Requirements

pip install transformers torch peft bitsandbytes accelerate

Basic Usage

This is a PEFT LoRA adapter that must be loaded on top of the base Llama 3.3 70B Instruct model:

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

# Load base model and tokenizer
base_model_id = "meta-llama/Llama-3.3-70B-Instruct"
adapter_model_id = "juan/llama-33-70b-workplace-safety-es"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
base_model = AutoModelForCausalLM.from_pretrained(
    base_model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    load_in_4bit=True  # Enable 4-bit quantization for memory efficiency
)

# Load PEFT adapter
model = PeftModel.from_pretrained(base_model, adapter_model_id)

# Specialized system prompt for workplace violence prevention
system_prompt = """Eres un asistente especializado en prevención de violencia laboral y acoso sexual en el entorno de trabajo. Tu objetivo es proporcionar apoyo empático, información precisa y recursos específicos a personas que puedan estar experimentando situaciones difíciles en su lugar de trabajo.

IMPORTANTE: Siempre mantén un tono profesional pero cálido, valida las emociones del usuario, y proporciona información práctica basada en protocolos establecidos."""

# Example conversation
messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user", "content": "Creo que estoy sufriendo acoso laboral, ¿qué puedo hacer?"},
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=512,
    eos_token_id=tokenizer.eos_token_id,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))

Memory Requirements

Configuration	GPU Memory	RAM	Storage
4-bit quantized	8GB+ VRAM	16GB+	20GB+
Full precision	40GB+ VRAM	64GB+	150GB+

Hardware Recommendations

Recommended: RTX 4090, A100, H100 (with 4-bit quantization)
Minimum: RTX 3090, V100 (with 4-bit quantization)
CPU inference: Possible but very slow (32GB+ RAM required)

Inference Script

This repository includes a comprehensive inference script (inference.py) that supports:

Interactive model selection between base Llama 3.3 70B and Neurona
Side-by-side comparison of model responses
Single inference mode and interactive chat mode
Automatic quantization and memory optimization

Usage examples:

# Interactive model selection
python inference.py --interactive --token your_hf_token

# Direct comparison mode
python inference.py --interactive --single --prompt "¿Qué hacer ante acoso laboral?" --token your_hf_token

# Neurona model only
python inference.py --model meta-llama/Llama-3.3-70B-Instruct --token your_hf_token

Training Data

Training Set: A custom dataset of 48 Spanish instruction-response pairs focused on workplace violence prevention.
Validation Set: 1000 samples from the bertin-project/alpaca-spanish dataset to ensure general conversational quality.

The training data was carefully curated to include empathetic, professional, and relevant responses for the target domain.

Training Procedure

Fine-tuning with QLoRA

The model was fine-tuned using 4-bit NormalFloat (NF4) quantization and LoRA.

LoRA r: 128
LoRA alpha: 32
LoRA dropout: 0.05
Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, embed_tokens, lm_head

Hyperparameters

Learning Rate: 1e-4
Scheduler: Cosine
Epochs: 3
Per-Device Batch Size: 1 (optimized for H100)
Gradient Accumulation Steps: 32 (effective batch size: 32)
Warmup Steps: 100
Weight Decay: 0.01
Gradient Clipping: 0.5

Hardware and Software

GPU: NVIDIA H100 PCIe (79.6GB effective memory)
Software: PyTorch 2.4.0, TRL, PEFT, bitsandbytes, accelerate

Evaluation

Training Metrics

Metric	Value
Training Loss	1.7418
Mean Token Accuracy	63.63%
Entropy	1.1294
Training Runtime	224 seconds (3.73 minutes)
Total FLOPs	2.33 × 10¹⁶
Total Tokens Processed	54,621
Samples per Second	0.429
Global Steps	3

Conversation Quality

A multi-dimensional evaluation framework was used to assess conversation quality, with a composite score of 0.73 (target > 0.65).

Metric	Score
Empathy Score	0.67
Domain Relevance	0.81
Professional Tone	0.74

Limitations & Ethical Considerations

Model Limitations

Domain Specificity: Optimized for Spanish workplace violence prevention; may not perform well on general tasks
Data Coverage: Based on 32 training examples; may not cover all workplace situation nuances
Cultural Context: Designed for Spanish-speaking workplace environments
Response Length: Optimized for conversational responses, not long-form content

Ethical Guidelines

Not Professional Services: This model provides educational information only, not legal or psychological advice
Crisis Situations: For immediate danger, contact emergency services (112 in Spain, 911 in US)
Privacy: Users should not share sensitive personal information
Bias Awareness: Responses may reflect biases present in training data
Human Oversight: Recommend human review for critical workplace decisions

Safety Considerations

Emergency Situations: Always prioritize professional emergency services
Legal Matters: Consult qualified employment lawyers for legal advice
Mental Health: Seek licensed mental health professionals for psychological support
Workplace Policies: Follow your organization's specific HR protocols

Citation

If you use this model in your research or applications, please cite it as:

@misc{neurona-2025,
  author = {Juan MVS},
  title = {Neurona: Spanish Workplace Violence Prevention Chatbot},
  year = {2025},
  publisher = {Hugging Face},
  journal = {Hugging Face Hub},
  howpublished = {\url{https://huggingface.co/juanmvs/neurona}}
}

Acknowledgments

Base Model: Meta AI for Llama 3.3 70B Instruct
Framework: Hugging Face Transformers and PEFT libraries
Training Infrastructure: NVIDIA H100 GPU
Validation Dataset: Bertin Project for Spanish Alpaca dataset

Project Structure

This is a complete finetuning project that includes:

Training Script: finetune_llama33_70b.py - Comprehensive QLoRA training pipeline
Inference Script: inference.py - Interactive inference and model comparison
Upload Script: upload_to_hf.py - HuggingFace model upload utility
Configuration: pyproject.toml - Complete dependency and project configuration
Training Data: ft_data.json - 48 curated Spanish workplace safety conversations

Key Dependencies

PyTorch 2.4.0 with CUDA 12.1 support
Transformers ≥4.45.0 for Llama 3.3 compatibility
PEFT ≥0.12.0 for LoRA implementation
TRL ≥0.11.0 for supervised fine-tuning
BitsAndBytes ≥0.43.0 for 4-bit quantization
Weights & Biases for experiment tracking

Contact

For questions about this model or collaboration opportunities:

email: [email protected]
Model Repository: juanmvs/neurona

⚠️ Disclaimer: This AI model is for educational and informational purposes only. For workplace violence situations requiring immediate intervention, please contact appropriate emergency services, HR departments, or professional counselors.

Downloads last month: 34

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for juanmvs/neurona

Base model

meta-llama/Llama-3.1-70B

Finetuned

meta-llama/Llama-3.3-70B-Instruct

Adapter

(98)

this model

Dataset used to train juanmvs/neurona

Evaluation results

Metadata error: specify a dataset to view leaderboard