๐Ÿง  Llama-3.1-KokoroChat-Low: Japanese Counseling Dialogue Model

Llama-3.1-KokoroChat-Low is a large-scale Japanese language model fine-tuned on the entire KokoroChat datasetโ€”a collection of over 6,000 psychological counseling dialogues conducted via role-play between trained counselors. The model is capable of generating empathetic and context-aware responses suitable for mental health-related conversational tasks.


๐Ÿ’ก Overview

  • โœ… Fine-tuned on 3,870 dialogues with client feedback scores below 70
  • โœ… Data collected through text-based role-play by trained counselors
  • โœ… Covers a wide range of topics: depression, family, school, career, relationships, and more
  • โœ… Base Model: tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.3

โš™๏ธ Usage Example

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "UEC-InabaLab/Llama-3.1-KokoroChat-Low"

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto")

# Set pad_token_id
if tokenizer.pad_token_id is None:
    tokenizer.pad_token = "[PAD]"
    tokenizer.pad_token_id = tokenizer.convert_tokens_to_ids("[PAD]")

model.config.pad_token_id = tokenizer.pad_token_id

# Build dialogue input
messages = [
    {"role": "system", "content": "ๅฟƒ็†ใ‚ซใ‚ฆใƒณใ‚ปใƒชใƒณใ‚ฐใฎไผš่ฉฑใซใŠใ„ใฆใ€ๅฏพ่ฉฑๅฑฅๆญดใ‚’่€ƒๆ…ฎใ—ใ€ใ‚ซใ‚ฆใƒณใ‚ปใƒฉใƒผใจใ—ใฆ้ฉๅˆ‡ใซๅฟœ็ญ”ใ—ใฆใใ ใ•ใ„ใ€‚"},
    {"role": "user", "content": "ๆœ€่ฟ‘ใ€ๆฐ—ๅˆ†ใŒ่ฝใก่พผใ‚“ใงใ‚„ใ‚‹ๆฐ—ใŒๅ‡บใพใ›ใ‚“ใ€‚"}
]

# Tokenize with chat template
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

attention_mask = inputs.ne(tokenizer.pad_token_id)

# Generate response
outputs = model.generate(
    inputs,
    attention_mask=attention_mask,
    pad_token_id=tokenizer.pad_token_id,
    max_new_tokens=256
)

# Extract only the newly generated tokens
response = outputs[0][inputs.shape[-1]:]
response_text = tokenizer.decode(response, skip_special_tokens=True)

# Print clean response
print(response_text)

๐Ÿ› ๏ธ Fine-Tuning Details

Fine-tuning was performed using QLoRA with the following configuration:

  • Quantization: 4-bit NF4 with bfloat16 computation
  • LoRA target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • LoRA parameters:
    • r = 8
    • lora_alpha = 16
    • lora_dropout = 0.05

Dataset Split

  • Training Data: 3,870 dialogues with feedback scores < 70
  • Train/Validation Split: 90% train, 10% validation

Hyperparameter Settings

  • Optimizer: adamw_8bit
  • Warm-up Steps: 100
  • Learning Rate: 1e-3
  • Epochs: 5
  • Batch Size: 8
  • Validation Frequency: every 400 steps

๐Ÿ“„ Citation

If you use this model or dataset, please cite the following paper:

@inproceedings{qi2025kokorochat,
  title     = {KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors},
  author    = {Zhiyang Qi and Takumasa Kaneko and Keiko Takamizo and Mariko Ukiyo and Michimasa Inaba},
  booktitle = {Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics},
  year      = {2025},
  url       = {https://github.com/UEC-InabaLab/KokoroChat}
}

๐Ÿ”— Related

Downloads last month
6
Safetensors
Model size
8.03B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for UEC-InabaLab/Llama-3.1-KokoroChat-Low

Finetuned
(4)
this model

Dataset used to train UEC-InabaLab/Llama-3.1-KokoroChat-Low