hblim's picture
Update README.md
e6924d3 verified
metadata
library_name: transformers
license: mit
datasets:
  - hblim/customer-complaints
language:
  - en
metrics:
  - accuracy
base_model:
  - google-bert/bert-base-uncased
tags:
  - bert
  - transformers
  - customer-complaints
  - text-classification
  - multiclass
  - huggingface
  - fine-tuned
  - wandb

BERT Base (Uncased) Fine-Tuned on Customer Complaint Classification (3 Classes)

🧾 Model Description

This model is a fine-tuned version of bert-base-uncased using Hugging Face Transformers on a custom dataset of customer complaints. The task is multi-class text classification, where each complaint is categorized into one of three classes.

The model is intended to support downstream tasks like complaint triage, issue type prediction, or support ticket classification.

Training and evaluation were tracked using Weights & Biases, and all hyperparameters are reproducible and logged below.


🧠 Intended Use

  • 🏷 Classify customer complaint text into 3 predefined categories
  • πŸ“Š Analyze complaint trends over time
  • πŸ’¬ Serve as a backend model for customer service applications

πŸ“š Dataset

  • Dataset Name: hblim/customer-complaints
  • Dataset Type: Multiclass text classification
  • Classes: billing, product, delivery
  • Preprocessing: Standard BERT tokenization

βš™οΈ Training Details

  • Base Model: bert-base-uncased
  • Epochs: 10
  • Batch Size: 1
  • Learning Rate: 1e-5
  • Weight Decay: 0.05
  • Warmup Ratio: 0.20
  • LR Scheduler: linear
  • Optimizer: AdamW
  • Evaluation Strategy: every 100 steps
  • Logging: every 100 steps
  • Trainer: Hugging Face Trainer
  • Hardware: Single NVIDIA GeForce RTX 3080 GPU

πŸ“ˆ Metrics

Evaluation was tracked using:

  • Accuracy

To reproduce metrics and training logs, refer to the corresponding W&B run: Weights & Biases Run - baseline-hf-hub

Step Training Loss Validation Loss Accuracy
100 1.106100 1.040519 0.523810
200 0.944800 0.744273 0.738095
300 0.660000 0.385309 0.900000
400 0.412400 0.273423 0.904762
500 0.220800 0.185636 0.923810
600 0.163400 0.245850 0.919048
700 0.116100 0.180523 0.942857
800 0.097200 0.254475 0.928571
900 0.052200 0.233583 0.942857
1000 0.050700 0.223150 0.928571
1100 0.035100 0.271416 0.919048
1200 0.027700 0.226478 0.933333
1300 0.009000 0.218807 0.938095
1400 0.013600 0.246330 0.928571
1500 0.014500 0.226987 0.933333

πŸš€ How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("your-username/baseline-hf-hub")
tokenizer = AutoTokenizer.from_pretrained("your-username/baseline-hf-hub")

inputs = tokenizer("I want to report an issue with my account", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()