--- tags: - roberta - email-classification - text-classification language: en license: apache-2.0 datasets: - Tobi-Bueck/customer-support-tickets metrics: - accuracy model_type: xlm-roberta pipeline_tag: text-classification --- # xlm-roberta-email-classifier Fine-tuned version of `xlm-roberta-base` for multi-class classification of English-language emails. This model is designed to automatically route or tag incoming messages based on their content. ## Model Overview - **Base Model**: `xlm-roberta-base` - **Task**: Email classification (10 categories) - **Language**: English - **Frameworks**: Hugging Face Transformers, PyTorch Lightning - **Training Tracker**: Weights & Biases ## Performance - Accuracy: 0.42 - F1 Score: 0.436 - Precision: 0.527 - Recall: 0.42 ## Class Labels The model predicts one of the following categories: | Label ID | Category | |----------|---------------------------------| | 0 | Billing and Payments | | 1 | Customer Service | | 2 | General Inquiry | | 3 | Human Resources | | 4 | IT Support | | 5 | Product Support | | 6 | Returns and Exchanges | | 7 | Sales and Pre-Sales | | 8 | Service Outages and Maintenance | | 9 | Technical Support | ## Usage ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("ale-dp/xlm-roberta-email-classifier") model = AutoModelForSequenceClassification.from_pretrained("ale-dp/xlm-roberta-email-classifier") email_text = "I'd like to return the item I purchased last week." inputs = tokenizer(email_text, return_tensors="pt") outputs = model(**inputs) predicted_class_id = outputs.logits.argmax().item() label_map = { 'Billing and Payments': 0, 'Customer Service': 1, 'General Inquiry': 2, 'Human Resources': 3, 'IT Support': 4, 'Product Support': 5, 'Returns and Exchanges': 6, 'Sales and Pre-Sales': 7, 'Service Outages and Maintenance': 8, 'Technical Support': 9 } predicted_label = list(label_map.keys())[list(label_map.values()).index(predicted_class_id)] print(predicted_label) ```