RoBERTa-Base Quantized Model for Topic Classification

This repository hosts a quantized version of the RoBERTa model, fine-tuned for topic classification using the AG News dataset. The model has been optimized using FP16 quantization for efficient deployment without significant accuracy loss.

Model Details

Model Architecture: RoBERTa Base
Task: Multi-class Topic Classification (4 classes)
Dataset: AG News (Hugging Face Datasets)
Quantization: Float16
Fine-tuning Framework: Hugging Face Transformers

Installation

pip install transformers torch datasets

Loading the Model


from transformers import RobertaTokenizer
from transformers import RobertaForSequenceClassification

import torch

# Load tokenizer and model

tokenizer = RobertaTokenizer.from_pretrained("roberta-base")
model = RobertaForSequenceClassification.from_pretrained("roberta-base", num_labels=4).to(device)
# Define test sentences
samples = [
    "Tensions rise in the Middle East as diplomats gather for emergency talks to prevent further escalation.",
"Tesla reports a 25% increase in quarterly revenue, driven by strong demand for its Model Y vehicles in Asia.",

"Researchers develop a new quantum computing chip that significantly reduces energy consumption.",
    "Argentina defeats Brazil 2-1 in the Copa América final, securing their 16th continental title.",
    "Meta unveils its latest AI model capable of generating 3D virtual environments from text prompts."
]



from transformers import pipeline

# Load pipeline for inference
classifier = pipeline("text-classification", model=trainer.model, tokenizer=tokenizer, device=0)  # device=-1 if using CPU

predictions = classifier(samples)

# Print results
for text, pred in zip(samples, predictions):
    print(f"\nText: {text}\nPredicted Topic: {pred['label']} (Score: {pred['score']:.4f})")

Performance Metrics

Accuracy: 0.9471
Precision: 0.9471
Recall: 0.9471
F1 Score: 0.9471

Fine-Tuning Details

Dataset

The dataset is sourced from Hugging Face’s ag_news dataset. It contains 120,000 training samples and 7,600 test samples, with each news article labeled into one of four categories: World, Sports, Business, or Sci/Tech. The original dataset was used as provided, and input texts were tokenized using the RoBERTa tokenizer and truncated/padded to a maximum length of 128 tokens.

Training

Epochs: 3
Batch size: 8
Learning rate: 2e-5
Evaluation strategy: epoch

Quantization

Post-training quantization was applied using PyTorch’s half() precision (FP16) to reduce model size and inference time.

Repository Structure

.
├── config.json                   # Model configuration
├── merges.txt                   # Byte Pair Encoding (BPE) merge rules for tokenizer
├── model.safetensors            # Quantized model weights
├── README.md                    # Model documentation
├── special_tokens_map.json      # Tokenizer special tokens
├── tokenizer_config.json        # Tokenizer configuration
├── vocab.json                   # Tokenizer vocabulary

├── README.md                      # Model documentation

Limitations

The model is trained specifically for binary topic classification on ag news dataset.
FP16 quantization may result in slight numerical instability in edge cases.

Contributing

Feel free to open issues or submit pull requests to improve the model or documentation.