Multi-Task BERT for Financial News Topic Classification and Sentiment Analysis
Model Description
This model is a multi-task BERT-based architecture designed to simultaneously perform topic classification and sentiment analysis on financial news text. The model leverages shared representations to improve performance on both tasks through multi-task learning.
Model Details
- Model Type: Multi-task BERT for text classification
- Language: English
- License: MIT
- Tasks:
- Topic Classification (financial news categories)
- Sentiment Analysis (positive, negative, neutral)
Intended Uses
Direct Use
This model can be used for:
- Analyzing sentiment in financial news articles
- Classifying financial news into relevant topics/categories
- Automated content analysis for financial research
- Risk assessment based on news sentiment
Downstream Use
The model can be fine-tuned for:
- Specific financial domains (stocks, forex, commodities)
- Custom topic taxonomies
- Different sentiment granularities
How to Use
import torch
import pickle
from transformers import AutoTokenizer, AutoModel
# Load the model
with open('multitask_bert_model.pkl', 'rb') as f:
model = pickle.load(f)
# Load tokenizer (adjust model name as needed)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')
# Example usage
text = "Apple stock rises 5% after strong quarterly earnings report"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)
# Get predictions (adjust based on your model's output format)
with torch.no_grad():
outputs = model(**inputs)
# Process outputs for topic and sentiment predictions
Training Data
The model was trained on financial news data for multi-task learning. The training involved:
- Topic classification task
- Sentiment analysis task
- Joint optimization with shared BERT representations
Training Procedure
Training Hyperparameters
- Training regime: Multi-task learning with shared encoder
- Model variants:
multitask_bert_model.pkl
: Base modelmultitask_bert_model_weight.pth
: Weighted versionmultitask_bert_model_imbalanced.pth
: Version trained on imbalanced data
Training Details
The model uses a shared BERT encoder with task-specific classification heads for topic classification and sentiment analysis. The multi-task approach allows the model to learn shared representations that benefit both tasks.
Evaluation
Testing Data & Metrics
The model should be evaluated on:
- Topic Classification: Accuracy, F1-score, Precision, Recall
- Sentiment Analysis: Accuracy, F1-score, Precision, Recall
Results
[Add your evaluation results here]
Task | Metric | Score |
---|---|---|
Topic Classification | Accuracy | 0.76 |
Sentiment Analysis | Accuracy | 0.87 |
Limitations and Bias
Limitations
- Performance may vary on financial news from different time periods
- Model may not generalize well to non-financial text
- Limited to English language text
- Performance depends on the quality and diversity of training data
Bias Considerations
- Model may reflect biases present in financial news training data
- Sentiment predictions may be influenced by market conditions during training
- Topic classifications may favor certain financial sectors represented in training data
Technical Specifications
Model Architecture
- Base Model: BERT
- Architecture: Multi-task learning with shared encoder