Multi-Task BERT for Financial News Topic Classification and Sentiment Analysis

Model Description

This model is a multi-task BERT-based architecture designed to simultaneously perform topic classification and sentiment analysis on financial news text. The model leverages shared representations to improve performance on both tasks through multi-task learning.

Model Details

Model Type: Multi-task BERT for text classification
Language: English
License: MIT
Tasks:
- Topic Classification (financial news categories)
- Sentiment Analysis (positive, negative, neutral)

Intended Uses

Direct Use

This model can be used for:

Analyzing sentiment in financial news articles
Classifying financial news into relevant topics/categories
Automated content analysis for financial research
Risk assessment based on news sentiment

Downstream Use

The model can be fine-tuned for:

Specific financial domains (stocks, forex, commodities)
Custom topic taxonomies
Different sentiment granularities

How to Use

import torch
import pickle
from transformers import AutoTokenizer, AutoModel

# Load the model
with open('multitask_bert_model.pkl', 'rb') as f:
    model = pickle.load(f)

# Load tokenizer (adjust model name as needed)
tokenizer = AutoTokenizer.from_pretrained('bert-base-uncased')

# Example usage
text = "Apple stock rises 5% after strong quarterly earnings report"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True, max_length=512)

# Get predictions (adjust based on your model's output format)
with torch.no_grad():
    outputs = model(**inputs)
    # Process outputs for topic and sentiment predictions

Training Data

The model was trained on financial news data for multi-task learning. The training involved:

Topic classification task
Sentiment analysis task
Joint optimization with shared BERT representations

Training Procedure

Training Hyperparameters

Training regime: Multi-task learning with shared encoder
Model variants:
- multitask_bert_model.pkl: Base model
- multitask_bert_model_weight.pth: Weighted version
- multitask_bert_model_imbalanced.pth: Version trained on imbalanced data

Training Details

The model uses a shared BERT encoder with task-specific classification heads for topic classification and sentiment analysis. The multi-task approach allows the model to learn shared representations that benefit both tasks.

Evaluation

Testing Data & Metrics

The model should be evaluated on:

Topic Classification: Accuracy, F1-score, Precision, Recall
Sentiment Analysis: Accuracy, F1-score, Precision, Recall

Results

[Add your evaluation results here]

Task	Metric	Score
Topic Classification	Accuracy	0.76
Sentiment Analysis	Accuracy	0.87

Limitations and Bias

Limitations

Performance may vary on financial news from different time periods
Model may not generalize well to non-financial text
Limited to English language text
Performance depends on the quality and diversity of training data

Bias Considerations

Model may reflect biases present in financial news training data
Sentiment predictions may be influenced by market conditions during training
Topic classifications may favor certain financial sectors represented in training data

Technical Specifications

Model Architecture

Base Model: BERT
Architecture: Multi-task learning with shared encoder