LAPEFT: Lexicon-Augmented PEFT for Financial Sentiment Analysis

This model implements LAPEFT (Lexicon-Augmented Parameter-Efficient Fine-Tuning), a novel approach that combines:

BERT-base-uncased as the foundation model
LoRA (Low-Rank Adaptation) for parameter-efficient fine-tuning
Gated Fusion Mechanism for combining transformer and lexicon features
Financial Lexicon Augmentation using VADER + Loughran-McDonald dictionary
Memory Optimization techniques for efficient training

Model Architecture

The LAPEFT model consists of several key components:

Base Model: BERT-base-uncased with LoRA adapters
Lexicon Features: 4-dimensional VADER sentiment features (compound, pos, neg, neu)
Gated Fusion Layer: Learns optimal combination of transformer and lexicon representations
Custom Classifier: Multi-layer classification head with dropout

Model Features

Parameter Efficiency: Only ~1-2% of parameters are trainable via LoRA
Financial Domain Expertise: Enhanced with Loughran-McDonald financial sentiment lexicon
Memory Optimized: Gradient checkpointing and mixed precision training
Robust Architecture: Gated fusion prevents overfitting to lexicon features

Usage

Note: This model requires custom loading code due to its specialized architecture with gated fusion and lexicon features.

Basic Inference (Simplified)

For basic usage, you can load just the PEFT adapter:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased", 
    num_labels=3
)
tokenizer = AutoTokenizer.from_pretrained("Hananguyen12/LAPEFT-Financial-Sentiment-Analysis")

# Load PEFT adapter
model = PeftModel.from_pretrained(base_model, "Hananguyen12/LAPEFT-Financial-Sentiment-Analysis")

# Basic inference (without lexicon features)
text = "The company's quarterly earnings exceeded expectations."
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1)

# Map predictions to labels
labels = ["negative", "neutral", "positive"]
sentiment = labels[predicted_class.item()]
print(f"Sentiment: {sentiment}")

Full LAPEFT Model (Advanced)

For the complete LAPEFT experience with lexicon features and gated fusion, you'll need to implement the custom model architecture. See the training code for the complete implementation.

Model Output

The model outputs 3 classes for financial sentiment:

0: Negative sentiment - Bearish financial outlook
1: Neutral sentiment - Neutral/factual financial information
2: Positive sentiment - Bullish financial outlook

Training Details

Base Model: BERT-base-uncased
Fine-tuning Method: LoRA (rank=16, alpha=32)
Sequence Length: 512 tokens
Lexicon: VADER + Loughran-McDonald Financial Dictionary
Fusion Method: Learnable gated fusion with attention mechanism
Optimization: Memory-optimized training with gradient checkpointing
Dataset: Financial sentiment dataset with 3-class labels

Performance

The LAPEFT model achieves superior performance on financial sentiment analysis by:

Leveraging domain-specific financial terminology
Combining neural and symbolic approaches
Using parameter-efficient fine-tuning for better generalization

Citation

If you use this model, please cite:

@misc{lapeft2024,
  title={LAPEFT: Lexicon-Augmented Parameter-Efficient Fine-Tuning for Financial Sentiment Analysis},
  author={Your Name},
  year={2024},
  note={Hugging Face Model Hub}
}

Model Files

adapter_config.json: LoRA adapter configuration
adapter_model.safetensors: LoRA adapter weights
additional_components.pt: Gated fusion and classifier weights
lexicon_analyzer.pkl: Financial lexicon analyzer
training_summary.json: Training metrics and configuration

Limitations

Requires custom loading code for full functionality
Optimized specifically for financial domain text
May not generalize well to other domains without retraining

Hananguyen12
/

LAPEFT-Financial-Sentiment-Analysis