FinBERT-Tone Gold LoRA Final

Model Description

This model is a gold commodity-specific LoRA (Low-Rank Adaptation) fine-tuned version of yiyanghkust/finbert-tone for sentiment analysis specifically focused on gold commodity news and financial documents.

Important: This is not a generic financial sentiment model. It is an asset-specific model that has been fine-tuned specifically for classifying sentiment (positive, negative, neutral, none) with respect to gold commodity news and market analysis.

Model Details

Base Model: yiyanghkust/finbert-tone
Model Type: BERT with LoRA adapter
Task: Sequence Classification (SEQ_CLS)
Specialization: Gold commodity sentiment analysis
Dataset: SaguaroCapital/sentiment-analysis-in-commodity-market-gold (10,000+ finance news stories)
Language: English
License: MIT
Architecture: BERT-based transformer with LoRA adaptation layers

Training Dataset

The model was trained using the SaguaroCapital/sentiment-analysis-in-commodity-market-gold dataset, which contains over 10,000 finance news stories specifically focused on gold commodity markets. The training process involved:

Asset-specific label harmonization for gold commodity news
Domain adaptation focused specifically on gold market sentiment
Fine-tuning for four sentiment classes: positive, negative, neutral, and none

Evaluation Metrics

Performance on validation set (from training logs):

Accuracy: 87%
F1 Score (Weighted): 0.86
F1 Score (Macro): 0.81

These metrics demonstrate strong performance on gold commodity-specific sentiment classification tasks.

Technical Specifications

LoRA Configuration

LoRA Rank (r): 8
LoRA Alpha: 16
LoRA Dropout: 0.05
Target Modules: ['value', 'query']
Modules to Save: ['classifier', 'score']
PEFT Type: LORA
Bias: none
Inference Mode: true

Tokenizer Configuration

Tokenizer Class: BertTokenizer
Vocabulary Size: Based on BERT WordPiece tokenizer
Special Tokens: [PAD], [UNK], [CLS], [SEP], [MASK]
Case Handling: Lowercased (do_lower_case: true)
Chinese Character Tokenization: Enabled

Model Architecture

Base Architecture: BERT (Bidirectional Encoder Representations from Transformers)
Adaptation Method: LoRA (Low-Rank Adaptation)
Target Components: Query and Value projection layers in attention mechanisms
Additional Trainable Components: Classification head and scoring layers

Model Files

adapter_model.safetensors: LoRA adapter weights (1.2 MB)
adapter_config.json: LoRA configuration parameters
tokenizer.json: Fast tokenizer configuration (712 kB)
vocab.txt: Vocabulary file (226 kB)
special_tokens_map.json: Special token mappings
tokenizer_config.json: Tokenizer configuration

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
from peft import PeftModel

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained("yiyanghkust/finbert-tone")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "misraanay/finbert-tone-gold-lora-final")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("misraanay/finbert-tone-gold-lora-final")

# Example usage for gold commodity sentiment analysis
text = "Gold prices surge amid inflation concerns and central bank purchases."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)

Training Details

Training Framework

Library: Transformers + PEFT
Method: LoRA (Low-Rank Adaptation)
Base Model: yiyanghkust/finbert-tone
Dataset: SaguaroCapital/sentiment-analysis-in-commodity-market-gold
Focus: Asset-specific label harmonization and domain adaptation for gold

LoRA Parameters

Low-rank decomposition applied to attention query and value matrices
Rank 8 decomposition with alpha scaling factor of 16
5% dropout applied to LoRA layers
Only ~1.2MB of additional parameters compared to full fine-tuning

Computational Efficiency

Parameter Efficiency: LoRA adapter contains only a small fraction of the base model parameters
Memory Efficiency: Reduced memory footprint during training and inference
Storage Efficiency: Adapter weights are only 1.2 MB compared to full model size

Intended Use

Primary Use Case

Gold commodity sentiment analysis and classification of gold-related financial text documents, news articles, and market reports.

Suitable Applications

Gold commodity news sentiment analysis
Gold market sentiment classification
Gold investment research text analysis
Gold price-related financial document analysis
Mining industry news sentiment scoring (gold-focused)

Out-of-Scope Use

General financial sentiment analysis (model is specialized specifically for gold commodity)
Other commodity markets (silver, oil, etc.) - model is gold-specific
Non-English text analysis
Tasks requiring generation rather than classification
General domain sentiment analysis

Limitations and Considerations

Model Limitations

Highly specialized for gold commodity domain - not suitable for general financial sentiment
Inherits potential biases from the base FinBERT model
Performance may vary on gold-related text from different time periods or regional markets
LoRA adaptation may not capture all possible task-specific patterns
Limited to English language gold commodity content

Computational Requirements

Requires PEFT library for proper loading and inference
Base model must be available for adapter loading
Standard BERT inference computational requirements

Model Card Authors

Anay Misra

Citation

If you use this model, please cite the original FinBERT paper:

@article{yang2020finbert,
  title={FinBERT: Financial Sentiment Analysis with Pre-trained Language Models},
  author={Yang, Yi and UY, Mark Christopher Siy and Huang, Allen},
  journal={arXiv preprint arXiv:1908.10063},
  year={2020}
}

Additional Information

Repository: Available on Hugging Face Model Hub
Model Size: Base model + 1.2 MB LoRA adapter
Inference: Compatible with transformers and PEFT libraries
Fine-tuning: Can be further adapted using LoRA or other PEFT methods
Specialization: Gold commodity-specific sentiment analysis

misraanay
/

finbert-tone-gold-lora-final