File size: 4,914 Bytes
e59db30
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
language: en
license: apache-2.0
base_model: google/gemma-7b
tags:
- financial-sentiment-analysis
- fine-tuned
- peft
- lora
- financial-phrasebank
- gemma
datasets:
- financial_phrasebank
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: trained-gemma-sentences_allagree
  results:
  - task:
      type: text-classification
      name: Financial Sentiment Analysis
    dataset:
      type: financial_phrasebank
      name: Financial PhraseBank
      config: sentences_allagree
    metrics:
    - type: accuracy
      value: 0.876
      name: Accuracy
    - type: f1
      value: 0.870
      name: F1 Score
    - type: precision
      value: 0.875
      name: Precision
    - type: recall
      value: 0.865
      name: Recall
---

# Trained Gemma Sentences_Allagree

## Model Description

Gemma-7B fine-tuned on financial sentiment (100% agreement threshold). This model was fine-tuned using LoRA (Low-Rank Adaptation) on the Financial PhraseBank dataset with 100% annotator agreement threshold.

## Model Details

- **Base Model**: google/gemma-7b
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Dataset**: Financial PhraseBank (sentences with 100% annotator agreement)
- **Task**: Financial Sentiment Analysis (3-class: positive, negative, neutral)
- **Language**: English

## Performance

| Metric | Value |
|--------|-------|
| Accuracy | 87.6% |
| F1 Score | 87.0% |
| Precision | 87.5% |
| Recall | 86.5% |

## Training Details

This model was fine-tuned as part of a Final Year Project on Financial Sentiment Analysis and Stock Prediction. The training used:

- **Training Framework**: Transformers + PEFT
- **Quantization**: 4-bit quantization using BitsAndBytes
- **Hardware**: CUDA-enabled GPU
- **Hyperparameter Optimization**: Extensive Optuna-based tuning

## Usage

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-7b",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-7b")

# Load fine-tuned model
model = PeftModel.from_pretrained(base_model, "jengyang/trained-gemma-sentences_allagree-financial-sentiment")

# Prepare input
text = "The company reported strong quarterly earnings, exceeding analyst expectations."
prompt = f"Classify the sentiment of this financial text as positive, negative, or neutral: {text}\n\nSentiment:"

# Tokenize and generate
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=10,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```

## Training Data

The model was trained on the Financial PhraseBank dataset, specifically using sentences where 100% of annotators agreed on the sentiment label. This ensures higher quality and consistency in the training data.

The Financial PhraseBank contains financial news headlines categorized into:
- **Positive**: Favorable financial news
- **Negative**: Unfavorable financial news  
- **Neutral**: Factual financial information without clear sentiment

## Evaluation

The model was evaluated on a held-out test set from the Financial PhraseBank dataset. The evaluation metrics reflect performance on financial sentiment classification with the 100% agreement threshold.

**Note**: Gemma models in this series achieved up to 87.6% accuracy, representing state-of-the-art performance on financial sentiment analysis tasks.

## Limitations and Bias

- The model is specifically designed for financial text sentiment analysis
- Performance may vary on non-financial text or different domains
- The model reflects the biases present in the Financial PhraseBank dataset
- Results should be interpreted within the context of financial sentiment analysis
- The model may not capture nuanced sentiment in complex financial scenarios

## Intended Use

**Intended Use Cases:**
- Financial news sentiment analysis
- Investment research and analysis
- Automated financial content classification
- Academic research in financial NLP

**Out-of-Scope Use Cases:**
- General-purpose sentiment analysis
- Medical or legal text analysis
- Real-time trading decisions without human oversight

## Citation

If you use this model, please cite:

```bibtex
@misc{trained_gemma_sentences_allagree,
  title={Trained Gemma Sentences_Allagree: Fine-tuned gemma-7b for Financial Sentiment Analysis},
  author={Final Year Project},
  year={2024},
  howpublished={\url{https://huggingface.co/jengyang/trained-gemma-sentences_allagree-financial-sentiment}}
}
```

## Model Card Authors

This model card was generated as part of a Final Year Project on Financial Sentiment Analysis and Stock Prediction.