File size: 7,418 Bytes
b8ed563
ccf9c27
 
6028bf1
 
b8ed563
 
6028bf1
b8ed563
 
 
ccf9c27
 
 
 
 
b8ed563
ccf9c27
b8ed563
9638104
 
ccf9c27
 
b8ed563
 
 
 
ccf9c27
 
 
b8ed563
ccf9c27
 
 
 
b8ed563
ccf9c27
b8ed563
ccf9c27
b8ed563
ccf9c27
b8ed563
ccf9c27
b8ed563
ccf9c27
b8ed563
ccf9c27
b8ed563
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ccf9c27
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ff84afb
ccf9c27
 
466fedd
ccf9c27
 
 
 
 
 
 
466fedd
ccf9c27
466fedd
ccf9c27
 
 
 
 
 
 
 
 
9a6b227
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
---
license: cc-by-nc-4.0
base_model: snoels/FinGEITje-7B-sft
datasets:
- BramVanroy/ultra_feedback_dutch
library_name: peft
tags:
- alignment-handbook
- trl
- dpo
- generated_from_trainer
- geitje
- fingeitje
- dutch
- nl
- finance
model-index:
- name: snoels/FinGEITje-7B-dpo
  results: []
language:
- nl
pipeline_tag: text-generation
inference: false
---

[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/snoels/huggingface/runs/yng7mdb0)

<p align="center" style="margin:0;padding:0">
<img src="https://huggingface.co/snoels/FinGEITje-7B-dpo/resolve/main/fingeitje-banner-dpo.png" alt="FinGEITje DPO Banner" width="1000"/>
</p>

<div style="margin:auto; text-align:center">
  <h1 style="margin-bottom: 0; font-size: 2em;">🐐 FinGEITje 7B DPO</h1>
  <em style="font-size: 1em;">A large open Dutch financial language model aligned through AI feedback.</em>
</div>

This model is a fine-tuned version of [snoels/FinGEITje-7B-sft](https://huggingface.co/snoels/FinGEITje-7B-sft) on the [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch) dataset.

## πŸ“– Model Description

[FinGEITje-7B-dpo](https://huggingface.co/snoels/FinGEITje-7B-dpo) is a large open Dutch financial language model with 7 billion parameters, based on Mistral 7B. It has been further trained using **Direct Preference Optimization (DPO)** on AI-generated preference data, aligning the model's responses with human-like preferences in the Dutch language. This alignment process enhances the model's ability to generate more helpful, coherent, and user-aligned responses in financial contexts.

## πŸ“Š Training

### Training Data

[FinGEITje-7B-dpo](https://huggingface.co/snoels/FinGEITje-7B-dpo) was fine-tuned on the [BramVanroy/ultra_feedback_dutch](https://huggingface.co/datasets/BramVanroy/ultra_feedback_dutch) dataset, which consists of synthetic preference data in Dutch. This dataset includes prompts along with preferred and less preferred responses, allowing the model to learn to generate more aligned responses through DPO.

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-06
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- gradient_accumulation_steps: 16
- total_train_batch_size: 64
- total_eval_batch_size: 4
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 1

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
| 0.1029        | 0.1327 | 100  | 0.1099          | -1.8067        | -5.3683          | 0.9679             | 3.5616          | -892.3373      | -579.9115    | -2.4775         | -2.3705       |
| 0.042         | 0.2654 | 200  | 0.0430          | -3.5129        | -10.6778         | 0.9828             | 7.1649          | -1423.2883     | -750.5289    | -1.9744         | -1.9895       |
| 0.0278        | 0.3981 | 300  | 0.0344          | -3.7335        | -13.5153         | 0.9828             | 9.7818          | -1707.0360     | -772.5893    | -1.7454         | -1.8191       |
| 0.0223        | 0.5308 | 400  | 0.0308          | -3.6554        | -13.7712         | 0.9858             | 10.1158         | -1732.6289     | -764.7831    | -1.8020         | -1.9184       |
| 0.0378        | 0.6635 | 500  | 0.0297          | -4.0018        | -16.3285         | 0.9851             | 12.3266         | -1988.3542     | -799.4221    | -1.6924         | -1.8650       |
| 0.0352        | 0.7962 | 600  | 0.0278          | -3.8104        | -15.6430         | 0.9836             | 11.8327         | -1919.8119     | -780.2752    | -1.7437         | -1.8978       |
| 0.0238        | 0.9289 | 700  | 0.0279          | -3.8974        | -15.9642         | 0.9828             | 12.0668         | -1951.9310     | -788.9780    | -1.7371         | -1.8937       |

### Framework versions

- PEFT 0.11.1
- Transformers 4.42.4
- Pytorch 2.3.1
- Datasets 2.20.0
- Tokenizers 0.19.1

## πŸ› οΈ How to Use

[FinGEITje-7B-dpo](https://huggingface.co/snoels/FinGEITje-7B-dpo) can be utilized using the Hugging Face Transformers library along with PEFT to load the adapters efficiently.

### Installation

Ensure you have the necessary libraries installed:

```bash
pip install torch transformers peft accelerate
```

### Loading the Model

```python
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel

# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained("BramVanroy/GEITje-7B-ultra", use_fast=False)

# Load the base model
base_model = AutoModelForCausalLM.from_pretrained("BramVanroy/GEITje-7B-ultra", device_map='auto')

# Load the FinGEITje-7B-dpo model with PEFT adapters
model = PeftModel.from_pretrained(base_model, "snoels/FinGEITje-7B-dpo", device_map='auto')
```

### Generating Text

```python
# Prepare the input
input_text = "Wat zijn de laatste trends in de Nederlandse banksector?"
input_ids = tokenizer.encode(input_text, return_tensors='pt').to(model.device)

# Generate a response
outputs = model.generate(input_ids, max_length=200, num_return_sequences=1)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)
```

## πŸ™ Acknowledgements

We would like to thank:

- **Rijgersberg** ([GitHub](https://github.com/Rijgersberg)) for creating [GEITje](https://github.com/Rijgersberg/GEITje), one of the first Dutch foundation models.
- **Bram Vanroy** ([GitHub](https://github.com/BramVanroy)) for creating [GEITje-7B-ultra](https://huggingface.co/BramVanroy/GEITje-7B-ultra) and providing the ultra_feedback_dutch dataset.
- **Contributors of the [Alignment Handbook](https://github.com/huggingface/alignment-handbook)** for providing valuable resources that guided the development and training process of [FinGEITje-7B-dpo](https://huggingface.co/snoels/FinGEITje-7B-dpo).
- **Silverfin** for their collaboration in this research. Silverfin, a Belgian scale-up focused on building an accountancy cloud service, provided valuable insights and resources that were instrumental in the development of FinGEITje. More about their work can be found at [Silverfin](https://silverfin.com/).

## πŸ“ Citation
[Link to the paper](https://arxiv.org/abs/2410.12835) 

If you use [FinGEITje-7B-dpo](https://huggingface.co/snoels/FinGEITje-7B-dpo) in your work, please cite:

```bibtex
@article{FinGEITje2024,
  title={A Dutch Financial Large Language Model},
  author={Noels, Sander and De Blaere, Jorne and De Bie, Tijl},
  journal={arXiv preprint arXiv:2410.12835},
  year={2024},
  url={https://arxiv.org/abs/2410.12835}
}
```

## πŸ“œ License

This model is licensed under the [Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/) license.

## πŸ“§ Contact

For any inquiries or questions, please contact [Sander Noels](mailto:[email protected]).