File size: 2,669 Bytes
1330a41
3fb973b
 
 
 
 
1d7b3e6
3fb973b
 
b0e3fe0
 
 
 
 
 
 
1d7b3e6
3fb973b
1330a41
 
3fb973b
1330a41
3fb973b
1330a41
3fb973b
1330a41
 
 
3fb973b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0acb8f5
 
 
 
3fb973b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e2a7418
d986a1d
e2a7418
 
 
 
 
 
 
 
 
 
 
 
 
7b0725e
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
---
language: en
tags:
- llm
- text-generation
- pytorch
license: mit
datasets:
- custom
metrics:
- accuracy
- f1
- bleu
- rouge
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
library_name: transformers
---

# JKU-G3-LLM-v2

## Model Description

JKU-G3-LLM-v2 is a language model developed by the ASER team at JKU. This second version builds upon previous architectures with enhanced training methodology and evaluation metrics.

## Training Details

### Hyperparameters
- Epochs: 5
- Training Samples: Unknown
- Batch Size: Unknown
- Learning Rate: Unknown

### Training Results
| Epoch | Training Loss | Validation Loss |
|-------|---------------|-----------------|
| 1     | 1.404300      | 1.422856        |
| 2     | 0.975000      | 1.363832        |
| 3     | 0.608100      | 1.515084        |
| 4     | 0.359300      | 1.794247        |
| 5     | 0.221200      | 2.057410        |

### Performance Metrics
- Final Training Loss: 0.7827
- Accuracy: 0.6
- F1 Score: 0.6
- BLEU: 0.0
  - Precisions: [0.667, 0.286, 0.2, 0.0]
  - Brevity Penalty: 1.0
  - Length Ratio: 1.125
  - Translation Length: 9,
  - Reference Length: 8
- ROUGE:
  - ROUGE-1: 0.583
  - ROUGE-2: 0.286
  - ROUGE-L: 0.583

### Computational Details
- Total FLOS: 1.3368e+16
- Training Runtime: 5,246.06 seconds (~1.46 hours)
- Training Samples/Second: 13.003
- Training Steps/Second: 1.626

## Intended Uses & Limitations

This model is intended for research purposes and text generation tasks. Users should be aware of the following limitations:

1. The model shows some overfitting tendencies as evidenced by decreasing training loss but increasing validation loss
2. Performance metrics indicate room for improvement in generation quality
3. The model may inherit biases present in the training data

## How to Use

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("b-aser/jku-g3-llm-v2")
tokenizer = AutoTokenizer.from_pretrained("b-aser/jku-g3-llm-v2")

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
```


You can customize this further by:
1. Adding more specific details about the architecture
2. Including information about the training data
3. Adding more detailed usage examples
4. Providing more specific environmental impact calculations
5. Adding any ethical considerations or bias analyses you've conducted


Contact Information
For questions, feedback, or collaboration opportunities, please contact:

- Maintainer : Aser T. Alemu
- Email : [email protected]
- GitHub : github.com/b-aser