JKU-G3-LLM-v2

Model Description

JKU-G3-LLM-v2 is a language model developed by the ASER team at JKU. This second version builds upon previous architectures with enhanced training methodology and evaluation metrics.

Training Details

Hyperparameters

Epochs: 5
Training Samples: Unknown
Batch Size: Unknown
Learning Rate: Unknown

Training Results

Epoch	Training Loss	Validation Loss
1	1.404300	1.422856
2	0.975000	1.363832
3	0.608100	1.515084
4	0.359300	1.794247
5	0.221200	2.057410

Performance Metrics

Final Training Loss: 0.7827
Accuracy: 0.6
F1 Score: 0.6
BLEU: 0.0
- Precisions: [0.667, 0.286, 0.2, 0.0]
- Brevity Penalty: 1.0
- Length Ratio: 1.125
- Translation Length: 9,
- Reference Length: 8
ROUGE:
- ROUGE-1: 0.583
- ROUGE-2: 0.286
- ROUGE-L: 0.583

Computational Details

Total FLOS: 1.3368e+16
Training Runtime: 5,246.06 seconds (~1.46 hours)
Training Samples/Second: 13.003
Training Steps/Second: 1.626

Intended Uses & Limitations

This model is intended for research purposes and text generation tasks. Users should be aware of the following limitations:

The model shows some overfitting tendencies as evidenced by decreasing training loss but increasing validation loss
Performance metrics indicate room for improvement in generation quality
The model may inherit biases present in the training data

How to Use

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("b-aser/jku-g3-llm-v2")
tokenizer = AutoTokenizer.from_pretrained("b-aser/jku-g3-llm-v2")

inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))

You can customize this further by:

Adding more specific details about the architecture
Including information about the training data
Adding more detailed usage examples
Providing more specific environmental impact calculations
Adding any ethical considerations or bias analyses you've conducted

Contact Information For questions, feedback, or collaboration opportunities, please contact:

Maintainer : Aser T. Alemu
Email : [email protected]
GitHub : github.com/b-aser

b-aser
/

jku-g3-llm-v2