JKU-G3-LLM-v2
Model Description
JKU-G3-LLM-v2 is a language model developed by the ASER team at JKU. This second version builds upon previous architectures with enhanced training methodology and evaluation metrics.
Training Details
Hyperparameters
- Epochs: 5
- Training Samples: Unknown
- Batch Size: Unknown
- Learning Rate: Unknown
Training Results
Epoch | Training Loss | Validation Loss |
---|---|---|
1 | 1.404300 | 1.422856 |
2 | 0.975000 | 1.363832 |
3 | 0.608100 | 1.515084 |
4 | 0.359300 | 1.794247 |
5 | 0.221200 | 2.057410 |
Performance Metrics
- Final Training Loss: 0.7827
- Accuracy: 0.6
- F1 Score: 0.6
- BLEU: 0.0
- Precisions: [0.667, 0.286, 0.2, 0.0]
- Brevity Penalty: 1.0
- Length Ratio: 1.125
- Translation Length: 9,
- Reference Length: 8
- ROUGE:
- ROUGE-1: 0.583
- ROUGE-2: 0.286
- ROUGE-L: 0.583
Computational Details
- Total FLOS: 1.3368e+16
- Training Runtime: 5,246.06 seconds (~1.46 hours)
- Training Samples/Second: 13.003
- Training Steps/Second: 1.626
Intended Uses & Limitations
This model is intended for research purposes and text generation tasks. Users should be aware of the following limitations:
- The model shows some overfitting tendencies as evidenced by decreasing training loss but increasing validation loss
- Performance metrics indicate room for improvement in generation quality
- The model may inherit biases present in the training data
How to Use
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("b-aser/jku-g3-llm-v2")
tokenizer = AutoTokenizer.from_pretrained("b-aser/jku-g3-llm-v2")
inputs = tokenizer("Your input text here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0]))
You can customize this further by:
- Adding more specific details about the architecture
- Including information about the training data
- Adding more detailed usage examples
- Providing more specific environmental impact calculations
- Adding any ethical considerations or bias analyses you've conducted
Contact Information For questions, feedback, or collaboration opportunities, please contact:
- Maintainer : Aser T. Alemu
- Email : [email protected]
- GitHub : github.com/b-aser
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for b-aser/jku-g3-llm-v2
Base model
google-bert/bert-base-uncased