|
--- |
|
pipeline_tag: text-generation |
|
inference: false |
|
license: apache-2.0 |
|
library_name: transformers |
|
tags: |
|
- language |
|
- aquif |
|
- gpt2 |
|
- text-generation-inference |
|
- math |
|
- coding |
|
- small |
|
language: |
|
- en |
|
datasets: |
|
- facebook/empathetic_dialogues |
|
- openai/gsm8k |
|
- codeparrot/codeparrot-clean |
|
- brando/small-c4-dataset |
|
--- |
|
|
|
# aquif-neo |
|
|
|
**aquif-neo** is our first pretrained model, featuring 64.1 million parameters. Designed purely as an experiment, it currently does not yet offer coherent text and reasoning at all. |
|
|
|
## Model Overview |
|
- **Name**: `aquif-neo` |
|
- **Parameters**: 64.1 million |
|
- **Architecture**: Dense |
|
- **Type**: General-purpose LLM |
|
- **Hosted on**: [Hugging Face](https://huggingface.co/aquiffoo/aquif-neo) |
|
|
|
## Training Steps |
|
|
|
step 500 | loss = 0.9147 |
|
\ |
|
step 1000 | loss = 0.7440 |
|
\ |
|
step 1500 | loss = 0.6791 |
|
\ |
|
step 2000 | loss = 0.6631 |
|
\ |
|
step 2500 | loss = 0.6439 |
|
\ |
|
step 3000 | loss = 0.6335 |
|
\ |
|
step 3500 | loss = 0.6176 |
|
\ |
|
step 4000 | loss = 0.5987 |
|
\ |
|
step 4500 | loss = 0.5979 |
|
\ |
|
step 5000 | loss = 0.6018 |
|
\ |
|
step 5500 | loss = 0.5767 |
|
\ |
|
step 6000 | loss = 0.5839 |
|
\ |
|
step 6500 | loss = 0.5754 |
|
\ |
|
step 7000 | loss = 0.5644 |
|
\ |
|
step 7500 | loss = 0.5640 |
|
\ |
|
step 8000 | loss = 0.5686 |
|
|