File size: 1,231 Bytes
d2112d6 28ac1f9 d2112d6 28ac1f9 d2112d6 28ac1f9 944537f 28ac1f9 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce 766e134 ea4f7ce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 |
---
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
- language
- aquif
- gpt2
- text-generation-inference
- math
- coding
- small
language:
- en
datasets:
- facebook/empathetic_dialogues
- openai/gsm8k
- codeparrot/codeparrot-clean
- brando/small-c4-dataset
---
# aquif-neo
**aquif-neo** is our first pretrained model, featuring 64.1 million parameters. Designed purely as an experiment, it currently does not yet offer coherent text and reasoning at all.
## Model Overview
- **Name**: `aquif-neo`
- **Parameters**: 64.1 million
- **Architecture**: Dense
- **Type**: General-purpose LLM
- **Hosted on**: [Hugging Face](https://huggingface.co/aquiffoo/aquif-neo)
## Training Steps
step 500 | loss = 0.9147
\
step 1000 | loss = 0.7440
\
step 1500 | loss = 0.6791
\
step 2000 | loss = 0.6631
\
step 2500 | loss = 0.6439
\
step 3000 | loss = 0.6335
\
step 3500 | loss = 0.6176
\
step 4000 | loss = 0.5987
\
step 4500 | loss = 0.5979
\
step 5000 | loss = 0.6018
\
step 5500 | loss = 0.5767
\
step 6000 | loss = 0.5839
\
step 6500 | loss = 0.5754
\
step 7000 | loss = 0.5644
\
step 7500 | loss = 0.5640
\
step 8000 | loss = 0.5686
|