File size: 1,231 Bytes
d2112d6
28ac1f9
 
 
d2112d6
28ac1f9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d2112d6
 
28ac1f9
 
944537f
28ac1f9
 
 
 
 
 
 
 
ea4f7ce
 
 
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
766e134
ea4f7ce
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
---
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
- language
- aquif
- gpt2
- text-generation-inference
- math
- coding
- small
language:
- en
datasets:
- facebook/empathetic_dialogues
- openai/gsm8k
- codeparrot/codeparrot-clean
- brando/small-c4-dataset
---

# aquif-neo

**aquif-neo** is our first pretrained model, featuring 64.1 million parameters. Designed purely as an experiment, it currently does not yet offer coherent text and reasoning at all.

## Model Overview
- **Name**: `aquif-neo`
- **Parameters**: 64.1 million
- **Architecture**: Dense
- **Type**: General-purpose LLM
- **Hosted on**: [Hugging Face](https://huggingface.co/aquiffoo/aquif-neo)

## Training Steps

step     500 | loss = 0.9147
\
step    1000 | loss = 0.7440
\
step    1500 | loss = 0.6791
\
step    2000 | loss = 0.6631
\
step    2500 | loss = 0.6439
\
step    3000 | loss = 0.6335
\
step    3500 | loss = 0.6176
\
step    4000 | loss = 0.5987
\
step    4500 | loss = 0.5979
\
step    5000 | loss = 0.6018
\
step    5500 | loss = 0.5767
\
step    6000 | loss = 0.5839
\
step    6500 | loss = 0.5754
\
step    7000 | loss = 0.5644
\
step    7500 | loss = 0.5640
\
step    8000 | loss = 0.5686