aquif-neo-small / README.md
aquiffoo's picture
Update README.md
944537f verified
metadata
pipeline_tag: text-generation
inference: false
license: apache-2.0
library_name: transformers
tags:
  - language
  - aquif
  - gpt2
  - text-generation-inference
  - math
  - coding
  - small
language:
  - en
datasets:
  - facebook/empathetic_dialogues
  - openai/gsm8k
  - codeparrot/codeparrot-clean
  - brando/small-c4-dataset

aquif-neo

aquif-neo is our first pretrained model, featuring 64.1 million parameters. Designed purely as an experiment, it currently does not yet offer coherent text and reasoning at all.

Model Overview

  • Name: aquif-neo
  • Parameters: 64.1 million
  • Architecture: Dense
  • Type: General-purpose LLM
  • Hosted on: Hugging Face

Training Steps

step 500 | loss = 0.9147
step 1000 | loss = 0.7440
step 1500 | loss = 0.6791
step 2000 | loss = 0.6631
step 2500 | loss = 0.6439
step 3000 | loss = 0.6335
step 3500 | loss = 0.6176
step 4000 | loss = 0.5987
step 4500 | loss = 0.5979
step 5000 | loss = 0.6018
step 5500 | loss = 0.5767
step 6000 | loss = 0.5839
step 6500 | loss = 0.5754
step 7000 | loss = 0.5644
step 7500 | loss = 0.5640
step 8000 | loss = 0.5686