Bertug1911
/

BrtGPT-1-Pre

Text Generation

text-generation-inference

Model card Files Files and versions

BrtGPT-1-Pre / README.md

Bertug1911's picture

Update README.md

47dfb16 verified 3 months ago

|

1.09 kB

	---
	license: cc
	datasets:
	- MBZUAI/LaMini-instruction
	language:
	- en
	pipeline_tag: text-generation
	---

	# BrtGPT-1-Pre

	## 1. Introduction

	We're introducing our first question-and-answer language model, "BratGPT-1-Preview." The model was trained using GPT-2-sized question-and-answer data (~150M tokens, 1 epoch) using a "chat template" instead of plain text.

	The model performed surprisingly well in simple question-and-answer, creativity, and knowledge-based chat.

	It's quite good for general/everyday chat.

	But it has some shortcomings:
	- Simple math,
	- Code,
	- High school and college-level science and engineering questions

	However, if necessary, deficiencies can be corrected with fine-tuning in areas of concern.
	Furthermore, while generally avoiding harmful responses, caution should still be exercised regarding potentially damaging responses.

	## 2. Technical Specifications

	Model specifications:

	- Context length: 1024 tokens (~768 words)
	- Maximum output length: 128 tokens (~96 words)
	- Parameter count: ~90 Million
	- Architecture type: Transformer (Decoder-only)