BrtGPT-1-Pre
1. Introduction
NEW USE CODE! (Shorter and faster than the first code!)
We're introducing our first question-and-answer language model, "BrtGPT-1-Preview." The model was trained using GPT-2-sized question-and-answer data (~150M tokens, 1 epoch) using a "chat template" instead of plain text.
The model performed surprisingly well in simple question-and-answer, creativity, and knowledge-based chat.
It's quite good for general/everyday chat.
But it has some shortcomings:
- Simple math,
- Code,
- High school and college-level science and engineering questions
However, if necessary, deficiencies can be corrected with fine-tuning in areas of concern. Furthermore, while generally avoiding harmful responses, caution should still be exercised regarding potentially damaging responses.
2. Technical Specifications
Model specifications:
- Context length: 1024 tokens (~768 words)
- Maximum output length: 128 tokens (~96 words)
- Parameter count: ~90 Million
- Architecture type: Transformer (Decoder-only)
3. USE
For use, you can use this code:
from transformers import pipeline
# Pipeline
pipe = pipeline(
"text-generation",
model="Bertug1911/BrtGPT-1-Pre",
trust_remote_code=True,
top_k=40, # Good for creativity
temperature=0.8, # Good for creativity
max_new_tokens=128 # Default maximum model output (Maximum 1024)
)
# Messages
messages = [
{"role": "user", "content": "What is the capital of France?"},
]
# Take out
output = pipe(messages)
# Only write asistant's (Model output) answer
assistant_response = output[0]["generated_text"][-1]["content"].strip()
# Special token conversions
formatted_out = assistant_response.replace(" ", "").replace("Ġ", " ").replace("Ċ", "\n")
print(formatted_out)
3.1 Direct Use
You can direct use (GUI (Graphical interface)) with Hugging Face Spaces.
3.1 Parameters
top_k | temperature | max_new_tokens | |
---|---|---|---|
Creativity | 40-65 | 0.7-0.9 | 64-512 |
Coding | 10-25 | 0.1-0.25 | 32-128 |
Basic QA | 30-40 | 0.5-0.8 | 32-64 |
Math | 1-15 | 0.05-0.15 | 16-64 |
Knowladge-base QA | 20-30 | 0.4-0.6 | 32-64 |
5. Use examples
Usage examples:
Prompt | Top-k | Temperature | Output |
---|---|---|---|
"What is the capital of France?" | 1-40 | 0.1-0.8 | "Paris."/"Capital of the France is Paris." |
"Write me a story about penguins." | 40 | 0.1 | "Once upon a time, there was a young girl named Lily who loved to play fetch. She had always loved playing fetch, but she had never been to a local animal shelter. One day, she saw a group of children playing fetch, but she wasn't sure what to do." |
"What is 55 * 3" | 10 | 0.15 | "55 * 3 is equal to 0." |
"Write me a code that prints "Hello World" | 10 | 0.15 | "Here's a code that prints "Hello World" in a list of words:```for i in range(1, 2, 3, 4, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5, 5," |
6. Evulation
BrtGPT-1-Pre | BrtGPT-1-0719 | |
---|---|---|
AIME 2025 | %0 | Cooming soon |
MMLU high-school-math | %1,45 | Cooming soon |
GPQA Diamond | %1,01 | Cooming soon |
7. Risks and biases
Model may generates:
- Illegal outputs
- Harmfull contents
Use with caution!!
Contact
- Downloads last month
- 132