|
--- |
|
license: cc-by-4.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
tags: |
|
- text-generation |
|
- code-assistant |
|
- a3on |
|
- A3ON |
|
- Kaiiddo |
|
--- |
|
```yaml |
|
--- |
|
language: en |
|
license: mit |
|
library_name: transformers |
|
tags: |
|
- text-generation |
|
- code-assistant |
|
- a3on |
|
- kaiiddo |
|
- 1b-parameter |
|
datasets: [] |
|
model-index: [] |
|
--- |
|
``` |
|
|
|
# A3ON-1B - Enhanced AI Assistant ๐ค |
|
|
|
## Model Overview |
|
|
|
Welcome to **A3ON-1B**, the enhanced version of the A3ON AI assistant! With **1.1 billion parameters**, this model is designed to provide significantly improved capabilities over the original 124M parameter model. Whether you need help with conversational tasks or code generation, A3ON-1B is here to assist you! |
|
|
|
## Key Features |
|
|
|
- **Enhanced Intelligence**: With 1.1B parameters, A3ON-1B offers more sophisticated understanding and responses. ๐ง |
|
- **Code Generation**: Get advanced programming assistance and code completion. ๐ป |
|
- **Conversational Intelligence**: Engage in natural dialogue with seamless understanding and response generation. ๐ฃ๏ธ |
|
- **Context Awareness**: Maintains context across multi-turn conversations for a more coherent interaction. ๐ |
|
- **Smart Response Detection**: Automatically distinguishes between coding and general knowledge requests. ๐ |
|
|
|
## Technical Specifications |
|
|
|
| Specification | Details | |
|
|---------------|---------| |
|
| **Architecture** | Transformer-based neural network | |
|
| **Model Type** | Causal language model | |
|
| **Parameters** | 1.1 Billion (1,137,207,296) | |
|
| **Vocabulary Size** | 49,152 tokens | |
|
| **Context Length** | Up to 32,768 tokens | |
|
| **Precision** | FP32/FP16 support | |
|
|
|
## Developer Information |
|
|
|
- **AI Name**: A3ON-1B |
|
- **Developer**: Kaiiddo |
|
- **Founder**: Aryan Rathod |
|
- **Organization**: Kaiiddo |
|
- **Location**: Gujarat, India ๐ฎ๐ณ |
|
|
|
## Usage |
|
|
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
# Load the tokenizer and model |
|
tokenizer = AutoTokenizer.from_pretrained("kaiiddo/A3ON-1B") |
|
model = AutoModelForCausalLM.from_pretrained("kaiiddo/A3ON-1B") |
|
|
|
# Set pad_token_id to eos_token_id to avoid warnings |
|
model.config.pad_token_id = model.config.eos_token_id |
|
|
|
# Generate text with adjusted parameters |
|
inputs = tokenizer("Hello, how can I help you today?", return_tensors="pt") |
|
outputs = model.generate( |
|
**inputs, |
|
max_length=500, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_k=50 |
|
) |
|
|
|
# Decode the output and split into lines |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
response_lines = response.split('\n') |
|
|
|
# Print each line of the response |
|
for line in response_lines: |
|
print(line) |
|
``` |
|
|
|
### Model Parameter Count |
|
|
|
| Parameter Type | Count | |
|
|----------------|-------| |
|
| **Total Parameters** | 1.1B (1,137,207,296) | |
|
| **Trainable Parameters** | 1.1B (1,137,207,296) | |
|
| **Non-Trainable Parameters** | 0 | |
|
|
|
### Model Architecture |
|
|
|
| Architecture Detail | Value | |
|
|---------------------|-------| |
|
| **Model Type** | GPTBigCodeForCausalLM | |
|
| **Context Length** | 8192 tokens | |
|
| **Vocabulary Size** | 49,152 tokens | |
|
| **Embedding Dimension** | 2048 | |
|
| **Number of Layers** | 24 | |
|
| **Number of Attention Heads** | 16 | |
|
|
|
### Memory Information |
|
|
|
| Memory Detail | Value | |
|
|---------------|-------| |
|
| **Device** | cuda:0 | |
|
| **Estimated Memory Usage** | 4.24 GB (FP32) | |
|
| **GPU** | Tesla T4 | |
|
| **GPU Memory** | 14.7 GB | |
|
|
|
### Model Category |
|
|
|
- **Category**: Massive Model (1B+) |
|
|
|
A3ON-1B is proudly developed in India, tailored to excel in coding assistance and beyond. ๐ |