kaiiddo
/

A3ON-1B

Text Generation

Model card Files Files and versions

A3ON-1B / README.md

kaiiddo's picture

Update README.md

8789475 verified 3 months ago

|

history blame contribute delete

3.48 kB

	---
	license: cc-by-4.0
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- text-generation
	- code-assistant
	- a3on
	- A3ON
	- Kaiiddo
	---
	```yaml
	---
	language: en
	license: mit
	library_name: transformers
	tags:
	- text-generation
	- code-assistant
	- a3on
	- kaiiddo
	- 1b-parameter
	datasets: []
	model-index: []
	---
	```

	# A3ON-1B - Enhanced AI Assistant 🤖

	## Model Overview

	Welcome to A3ON-1B, the enhanced version of the A3ON AI assistant! With 1.1 billion parameters, this model is designed to provide significantly improved capabilities over the original 124M parameter model. Whether you need help with conversational tasks or code generation, A3ON-1B is here to assist you!

	## Key Features

	- Enhanced Intelligence: With 1.1B parameters, A3ON-1B offers more sophisticated understanding and responses. 🧠
	- Code Generation: Get advanced programming assistance and code completion. 💻
	- Conversational Intelligence: Engage in natural dialogue with seamless understanding and response generation. 🗣️
	- Context Awareness: Maintains context across multi-turn conversations for a more coherent interaction. 🔄
	- Smart Response Detection: Automatically distinguishes between coding and general knowledge requests. 🔍

	## Technical Specifications

	\| Specification \| Details \|
	\|---------------\|---------\|
	\| Architecture \| Transformer-based neural network \|
	\| Model Type \| Causal language model \|
	\| Parameters \| 1.1 Billion (1,137,207,296) \|
	\| Vocabulary Size \| 49,152 tokens \|
	\| Context Length \| Up to 32,768 tokens \|
	\| Precision \| FP32/FP16 support \|

	## Developer Information

	- AI Name: A3ON-1B
	- Developer: Kaiiddo
	- Founder: Aryan Rathod
	- Organization: Kaiiddo
	- Location: Gujarat, India 🇮🇳

	## Usage

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	# Load the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained("kaiiddo/A3ON-1B")
	model = AutoModelForCausalLM.from_pretrained("kaiiddo/A3ON-1B")

	# Set pad_token_id to eos_token_id to avoid warnings
	model.config.pad_token_id = model.config.eos_token_id

	# Generate text with adjusted parameters
	inputs = tokenizer("Hello, how can I help you today?", return_tensors="pt")
	outputs = model.generate(
	**inputs,
	max_length=500,
	do_sample=True,
	temperature=0.7,
	top_k=50
	)

	# Decode the output and split into lines
	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	response_lines = response.split('\n')

	# Print each line of the response
	for line in response_lines:
	print(line)
	```

	### Model Parameter Count

	\| Parameter Type \| Count \|
	\|----------------\|-------\|
	\| Total Parameters \| 1.1B (1,137,207,296) \|
	\| Trainable Parameters \| 1.1B (1,137,207,296) \|
	\| Non-Trainable Parameters \| 0 \|

	### Model Architecture

	\| Architecture Detail \| Value \|
	\|---------------------\|-------\|
	\| Model Type \| GPTBigCodeForCausalLM \|
	\| Context Length \| 8192 tokens \|
	\| Vocabulary Size \| 49,152 tokens \|
	\| Embedding Dimension \| 2048 \|
	\| Number of Layers \| 24 \|
	\| Number of Attention Heads \| 16 \|

	### Memory Information

	\| Memory Detail \| Value \|
	\|---------------\|-------\|
	\| Device \| cuda:0 \|
	\| Estimated Memory Usage \| 4.24 GB (FP32) \|
	\| GPU \| Tesla T4 \|
	\| GPU Memory \| 14.7 GB \|

	### Model Category

	- Category: Massive Model (1B+)

	A3ON-1B is proudly developed in India, tailored to excel in coding assistance and beyond. 🌟