Odd Eyed Black Cat by fourbyfourblazer, on Flickr
Table of Contents
- Model Description
- Model Architecture
- Training Data
- Training Procedure
- Usage
- Limitations
- Ethical Considerations
- Acknowledgements
- Citations
- License
Model Description
cat0.1 is a conversational AI model with 3 billion parameters, optimized for efficiency using 4-bit precision. Designed to engage in dynamic and uncensored dialogues, cat0.1 has been trained over the past eight months through an iterative process of training and interactive chatting. The model embodies a diverse range of characters, enabling versatile and engaging interactions. cat0.1 is adapted from unsloth/Llama-3.2-3B-bnb-4bit, leveraging its robust architecture to enhance conversational capabilities.
Model Architecture
- Parameters: 3 billion
- Precision: 4-bit
- Training Configuration:
- Rank: 32
- Alpha: 64
- Hardware: Trained on an RTX 4090 laptop GPU
Training Data
The model was trained on a diverse set of conversational data collected over eight months. The data includes interactions with various characters, ensuring a wide range of conversational styles and topics. Training data is continuously updated with new chunks, allowing the model to evolve and adapt over time.
Training Procedure
cat0.1 employs a progressive training approach:
- Initial Training: The model is initially trained on a base set of conversational data.
- Interactive Training: The trained model is engaged in chats, generating new data based on its interactions.
- Data Update Cycle:
- Data Collection: New conversational data chunks are gathered from interactions.
- Training Update: The model is retrained with the new data. Occasionally, older data is removed to focus on recent interactions, while retaining previous model parameters.
- Iteration: This cycle of training and data updating is repeated frequently to ensure the model remains current and responsive.
Usage
cat0.1 is designed for applications requiring dynamic and unrestricted conversational capabilities. Suitable use cases include:
- Chatbots: For platforms needing engaging and versatile conversational agents.
- Creative Writing Assistance: Helping writers generate dialogue and character interactions.
- Entertainment: Providing interactive experiences in games and virtual environments.
Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("rwitz/cat0.1")
model = AutoModelForCausalLM.from_pretrained("rwitz/cat0.1", torch_dtype=torch.float16)
# Encode input
input_ids = tokenizer.encode("Hello, how are you?", return_tensors="pt")
# Generate response
with torch.no_grad():
output = model.generate(input_ids, max_length=50)
# Decode and print
print(tokenizer.decode(output[0], skip_special_tokens=True))
- Downloads last month
- 26
Model tree for rwitz/cat0.1
Base model
meta-llama/Llama-3.2-3B