rwitz/cat0.1 · Hugging Face

Odd Eyed Black Cat by fourbyfourblazer, on Flickr

Model Description
Model Architecture
Training Data
Training Procedure
Usage
Limitations
Ethical Considerations
Acknowledgements
Citations
License

Model Description

cat0.1 is a conversational AI model with 3 billion parameters, optimized for efficiency using 4-bit precision. Designed to engage in dynamic and uncensored dialogues, cat0.1 has been trained over the past eight months through an iterative process of training and interactive chatting. The model embodies a diverse range of characters, enabling versatile and engaging interactions. cat0.1 is adapted from unsloth/Llama-3.2-3B-bnb-4bit, leveraging its robust architecture to enhance conversational capabilities.

Model Architecture

Parameters: 3 billion
Precision: 4-bit
Training Configuration:
- Rank: 32
- Alpha: 64
Hardware: Trained on an RTX 4090 laptop GPU

Training Data

The model was trained on a diverse set of conversational data collected over eight months. The data includes interactions with various characters, ensuring a wide range of conversational styles and topics. Training data is continuously updated with new chunks, allowing the model to evolve and adapt over time.

Training Procedure

cat0.1 employs a progressive training approach:

Initial Training: The model is initially trained on a base set of conversational data.
Interactive Training: The trained model is engaged in chats, generating new data based on its interactions.
Data Update Cycle:
- Data Collection: New conversational data chunks are gathered from interactions.
- Training Update: The model is retrained with the new data. Occasionally, older data is removed to focus on recent interactions, while retaining previous model parameters.
Iteration: This cycle of training and data updating is repeated frequently to ensure the model remains current and responsive.

Usage

cat0.1 is designed for applications requiring dynamic and unrestricted conversational capabilities. Suitable use cases include:

Chatbots: For platforms needing engaging and versatile conversational agents.
Creative Writing Assistance: Helping writers generate dialogue and character interactions.
Entertainment: Providing interactive experiences in games and virtual environments.

Example

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("rwitz/cat0.1")
model = AutoModelForCausalLM.from_pretrained("rwitz/cat0.1", torch_dtype=torch.float16)

# Encode input
input_ids = tokenizer.encode("Hello, how are you?", return_tensors="pt")

# Generate response
with torch.no_grad():
    output = model.generate(input_ids, max_length=50)

# Decode and print
print(tokenizer.decode(output[0], skip_special_tokens=True))

rwitz
/

cat0.1

Table of Contents