metadata
language: en
license: apache-2.0
tags:
- compact-ai
- interleaved-thinking
- transformer
- pytorch
- reasoning
datasets:
- custom
Compact AI Model with Interleaved Thinking
A compact AI model that implements interleaved thinking for enhanced reasoning capabilities. This model combines efficient transformer architecture with parallel reasoning paths to achieve better performance on complex tasks.
Model Details
Model Description
This is a compact AI model designed for efficient inference while maintaining strong reasoning capabilities through interleaved thinking. The model uses multiple parallel reasoning paths that work together to solve complex problems.
Model Architecture
- Base Architecture: Transformer with efficient attention mechanisms
- Key Features:
- Interleaved thinking with parallel reasoning paths
- Hierarchical reasoning with different abstraction levels
- Adaptive memory compression
- Early stopping based on confidence thresholds
- RoPE positional embeddings
- Flash attention support
Model Sizes
- Tiny: ~50M parameters (256 dim, 8 layers, 8 heads)
- Small: ~100M parameters (512 dim, 12 layers, 8 heads)
- Medium: ~200M parameters (768 dim, 16 layers, 12 heads)
Usage
Installation
pip install torch transformers
Loading the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("likhonsheikh/compact-ai-model")
tokenizer = AutoTokenizer.from_pretrained("likhonsheikh/compact-ai-model")
Inference
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
API Usage
The model also supports a FastAPI-based API server:
uvicorn compact_ai_model.api.main:app --host 0.0.0.0 --port 8000
Training
Requirements
- Python 3.8+
- PyTorch 2.0+
- CUDA-compatible GPU (recommended)
Training Script
python compact_ai_model/training/train.py
Performance
Benchmarks
- MMLU: Coming soon
- ARC: Coming soon
- HellaSwag: Coming soon
Efficiency
- Memory-efficient attention mechanisms
- Adaptive compression for long contexts
- Early stopping to reduce computation
Limitations
- Currently uses a simple tokenizer for demonstration
- Model is not yet fine-tuned on large datasets
- API is still in development
Citation
@misc{compact-ai-model,
title={Compact AI Model with Interleaved Thinking},
author={Likhon Sheikh},
year={2024},
publisher={Hugging Face},
url={https://huggingface.co/likhonsheikh/compact-ai-model}
}
License
This model is released under the Apache 2.0 license.