Apollo-Astralis V1 4B - Quick Start Guide
Installation
Option 1: Using Transformers (Recommended)
pip install transformers torch accelerate peft
Option 2: Using with LoRA Adapters
If you want to load adapters separately:
pip install transformers torch peft bitsandbytes accelerate
Quick Usage
Basic Example
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model
model_name = "VANTA-Research/apollo-astralis-v1-4b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Generate response
messages = [
{"role": "system", "content": "You are Apollo-Astralis V1, a warm reasoning assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Run the Example
python example_usage.py
System Requirements
- Python: 3.8+
- CUDA: 11.8+ (for GPU acceleration)
- RAM: 16GB minimum, 32GB recommended
- GPU VRAM: 8GB minimum (RTX 3060 or better)
- Disk Space: 10GB
Next Steps
- Read the full README.md for detailed documentation
- Check example_usage.py for more examples
- Visit HuggingFace Model Card
Support
- Issues: GitHub Issues
- Email: [email protected]