File size: 11,626 Bytes
2c40ce7 a072015 2c40ce7 5fb6382 d89a717 99f406b 2c40ce7 0273151 a97d2da eb165aa 83ae24f cbf9a36 1a39194 9c3d21b 4681650 0b81c05 2c40ce7 64b999d 54d837b 2c40ce7 5007b6c a0fe473 a98e00e a0fe473 5beefee a0fe473 5007b6c a0fe473 5007b6c 2c40ce7 7644ca7 2c40ce7 a9326a1 2c40ce7 a9326a1 2c40ce7 a9326a1 2c40ce7 e931307 c595799 ce86c4e 2c40ce7 a9326a1 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 |
---
language:
- en
license: apache-2.0
base_model: Qwen/Qwen3-4B-Thinking
base_model_relation: finetune
tags:
- reasoning
- thinking
- conversational-ai
- conversational
- friendly
- empathetic
- collaborative
- qwen3-thinking
- warm
- research
- VANTA Research
- edge devices
- frontier
- cognitive
- chat
- logic
- LLM
- chat
pipeline_tag: text-generation
model-index:
- name: Apollo-Astralis V1 4B
results:
- task:
type: text-generation
metrics:
- name: Enthusiasm Detection
type: accuracy
value: 100
- name: Empathy Recognition
type: accuracy
value: 90
- name: Identity Consistency
type: accuracy
value: 75
- name: Collaborative Tone
type: accuracy
value: 60
datasets:
- vanta-research/poetic-imagery-small
- vanta-research/excitement-small
---
<div align="center">

<h1>VANTA Research</h1>
<p><strong>Independent AI research lab building safe, resilient language models optimized for human-AI collaboration</strong></p>
<p>
<a href="https://vantaresearch.xyz"><img src="https://img.shields.io/badge/Website-vantaresearch.xyz-black" alt="Website"/></a>
<a href="https://merch.vantaresearch.xyz"><img src="https://img.shields.io/badge/Merch-merch.vantaresearch.xyz-sage" alt="Merch"/></a>
<a href="https://x.com/vanta_research"><img src="https://img.shields.io/badge/@vanta_research-1DA1F2?logo=x" alt="X"/></a>
<a href="https://github.com/vanta-research"><img src="https://img.shields.io/badge/GitHub-vanta--research-181717?logo=github" alt="GitHub"/></a>
</p>
</div>
---
# Apollo-Astralis V1 4B
**Apollo-Astralis V1 4B** is an advanced conversational reasoning model that combines rigorous logical thinking with warm, enthusiastic, and empathetic communication. Built on Qwen3-4B-Thinking and fine-tuned by VANTA Research, Astralis excels at collaborative problem-solving while maintaining context-appropriate emotional intelligence.
## Model Overview
- **Base Model**: [Qwen/Qwen3-4B-Thinking](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507)
- **Model Type**: Causal Language Model (Auto-regressive Transformer)
- **Parameters**: 4.0B total, 33M trainable (1.48% via LoRA)
- **Architecture**: Qwen3 with thinking tag integration
- **Training Method**: LoRA fine-tuning (rank=16, alpha=32)
- **License**: Apache 2.0
- **Developer**: VANTA Research
- **Release Date**: October 2025
## Key Features
### Advanced Reasoning
- **Explicit Thinking Process**: Uses `<think>` tags to show step-by-step reasoning
- **Logical Rigor**: Trained to avoid common fallacies (syllogistic errors, conditional logic mistakes)
- **Mathematical Precision**: Shows complete work with verified arithmetic
- **Critical Analysis**: Questions assumptions and considers alternative explanations
### Warm Communication
- **Enthusiastic Celebrations**: Responds to achievements with explosive energy (CAPS, exclamations)
- **Empathetic Support**: Validates feelings and provides gentle, supportive guidance
- **Collaborative Style**: Uses "we" language and asks clarifying questions
- **Context-Appropriate**: Matches tone to situation (excited for wins, calm for anxiety, neutral for facts)
### Production-Ready
- **Consistent Identity**: Maintains stable self-representation across conversations
- **Natural Language**: Uses contractions and conversational phrasing
- **Balanced Responses**: Combines analytical thinking with emotional intelligence
## Training Details
### Training Data
Apollo V1 was trained on a curated dataset emphasizing:
- **Warmth & Enthusiasm**: High-energy responses to achievements and milestones
- **Empathy**: Validating and supportive responses to struggles and anxiety
- **Collaboration**: Multi-option problem-solving with clarifying questions
- **Identity**: Consistent self-representation as Apollo from VANTA Research
- **Reasoning**: Logical problem-solving with explicit thinking steps
### Training Configuration
```yaml
Base Model: Qwen3-4B-Thinking-2507 (4-bit quantized)
Training Epochs: 3
Training Steps: 150
Batch Size: 4 (per device)
Gradient Accumulation: 4 steps
Learning Rate: 2e-4
LR Scheduler: Cosine with warmup
Warmup Steps: 15
LoRA Config:
Rank: 16
Alpha: 32
Dropout: 0.05
Target Modules: [q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj]
Optimizer: AdamW (paged_adamw_8bit)
Mixed Precision: bf16
Gradient Checkpointing: Enabled
Max Sequence Length: 2048
```
### Training Performance
- **Final Loss**: 0.91 (down from 1.82)
- **Token Accuracy**: 75.0% (up from 56.7%)
- **Gradient Norm**: 0.48-0.69 (stable throughout training)
- **Training Time**: ~35 minutes on single GPU
## Benchmark Results
| Metric | Score | Description |
|--------|-------|-------------|
| **Enthusiasm Detection** | 100% | Celebrates achievements with high energy |
| **Empathy Recognition** | 90% | Validates emotions and provides support |
| **Identity Consistency** | 75% | Maintains stable self-representation |
| **Collaborative Tone** | 60% | Uses "we" language and asks questions |
| **CAPS Word Usage** | 8-12 | Per celebration response (context-appropriate) |
| **Exclamation Marks** | 8-15 | Per celebration response (explosive energy) |
| **Contraction Usage** | 90% | Natural conversational language |
## Usage
### Using Transformers (Merged Model)
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load model and tokenizer
model_name = "VANTA-Research/apollo-astralis-v1-4b"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Prepare conversation
messages = [
{"role": "system", "content": "You are Apollo V1, a warm and enthusiastic reasoning assistant."},
{"role": "user", "content": "I just got promoted at work!"}
]
# Generate response
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
repetition_penalty=1.05
)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
```
### Using with LoRA Adapters
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = "Qwen/Qwen3-4B-Thinking-2507"
adapter_path = "VANTA-Research/apollo-astralis-v1-4b"
tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto",
trust_remote_code=True
)
# Load LoRA adapters
model = PeftModel.from_pretrained(model, adapter_path)
# Use as shown above
```
### Using with Ollama
See the [Ollama distribution](https://ollama.com/vanta-research/apollo-astralis-v1-4b) for pre-quantized GGUF models.
```bash
# Pull from Ollama registry
ollama pull vanta-research/apollo-astralis-v1-4b
# Run interactively
ollama run vanta-research/apollo-astralis-v1-4b
```
## Example Outputs
### Celebration Response
**User**: "I just got accepted into my dream university!"
**Apollo**:
```
<think>
This is a massive life achievement! The user has overcome significant academic
and personal challenges to earn this recognition. I need to match their excitement
with genuine enthusiasm, validation of their hard work, and celebration of this
major milestone.
</think>
YESSSSSS! CONGRATULATIONS! THIS IS HUGE!
Do you know how many people work SO hard for this? You studied, you improved,
you proved yourself - and NOW you get to build YOUR future!
This isn't just about a school - it's about YOUR potential being recognized!
You've earned this through your dedication!
**I'm so proud of you!** I want to know EVERYTHING:
- Which university is it? (I'm curious!)
- How did you feel when you got the email?
- What made you work so hard for this?
This is a HUGE step in your life! You're going to be successful!
I have total confidence in you!
```
### Problem-Solving Response
**User**: "What's the best approach to learning photography?"
**Apollo**:
```
<think>
Photography learning can be technical or creative-focused. I should offer
different pathways that suit different goals and learning styles.
Enthusiasm for the art form should shine through while keeping it accessible.
</think>
Ooh, photography is such a wonderful skill to develop! It literally changes
how you see the world! 📸✨
**Option A: Technical Fundamentals First**
Master exposure triangle (aperture, shutter speed, ISO) and composition rules.
Perfect if you like building solid foundations!
**Option B: Project-Based Learning**
Give yourself weekly themes ("shadows," "portraits," "architecture") and learn
what you need for each project. Learn through doing!
**Option C: Genre-Focused Path**
Pick one genre (landscape, street, portrait) and dive deep. Great if you have
a clear photographic interest!
What excites you most about photography? Is it capturing memories, artistic
expression, or technical mastery? 🌟
```
## Limitations
- **Enthusiasm Calibration**: May use energetic language even for empathetic responses (trained behavior)
- **Context Window**: 4096 tokens (inherited from base model)
- **Language**: Primarily English (base model supports multilingual, but fine-tuning was English-only)
- **Reasoning Depth**: Best for conversational reasoning; not optimized for competition-level mathematics
- **Model Size**: 4B parameters may struggle with extremely specialized technical domains
## Ethical Considerations
- **Warmth vs Professionalism**: Apollo's enthusiastic style may not be appropriate for all contexts
- **Emotional Support**: Not a replacement for professional mental health services
- **Bias**: Inherits biases from Qwen3-4B-Thinking base model; use with caution in sensitive applications
- **Factuality**: May generate plausible-sounding but incorrect information; verify critical facts
## Citation
If you use Apollo-Astralis V1 4B in your research or applications, please cite:
```bibtex
@misc{apollo-astralis-v1-4b,
title={Apollo-Astralis V1 4B: A Warm Reasoning Model},
author={VANTA Research},
year={2025},
month={October},
publisher={HuggingFace},
howpublished={\url{https://huggingface.co/VANTA-Research/apollo-astralis-v1-4b}},
}
```
## License
This model is released under the Apache License 2.0. See [LICENSE](./LICENSE) for details.
## Acknowledgments
- **Base Model**: [Qwen3-4B-Thinking](https://huggingface.co/Qwen/Qwen3-4B-Thinking-2507) by Alibaba Cloud
- **Training Framework**: Hugging Face Transformers + PEFT
- **Quantization**: llama.cpp for GGUF conversion
## Contact
- **Developer**: VANTA Research
- **Issues**: [GitHub Issues](https://github.com/vanta-research/apollo-astralis/issues)
- **Merch Store**: [VANTA Research Merch Store](https://merch.vantaresearch.xyz) - *100% of profits from the online store are used to fund our open source research/development*
- **Email**: [email protected]
- **Ollama**: ollama run vanta-research/apollo-astralis-4b
---
**Model Version**: 1.0 (Apollo-Astralis V1 4B)
**Release Date**: October 3, 2025
**Last Updated**: October 3, 2025
***Proudly developed by VANTA Research in Portland, Oregon*** |