chore: rename GGUF to apollo_astralis_8b.gguf and update docs

Files changed (5) hide show

MERGE_GUIDE.md CHANGED Viewed

@@ -144,7 +144,7 @@ python convert_hf_to_gguf.py ../apollo-astralis-8b-merged/ \
 # Quantize to Q4_K_M (recommended)
 ./llama-quantize apollo-astralis-8b-f16.gguf \
-  apollo-astralis-8b-Q4_K_M.gguf Q4_K_M
 ```
 ### Step 3: Deploy with Ollama
@@ -152,7 +152,7 @@ python convert_hf_to_gguf.py ../apollo-astralis-8b-merged/ \
 ```bash
 # Create Modelfile
 cat > Modelfile <<EOF
-from ./apollo-astralis-8b-Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
@@ -263,7 +263,7 @@ Available quantization formats:
 ```bash
 # Quantize to different formats
-./llama-quantize apollo-astralis-8b-f16.gguf apollo-astralis-8b-Q4_K_M.gguf Q4_K_M
 ./llama-quantize apollo-astralis-8b-f16.gguf apollo-astralis-8b-Q5_K_M.gguf Q5_K_M
 ./llama-quantize apollo-astralis-8b-f16.gguf apollo-astralis-8b-Q8_0.gguf Q8_0
 ```

 # Quantize to Q4_K_M (recommended)
 ./llama-quantize apollo-astralis-8b-f16.gguf \
+    apollo_astralis_8b.gguf Q4_K_M
 ```
 ### Step 3: Deploy with Ollama
 ```bash
 # Create Modelfile
 cat > Modelfile <<EOF
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
 ```bash
 # Quantize to different formats
+./llama-quantize apollo-astralis-8b-f16.gguf apollo_astralis_8b.gguf Q4_K_M
 ./llama-quantize apollo-astralis-8b-f16.gguf apollo-astralis-8b-Q5_K_M.gguf Q5_K_M
 ./llama-quantize apollo-astralis-8b-f16.gguf apollo-astralis-8b-Q8_0.gguf Q8_0
 ```

README.md CHANGED Viewed

@@ -124,7 +124,7 @@ ollama run apollo-astralis-8b
 **Modelfile (Conservative - 256 tokens)**:
 ```dockerfile
-from ./apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
@@ -146,7 +146,7 @@ system """You are Apollo, a collaborative AI assistant specializing in reasoning
 **Modelfile (Unlimited - for complex reasoning)**:
 ```dockerfile
-from ./apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>

 **Modelfile (Conservative - 256 tokens)**:
 ```dockerfile
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
 **Modelfile (Unlimited - for complex reasoning)**:
 ```dockerfile
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>

UPLOAD_CHECKLIST.md CHANGED Viewed

@@ -107,13 +107,13 @@ If you want to include the quantized GGUF directly:
 ```bash
 # Copy GGUF to package directory
-cp /home/vanta/proving-ground/apollo_astralis_8b_v5_conservative.Q4_K_M.gguf .
 # Track with Git LFS
 git lfs track "*.gguf"
 # Add and push
-git add apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 git commit -m "Add Q4_K_M quantized GGUF model (4.7GB)"
 git push
 ```

 ```bash
 # Copy GGUF to package directory
+cp /home/vanta/proving-ground/apollo_astralis_8b.gguf .
 # Track with Git LFS
 git lfs track "*.gguf"
 # Add and push
+git add apollo_astralis_8b.gguf
 git commit -m "Add Q4_K_M quantized GGUF model (4.7GB)"
 git push
 ```

USAGE_GUIDE.md CHANGED Viewed

@@ -22,11 +22,11 @@ The simplest way to use Apollo Astralis:
 curl -fsSL https://ollama.ai/install.sh | sh
 # Download the GGUF model file
-wget https://huggingface.co/vanta-research/apollo-astralis-8b/resolve/main/apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 # Create Modelfile
 cat > Modelfile-apollo-astralis <<EOF
-from ./apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
@@ -98,10 +98,10 @@ cd llama.cpp
 make
 # Download model
-wget https://huggingface.co/vanta-research/apollo-astralis-8b/resolve/main/apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 # Run inference
-./main -m apollo_astralis_8b_v5_conservative.Q4_K_M.gguf \
   --prompt "Solve this problem: If x + 7 = 15, what is x?" \
   --temp 0.7 \
   --top-p 0.9 \
@@ -117,7 +117,7 @@ Best for most tasks with balanced response length:
 ```dockerfile
 # Modelfile
-from ./apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
@@ -143,7 +143,7 @@ For multi-step reasoning requiring extended chain-of-thought:
 ```dockerfile
 # Modelfile-unlimited
-from ./apollo_astralis_8b_v5_conservative.Q4_K_M.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>

 curl -fsSL https://ollama.ai/install.sh | sh
 # Download the GGUF model file
+wget https://huggingface.co/vanta-research/apollo-astralis-8b/resolve/main/apollo_astralis_8b.gguf
 # Create Modelfile
 cat > Modelfile-apollo-astralis <<EOF
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
 make
 # Download model
+wget https://huggingface.co/vanta-research/apollo-astralis-8b/resolve/main/apollo_astralis_8b.gguf
 # Run inference
+./main -m apollo_astralis_8b.gguf \
   --prompt "Solve this problem: If x + 7 = 15, what is x?" \
   --temp 0.7 \
   --top-p 0.9 \
 ```dockerfile
 # Modelfile
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>
 ```dockerfile
 # Modelfile-unlimited
+from ./apollo_astralis_8b.gguf
 template """<|im_start|>system
 {{ .System }}<|im_end|>

apollo_astralis_8b_v5_conservative.Q4_K_M.gguf → apollo_astralis_8b.gguf RENAMED Viewed

File without changes