utkmst
/

chimera-beta-test2-lora-merged-Q4_K_M-GGUF

@@ -1,10 +1,13 @@
 ---
 base_model:
-- utkmst/chimera-beta-test2-lora-merged
 - meta-llama/Llama-3.1-8B-Instruct
 tags:
 - llama-cpp
-- gguf-my-repo
 license: llama3.1
 datasets:
 - OpenAssistant/oasst1
@@ -12,13 +15,54 @@ datasets:
 - Open-Orca/OpenOrca
 - mlabonne/open-perfectblend
 - tatsu-lab/alpaca
-language:
-- en
 ---
 # utkmst/chimera-beta-test2-lora-merged-Q4_K_M-GGUF
-This model was converted to GGUF format from [`utkmst/chimera-beta-test2-lora-merged`](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
-Refer to the [original model card](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)

 ---
 base_model:
 - meta-llama/Llama-3.1-8B-Instruct
+language:
+- en
 tags:
 - llama-cpp
+- gguf
+- llama-3.1
+- instruction-tuned
 license: llama3.1
 datasets:
 - OpenAssistant/oasst1
 - Open-Orca/OpenOrca
 - mlabonne/open-perfectblend
 - tatsu-lab/alpaca
 ---
 # utkmst/chimera-beta-test2-lora-merged-Q4_K_M-GGUF
+## Model Description
+This is a quantized GGUF version of my fine-tuned model [`utkmst/chimera-beta-test2-lora-merged`](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged), which was created by LoRA fine-tuning the Meta Llama-3.1-8B-Instruct model and merging the resulting adapter with the base model. The GGUF conversion was performed using llama.cpp with Q4_K_M quantization for efficient inference.
+## Architecture
+- **Base Model**: meta-llama/Llama-3.1-8B-Instruct
+- **Size**: 8B parameters
+- **Type**: Decoder-only transformer
+- **Quantization**: Q4_K_M GGUF format (4-bit quantization with K-means clustering)
+## Training Details
+- **Training Method**: LoRA fine-tuning followed by adapter merging
+- **LoRA Configuration**:
+  - Rank: 8
+  - Alpha: 16
+  - Trainable modules: Attention layers and feed-forward networks
+- **Training Hyperparameters**:
+  - Learning rate: 2e-4
+  - Batch size: 2
+  - Training epochs: 1
+  - Optimizer: AdamW with constant scheduler
+## Dataset
+The model was trained on a curated mixture of high-quality instruction datasets:
+- OpenAssistant/oasst1: Human-generated conversations with AI assistants
+- databricks/databricks-dolly-15k: Instruction-following examples
+- Open-Orca/OpenOrca: Augmented training data based on GPT-4 generations
+- mlabonne/open-perfectblend: A carefully balanced blend of open-source instruction data
+- tatsu-lab/alpaca: Self-instructed data based on demonstrations
+## Intended Use
+This model is designed for:
+- General purpose assistant capabilities
+- Question answering and knowledge retrieval
+- Creative content generation
+- Instructional guidance
+It's optimized for deployment in resource-constrained environments due to its quantized nature while maintaining good response quality.
+## Limitations
+- Reduced numerical precision due to quantization may impact performance on certain mathematical or precise reasoning tasks
+- Base model limitations including potential hallucinations and factual inaccuracies
+- Limited context window compared to larger models
+- Knowledge cutoff from the base Llama-3.1 model
+- May exhibit biases present in training data
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)