utkmst commited on
Commit
acbad5e
·
verified ·
1 Parent(s): efa43fc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -6
README.md CHANGED
@@ -1,10 +1,13 @@
1
  ---
2
  base_model:
3
- - utkmst/chimera-beta-test2-lora-merged
4
  - meta-llama/Llama-3.1-8B-Instruct
 
 
5
  tags:
6
  - llama-cpp
7
- - gguf-my-repo
 
 
8
  license: llama3.1
9
  datasets:
10
  - OpenAssistant/oasst1
@@ -12,13 +15,54 @@ datasets:
12
  - Open-Orca/OpenOrca
13
  - mlabonne/open-perfectblend
14
  - tatsu-lab/alpaca
15
- language:
16
- - en
17
  ---
18
 
19
  # utkmst/chimera-beta-test2-lora-merged-Q4_K_M-GGUF
20
- This model was converted to GGUF format from [`utkmst/chimera-beta-test2-lora-merged`](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
21
- Refer to the [original model card](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged) for more details on the model.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  ## Use with llama.cpp
24
  Install llama.cpp through brew (works on Mac and Linux)
 
1
  ---
2
  base_model:
 
3
  - meta-llama/Llama-3.1-8B-Instruct
4
+ language:
5
+ - en
6
  tags:
7
  - llama-cpp
8
+ - gguf
9
+ - llama-3.1
10
+ - instruction-tuned
11
  license: llama3.1
12
  datasets:
13
  - OpenAssistant/oasst1
 
15
  - Open-Orca/OpenOrca
16
  - mlabonne/open-perfectblend
17
  - tatsu-lab/alpaca
 
 
18
  ---
19
 
20
  # utkmst/chimera-beta-test2-lora-merged-Q4_K_M-GGUF
21
+
22
+ ## Model Description
23
+ This is a quantized GGUF version of my fine-tuned model [`utkmst/chimera-beta-test2-lora-merged`](https://huggingface.co/utkmst/chimera-beta-test2-lora-merged), which was created by LoRA fine-tuning the Meta Llama-3.1-8B-Instruct model and merging the resulting adapter with the base model. The GGUF conversion was performed using llama.cpp with Q4_K_M quantization for efficient inference.
24
+
25
+ ## Architecture
26
+ - **Base Model**: meta-llama/Llama-3.1-8B-Instruct
27
+ - **Size**: 8B parameters
28
+ - **Type**: Decoder-only transformer
29
+ - **Quantization**: Q4_K_M GGUF format (4-bit quantization with K-means clustering)
30
+
31
+ ## Training Details
32
+ - **Training Method**: LoRA fine-tuning followed by adapter merging
33
+ - **LoRA Configuration**:
34
+ - Rank: 8
35
+ - Alpha: 16
36
+ - Trainable modules: Attention layers and feed-forward networks
37
+ - **Training Hyperparameters**:
38
+ - Learning rate: 2e-4
39
+ - Batch size: 2
40
+ - Training epochs: 1
41
+ - Optimizer: AdamW with constant scheduler
42
+
43
+ ## Dataset
44
+ The model was trained on a curated mixture of high-quality instruction datasets:
45
+ - OpenAssistant/oasst1: Human-generated conversations with AI assistants
46
+ - databricks/databricks-dolly-15k: Instruction-following examples
47
+ - Open-Orca/OpenOrca: Augmented training data based on GPT-4 generations
48
+ - mlabonne/open-perfectblend: A carefully balanced blend of open-source instruction data
49
+ - tatsu-lab/alpaca: Self-instructed data based on demonstrations
50
+
51
+ ## Intended Use
52
+ This model is designed for:
53
+ - General purpose assistant capabilities
54
+ - Question answering and knowledge retrieval
55
+ - Creative content generation
56
+ - Instructional guidance
57
+
58
+ It's optimized for deployment in resource-constrained environments due to its quantized nature while maintaining good response quality.
59
+
60
+ ## Limitations
61
+ - Reduced numerical precision due to quantization may impact performance on certain mathematical or precise reasoning tasks
62
+ - Base model limitations including potential hallucinations and factual inaccuracies
63
+ - Limited context window compared to larger models
64
+ - Knowledge cutoff from the base Llama-3.1 model
65
+ - May exhibit biases present in training data
66
 
67
  ## Use with llama.cpp
68
  Install llama.cpp through brew (works on Mac and Linux)