pszemraj commited on
Commit
13698fb
1 Parent(s): 290af7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -11
README.md CHANGED
@@ -3,6 +3,8 @@ license: apache-2.0
3
  base_model: BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile
4
  tags:
5
  - generated_from_trainer
 
 
6
  metrics:
7
  - accuracy
8
  inference:
@@ -12,7 +14,7 @@ inference:
12
  temperature: 0.8
13
  repetition_penalty: 1.15
14
  no_repeat_ngram_size: 4
15
- eta_cutoff: 0.0006
16
  renormalize_logits: true
17
  widget:
18
  - text: My name is El Microondas the Wise and
@@ -57,24 +59,18 @@ pipeline_tag: text-generation
57
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
58
  should probably proofread and complete it, then remove this comment. -->
59
 
60
- # NanoLlama-GQA-L10-A32_KV8-v12-minipile-knowledge-inoc-concat-v1-2048-vN
 
 
61
 
62
  This model is a fine-tuned version of [BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile](https://huggingface.co/BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile) on the None dataset.
63
  It achieves the following results on the evaluation set:
64
  - Loss: 2.5937
65
  - Accuracy: 0.4948
66
 
67
- ## Model description
68
-
69
- More information needed
70
-
71
- ## Intended uses & limitations
72
-
73
- More information needed
74
-
75
  ## Training and evaluation data
76
 
77
- More information needed
78
 
79
  ## Training procedure
80
 
 
3
  base_model: BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile
4
  tags:
5
  - generated_from_trainer
6
+ - smol_llama
7
+ - llama2
8
  metrics:
9
  - accuracy
10
  inference:
 
14
  temperature: 0.8
15
  repetition_penalty: 1.15
16
  no_repeat_ngram_size: 4
17
+ eta_cutoff: 0.0008
18
  renormalize_logits: true
19
  widget:
20
  - text: My name is El Microondas the Wise and
 
59
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
60
  should probably proofread and complete it, then remove this comment. -->
61
 
62
+ # BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v13-KI
63
+
64
+ > note that training still WIP
65
 
66
  This model is a fine-tuned version of [BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile](https://huggingface.co/BEE-spoke-data/NanoLlama-GQA-L10-A32_KV8-v12-minipile) on the None dataset.
67
  It achieves the following results on the evaluation set:
68
  - Loss: 2.5937
69
  - Accuracy: 0.4948
70
 
 
 
 
 
 
 
 
 
71
  ## Training and evaluation data
72
 
73
+ KI dataset
74
 
75
  ## Training procedure
76