SmerkyG BlinkDL commited on
Commit
e918688
·
verified ·
0 Parent(s):

Duplicate from BlinkDL/rwkv-7-world

Browse files

Co-authored-by: BlinkDL <[email protected]>

.gitattributes ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ *.7z filter=lfs diff=lfs merge=lfs -text
2
+ *.arrow filter=lfs diff=lfs merge=lfs -text
3
+ *.bin filter=lfs diff=lfs merge=lfs -text
4
+ *.bz2 filter=lfs diff=lfs merge=lfs -text
5
+ *.ckpt filter=lfs diff=lfs merge=lfs -text
6
+ *.ftz filter=lfs diff=lfs merge=lfs -text
7
+ *.gz filter=lfs diff=lfs merge=lfs -text
8
+ *.h5 filter=lfs diff=lfs merge=lfs -text
9
+ *.joblib filter=lfs diff=lfs merge=lfs -text
10
+ *.lfs.* filter=lfs diff=lfs merge=lfs -text
11
+ *.mlmodel filter=lfs diff=lfs merge=lfs -text
12
+ *.model filter=lfs diff=lfs merge=lfs -text
13
+ *.msgpack filter=lfs diff=lfs merge=lfs -text
14
+ *.npy filter=lfs diff=lfs merge=lfs -text
15
+ *.npz filter=lfs diff=lfs merge=lfs -text
16
+ *.onnx filter=lfs diff=lfs merge=lfs -text
17
+ *.ot filter=lfs diff=lfs merge=lfs -text
18
+ *.parquet filter=lfs diff=lfs merge=lfs -text
19
+ *.pb filter=lfs diff=lfs merge=lfs -text
20
+ *.pickle filter=lfs diff=lfs merge=lfs -text
21
+ *.pkl filter=lfs diff=lfs merge=lfs -text
22
+ *.pt filter=lfs diff=lfs merge=lfs -text
23
+ *.pth filter=lfs diff=lfs merge=lfs -text
24
+ *.rar filter=lfs diff=lfs merge=lfs -text
25
+ *.safetensors filter=lfs diff=lfs merge=lfs -text
26
+ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
27
+ *.tar.* filter=lfs diff=lfs merge=lfs -text
28
+ *.tar filter=lfs diff=lfs merge=lfs -text
29
+ *.tflite filter=lfs diff=lfs merge=lfs -text
30
+ *.tgz filter=lfs diff=lfs merge=lfs -text
31
+ *.wasm filter=lfs diff=lfs merge=lfs -text
32
+ *.xz filter=lfs diff=lfs merge=lfs -text
33
+ *.zip filter=lfs diff=lfs merge=lfs -text
34
+ *.zst filter=lfs diff=lfs merge=lfs -text
35
+ *tfevents* filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ - zh
5
+ - fr
6
+ - es
7
+ - de
8
+ - pt
9
+ - ru
10
+ - it
11
+ - ja
12
+ - ko
13
+ - vi
14
+ - ar
15
+ tags:
16
+ - pytorch
17
+ - text-generation
18
+ - causal-lm
19
+ - rwkv
20
+ license: apache-2.0
21
+ datasets:
22
+ - HuggingFaceFW/fineweb-edu
23
+ - mlfoundations/dclm-baseline-1.0
24
+ - cerebras/SlimPajama-627B
25
+ - EleutherAI/pile
26
+ - bigcode/starcoderdata
27
+ - oscar-corpus/OSCAR-2301
28
+ ---
29
+
30
+ # RWKV-7 World
31
+
32
+ Use rwkv pip package 0.8.28+ for RWKV-7 inference: https://pypi.org/project/rwkv/
33
+
34
+ Evals and more information: https://rwkv.com/
35
+
36
+ For developers: https://github.com/BlinkDL/RWKV-LM
37
+
38
+ Chat demo: https://github.com/BlinkDL/ChatRWKV/blob/main/API_DEMO_CHAT.py
39
+
40
+ MMLU eval: https://github.com/BlinkDL/RWKV-LM/blob/main/RWKV-v7/rwkv_mmlu_eval.py
41
+
42
+ rwkv7-v3-2.9b 54.56% (rwkv6-v2.1-3.1b 32.38%)
43
+
44
+ rwkv7-v3-1.5b 44.84% (rwkv6-v2.1-1.6b 26.34%)
45
+
46
+ 0.1B = L12 D768 // 0.4B = L24 D1024 // 1.5B = L24 D2048 // ~3B = L32 D2560 // ~7B = L32 D4096
47
+
48
+ ## Model Description
49
+
50
+ RWKV-7 trained on 100+ world languages (80% English, 10% multilang, 10% code).
51
+
52
+ World-v3 = 3.1T tokens
53
+
54
+ World-v2.9 = subsampled 2T tokens
55
+
56
+ World-v2.8 = subsampled 1T tokens
57
+
58
+ Recommended fine-tuning format (use \n for newlines):
59
+ ```
60
+ User: xxxxxxxxxxxxxxx
61
+
62
+ Assistant: xxxxxxxxxxxxxxx
63
+ xxxxxxxxxxxxxxx
64
+ xxxxxxxxxxxxxxx
65
+
66
+ User: xxxxxxxxxxxxxxx
67
+ xxxxxxxxxxxxxxx
68
+
69
+ Assistant: xxxxxxxxxxxxxxx
70
+ xxxxxxxxxxxxxxx
71
+ xxxxxxxxxxxxxxx
72
+ xxxxxxxxxxxxxxx
73
+ ```
74
+
75
+ A good chat prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
76
+ ```
77
+ User: hi
78
+
79
+ Assistant: Hi. I am your assistant and I will provide expert full response in full details. Please feel free to ask any question and I will always answer it.
80
+
81
+ User: xxx
82
+
83
+ Assistant:
84
+ ```
85
+ QA prompt (better replace \n\n in xxx to \n, such that there will never be extra \n\n in response):
86
+ ```
87
+ Question: xxx
88
+
89
+ Answer:
90
+ ```
91
+ and
92
+ ```
93
+ Instruction: xxx
94
+
95
+ Input: xxx
96
+
97
+ Response:
98
+ ```
99
+
100
+ !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
101
+
102
+ !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
103
+
104
+ !!! There should not be any space after your final ":" or you will upset the tokenizer and see non-English reponse !!!
RWKV-x070-World-0.1B-v2.8-20241210-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60c98129b9529963bff2c164b8ab4bd17c19332ae06dc2dcae32aa3a3739295a
3
+ size 382195690
RWKV-x070-World-0.4B-v2.9-20250107-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c08cf602b33e59a5717b8f0cafaf7f04c50c1c67166e477aa2f47c5ca180da4a
3
+ size 901794466
RWKV-x070-World-1.5B-v3-20250127-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:23e13ff62ac6f80b28910ffc230324939e80ddb00b98f892a5951c0da071e700
3
+ size 3055062345
RWKV-x070-World-2.9B-v3-20250211-ctx4096.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d7901b91ef1646115db03d03d171370bca454061ad426a1256fda9e1c428cf8
3
+ size 5895796446