SupYumm commited on
Commit
4c45e80
·
verified ·
1 Parent(s): b578bf6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +53 -3
README.md CHANGED
@@ -1,3 +1,53 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ ---
4
+ ** Model Detail
5
+
6
+ * Model type:
7
+ RWKV7 SigLIP2 is an opensource chatbot trained using RWKV7 architecture and SigLIP2 Encoder.
8
+
9
+ * Model date: Feb,2025
10
+
11
+ * Paper or resources for more information: https://github.com/JL-er/WorldRWKV
12
+
13
+ * Where to send questions or comments about the model: https://github.com/JL-er/WorldRWKV/issues
14
+
15
+ ** Training datasets:
16
+ * Pretrain: LLaVA 595k
17
+ * Fine-tune: LLaVA 665k
18
+
19
+
20
+
21
+ ** Evaluation dataset
22
+
23
+ Currently, we tested RWKV7 SigLIP2 on 4 benchmarks proposed for instruction-following LMMs. More benchmarks will be released soon.
24
+
25
+ * Benchmarks
26
+ * | **Encoder** | **LLM** | **VQAV2** | **TextVQA** | **GQA** | **ScienceQA** |
27
+ |:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|:--------------:|
28
+ | [**SigLIP2**](https://huggingface.co/google/siglip2-base-patch16-384) | RWKV7-3B | 78.30 | 51.09 | 60.75 | 70.93 |
29
+
30
+
31
+
32
+ * Inference
33
+
34
+ * ```
35
+ from infer.worldmodel import Worldinfer
36
+ from PIL import Image
37
+
38
+
39
+ llm_path='WorldRWKV/RWKV7-3B-siglip2/rwkv-0' #Local model path
40
+ encoder_path='google/siglip2-base-patch16-384'
41
+ encoder_type='siglip'
42
+
43
+ model = Worldinfer(model_path=llm_path, encoder_type=encoder_type, encoder_path=encoder_path)
44
+
45
+ img_path = './docs/03-Confusing-Pictures.jpg'
46
+ image = Image.open(img_path).convert('RGB')
47
+
48
+ text = '\x16User: What is unusual about this image?\x17Assistant:'
49
+
50
+ result = model.generate(text, image)
51
+
52
+ print(result)
53
+ ```