mgonzs13 nielsr HF Staff commited on
Commit
a6abb54
·
verified ·
1 Parent(s): 6e1abd5

Improve model card: Add paper and benchmark GitHub links (#1)

Browse files

- Improve model card: Add paper and benchmark GitHub links (5c3a0db3db866bddd6626c2d9c3009a840db160f)


Co-authored-by: Niels Rogge <[email protected]>

Files changed (1) hide show
  1. README.md +26 -204
README.md CHANGED
@@ -1,8 +1,14 @@
1
  ---
2
- task_categories:
3
- - visual-question-answering
 
 
4
  language:
5
  - en
 
 
 
 
6
  tags:
7
  - gguf
8
  - remyx
@@ -16,14 +22,9 @@ tags:
16
  - vision-language
17
  - distance-estimation
18
  - quantitative-spatial-reasoning
 
 
19
  pretty_name: SpaceOm-GGUF
20
- license: apache-2.0
21
- datasets:
22
- - remyxai/SpaceThinker
23
- base_model:
24
- - remyxai/SpaceOm
25
- pipeline_tag: image-text-to-text
26
- library_name: llama.cpp
27
  model-index:
28
  - name: SpaceOm
29
  results:
@@ -35,230 +36,51 @@ model-index:
35
  type: benchmark
36
  metrics:
37
  - type: success_rate
38
- name: Overall Success Rate
39
  value: 0.5419
40
- results_by_subcategory:
41
- - name: 3D Positional Relation / Orientation
42
- success_rate: 0.4877
43
- - name: Object Localization / 3D Localization
44
- success_rate: 0.6337
45
- - name: Object Properties / Size
46
- success_rate: 0.5043
47
- - task:
48
- type: visual-question-answering
49
- name: Spatial Reasoning
50
- dataset:
51
- name: BLINK
52
- type: benchmark
53
- metrics:
54
- - type: success_rate
55
  name: Overall Success Rate
56
- value: 0.599
57
- results_by_subcategory:
58
- - name: 3D Positional Relation / Orientation
59
- success_rate: 0.7972
60
- - name: Counting / Object Counting
61
- success_rate: 0.6167
62
- - name: Depth and Distance / Relative
63
- success_rate: 0.621
64
- - name: Object Localization / 2D Localization
65
- success_rate: 0.582
66
- - name: Point and Object Tracking / Point Correspondence
67
- success_rate: 0.3779
68
- - task:
69
- type: visual-question-answering
70
- name: Spatial Reasoning
71
- dataset:
72
- name: MMIU
73
- type: benchmark
74
- metrics:
75
  - type: success_rate
 
76
  name: Overall Success Rate
77
- value: 0.388
78
- results_by_subcategory:
79
- - name: Camera and Image Transformation / 2D Transformation
80
- success_rate: 0.255
81
- - name: Camera and Image Transformation / 3D Camera Pose
82
- success_rate: 0.4
83
- - name: Camera and Image Transformation / Camera Motion
84
- success_rate: 0.4436
85
- - name: Depth and Distance / Absolute
86
- success_rate: 0.265
87
- - name: Object Localization / 3D Localization
88
- success_rate: 0.3625
89
- - name: Point and Object Tracking / 3D Tracking
90
- success_rate: 0.725
91
- - name: Point and Object Tracking / Point Correspondence
92
- success_rate: 0.265
93
- - task:
94
- type: visual-question-answering
95
- name: Spatial Reasoning
96
- dataset:
97
- name: MMVP
98
- type: benchmark
99
- metrics:
100
  - type: success_rate
 
101
  name: Overall Success Rate
102
- value: 0.5833
103
- results_by_subcategory:
104
- - name: Others / Miscellaneous
105
- success_rate: 0.5833
106
- - task:
107
- type: visual-question-answering
108
- name: Spatial Reasoning
109
- dataset:
110
- name: QSpatialBench-Plus
111
- type: benchmark
112
- metrics:
113
  - type: success_rate
 
114
  name: Overall Success Rate
115
- value: 0.4455
116
- results_by_subcategory:
117
- - name: Depth and Distance / Absolute
118
- success_rate: 0.4455
119
- - task:
120
- type: visual-question-answering
121
- name: Spatial Reasoning
122
- dataset:
123
- name: QSpatialBench-ScanNet
124
- type: benchmark
125
- metrics:
126
  - type: success_rate
 
127
  name: Overall Success Rate
128
- value: 0.4876
129
- results_by_subcategory:
130
- - name: Depth and Distance / Absolute
131
- success_rate: 0.464
132
- - name: Object Properties / Size
133
- success_rate: 0.5111
134
- - task:
135
- type: visual-question-answering
136
- name: Spatial Reasoning
137
- dataset:
138
- name: RealWorldQA
139
- type: benchmark
140
- metrics:
141
  - type: success_rate
 
142
  name: Overall Success Rate
143
- value: 0.6105
144
- results_by_subcategory:
145
- - name: Others / Miscellaneous
146
- success_rate: 0.6105
147
- - task:
148
- type: visual-question-answering
149
- name: Spatial Reasoning
150
- dataset:
151
- name: SpatialSense
152
- type: benchmark
153
- metrics:
154
  - type: success_rate
 
155
  name: Overall Success Rate
156
- value: 0.7043
157
- results_by_subcategory:
158
- - name: 3D Positional Relation / Orientation
159
- success_rate: 0.7043
160
- - task:
161
- type: visual-question-answering
162
- name: Spatial Reasoning
163
- dataset:
164
- name: VGBench
165
- type: benchmark
166
- metrics:
167
  - type: success_rate
 
168
  name: Overall Success Rate
169
- value: 0.3504
170
- results_by_subcategory:
171
- - name: Camera and Image Transformation / 2D Transformation
172
- success_rate: 0.2568
173
- - name: Camera and Image Transformation / 3D Camera Pose
174
- success_rate: 0.4371
175
- - name: Depth and Distance / Absolute
176
- success_rate: 0.3339
177
- - name: Depth and Distance / Relative
178
- success_rate: 0.32
179
- - name: Object Localization / 3D Localization
180
- success_rate: 0.4283
181
- - name: Point and Object Tracking / 3D Tracking
182
- success_rate: 0.3264
183
- - task:
184
- type: visual-question-answering
185
- name: Spatial Reasoning
186
- dataset:
187
- name: VSI-Bench_8
188
- type: benchmark
189
- metrics:
190
  - type: success_rate
 
191
  name: Overall Success Rate
192
- value: 0.2558
193
- results_by_subcategory:
194
- - name: 3D Positional Relation / Orientation
195
- success_rate: 0.3998
196
- - name: Counting / Object Counting
197
- success_rate: 0.229
198
- - name: Depth and Distance / Absolute
199
- success_rate: 0.1562
200
- - name: Depth and Distance / Relative
201
- success_rate: 0.3648
202
- - name: Object Properties / Size
203
- success_rate: 0.1645
204
- - name: Others / Miscellaneous
205
- success_rate: 0.2204
206
- - task:
207
- type: visual-question-answering
208
- name: Spatial Reasoning
209
- dataset:
210
- name: VSR-ZeroShot
211
- type: benchmark
212
- metrics:
213
  - type: success_rate
 
214
  name: Overall Success Rate
215
- value: 0.8085
216
- results_by_subcategory:
217
- - name: 3D Positional Relation / Orientation
218
- success_rate: 0.8085
219
- - task:
220
- type: visual-question-answering
221
- name: Spatial Reasoning
222
- dataset:
223
- name: cvbench
224
- type: benchmark
225
- metrics:
226
  - type: success_rate
 
227
  name: Overall Success Rate
228
- value: 0.6839
229
- results_by_subcategory:
230
- - name: Counting / Object Counting
231
- success_rate: 0.6294
232
- - name: Depth and Distance / Relative
233
- success_rate: 0.7408
234
- - name: Object Localization / 3D Localization
235
- success_rate: 0.6815
236
- - task:
237
- type: visual-question-answering
238
- name: Spatial Reasoning
239
- dataset:
240
- name: spatialbench
241
- type: benchmark
242
- metrics:
243
  - type: success_rate
 
244
  name: Overall Success Rate
 
245
  value: 0.6553
246
- results_by_subcategory:
247
- - name: 3D Positional Relation / Orientation
248
- success_rate: 0.6765
249
- - name: Counting / Object Counting
250
- success_rate: 0.75
251
- - name: Object Properties / Existence
252
- success_rate: 0.925
253
- - name: Object Properties / Reachability
254
- success_rate: 0.55
255
- - name: Object Properties / Size
256
- success_rate: 0.375
257
-
258
  ---
259
 
260
  # SpaceOm
261
 
 
 
 
262
  **Model creator:** [remyxai](https://huggingface.co/remyxai)<br>
263
  **Original model**: [SpaceOm](https://huggingface.co/remyxai/SpaceOm)<br>
264
  **GGUF quantization:** `llama.cpp` commit [2baf07727f921d9a4a1b63a2eff941e95d0488ed](https://github.com/ggerganov/llama.cpp/tree/2baf07727f921d9a4a1b63a2eff941e95d0488ed)<br>
 
1
  ---
2
+ base_model:
3
+ - remyxai/SpaceOm
4
+ datasets:
5
+ - remyxai/SpaceThinker
6
  language:
7
  - en
8
+ library_name: llama.cpp
9
+ license: apache-2.0
10
+ pipeline_tag: image-text-to-text
11
+ paper: 2506.07966
12
  tags:
13
  - gguf
14
  - remyx
 
22
  - vision-language
23
  - distance-estimation
24
  - quantitative-spatial-reasoning
25
+ task_categories:
26
+ - visual-question-answering
27
  pretty_name: SpaceOm-GGUF
 
 
 
 
 
 
 
28
  model-index:
29
  - name: SpaceOm
30
  results:
 
36
  type: benchmark
37
  metrics:
38
  - type: success_rate
 
39
  value: 0.5419
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
41
  - type: success_rate
42
+ value: 0.599
43
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
44
  - type: success_rate
45
+ value: 0.388
46
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
47
  - type: success_rate
48
+ value: 0.5833
49
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
50
  - type: success_rate
51
+ value: 0.4455
52
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
53
  - type: success_rate
54
+ value: 0.4876
55
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
56
  - type: success_rate
57
+ value: 0.6105
58
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
59
  - type: success_rate
60
+ value: 0.7043
61
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
62
  - type: success_rate
63
+ value: 0.3504
64
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  - type: success_rate
66
+ value: 0.2558
67
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
68
  - type: success_rate
69
+ value: 0.8085
70
  name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
71
  - type: success_rate
72
+ value: 0.6839
73
  name: Overall Success Rate
74
+ - type: success_rate
75
  value: 0.6553
76
+ name: Overall Success Rate
 
 
 
 
 
 
 
 
 
 
 
77
  ---
78
 
79
  # SpaceOm
80
 
81
+ This model is evaluated in the paper [SpaCE-10: A Comprehensive Benchmark for Multimodal Large Language Models in Compositional Spatial Intelligence](https://huggingface.co/papers/2506.07966).
82
+ The code for the SpaCE-10 benchmark is available at: https://github.com/Cuzyoung/SpaCE-10.
83
+
84
  **Model creator:** [remyxai](https://huggingface.co/remyxai)<br>
85
  **Original model**: [SpaceOm](https://huggingface.co/remyxai/SpaceOm)<br>
86
  **GGUF quantization:** `llama.cpp` commit [2baf07727f921d9a4a1b63a2eff941e95d0488ed](https://github.com/ggerganov/llama.cpp/tree/2baf07727f921d9a4a1b63a2eff941e95d0488ed)<br>