Update README.md
Browse files
README.md
CHANGED
@@ -318,11 +318,12 @@ And comparing to **SpaceThinker**:
|
|
318 |
|
319 |
[](https://colab.research.google.com/drive/1YpIOjJFZ-Zaomg77ImeQHSqYBLB8T1Ce?usp=sharing)
|
320 |
|
321 |
-
This table compares `SpaceOm` evaluated using GPT scoring against
|
322 |
|
323 |
| Model | EQ | SQ | SA | OO | OS | EP | FR | SP | Source |
|
324 |
|------------------------|-------|-------|-------|-------|-------|-------|-------|-------|-----------|
|
325 |
-
| **SpaceOm**
|
|
|
326 |
| Qwen2.5-VL-7B-Instruct | 32.70 | 31.00 | 41.30 | 32.10 | 27.60 | 15.40 | 26.30 | 27.50 | Table |
|
327 |
| LLaVA-OneVision-7B | **37.40** | 36.20 | 42.90 | 44.20 | 27.10 | 11.20 | **45.60** | 27.20 | Table |
|
328 |
| VILA1.5-7B | 30.20 | **38.60** | 39.90 | 44.10 | 16.50 | **35.10** | 30.10 | **37.60** | Table |
|
|
|
318 |
|
319 |
[](https://colab.research.google.com/drive/1YpIOjJFZ-Zaomg77ImeQHSqYBLB8T1Ce?usp=sharing)
|
320 |
|
321 |
+
This table compares `SpaceOm` and `SpaceQwen` evaluated using GPT scoring against leading open-source models on the SpaCE-10 benchmark. Top scores in each category are **bolded**.
|
322 |
|
323 |
| Model | EQ | SQ | SA | OO | OS | EP | FR | SP | Source |
|
324 |
|------------------------|-------|-------|-------|-------|-------|-------|-------|-------|-----------|
|
325 |
+
| **SpaceOm** | 32.47 | 24.81 | **47.63** | 50.00 | 32.52 | 9.12 | **37.04** | 25.00 | GPT Eval |
|
326 |
+
| SpaceQwen | 31.19 | 25.89 | 41.61 | **51.98** | **35.18** | 10.97 | 36.54 | 22.50 | GPT Eval |
|
327 |
| Qwen2.5-VL-7B-Instruct | 32.70 | 31.00 | 41.30 | 32.10 | 27.60 | 15.40 | 26.30 | 27.50 | Table |
|
328 |
| LLaVA-OneVision-7B | **37.40** | 36.20 | 42.90 | 44.20 | 27.10 | 11.20 | **45.60** | 27.20 | Table |
|
329 |
| VILA1.5-7B | 30.20 | **38.60** | 39.90 | 44.10 | 16.50 | **35.10** | 30.10 | **37.60** | Table |
|