xzxuan commited on
Commit
82628d6
·
verified ·
1 Parent(s): 9540784

Update README data

Browse files
Files changed (1) hide show
  1. README.md +33 -28
README.md CHANGED
@@ -66,40 +66,45 @@ This openvla-oft model is trained on ``Haozhan72/Openvla-oft-SFT-libero10-trajal
66
 
67
  ## Full OOD Evaluation and Results
68
  ### Overall OOD Eval Results
69
- Note: rl4vla refers to the paper VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study.
70
- | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
71
- |---------------|-----------|-----------------|----------------|-------------|---------------|
72
- | Avg results | 0.7608 | 0.61484375 | 0.6453125 | **0.822135417** | 0.7546875 |
 
 
73
  ### OOD Eval on Vision
74
 
75
- | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
76
- |---------------|-----------|-----------------|----------------|-------------|---------------|
77
- | vision avg | 0.7656 | 0.846875 | 0.80546875 | **0.8203125** | 0.746875 |
78
- | unseen table | 0.844 | 0.9140625 | 0.9453125 | **0.95703125** | 0.8984375 |
79
- | dynamic texture (weak) | 0.833 | **0.91015625** | 0.82421875 | 0.85546875 | 0.7890625 |
80
- | dynamic texture (strong) | 0.63 | **0.7734375** | 0.625 | 0.72265625 | 0.65625 |
81
- | dynamic noise (weak) | 0.854 | 0.89453125 | **0.8984375** | 0.87109375 | 0.796875|
82
- | dynamic noise (strong) | 0.667 | **0.7421875** | 0.734375 | 0.6953125 | 0.59375|
83
 
84
  ### OOD Eval on Semantic
85
- | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
86
- |---------------|-----------|-----------------|----------------|-------------|---------------|
87
- | object avg | 0.754 | 0.516113281 | 0.56640625 | **0.805664063** | 0.744140625|
88
- | train setting | 0.938 | 0.94140625 | 0.91796875 | **0.9609375** | 0.84375|
89
- | unseen objects | 0.714 | 0.8046875 | 0.77734375 | **0.81640625** | 0.765625|
90
- | unseen receptacles | 0.75 | 0.7421875 | 0.78125 | **0.8125** | 0.734375|
91
- | unseen instructions | 0.891 | 0.6796875 | 0.68359375 | **0.9453125** | 0.890625|
92
- | multi-object (both seen) | 0.75 | 0.3515625 | 0.4296875 | **0.84375** | 0.7578125|
93
- | multi-object (both unseen) | 0.578 | 0.3046875 | 0.38671875 | **0.62890625** | 0.578125|
94
- | distractive receptacle | 0.812 | 0.1875 | 0.31640625 | **0.828125** | 0.78125|
95
- | multi-receptacle (both unseen) | 0.599 | 0.1171875 | 0.23828125 | **0.609375** | 0.6015625|
 
96
 
97
  ### OOD Eval on Position
98
- | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
99
- |---------------|-----------|-----------------|----------------|-------------|---------------|
100
- | position avg | 0.776 | 0.4296875 | 0.560546875 | **0.892578125** | 0.81640625|
101
- | unseen position (object & receptacle) | 0.807 | 0.40234375 | 0.50390625 | **0.86328125** | 0.75|
102
- | mid-episode object reposition | 0.745 | 0.45703125 | 0.6171875 | **0.921875** | 0.8828125|
 
 
103
 
104
  ## How to Use
105
  Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_grpo_openvlaoft.yaml``:
 
66
 
67
  ## Full OOD Evaluation and Results
68
  ### Overall OOD Eval Results
69
+ Note: rl4vla refers to the paper [VLA-RL-Study: What Can RL Bring to VLA Generalization? An Empirical Study](https://arxiv.org/abs/2505.19789).
70
+
71
+ | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
72
+ |-------------|--------|---------------------|----------------|-------------|--------------|
73
+ | Avg results | 76.08 | 61.48 | 64.53 | **82.21** | 75.47 |
74
+
75
  ### OOD Eval on Vision
76
 
77
+ | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
78
+ |-------------|--------|---------------------|----------------|-------------|--------------|
79
+ | vision avg | 76.56 | 84.69 | 80.55 | **82.03** | 74.69 |
80
+ | unseen table | 84.40 | 91.41 | 94.53 | **95.70** | 89.84 |
81
+ | dynamic texture (weak) | 83.30 | **91.02** | 82.42 | 85.55 | 78.91 |
82
+ | dynamic texture (strong) | 63.00 | **77.34** | 62.50 | 72.27 | 65.62 |
83
+ | dynamic noise (weak) | 85.40 | 89.45 | **89.84** | 87.11 | 79.69 |
84
+ | dynamic noise (strong) | 66.70 | **74.22** | 73.44 | 69.53 | 59.38 |
85
 
86
  ### OOD Eval on Semantic
87
+
88
+ | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
89
+ |-------------|--------|---------------------|----------------|-------------|--------------|
90
+ | object avg | 75.40 | 51.61 | 56.64 | **80.57** | 74.41 |
91
+ | train setting | 93.80 | 94.14 | 91.80 | **96.09** | 84.38 |
92
+ | unseen objects | 71.40 | 80.47 | 77.73 | **81.64** | 76.56 |
93
+ | unseen receptacles | 75.00 | 74.22 | 78.12 | **81.25** | 73.44 |
94
+ | unseen instructions | 89.10 | 67.97 | 68.36 | **94.53** | 89.06 |
95
+ | multi-object (both seen) | 75.00 | 35.16 | 42.97 | **84.38** | 75.78 |
96
+ | multi-object (both unseen) | 57.80 | 30.47 | 38.67 | **62.89** | 57.81 |
97
+ | distractive receptacle | 81.20 | 18.75 | 31.64 | **82.81** | 78.12 |
98
+ | multi-receptacle (both unseen) | 59.90 | 11.72 | 23.83 | **60.94** | 60.16 |
99
 
100
  ### OOD Eval on Position
101
+
102
+ | Description | rl4vla | __GRPO-openvlaoft__ | PPO-openvlaoft | PPO-openvla | GRPO-openvla |
103
+ |-------------|--------|---------------------|----------------|-------------|--------------|
104
+ | position avg | 77.60 | 42.97 | 56.05 | **89.26** | 81.64 |
105
+ | unseen position (object & receptacle) | 80.70 | 40.23 | 50.39 | **86.33** | 75.00 |
106
+ | mid-episode object reposition | 74.50 | 45.70 | 61.72 | **92.19** | 88.28 |
107
+
108
 
109
  ## How to Use
110
  Please integrate the provided model with the [RLinf](https://github.com/RLinf/RLinf) codebase. To do so, modify the following parameters in the configuration file ``examples/embodiment/config/maniskill_grpo_openvlaoft.yaml``: