Update README.md
Browse files
README.md
CHANGED
@@ -129,7 +129,7 @@ See [Cosmos-Reason1](https://github.com/nvidia-cosmos/cosmos-reason1) for detail
|
|
129 |
|
130 |
# Evaluation
|
131 |
|
132 |
-
Please see our [technical paper](https://arxiv.org/pdf/2503.15558) for detailed evaluations on physical common sense and embodied reasoning. Part of the evaluation datasets are released under [Cosmos-Reason1-Benchmark](https://huggingface.co/datasets/nvidia/Cosmos-Reason1-Benchmark
|
133 |
All datasets go through the data annotation process described in the technical paper to prepare training and evaluation data and annotations.
|
134 |
|
135 |
**Data Collection Method**:
|
@@ -159,6 +159,7 @@ Modality: Video (mp4) and Text
|
|
159 |
|
160 |
## Dataset Quantification
|
161 |
We release the embodied reasoning data and benchmarks. Each data sample is a pair of video and text. The text annotations include understanding and reasoning annotations described in the Cosmos-Reason1 paper. Each video may have multiple text annotations. The quantity of the video and text pairs is described in the table below.
|
|
|
162 |
|
163 |
| | [RoboVQA](https://robovqa.github.io/) | AV | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/)| [Agibot](https://github.com/OpenDriveLab/AgiBot-World)| [HoloAssist](https://holoassist.github.io/) | [RoboFail](https://robot-reflect.github.io/) | Total Storage Size |
|
164 |
|--------------------|---------------------------------------------|----------|------------------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------|--------------------|
|
|
|
129 |
|
130 |
# Evaluation
|
131 |
|
132 |
+
Please see our [technical paper](https://arxiv.org/pdf/2503.15558) for detailed evaluations on physical common sense and embodied reasoning. Part of the evaluation datasets are released under [Cosmos-Reason1-Benchmark](https://huggingface.co/datasets/nvidia/Cosmos-Reason1-Benchmark). The embodied reasoning datasets and benchmarks focus on the following areas: robotics (RoboVQA, BridgeDataV2, Agibot, RobFail), ego-centric human demonstration (HoloAssist), and Autonomous Vehicle (AV) driving video data. The AV dataset is collected and annotated by NVIDIA.
|
133 |
All datasets go through the data annotation process described in the technical paper to prepare training and evaluation data and annotations.
|
134 |
|
135 |
**Data Collection Method**:
|
|
|
159 |
|
160 |
## Dataset Quantification
|
161 |
We release the embodied reasoning data and benchmarks. Each data sample is a pair of video and text. The text annotations include understanding and reasoning annotations described in the Cosmos-Reason1 paper. Each video may have multiple text annotations. The quantity of the video and text pairs is described in the table below.
|
162 |
+
**The AV data is currently unavailable and will be uploaded soon!**
|
163 |
|
164 |
| | [RoboVQA](https://robovqa.github.io/) | AV | [BridgeDataV2](https://rail-berkeley.github.io/bridgedata/)| [Agibot](https://github.com/OpenDriveLab/AgiBot-World)| [HoloAssist](https://holoassist.github.io/) | [RoboFail](https://robot-reflect.github.io/) | Total Storage Size |
|
165 |
|--------------------|---------------------------------------------|----------|------------------------------------------------------|------------------------------------------------|------------------------------------------------|------------------------------------------------|--------------------|
|