nielsr HF Staff commited on
Commit
46b7254
Β·
verified Β·
1 Parent(s): fa40fb0

Improve model card: Add robotics pipeline tag and canonical links

Browse files

This PR enhances the model card for MolmoAct-7B-D by:

- Adding the `pipeline_tag: robotics` to the metadata, which helps users discover the model via the Hugging Face Hub's pipeline filters (e.g., at https://huggingface.co/models?pipeline_tag=robotics).
- Updating the paper link in the "Quick links" section to point to the Hugging Face Papers page ([https://huggingface.co/papers/2508.07917](https://huggingface.co/papers/2508.07917)) for consistency and improved discoverability within the Hub.
- Adding a direct link to the GitHub repository ([https://github.com/allenai/MolmoAct](https://github.com/allenai/MolmoAct)) in the "Quick links" section for easier access to the codebase.

These changes will help researchers and practitioners more easily find and understand the model's capabilities and resources.

Files changed (1) hide show
  1. README.md +7 -5
README.md CHANGED
@@ -1,11 +1,12 @@
1
  ---
2
- license: apache-2.0
3
- language:
4
- - en
5
  base_model:
6
  - Qwen/Qwen2.5-7B
7
  - google/siglip2-so400m-patch14-384
 
 
8
  library_name: transformers
 
 
9
  tags:
10
  - molmoact
11
  - molmo
@@ -21,7 +22,7 @@ tags:
21
  # MolmoAct 7B-D
22
 
23
  MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
24
- **Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/allenai/MolmoAct-7B-D-0812/blob/main/MolmoAct_Technical_Report.pdf).
25
 
26
  **MolmoAct 7B-D** is based on [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) and uses [SigLip2](https://huggingface.co/google/siglip2-so400m-patch14-384) as the vision backbone, which is initialized using Molmo's pre-training approach. It is first pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture), and then mid-trained on [MolmoAct Dataset](https://huggingface.co/datasets/allenai/MolmoAct-Midtraining-Mixture). This model is intended to be used for downstream post-training.
27
 
@@ -30,7 +31,8 @@ This checkpoint is a **preview** of the MolmoAct release. All artifacts used in
30
  Quick links:
31
  - πŸ“‚ [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
32
  - πŸ“‚ [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
33
- - πŸ“ƒ [Paper](https://arxiv.org/pdf/2508.07917)
 
34
  - πŸŽ₯ [Blog Post](https://allenai.org/blog/molmoact)
35
  - πŸŽ₯ [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
36
 
 
1
  ---
 
 
 
2
  base_model:
3
  - Qwen/Qwen2.5-7B
4
  - google/siglip2-so400m-patch14-384
5
+ language:
6
+ - en
7
  library_name: transformers
8
+ license: apache-2.0
9
+ pipeline_tag: robotics
10
  tags:
11
  - molmoact
12
  - molmo
 
22
  # MolmoAct 7B-D
23
 
24
  MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
25
+ **Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/papers/2508.07917).
26
 
27
  **MolmoAct 7B-D** is based on [Qwen2.5-7B](https://huggingface.co/Qwen/Qwen2.5-7B) and uses [SigLip2](https://huggingface.co/google/siglip2-so400m-patch14-384) as the vision backbone, which is initialized using Molmo's pre-training approach. It is first pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture), and then mid-trained on [MolmoAct Dataset](https://huggingface.co/datasets/allenai/MolmoAct-Midtraining-Mixture). This model is intended to be used for downstream post-training.
28
 
 
31
  Quick links:
32
  - πŸ“‚ [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
33
  - πŸ“‚ [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
34
+ - πŸ“„ [Paper](https://huggingface.co/papers/2508.07917)
35
+ - πŸ’» [GitHub Repository](https://github.com/allenai/MolmoAct)
36
  - πŸŽ₯ [Blog Post](https://allenai.org/blog/molmoact)
37
  - πŸŽ₯ [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)
38