Improve model card: Add pipeline tag, paper link, and code link

This PR improves the model card for MolmoAct by:

* Adding the `pipeline_tag: robotics` metadata, which helps users discover your model on the Hub (e.g., at https://huggingface.co/models?pipeline_tag=robotics).
* Updating the main paper link to point to the official Hugging Face paper page (`https://huggingface.co/papers/2508.07917`), enhancing discoverability and consistency on the Hub.
* Adding an explicit "Code" link for easier navigation to the GitHub repository (`https://github.com/allenai/MolmoAct`) in the "Quick links" section.

Please review these changes and merge if they look good.

Files changed (1) hide show

README.md +7 -5

README.md CHANGED Viewed

@@ -1,11 +1,12 @@
 ---
-license: apache-2.0
-language:
-- en
 base_model:
 - allenai/OLMo-2-1124-7B
 - openai/clip-vit-large-patch14-336
 library_name: transformers
 tags:
 - molmoact
 - molmo
@@ -21,16 +22,17 @@ tags:
 # MolmoAct 7B-O
 MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
-**Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/allenai/MolmoAct-7B-D-0812/blob/main/MolmoAct_Technical_Report.pdf).
 **MolmoAct 7B-O** is based on [OLMo-2-1124-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) and uses [OpenAI CLIP](https://huggingface.co/openai/clip-vit-large-patch14-336) as the vision backbone, which is initialized using Molmo's pre-training approach. It is first pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture), and then mid-trained on the [MolmoAct Dataset](https://huggingface.co/datasets/allenai/MolmoAct-Midtraining-Mixture). This model is intended to be used for downstream post-training.
 This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
 Quick links:
 - 📂 [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
 - 📂 [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
-- 📃 [Paper](https://arxiv.org/pdf/2508.07917)
 - 🎥 [Blog Post](https://allenai.org/blog/molmoact)
 - 🎥 [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)

 ---
 base_model:
 - allenai/OLMo-2-1124-7B
 - openai/clip-vit-large-patch14-336
+language:
+- en
 library_name: transformers
+license: apache-2.0
+pipeline_tag: robotics
 tags:
 - molmoact
 - molmo
 # MolmoAct 7B-O
 MolmoAct is a fully open-source action reasoning model for robotic manipulation developed by the Allen Institute for AI. MolmoAct is trained on a subset of OXE and MolmoAct Dataset, a dataset with 10k high-quality trajectories of a single-arm Franka robot performing 93 unique manipulation tasks in both home and tabletop environments. It has state-of-the-art performance among vision-language-action models on multiple benchmarks while being fully open-source. You can find all models in the MolmoAct family [here](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7).
+**Learn more about MolmoAct** in our announcement [blog post](https://allenai.org/blog/molmoact) or the [paper](https://huggingface.co/papers/2508.07917).
 **MolmoAct 7B-O** is based on [OLMo-2-1124-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) and uses [OpenAI CLIP](https://huggingface.co/openai/clip-vit-large-patch14-336) as the vision backbone, which is initialized using Molmo's pre-training approach. It is first pre-trained on MolmoAct's [Pre-training Mixture](https://huggingface.co/datasets/allenai/MolmoAct-Pretraining-Mixture), and then mid-trained on the [MolmoAct Dataset](https://huggingface.co/datasets/allenai/MolmoAct-Midtraining-Mixture). This model is intended to be used for downstream post-training.
 This checkpoint is a **preview** of the MolmoAct release. All artifacts used in creating MolmoAct (data, training code, evaluations, intermediate checkpoints) will be made available at a later date, furthering our commitment to open-source AI development and reproducibility.
 Quick links:
+- 📂 [Code](https://github.com/allenai/MolmoAct)
 - 📂 [All Models](https://huggingface.co/collections/allenai/molmoact-689697591a3936fba38174d7)
 - 📂 [All Data](https://huggingface.co/collections/allenai/molmoact-data-mixture-6897e583e13b6c2cf3ea2b80)
+- 📃 [Paper](https://huggingface.co/papers/2508.07917)
 - 🎥 [Blog Post](https://allenai.org/blog/molmoact)
 - 🎥 [Video](https://youtu.be/-_wag1X25OE?si=Xi_kUaJTmcQBx1f6)