mustafataha5
/

sac-her-PandaPickAndPlace-v3-800k

Model card Files Files and versions Community

mustafataha5 commited on 11 days ago

Commit

612c3a0

verified ·

1 Parent(s): 5fdda93

Update README.md

Browse files

Files changed (1) hide show

README.md +74 -27

README.md CHANGED Viewed

@@ -1,37 +1,84 @@
 ---
-library_name: stable-baselines3
-tags:
-- PandaPickAndPlace-v3
-- deep-reinforcement-learning
-- reinforcement-learning
-- stable-baselines3
-model-index:
-- name: SAC
-  results:
-  - task:
-      type: reinforcement-learning
-      name: reinforcement-learning
-    dataset:
-      name: PandaPickAndPlace-v3
-      type: PandaPickAndPlace-v3
-    metrics:
-    - type: mean_reward
-      value: -50.00 +/- 0.00
-      name: mean_reward
-      verified: false
 ---
-# **SAC** Agent playing **PandaPickAndPlace-v3**
-This is a trained model of a **SAC** agent playing **PandaPickAndPlace-v3**
-using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
-## Usage (with Stable-baselines3)
-TODO: Add your code
 ```python
-from stable_baselines3 import ...
 from huggingface_sb3 import load_from_hub
-...
 ```

+# SAC + HER Agent for PandaPickAndPlace-v3 🦾
+This repository contains a **Soft Actor-Critic (SAC)** agent trained with **Hindsight Experience Replay (HER)** to solve the [PandaPickAndPlace-v3](https://panda-gym.readthedocs.io/en/latest/environments/pickandplace.html) environment from [Panda-Gym](https://github.com/qgallouedec/panda-gym).
+The training was done using [Stable-Baselines3](https://stable-baselines3.readthedocs.io/) and uploaded to the Hugging Face Hub.
 ---
+## 📖 Model Details
+- **Algorithm:** SAC (Soft Actor-Critic) + HER
+- **Environment:** `PandaPickAndPlace-v3`
+- **Training Steps:** 800k
+- **Library:** [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)
+- **Replay Buffer:** HER with `future` strategy
+- **Device:** Trained on GPU (`cuda`)
 ---
+## 📊 Evaluation Results
+The agent was evaluated for **10 episodes**:
+Mean reward = XXX.XX ± YYY.YY
+*Please replace XXX.XX and YYY.YY with your actual evaluation results.*
+---
+## 🚀 Usage
+You can directly load this trained agent from the Hugging Face Hub and run it inside the `PandaPickAndPlace-v3` environment.
 ```python
+import gymnasium as gym
+from stable_baselines3 import SAC
 from huggingface_sb3 import load_from_hub
+# Download model from Hugging Face Hub
+repo_id = "mustafataha5/sac-her-PandaPickAndPlace-v3-800k"   # your repo
+filename = "sac_her_checkpoint_800000_steps.zip"             # uploaded file
+# This will download the model from HF Hub
+model_path = load_from_hub(repo_id, filename)
+model = SAC.load(model_path)
+# Create the environment
+env = gym.make("PandaPickAndPlace-v3", render_mode="human")
+# Run one episode
+obs, _ = env.reset()
+done, truncated = False, False
+while not (done or truncated):
+    action, _ = model.predict(obs, deterministic=True)
+    obs, reward, done, truncated, info = env.step(action)
+    env.render()
+env.close()
 ```
+---
+## 📦 Files inside this repo
+- `sac_her_checkpoint_800000_steps.zip` → The trained SAC + HER model checkpoint
+- `README.md` → This file
+---
+## 🙌 Acknowledgements
+- [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)
+- [Panda-Gym](https://github.com/qgallouedec/panda-gym)
+- [Hugging Face Hub](https://huggingface.co/)
+---
+## 📝 Maintainer
+Mustafa Taha
+---
+⚡ **Steps to use:**
+1. Copy this into a file called `README.md`.
+2. Place it in your Hugging Face repo (it will replace the default template).
+3. Commit + push.
+Then, when people visit your model page, they’ll see this **professional README** and can copy-paste the usage code to download + run your agent.