mustafataha5 commited on
Commit
612c3a0
·
verified ·
1 Parent(s): 5fdda93

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -27
README.md CHANGED
@@ -1,37 +1,84 @@
 
 
 
 
 
1
  ---
2
- library_name: stable-baselines3
3
- tags:
4
- - PandaPickAndPlace-v3
5
- - deep-reinforcement-learning
6
- - reinforcement-learning
7
- - stable-baselines3
8
- model-index:
9
- - name: SAC
10
- results:
11
- - task:
12
- type: reinforcement-learning
13
- name: reinforcement-learning
14
- dataset:
15
- name: PandaPickAndPlace-v3
16
- type: PandaPickAndPlace-v3
17
- metrics:
18
- - type: mean_reward
19
- value: -50.00 +/- 0.00
20
- name: mean_reward
21
- verified: false
22
  ---
23
 
24
- # **SAC** Agent playing **PandaPickAndPlace-v3**
25
- This is a trained model of a **SAC** agent playing **PandaPickAndPlace-v3**
26
- using the [stable-baselines3 library](https://github.com/DLR-RM/stable-baselines3).
27
 
28
- ## Usage (with Stable-baselines3)
29
- TODO: Add your code
 
 
 
30
 
 
 
 
31
 
32
  ```python
33
- from stable_baselines3 import ...
 
34
  from huggingface_sb3 import load_from_hub
35
 
36
- ...
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
37
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SAC + HER Agent for PandaPickAndPlace-v3 🦾
2
+
3
+ This repository contains a **Soft Actor-Critic (SAC)** agent trained with **Hindsight Experience Replay (HER)** to solve the [PandaPickAndPlace-v3](https://panda-gym.readthedocs.io/en/latest/environments/pickandplace.html) environment from [Panda-Gym](https://github.com/qgallouedec/panda-gym).
4
+ The training was done using [Stable-Baselines3](https://stable-baselines3.readthedocs.io/) and uploaded to the Hugging Face Hub.
5
+
6
  ---
7
+
8
+ ## 📖 Model Details
9
+ - **Algorithm:** SAC (Soft Actor-Critic) + HER
10
+ - **Environment:** `PandaPickAndPlace-v3`
11
+ - **Training Steps:** 800k
12
+ - **Library:** [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)
13
+ - **Replay Buffer:** HER with `future` strategy
14
+ - **Device:** Trained on GPU (`cuda`)
15
+
 
 
 
 
 
 
 
 
 
 
 
16
  ---
17
 
18
+ ## 📊 Evaluation Results
19
+ The agent was evaluated for **10 episodes**:
 
20
 
21
+ Mean reward = XXX.XX ± YYY.YY
22
+
23
+ *Please replace XXX.XX and YYY.YY with your actual evaluation results.*
24
+
25
+ ---
26
 
27
+ ## 🚀 Usage
28
+
29
+ You can directly load this trained agent from the Hugging Face Hub and run it inside the `PandaPickAndPlace-v3` environment.
30
 
31
  ```python
32
+ import gymnasium as gym
33
+ from stable_baselines3 import SAC
34
  from huggingface_sb3 import load_from_hub
35
 
36
+ # Download model from Hugging Face Hub
37
+ repo_id = "mustafataha5/sac-her-PandaPickAndPlace-v3-800k" # your repo
38
+ filename = "sac_her_checkpoint_800000_steps.zip" # uploaded file
39
+
40
+ # This will download the model from HF Hub
41
+ model_path = load_from_hub(repo_id, filename)
42
+ model = SAC.load(model_path)
43
+
44
+ # Create the environment
45
+ env = gym.make("PandaPickAndPlace-v3", render_mode="human")
46
+
47
+ # Run one episode
48
+ obs, _ = env.reset()
49
+ done, truncated = False, False
50
+
51
+ while not (done or truncated):
52
+ action, _ = model.predict(obs, deterministic=True)
53
+ obs, reward, done, truncated, info = env.step(action)
54
+ env.render()
55
+
56
+ env.close()
57
  ```
58
+
59
+ ---
60
+
61
+ ## 📦 Files inside this repo
62
+ - `sac_her_checkpoint_800000_steps.zip` → The trained SAC + HER model checkpoint
63
+ - `README.md` → This file
64
+
65
+ ---
66
+
67
+ ## 🙌 Acknowledgements
68
+ - [Stable-Baselines3](https://stable-baselines3.readthedocs.io/)
69
+ - [Panda-Gym](https://github.com/qgallouedec/panda-gym)
70
+ - [Hugging Face Hub](https://huggingface.co/)
71
+
72
+ ---
73
+
74
+ ## 📝 Maintainer
75
+ Mustafa Taha
76
+
77
+ ---
78
+
79
+ ⚡ **Steps to use:**
80
+ 1. Copy this into a file called `README.md`.
81
+ 2. Place it in your Hugging Face repo (it will replace the default template).
82
+ 3. Commit + push.
83
+
84
+ Then, when people visit your model page, they’ll see this **professional README** and can copy-paste the usage code to download + run your agent.