upload via upload_folder 2025-08-14T10:21:06.581228+00:00

Files changed (7) hide show

README.md ADDED Viewed

+---
+env_name: LunarLander-v3
+tags:
+- LunarLander-v3
+- rainbow-dqn
+- reinforcement-learning
+- custom-implementation
+- deep-q-learning
+- pytorch
+- rainbow
+- dqn
+model-index:
+- name: Rainbow-2d-LunarLander-v3
+  results:
+  - task:
+      type: reinforcement-learning
+      name: reinforcement-learning
+    dataset:
+      name: LunarLander-v3
+      type: LunarLander-v3
+    metrics:
+    - type: mean_reward
+      value: 192.34 +/- 127.62
+      name: mean_reward
+      verified: false
+---
+# **Rainbow-DQN** Agent playing **LunarLander-v3**
+This is a trained model of a **Rainbow-DQN** agent playing **LunarLander-v3**.
+## Usage
+### create the conda env in https://github.com/GeneHit/drl_practice
+```bash
+conda create -n drl python=3.12
+conda activate drl
+python -m pip install -r requirements.txt
+```
+### play with full model
+```python
+# load the full model
+model = load_from_hub(repo_id="winkin119/Rainbow-2d-LunarLander-v3", filename="full_model.pt")
+# Create the environment. Don't forget to check the necessary wrappers in the env setup.
+env = gym.make("LunarLander-v3")
+state, _ = env.reset()
+action = model.action(state)
+...
+```
+There is also a state dict version of the model, you can check the corresponding definition in the repo.

eval_result.json ADDED Viewed

+{
+    "mean_reward": 192.33588980155548,
+    "std_reward": 127.62317107151691,
+    "datetime": "2025-08-13T23:25:51.510828+00:00",
+    "train_duration_min": "283.82"
+}

full_model.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:4a1a95bbf540d677ec6b8011d3c4adceb5913382360dca955519344a98fe52a1
+size 4717877

params.json ADDED Viewed

+{
+    "env_config": {
+        "env_id": "LunarLander-v3",
+        "env_kwargs": {},
+        "max_steps": null,
+        "normalize_obs": false,
+        "use_image": true,
+        "vector_env_num": 6,
+        "use_multi_processing": true,
+        "image_shape": [
+            84,
+            84
+        ],
+        "frame_stack": 4,
+        "frame_skip": 2,
+        "training_render_mode": "rgb_array"
+    },
+    "device": "mps",
+    "learning_rate": 0.0001,
+    "gamma": 0.99,
+    "checkpoint_pathname": "",
+    "max_grad_norm": 0.5,
+    "log_interval": 100,
+    "eval_episodes": 100,
+    "eval_random_seed": 42,
+    "eval_video_num": 10,
+    "timesteps": 225000,
+    "epsilon_schedule": {
+        "_type": "ConstantSchedule",
+        "_module": "practice.utils_for_coding.scheduler_utils",
+        "value": 0.0
+    },
+    "replay_buffer_capacity": 0,
+    "batch_size": 64,
+    "train_interval": 1,
+    "target_update_interval": 250,
+    "update_start_step": 2000,
+    "dqn_algorithm": "rainbow",
+    "noisy_std": 0.5,
+    "per_buffer_config": {
+        "capacity": 135000,
+        "n_step": 3,
+        "gamma": 0.99,
+        "use_uniform_sampling": true,
+        "alpha": 0.6,
+        "beta": 0.4,
+        "beta_increment": 2.424242424242424e-06
+    },
+    "v_min": -300.0,
+    "v_max": 300.0,
+    "num_atoms": 51
+}

replay.mp4 ADDED Viewed

Binary file (17.7 kB). View file

state_dict.pt ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:109b8a5b744c269487c2e4d80ad1bccdf73323184fff603bb0949d4ba7b12676
+size 4714165

tensorboard/events.out.tfevents.1755110428.winkindeMacBook-Air.local.24455.0 ADDED Viewed

+version https://git-lfs.github.com/spec/v1
+oid sha256:3e695b6d38591d9c1548aa5b23d0591ef6f639609889b351fe5f69e5883bf903
+size 2166680