mmorales34 commited on
Commit
8d2168f
·
1 Parent(s): e46f3ce

pushing model

Browse files
README.md CHANGED
@@ -16,7 +16,7 @@ model-index:
16
  type: Pong-v4
17
  metrics:
18
  - type: mean_reward
19
- value: -1.60 +/- 7.05
20
  name: mean_reward
21
  verified: false
22
  ---
@@ -46,7 +46,7 @@ curl -OL https://huggingface.co/pfunk/Pong-v4-no_tau_sweep-seed1/raw/main/dqpn_a
46
  curl -OL https://huggingface.co/pfunk/Pong-v4-no_tau_sweep-seed1/raw/main/pyproject.toml
47
  curl -OL https://huggingface.co/pfunk/Pong-v4-no_tau_sweep-seed1/raw/main/poetry.lock
48
  poetry install --all-extras
49
- python dqpn_atari.py --end-policy-f=1000 --env-id=Pong-v4 --evaluation-fraction=0.66 --exp-name=no_tau_sweep --hf-entity=pfunk --policy-tau=1 --save-model=true --seed=1 --start-policy-f=75000 --target-tau=1 --total-timesteps=10000000 --track=true --upload-model=true --wandb-entity=pfunk --wandb-project-name=dqpn
50
  ```
51
 
52
  # Hyperparameters
@@ -58,7 +58,7 @@ python dqpn_atari.py --end-policy-f=1000 --env-id=Pong-v4 --evaluation-fraction=
58
  'end_e': 0.01,
59
  'end_policy_f': 1000,
60
  'env_id': 'Pong-v4',
61
- 'evaluation_fraction': 0.66,
62
  'exp_name': 'no_tau_sweep',
63
  'exploration_fraction': 0.1,
64
  'gamma': 0.99,
@@ -69,7 +69,7 @@ python dqpn_atari.py --end-policy-f=1000 --env-id=Pong-v4 --evaluation-fraction=
69
  'save_model': True,
70
  'seed': 1,
71
  'start_e': 1,
72
- 'start_policy_f': 75000,
73
  'target_network_frequency': 1000,
74
  'target_tau': 1.0,
75
  'torch_deterministic': True,
 
16
  type: Pong-v4
17
  metrics:
18
  - type: mean_reward
19
+ value: 5.60 +/- 4.25
20
  name: mean_reward
21
  verified: false
22
  ---
 
46
  curl -OL https://huggingface.co/pfunk/Pong-v4-no_tau_sweep-seed1/raw/main/pyproject.toml
47
  curl -OL https://huggingface.co/pfunk/Pong-v4-no_tau_sweep-seed1/raw/main/poetry.lock
48
  poetry install --all-extras
49
+ python dqpn_atari.py --end-policy-f=1000 --env-id=Pong-v4 --evaluation-fraction=0.75 --exp-name=no_tau_sweep --hf-entity=pfunk --policy-tau=1 --save-model=true --seed=1 --start-policy-f=250000 --target-tau=1 --total-timesteps=10000000 --track=true --upload-model=true --wandb-entity=pfunk --wandb-project-name=dqpn
50
  ```
51
 
52
  # Hyperparameters
 
58
  'end_e': 0.01,
59
  'end_policy_f': 1000,
60
  'env_id': 'Pong-v4',
61
+ 'evaluation_fraction': 0.75,
62
  'exp_name': 'no_tau_sweep',
63
  'exploration_fraction': 0.1,
64
  'gamma': 0.99,
 
69
  'save_model': True,
70
  'seed': 1,
71
  'start_e': 1,
72
+ 'start_policy_f': 250000,
73
  'target_network_frequency': 1000,
74
  'target_tau': 1.0,
75
  'torch_deterministic': True,
events.out.tfevents.1677084064.rhea.230732.0 → events.out.tfevents.1677084593.rhea.231097.0 RENAMED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:cd66c3e1745394486bb997b7f9c3deb1465692290f5925d34edc12864cb890b4
3
- size 17527511
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b5dde5ab2039ef9606aad1402c265339affc0f8686004f57f12cb2b7d5bf80fa
3
+ size 17836836
no_tau_sweep.cleanrl_model CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bd78ed542248a66be7b7e9a1a4d96579dbcc0303ddac8a84381f2bcd70919080
3
  size 6752451
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:696b75191437e2b8b027809d88e4787242c207a2dc1594a92da307d99a034274
3
  size 6752451
replay.mp4 CHANGED
Binary files a/replay.mp4 and b/replay.mp4 differ
 
videos/Pong-v4__no_tau_sweep__1__1677084056-eval/rl-video-episode-0.mp4 DELETED
Binary file (310 kB)
 
videos/Pong-v4__no_tau_sweep__1__1677084056-eval/rl-video-episode-1.mp4 DELETED
Binary file (365 kB)
 
videos/Pong-v4__no_tau_sweep__1__1677084056-eval/rl-video-episode-8.mp4 DELETED
Binary file (380 kB)
 
videos/Pong-v4__no_tau_sweep__1__1677084585-eval/rl-video-episode-0.mp4 ADDED
Binary file (266 kB). View file
 
videos/Pong-v4__no_tau_sweep__1__1677084585-eval/rl-video-episode-1.mp4 ADDED
Binary file (336 kB). View file
 
videos/Pong-v4__no_tau_sweep__1__1677084585-eval/rl-video-episode-8.mp4 ADDED
Binary file (281 kB). View file