|
[2025-04-19 17:46:50,419][08935] Saving configuration to ./runs/default_experiment/config.json... |
|
[2025-04-19 17:46:50,420][08935] Rollout worker 0 uses device cpu |
|
[2025-04-19 17:46:50,421][08935] Rollout worker 1 uses device cpu |
|
[2025-04-19 17:46:50,422][08935] Rollout worker 2 uses device cpu |
|
[2025-04-19 17:46:50,423][08935] Rollout worker 3 uses device cpu |
|
[2025-04-19 17:46:50,423][08935] Rollout worker 4 uses device cpu |
|
[2025-04-19 17:46:50,424][08935] Rollout worker 5 uses device cpu |
|
[2025-04-19 17:46:50,507][08935] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 17:46:50,508][08935] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-04-19 17:46:50,531][08935] Starting all processes... |
|
[2025-04-19 17:46:50,531][08935] Starting process learner_proc0 |
|
[2025-04-19 17:46:50,581][08935] Starting all processes... |
|
[2025-04-19 17:46:50,587][08935] Starting process inference_proc0-0 |
|
[2025-04-19 17:46:50,588][08935] Starting process rollout_proc0 |
|
[2025-04-19 17:46:50,588][08935] Starting process rollout_proc1 |
|
[2025-04-19 17:46:50,588][08935] Starting process rollout_proc2 |
|
[2025-04-19 17:46:50,592][08935] Starting process rollout_proc3 |
|
[2025-04-19 17:46:50,593][08935] Starting process rollout_proc4 |
|
[2025-04-19 17:46:50,594][08935] Starting process rollout_proc5 |
|
[2025-04-19 17:46:52,582][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 17:46:52,582][08993] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-04-19 17:46:52,600][08993] Num visible devices: 1 |
|
[2025-04-19 17:46:52,603][08993] Starting seed is not provided |
|
[2025-04-19 17:46:52,603][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 17:46:52,603][08993] Initializing actor-critic model on device cuda:0 |
|
[2025-04-19 17:46:52,603][08993] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 17:46:52,605][08993] RunningMeanStd input shape: (1,) |
|
[2025-04-19 17:46:52,623][08993] ConvEncoder: input_channels=3 |
|
[2025-04-19 17:46:52,812][09005] Worker 1 uses CPU cores [1] |
|
[2025-04-19 17:46:52,822][08993] Conv encoder output size: 512 |
|
[2025-04-19 17:46:52,822][08993] Policy head output size: 512 |
|
[2025-04-19 17:46:52,846][08993] Created Actor Critic model with architecture: |
|
[2025-04-19 17:46:52,846][08993] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-04-19 17:46:52,915][09008] Worker 3 uses CPU cores [3] |
|
[2025-04-19 17:46:52,918][09010] Worker 4 uses CPU cores [4] |
|
[2025-04-19 17:46:53,003][09004] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 17:46:53,003][09004] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-04-19 17:46:53,040][09004] Num visible devices: 1 |
|
[2025-04-19 17:46:53,157][09009] Worker 5 uses CPU cores [5] |
|
[2025-04-19 17:46:53,238][08993] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-04-19 17:46:53,302][09006] Worker 0 uses CPU cores [0] |
|
[2025-04-19 17:46:53,354][09007] Worker 2 uses CPU cores [2] |
|
[2025-04-19 17:46:54,037][08993] No checkpoints found |
|
[2025-04-19 17:46:54,037][08993] Did not load from checkpoint, starting from scratch! |
|
[2025-04-19 17:46:54,037][08993] Initialized policy 0 weights for model version 0 |
|
[2025-04-19 17:46:54,038][08993] LearnerWorker_p0 finished initialization! |
|
[2025-04-19 17:46:54,038][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 17:46:54,187][09004] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 17:46:54,187][09004] RunningMeanStd input shape: (1,) |
|
[2025-04-19 17:46:54,196][09004] ConvEncoder: input_channels=3 |
|
[2025-04-19 17:46:54,282][09004] Conv encoder output size: 512 |
|
[2025-04-19 17:46:54,283][09004] Policy head output size: 512 |
|
[2025-04-19 17:46:54,312][08935] Inference worker 0-0 is ready! |
|
[2025-04-19 17:46:54,313][08935] All inference workers are ready! Signal rollout workers to start! |
|
[2025-04-19 17:46:54,350][09007] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,354][09005] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,358][09010] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,363][09006] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,372][09009] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,384][09008] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:46:54,602][09009] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:54,603][09010] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:54,604][09005] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:54,855][09010] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:46:54,855][09009] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:46:54,866][09005] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:46:54,872][09006] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:55,163][09008] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:55,169][09006] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:46:55,421][09008] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:46:56,518][08993] Signal inference workers to stop experience collection... |
|
[2025-04-19 17:46:56,521][09004] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-04-19 17:46:57,303][08935] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-04-19 17:46:57,304][08935] Avg episode reward: [(0, '3.848')] |
|
[2025-04-19 17:46:57,347][08993] Signal inference workers to resume experience collection... |
|
[2025-04-19 17:46:57,348][09004] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-04-19 17:46:57,618][09007] Decorrelating experience for 0 frames... |
|
[2025-04-19 17:46:57,900][09007] Decorrelating experience for 32 frames... |
|
[2025-04-19 17:47:01,303][09004] Updated weights for policy 0, policy_version 10 (0.0067) |
|
[2025-04-19 17:47:02,303][08935] Fps is (10 sec: 9830.5, 60 sec: 9830.5, 300 sec: 9830.5). Total num frames: 49152. Throughput: 0: 1673.2. Samples: 8366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:47:02,304][08935] Avg episode reward: [(0, '4.495')] |
|
[2025-04-19 17:47:05,907][09004] Updated weights for policy 0, policy_version 20 (0.0009) |
|
[2025-04-19 17:47:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9420.8, 300 sec: 9420.8). Total num frames: 94208. Throughput: 0: 2154.7. Samples: 21547. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:47:07,304][08935] Avg episode reward: [(0, '4.573')] |
|
[2025-04-19 17:47:10,442][09004] Updated weights for policy 0, policy_version 30 (0.0009) |
|
[2025-04-19 17:47:10,500][08935] Heartbeat connected on Batcher_0 |
|
[2025-04-19 17:47:10,503][08935] Heartbeat connected on LearnerWorker_p0 |
|
[2025-04-19 17:47:10,510][08935] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-04-19 17:47:10,512][08935] Heartbeat connected on RolloutWorker_w0 |
|
[2025-04-19 17:47:10,519][08935] Heartbeat connected on RolloutWorker_w2 |
|
[2025-04-19 17:47:10,520][08935] Heartbeat connected on RolloutWorker_w1 |
|
[2025-04-19 17:47:10,524][08935] Heartbeat connected on RolloutWorker_w3 |
|
[2025-04-19 17:47:10,528][08935] Heartbeat connected on RolloutWorker_w4 |
|
[2025-04-19 17:47:10,554][08935] Heartbeat connected on RolloutWorker_w5 |
|
[2025-04-19 17:47:12,303][08935] Fps is (10 sec: 8601.5, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 135168. Throughput: 0: 1900.3. Samples: 28505. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:12,304][08935] Avg episode reward: [(0, '4.590')] |
|
[2025-04-19 17:47:12,305][08993] Saving new best policy, reward=4.590! |
|
[2025-04-19 17:47:15,226][09004] Updated weights for policy 0, policy_version 40 (0.0008) |
|
[2025-04-19 17:47:17,303][08935] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 180224. Throughput: 0: 2068.8. Samples: 41376. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:17,304][08935] Avg episode reward: [(0, '4.679')] |
|
[2025-04-19 17:47:17,308][08993] Saving new best policy, reward=4.679! |
|
[2025-04-19 17:47:19,663][09004] Updated weights for policy 0, policy_version 50 (0.0008) |
|
[2025-04-19 17:47:22,303][08935] Fps is (10 sec: 9011.3, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 225280. Throughput: 0: 2193.6. Samples: 54839. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:22,304][08935] Avg episode reward: [(0, '4.537')] |
|
[2025-04-19 17:47:24,358][09004] Updated weights for policy 0, policy_version 60 (0.0009) |
|
[2025-04-19 17:47:27,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 270336. Throughput: 0: 2054.6. Samples: 61638. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:27,304][08935] Avg episode reward: [(0, '4.557')] |
|
[2025-04-19 17:47:29,041][09004] Updated weights for policy 0, policy_version 70 (0.0008) |
|
[2025-04-19 17:47:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 315392. Throughput: 0: 2136.9. Samples: 74792. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:32,304][08935] Avg episode reward: [(0, '4.539')] |
|
[2025-04-19 17:47:33,620][09004] Updated weights for policy 0, policy_version 80 (0.0008) |
|
[2025-04-19 17:47:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 360448. Throughput: 0: 2205.1. Samples: 88203. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:37,304][08935] Avg episode reward: [(0, '4.418')] |
|
[2025-04-19 17:47:38,320][09004] Updated weights for policy 0, policy_version 90 (0.0008) |
|
[2025-04-19 17:47:42,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8829.2, 300 sec: 8829.2). Total num frames: 397312. Throughput: 0: 2099.0. Samples: 94456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:42,304][08935] Avg episode reward: [(0, '4.463')] |
|
[2025-04-19 17:47:43,616][09004] Updated weights for policy 0, policy_version 100 (0.0009) |
|
[2025-04-19 17:47:47,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8765.4, 300 sec: 8765.4). Total num frames: 438272. Throughput: 0: 2177.4. Samples: 106349. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:47:47,304][08935] Avg episode reward: [(0, '4.508')] |
|
[2025-04-19 17:47:48,450][09004] Updated weights for policy 0, policy_version 110 (0.0008) |
|
[2025-04-19 17:47:52,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8787.8, 300 sec: 8787.8). Total num frames: 483328. Throughput: 0: 2169.2. Samples: 119159. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:47:52,304][08935] Avg episode reward: [(0, '4.537')] |
|
[2025-04-19 17:47:53,061][09004] Updated weights for policy 0, policy_version 120 (0.0008) |
|
[2025-04-19 17:47:57,308][08935] Fps is (10 sec: 8597.1, 60 sec: 8737.4, 300 sec: 8737.4). Total num frames: 524288. Throughput: 0: 2164.1. Samples: 125902. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:47:57,310][08935] Avg episode reward: [(0, '4.575')] |
|
[2025-04-19 17:47:58,535][09004] Updated weights for policy 0, policy_version 130 (0.0008) |
|
[2025-04-19 17:48:02,303][08935] Fps is (10 sec: 7372.6, 60 sec: 8465.0, 300 sec: 8570.1). Total num frames: 557056. Throughput: 0: 2111.5. Samples: 136395. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) |
|
[2025-04-19 17:48:02,304][08935] Avg episode reward: [(0, '4.377')] |
|
[2025-04-19 17:48:04,297][09004] Updated weights for policy 0, policy_version 140 (0.0008) |
|
[2025-04-19 17:48:07,303][08935] Fps is (10 sec: 6966.8, 60 sec: 8328.5, 300 sec: 8484.6). Total num frames: 593920. Throughput: 0: 2050.9. Samples: 147129. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:07,304][08935] Avg episode reward: [(0, '4.458')] |
|
[2025-04-19 17:48:09,611][09004] Updated weights for policy 0, policy_version 150 (0.0009) |
|
[2025-04-19 17:48:12,303][08935] Fps is (10 sec: 7782.6, 60 sec: 8328.5, 300 sec: 8465.1). Total num frames: 634880. Throughput: 0: 2033.4. Samples: 153143. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:12,304][08935] Avg episode reward: [(0, '4.512')] |
|
[2025-04-19 17:48:15,205][09004] Updated weights for policy 0, policy_version 160 (0.0009) |
|
[2025-04-19 17:48:17,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8192.0, 300 sec: 8396.8). Total num frames: 671744. Throughput: 0: 1989.4. Samples: 164315. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:17,304][08935] Avg episode reward: [(0, '4.723')] |
|
[2025-04-19 17:48:17,307][08993] Saving new best policy, reward=4.723! |
|
[2025-04-19 17:48:19,589][09004] Updated weights for policy 0, policy_version 170 (0.0008) |
|
[2025-04-19 17:48:22,303][08935] Fps is (10 sec: 8601.7, 60 sec: 8260.3, 300 sec: 8481.1). Total num frames: 720896. Throughput: 0: 2004.3. Samples: 178396. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:48:22,304][08935] Avg episode reward: [(0, '4.696')] |
|
[2025-04-19 17:48:23,905][09004] Updated weights for policy 0, policy_version 180 (0.0008) |
|
[2025-04-19 17:48:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8260.3, 300 sec: 8510.6). Total num frames: 765952. Throughput: 0: 2020.6. Samples: 185384. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:27,304][08935] Avg episode reward: [(0, '4.802')] |
|
[2025-04-19 17:48:27,307][08993] Saving new best policy, reward=4.802! |
|
[2025-04-19 17:48:28,363][09004] Updated weights for policy 0, policy_version 190 (0.0008) |
|
[2025-04-19 17:48:32,303][08935] Fps is (10 sec: 9011.1, 60 sec: 8260.3, 300 sec: 8536.9). Total num frames: 811008. Throughput: 0: 2065.0. Samples: 199275. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:32,305][08935] Avg episode reward: [(0, '4.793')] |
|
[2025-04-19 17:48:32,763][09004] Updated weights for policy 0, policy_version 200 (0.0009) |
|
[2025-04-19 17:48:37,106][09004] Updated weights for policy 0, policy_version 210 (0.0008) |
|
[2025-04-19 17:48:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8328.5, 300 sec: 8601.6). Total num frames: 860160. Throughput: 0: 2094.5. Samples: 213412. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:37,304][08935] Avg episode reward: [(0, '4.841')] |
|
[2025-04-19 17:48:37,308][08993] Saving new best policy, reward=4.841! |
|
[2025-04-19 17:48:41,335][09004] Updated weights for policy 0, policy_version 220 (0.0008) |
|
[2025-04-19 17:48:42,305][08935] Fps is (10 sec: 9828.9, 60 sec: 8533.1, 300 sec: 8660.0). Total num frames: 909312. Throughput: 0: 2104.5. Samples: 220597. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:42,306][08935] Avg episode reward: [(0, '5.309')] |
|
[2025-04-19 17:48:42,308][08993] Saving new best policy, reward=5.309! |
|
[2025-04-19 17:48:45,971][09004] Updated weights for policy 0, policy_version 230 (0.0008) |
|
[2025-04-19 17:48:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8533.3, 300 sec: 8638.8). Total num frames: 950272. Throughput: 0: 2172.6. Samples: 234161. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:47,304][08935] Avg episode reward: [(0, '5.528')] |
|
[2025-04-19 17:48:47,335][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000233_954368.pth... |
|
[2025-04-19 17:48:47,380][08993] Saving new best policy, reward=5.528! |
|
[2025-04-19 17:48:50,477][09004] Updated weights for policy 0, policy_version 240 (0.0008) |
|
[2025-04-19 17:48:52,303][08935] Fps is (10 sec: 9012.7, 60 sec: 8601.6, 300 sec: 8690.6). Total num frames: 999424. Throughput: 0: 2240.3. Samples: 247942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:48:52,304][08935] Avg episode reward: [(0, '5.789')] |
|
[2025-04-19 17:48:52,304][08993] Saving new best policy, reward=5.789! |
|
[2025-04-19 17:48:54,905][09004] Updated weights for policy 0, policy_version 250 (0.0007) |
|
[2025-04-19 17:48:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8670.6, 300 sec: 8704.0). Total num frames: 1044480. Throughput: 0: 2260.9. Samples: 254885. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:48:57,304][08935] Avg episode reward: [(0, '5.373')] |
|
[2025-04-19 17:48:59,243][09004] Updated weights for policy 0, policy_version 260 (0.0008) |
|
[2025-04-19 17:49:02,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8683.5). Total num frames: 1085440. Throughput: 0: 2309.3. Samples: 268233. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:02,304][08935] Avg episode reward: [(0, '5.656')] |
|
[2025-04-19 17:49:04,443][09004] Updated weights for policy 0, policy_version 270 (0.0010) |
|
[2025-04-19 17:49:07,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8696.1). Total num frames: 1130496. Throughput: 0: 2280.4. Samples: 281015. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:49:07,304][08935] Avg episode reward: [(0, '6.581')] |
|
[2025-04-19 17:49:07,309][08993] Saving new best policy, reward=6.581! |
|
[2025-04-19 17:49:09,165][09004] Updated weights for policy 0, policy_version 280 (0.0008) |
|
[2025-04-19 17:49:12,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8677.5). Total num frames: 1171456. Throughput: 0: 2264.5. Samples: 287286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-04-19 17:49:12,304][08935] Avg episode reward: [(0, '6.799')] |
|
[2025-04-19 17:49:12,305][08993] Saving new best policy, reward=6.799! |
|
[2025-04-19 17:49:14,060][09004] Updated weights for policy 0, policy_version 290 (0.0008) |
|
[2025-04-19 17:49:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 9011.2, 300 sec: 8660.1). Total num frames: 1212416. Throughput: 0: 2235.1. Samples: 299852. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:17,304][08935] Avg episode reward: [(0, '6.242')] |
|
[2025-04-19 17:49:18,969][09004] Updated weights for policy 0, policy_version 300 (0.0009) |
|
[2025-04-19 17:49:22,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8874.7, 300 sec: 8644.0). Total num frames: 1253376. Throughput: 0: 2201.0. Samples: 312458. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:22,304][08935] Avg episode reward: [(0, '6.993')] |
|
[2025-04-19 17:49:22,339][08993] Saving new best policy, reward=6.993! |
|
[2025-04-19 17:49:23,741][09004] Updated weights for policy 0, policy_version 310 (0.0011) |
|
[2025-04-19 17:49:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8656.2). Total num frames: 1298432. Throughput: 0: 2186.4. Samples: 318983. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:27,304][08935] Avg episode reward: [(0, '8.612')] |
|
[2025-04-19 17:49:27,310][08993] Saving new best policy, reward=8.612! |
|
[2025-04-19 17:49:28,341][09004] Updated weights for policy 0, policy_version 320 (0.0010) |
|
[2025-04-19 17:49:32,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8641.2). Total num frames: 1339392. Throughput: 0: 2174.7. Samples: 332022. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:32,304][08935] Avg episode reward: [(0, '8.630')] |
|
[2025-04-19 17:49:32,315][08993] Saving new best policy, reward=8.630! |
|
[2025-04-19 17:49:33,202][09004] Updated weights for policy 0, policy_version 330 (0.0008) |
|
[2025-04-19 17:49:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.4). Total num frames: 1388544. Throughput: 0: 2166.8. Samples: 345447. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:37,304][08935] Avg episode reward: [(0, '8.841')] |
|
[2025-04-19 17:49:37,308][08993] Saving new best policy, reward=8.841! |
|
[2025-04-19 17:49:37,645][09004] Updated weights for policy 0, policy_version 340 (0.0008) |
|
[2025-04-19 17:49:41,959][09004] Updated weights for policy 0, policy_version 350 (0.0008) |
|
[2025-04-19 17:49:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8738.4, 300 sec: 8688.5). Total num frames: 1433600. Throughput: 0: 2167.5. Samples: 352424. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:42,304][08935] Avg episode reward: [(0, '10.682')] |
|
[2025-04-19 17:49:42,304][08993] Saving new best policy, reward=10.682! |
|
[2025-04-19 17:49:46,346][09004] Updated weights for policy 0, policy_version 360 (0.0010) |
|
[2025-04-19 17:49:47,303][08935] Fps is (10 sec: 9420.6, 60 sec: 8874.6, 300 sec: 8722.1). Total num frames: 1482752. Throughput: 0: 2184.0. Samples: 366514. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:47,304][08935] Avg episode reward: [(0, '10.472')] |
|
[2025-04-19 17:49:50,780][09004] Updated weights for policy 0, policy_version 370 (0.0008) |
|
[2025-04-19 17:49:52,303][08935] Fps is (10 sec: 9420.7, 60 sec: 8806.4, 300 sec: 8730.3). Total num frames: 1527808. Throughput: 0: 2205.1. Samples: 380246. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:49:52,304][08935] Avg episode reward: [(0, '11.006')] |
|
[2025-04-19 17:49:52,305][08993] Saving new best policy, reward=11.006! |
|
[2025-04-19 17:49:55,575][09004] Updated weights for policy 0, policy_version 380 (0.0010) |
|
[2025-04-19 17:49:57,303][08935] Fps is (10 sec: 8601.4, 60 sec: 8738.1, 300 sec: 8715.4). Total num frames: 1568768. Throughput: 0: 2204.2. Samples: 386478. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:49:57,304][08935] Avg episode reward: [(0, '11.157')] |
|
[2025-04-19 17:49:57,309][08993] Saving new best policy, reward=11.157! |
|
[2025-04-19 17:50:00,514][09004] Updated weights for policy 0, policy_version 390 (0.0009) |
|
[2025-04-19 17:50:02,303][08935] Fps is (10 sec: 8192.1, 60 sec: 8738.1, 300 sec: 8701.2). Total num frames: 1609728. Throughput: 0: 2205.9. Samples: 399119. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:02,304][08935] Avg episode reward: [(0, '11.236')] |
|
[2025-04-19 17:50:02,305][08993] Saving new best policy, reward=11.236! |
|
[2025-04-19 17:50:05,692][09004] Updated weights for policy 0, policy_version 400 (0.0009) |
|
[2025-04-19 17:50:07,303][08935] Fps is (10 sec: 8192.3, 60 sec: 8669.9, 300 sec: 8687.8). Total num frames: 1650688. Throughput: 0: 2194.5. Samples: 411210. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:07,304][08935] Avg episode reward: [(0, '12.841')] |
|
[2025-04-19 17:50:07,308][08993] Saving new best policy, reward=12.841! |
|
[2025-04-19 17:50:10,659][09004] Updated weights for policy 0, policy_version 410 (0.0009) |
|
[2025-04-19 17:50:12,303][08935] Fps is (10 sec: 8191.7, 60 sec: 8669.8, 300 sec: 8675.1). Total num frames: 1691648. Throughput: 0: 2185.9. Samples: 417349. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:12,306][08935] Avg episode reward: [(0, '13.748')] |
|
[2025-04-19 17:50:12,310][08993] Saving new best policy, reward=13.748! |
|
[2025-04-19 17:50:15,441][09004] Updated weights for policy 0, policy_version 420 (0.0010) |
|
[2025-04-19 17:50:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8669.8, 300 sec: 8663.0). Total num frames: 1732608. Throughput: 0: 2180.6. Samples: 430148. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:17,304][08935] Avg episode reward: [(0, '13.048')] |
|
[2025-04-19 17:50:19,931][09004] Updated weights for policy 0, policy_version 430 (0.0009) |
|
[2025-04-19 17:50:22,303][08935] Fps is (10 sec: 9011.5, 60 sec: 8806.4, 300 sec: 8691.5). Total num frames: 1781760. Throughput: 0: 2189.2. Samples: 443963. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:50:22,304][08935] Avg episode reward: [(0, '14.504')] |
|
[2025-04-19 17:50:22,305][08993] Saving new best policy, reward=14.504! |
|
[2025-04-19 17:50:24,259][09004] Updated weights for policy 0, policy_version 440 (0.0008) |
|
[2025-04-19 17:50:27,303][08935] Fps is (10 sec: 9420.9, 60 sec: 8806.4, 300 sec: 8699.1). Total num frames: 1826816. Throughput: 0: 2189.6. Samples: 450958. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:27,305][08935] Avg episode reward: [(0, '15.493')] |
|
[2025-04-19 17:50:27,335][08993] Saving new best policy, reward=15.493! |
|
[2025-04-19 17:50:28,984][09004] Updated weights for policy 0, policy_version 450 (0.0008) |
|
[2025-04-19 17:50:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8706.4). Total num frames: 1871872. Throughput: 0: 2165.2. Samples: 463947. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:50:32,304][08935] Avg episode reward: [(0, '15.586')] |
|
[2025-04-19 17:50:32,305][08993] Saving new best policy, reward=15.586! |
|
[2025-04-19 17:50:33,757][09004] Updated weights for policy 0, policy_version 460 (0.0009) |
|
[2025-04-19 17:50:37,303][08935] Fps is (10 sec: 8601.2, 60 sec: 8738.1, 300 sec: 8694.7). Total num frames: 1912832. Throughput: 0: 2136.8. Samples: 476402. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:37,304][08935] Avg episode reward: [(0, '16.166')] |
|
[2025-04-19 17:50:37,308][08993] Saving new best policy, reward=16.166! |
|
[2025-04-19 17:50:38,770][09004] Updated weights for policy 0, policy_version 470 (0.0009) |
|
[2025-04-19 17:50:42,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8683.5). Total num frames: 1953792. Throughput: 0: 2136.8. Samples: 482633. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:42,304][08935] Avg episode reward: [(0, '15.322')] |
|
[2025-04-19 17:50:43,683][09004] Updated weights for policy 0, policy_version 480 (0.0008) |
|
[2025-04-19 17:50:47,303][08935] Fps is (10 sec: 8192.4, 60 sec: 8533.4, 300 sec: 8672.8). Total num frames: 1994752. Throughput: 0: 2136.2. Samples: 495246. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:47,304][08935] Avg episode reward: [(0, '16.011')] |
|
[2025-04-19 17:50:47,308][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000487_1994752.pth... |
|
[2025-04-19 17:50:48,527][09004] Updated weights for policy 0, policy_version 490 (0.0008) |
|
[2025-04-19 17:50:52,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8465.1, 300 sec: 8662.6). Total num frames: 2035712. Throughput: 0: 2150.2. Samples: 507969. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:52,304][08935] Avg episode reward: [(0, '17.297')] |
|
[2025-04-19 17:50:52,305][08993] Saving new best policy, reward=17.297! |
|
[2025-04-19 17:50:53,366][09004] Updated weights for policy 0, policy_version 500 (0.0009) |
|
[2025-04-19 17:50:57,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8533.4, 300 sec: 8669.9). Total num frames: 2080768. Throughput: 0: 2154.3. Samples: 514294. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:50:57,304][08935] Avg episode reward: [(0, '16.608')] |
|
[2025-04-19 17:50:57,851][09004] Updated weights for policy 0, policy_version 510 (0.0009) |
|
[2025-04-19 17:51:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8601.6, 300 sec: 8676.8). Total num frames: 2125824. Throughput: 0: 2172.4. Samples: 527908. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:02,305][08935] Avg episode reward: [(0, '14.945')] |
|
[2025-04-19 17:51:02,690][09004] Updated weights for policy 0, policy_version 520 (0.0009) |
|
[2025-04-19 17:51:07,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8667.1). Total num frames: 2166784. Throughput: 0: 2142.4. Samples: 540371. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:07,306][08935] Avg episode reward: [(0, '15.166')] |
|
[2025-04-19 17:51:07,501][09004] Updated weights for policy 0, policy_version 530 (0.0009) |
|
[2025-04-19 17:51:12,222][09004] Updated weights for policy 0, policy_version 540 (0.0008) |
|
[2025-04-19 17:51:12,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8673.9). Total num frames: 2211840. Throughput: 0: 2133.7. Samples: 546973. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:51:12,304][08935] Avg episode reward: [(0, '15.981')] |
|
[2025-04-19 17:51:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8648.9). Total num frames: 2248704. Throughput: 0: 2115.4. Samples: 559142. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:17,304][08935] Avg episode reward: [(0, '16.686')] |
|
[2025-04-19 17:51:17,621][09004] Updated weights for policy 0, policy_version 550 (0.0009) |
|
[2025-04-19 17:51:22,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8465.1, 300 sec: 8640.2). Total num frames: 2289664. Throughput: 0: 2104.9. Samples: 571122. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:22,304][08935] Avg episode reward: [(0, '18.793')] |
|
[2025-04-19 17:51:22,305][08993] Saving new best policy, reward=18.793! |
|
[2025-04-19 17:51:22,557][09004] Updated weights for policy 0, policy_version 560 (0.0008) |
|
[2025-04-19 17:51:27,101][09004] Updated weights for policy 0, policy_version 570 (0.0008) |
|
[2025-04-19 17:51:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8465.1, 300 sec: 8647.1). Total num frames: 2334720. Throughput: 0: 2109.7. Samples: 577571. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:51:27,304][08935] Avg episode reward: [(0, '18.396')] |
|
[2025-04-19 17:51:31,750][09004] Updated weights for policy 0, policy_version 580 (0.0008) |
|
[2025-04-19 17:51:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8465.1, 300 sec: 8653.7). Total num frames: 2379776. Throughput: 0: 2125.0. Samples: 590873. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:32,304][08935] Avg episode reward: [(0, '18.771')] |
|
[2025-04-19 17:51:36,167][09004] Updated weights for policy 0, policy_version 590 (0.0009) |
|
[2025-04-19 17:51:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8533.4, 300 sec: 8660.1). Total num frames: 2424832. Throughput: 0: 2151.0. Samples: 604763. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:37,304][08935] Avg episode reward: [(0, '19.675')] |
|
[2025-04-19 17:51:37,307][08993] Saving new best policy, reward=19.675! |
|
[2025-04-19 17:51:40,633][09004] Updated weights for policy 0, policy_version 600 (0.0008) |
|
[2025-04-19 17:51:42,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8601.6, 300 sec: 8666.3). Total num frames: 2469888. Throughput: 0: 2162.3. Samples: 611599. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:42,304][08935] Avg episode reward: [(0, '20.646')] |
|
[2025-04-19 17:51:42,305][08993] Saving new best policy, reward=20.646! |
|
[2025-04-19 17:51:45,154][09004] Updated weights for policy 0, policy_version 610 (0.0009) |
|
[2025-04-19 17:51:47,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8738.1, 300 sec: 8686.3). Total num frames: 2519040. Throughput: 0: 2162.8. Samples: 625236. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:51:47,304][08935] Avg episode reward: [(0, '21.808')] |
|
[2025-04-19 17:51:47,309][08993] Saving new best policy, reward=21.808! |
|
[2025-04-19 17:51:49,683][09004] Updated weights for policy 0, policy_version 620 (0.0008) |
|
[2025-04-19 17:51:52,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8678.0). Total num frames: 2560000. Throughput: 0: 2184.7. Samples: 638682. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:52,304][08935] Avg episode reward: [(0, '21.149')] |
|
[2025-04-19 17:51:54,499][09004] Updated weights for policy 0, policy_version 630 (0.0009) |
|
[2025-04-19 17:51:57,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8664.1). Total num frames: 2605056. Throughput: 0: 2178.2. Samples: 644992. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:51:57,304][08935] Avg episode reward: [(0, '20.334')] |
|
[2025-04-19 17:51:59,027][09004] Updated weights for policy 0, policy_version 640 (0.0008) |
|
[2025-04-19 17:52:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8664.1). Total num frames: 2650112. Throughput: 0: 2209.5. Samples: 658569. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:02,304][08935] Avg episode reward: [(0, '20.266')] |
|
[2025-04-19 17:52:03,501][09004] Updated weights for policy 0, policy_version 650 (0.0008) |
|
[2025-04-19 17:52:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.0). Total num frames: 2695168. Throughput: 0: 2245.8. Samples: 672183. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:52:07,304][08935] Avg episode reward: [(0, '19.989')] |
|
[2025-04-19 17:52:08,128][09004] Updated weights for policy 0, policy_version 660 (0.0008) |
|
[2025-04-19 17:52:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.0). Total num frames: 2740224. Throughput: 0: 2248.4. Samples: 678747. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:12,304][08935] Avg episode reward: [(0, '18.762')] |
|
[2025-04-19 17:52:12,729][09004] Updated weights for policy 0, policy_version 670 (0.0009) |
|
[2025-04-19 17:52:17,220][09004] Updated weights for policy 0, policy_version 680 (0.0009) |
|
[2025-04-19 17:52:17,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8678.0). Total num frames: 2785280. Throughput: 0: 2255.8. Samples: 692386. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:17,304][08935] Avg episode reward: [(0, '18.158')] |
|
[2025-04-19 17:52:21,725][09004] Updated weights for policy 0, policy_version 690 (0.0009) |
|
[2025-04-19 17:52:22,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8678.0). Total num frames: 2830336. Throughput: 0: 2245.8. Samples: 705822. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:22,304][08935] Avg episode reward: [(0, '20.096')] |
|
[2025-04-19 17:52:26,613][09004] Updated weights for policy 0, policy_version 700 (0.0008) |
|
[2025-04-19 17:52:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8664.1). Total num frames: 2871296. Throughput: 0: 2225.4. Samples: 711744. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:52:27,304][08935] Avg episode reward: [(0, '23.016')] |
|
[2025-04-19 17:52:27,307][08993] Saving new best policy, reward=23.016! |
|
[2025-04-19 17:52:31,397][09004] Updated weights for policy 0, policy_version 710 (0.0009) |
|
[2025-04-19 17:52:32,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8664.1). Total num frames: 2916352. Throughput: 0: 2221.7. Samples: 725213. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:32,304][08935] Avg episode reward: [(0, '24.012')] |
|
[2025-04-19 17:52:32,305][08993] Saving new best policy, reward=24.012! |
|
[2025-04-19 17:52:35,831][09004] Updated weights for policy 0, policy_version 720 (0.0007) |
|
[2025-04-19 17:52:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8691.8). Total num frames: 2961408. Throughput: 0: 2218.8. Samples: 738526. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:37,304][08935] Avg episode reward: [(0, '20.950')] |
|
[2025-04-19 17:52:40,565][09004] Updated weights for policy 0, policy_version 730 (0.0009) |
|
[2025-04-19 17:52:42,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8691.9). Total num frames: 3002368. Throughput: 0: 2221.2. Samples: 744947. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:42,304][08935] Avg episode reward: [(0, '22.606')] |
|
[2025-04-19 17:52:45,714][09004] Updated weights for policy 0, policy_version 740 (0.0008) |
|
[2025-04-19 17:52:47,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8738.1, 300 sec: 8678.0). Total num frames: 3043328. Throughput: 0: 2194.5. Samples: 757323. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:52:47,304][08935] Avg episode reward: [(0, '23.337')] |
|
[2025-04-19 17:52:47,308][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth... |
|
[2025-04-19 17:52:47,355][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000233_954368.pth |
|
[2025-04-19 17:52:50,370][09004] Updated weights for policy 0, policy_version 750 (0.0008) |
|
[2025-04-19 17:52:52,303][08935] Fps is (10 sec: 8601.5, 60 sec: 8806.4, 300 sec: 8692.0). Total num frames: 3088384. Throughput: 0: 2181.8. Samples: 770366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:52:52,304][08935] Avg episode reward: [(0, '22.798')] |
|
[2025-04-19 17:52:54,851][09004] Updated weights for policy 0, policy_version 760 (0.0008) |
|
[2025-04-19 17:52:57,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8733.5). Total num frames: 3133440. Throughput: 0: 2186.6. Samples: 777142. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:52:57,304][08935] Avg episode reward: [(0, '21.810')] |
|
[2025-04-19 17:52:59,340][09004] Updated weights for policy 0, policy_version 770 (0.0009) |
|
[2025-04-19 17:53:02,303][08935] Fps is (10 sec: 9011.3, 60 sec: 8806.4, 300 sec: 8761.3). Total num frames: 3178496. Throughput: 0: 2188.4. Samples: 790862. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:02,304][08935] Avg episode reward: [(0, '21.429')] |
|
[2025-04-19 17:53:03,816][09004] Updated weights for policy 0, policy_version 780 (0.0008) |
|
[2025-04-19 17:53:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8775.2). Total num frames: 3223552. Throughput: 0: 2198.1. Samples: 804736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:07,304][08935] Avg episode reward: [(0, '20.614')] |
|
[2025-04-19 17:53:08,231][09004] Updated weights for policy 0, policy_version 790 (0.0008) |
|
[2025-04-19 17:53:12,303][08935] Fps is (10 sec: 9420.7, 60 sec: 8874.7, 300 sec: 8816.8). Total num frames: 3272704. Throughput: 0: 2221.1. Samples: 811693. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:12,304][08935] Avg episode reward: [(0, '21.085')] |
|
[2025-04-19 17:53:12,586][09004] Updated weights for policy 0, policy_version 800 (0.0007) |
|
[2025-04-19 17:53:16,929][09004] Updated weights for policy 0, policy_version 810 (0.0007) |
|
[2025-04-19 17:53:17,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8874.7, 300 sec: 8802.9). Total num frames: 3317760. Throughput: 0: 2233.7. Samples: 825729. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:17,304][08935] Avg episode reward: [(0, '22.154')] |
|
[2025-04-19 17:53:21,335][09004] Updated weights for policy 0, policy_version 820 (0.0008) |
|
[2025-04-19 17:53:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8942.9, 300 sec: 8816.8). Total num frames: 3366912. Throughput: 0: 2253.7. Samples: 839944. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:22,304][08935] Avg episode reward: [(0, '21.032')] |
|
[2025-04-19 17:53:25,718][09004] Updated weights for policy 0, policy_version 830 (0.0007) |
|
[2025-04-19 17:53:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9011.2, 300 sec: 8816.8). Total num frames: 3411968. Throughput: 0: 2264.6. Samples: 846856. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:27,304][08935] Avg episode reward: [(0, '21.124')] |
|
[2025-04-19 17:53:30,156][09004] Updated weights for policy 0, policy_version 840 (0.0008) |
|
[2025-04-19 17:53:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8802.9). Total num frames: 3457024. Throughput: 0: 2299.7. Samples: 860811. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:32,304][08935] Avg episode reward: [(0, '23.921')] |
|
[2025-04-19 17:53:34,590][09004] Updated weights for policy 0, policy_version 850 (0.0008) |
|
[2025-04-19 17:53:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9079.5, 300 sec: 8803.0). Total num frames: 3506176. Throughput: 0: 2313.7. Samples: 874483. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:37,304][08935] Avg episode reward: [(0, '27.222')] |
|
[2025-04-19 17:53:37,307][08993] Saving new best policy, reward=27.222! |
|
[2025-04-19 17:53:39,119][09004] Updated weights for policy 0, policy_version 860 (0.0008) |
|
[2025-04-19 17:53:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9147.7, 300 sec: 8816.8). Total num frames: 3551232. Throughput: 0: 2315.9. Samples: 881356. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:42,304][08935] Avg episode reward: [(0, '27.461')] |
|
[2025-04-19 17:53:42,305][08993] Saving new best policy, reward=27.461! |
|
[2025-04-19 17:53:43,553][09004] Updated weights for policy 0, policy_version 870 (0.0008) |
|
[2025-04-19 17:53:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8802.9). Total num frames: 3596288. Throughput: 0: 2315.7. Samples: 895068. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:47,304][08935] Avg episode reward: [(0, '24.830')] |
|
[2025-04-19 17:53:48,006][09004] Updated weights for policy 0, policy_version 880 (0.0008) |
|
[2025-04-19 17:53:52,251][09004] Updated weights for policy 0, policy_version 890 (0.0007) |
|
[2025-04-19 17:53:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8816.8). Total num frames: 3645440. Throughput: 0: 2324.4. Samples: 909336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:52,304][08935] Avg episode reward: [(0, '22.522')] |
|
[2025-04-19 17:53:56,654][09004] Updated weights for policy 0, policy_version 900 (0.0008) |
|
[2025-04-19 17:53:57,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8816.8). Total num frames: 3686400. Throughput: 0: 2321.1. Samples: 916142. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:53:57,304][08935] Avg episode reward: [(0, '23.013')] |
|
[2025-04-19 17:54:01,316][09004] Updated weights for policy 0, policy_version 910 (0.0008) |
|
[2025-04-19 17:54:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9284.3, 300 sec: 8830.7). Total num frames: 3735552. Throughput: 0: 2310.6. Samples: 929704. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:54:02,304][08935] Avg episode reward: [(0, '26.877')] |
|
[2025-04-19 17:54:05,695][09004] Updated weights for policy 0, policy_version 920 (0.0008) |
|
[2025-04-19 17:54:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8844.6). Total num frames: 3780608. Throughput: 0: 2304.1. Samples: 943630. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:07,305][08935] Avg episode reward: [(0, '27.147')] |
|
[2025-04-19 17:54:10,197][09004] Updated weights for policy 0, policy_version 930 (0.0008) |
|
[2025-04-19 17:54:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8858.5). Total num frames: 3825664. Throughput: 0: 2303.3. Samples: 950503. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:54:12,304][08935] Avg episode reward: [(0, '26.537')] |
|
[2025-04-19 17:54:14,632][09004] Updated weights for policy 0, policy_version 940 (0.0008) |
|
[2025-04-19 17:54:17,303][08935] Fps is (10 sec: 9420.7, 60 sec: 9284.2, 300 sec: 8886.2). Total num frames: 3874816. Throughput: 0: 2299.1. Samples: 964269. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:17,304][08935] Avg episode reward: [(0, '23.645')] |
|
[2025-04-19 17:54:19,111][09004] Updated weights for policy 0, policy_version 950 (0.0008) |
|
[2025-04-19 17:54:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 3919872. Throughput: 0: 2303.6. Samples: 978146. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:22,304][08935] Avg episode reward: [(0, '24.159')] |
|
[2025-04-19 17:54:23,519][09004] Updated weights for policy 0, policy_version 960 (0.0010) |
|
[2025-04-19 17:54:27,303][08935] Fps is (10 sec: 9011.4, 60 sec: 9216.0, 300 sec: 8900.1). Total num frames: 3964928. Throughput: 0: 2303.0. Samples: 984990. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:27,304][08935] Avg episode reward: [(0, '26.718')] |
|
[2025-04-19 17:54:28,024][09004] Updated weights for policy 0, policy_version 970 (0.0008) |
|
[2025-04-19 17:54:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4009984. Throughput: 0: 2306.5. Samples: 998860. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:32,304][08935] Avg episode reward: [(0, '26.309')] |
|
[2025-04-19 17:54:32,348][09004] Updated weights for policy 0, policy_version 980 (0.0007) |
|
[2025-04-19 17:54:36,734][09004] Updated weights for policy 0, policy_version 990 (0.0008) |
|
[2025-04-19 17:54:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8900.1). Total num frames: 4059136. Throughput: 0: 2303.2. Samples: 1012981. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:37,304][08935] Avg episode reward: [(0, '27.251')] |
|
[2025-04-19 17:54:41,099][09004] Updated weights for policy 0, policy_version 1000 (0.0008) |
|
[2025-04-19 17:54:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4104192. Throughput: 0: 2306.2. Samples: 1019923. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:42,304][08935] Avg episode reward: [(0, '24.135')] |
|
[2025-04-19 17:54:45,484][09004] Updated weights for policy 0, policy_version 1010 (0.0008) |
|
[2025-04-19 17:54:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4149248. Throughput: 0: 2319.7. Samples: 1034089. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:47,304][08935] Avg episode reward: [(0, '22.221')] |
|
[2025-04-19 17:54:47,315][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001014_4153344.pth... |
|
[2025-04-19 17:54:47,359][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000487_1994752.pth |
|
[2025-04-19 17:54:50,000][09004] Updated weights for policy 0, policy_version 1020 (0.0009) |
|
[2025-04-19 17:54:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8914.0). Total num frames: 4198400. Throughput: 0: 2313.9. Samples: 1047757. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:54:52,304][08935] Avg episode reward: [(0, '24.666')] |
|
[2025-04-19 17:54:54,461][09004] Updated weights for policy 0, policy_version 1030 (0.0008) |
|
[2025-04-19 17:54:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8927.9). Total num frames: 4243456. Throughput: 0: 2312.8. Samples: 1054580. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:54:57,304][08935] Avg episode reward: [(0, '25.012')] |
|
[2025-04-19 17:54:58,959][09004] Updated weights for policy 0, policy_version 1040 (0.0007) |
|
[2025-04-19 17:55:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8941.8). Total num frames: 4288512. Throughput: 0: 2312.2. Samples: 1068317. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:55:02,304][08935] Avg episode reward: [(0, '24.328')] |
|
[2025-04-19 17:55:03,371][09004] Updated weights for policy 0, policy_version 1050 (0.0010) |
|
[2025-04-19 17:55:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8969.6). Total num frames: 4337664. Throughput: 0: 2313.3. Samples: 1082245. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:07,304][08935] Avg episode reward: [(0, '26.567')] |
|
[2025-04-19 17:55:07,823][09004] Updated weights for policy 0, policy_version 1060 (0.0008) |
|
[2025-04-19 17:55:12,141][09004] Updated weights for policy 0, policy_version 1070 (0.0008) |
|
[2025-04-19 17:55:12,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4382720. Throughput: 0: 2314.0. Samples: 1089120. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0) |
|
[2025-04-19 17:55:12,304][08935] Avg episode reward: [(0, '26.622')] |
|
[2025-04-19 17:55:16,503][09004] Updated weights for policy 0, policy_version 1080 (0.0007) |
|
[2025-04-19 17:55:17,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8969.5). Total num frames: 4427776. Throughput: 0: 2320.7. Samples: 1103292. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:17,304][08935] Avg episode reward: [(0, '25.447')] |
|
[2025-04-19 17:55:20,859][09004] Updated weights for policy 0, policy_version 1090 (0.0008) |
|
[2025-04-19 17:55:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4476928. Throughput: 0: 2319.6. Samples: 1117363. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:22,304][08935] Avg episode reward: [(0, '24.431')] |
|
[2025-04-19 17:55:25,312][09004] Updated weights for policy 0, policy_version 1100 (0.0008) |
|
[2025-04-19 17:55:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4521984. Throughput: 0: 2317.4. Samples: 1124208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:27,304][08935] Avg episode reward: [(0, '21.919')] |
|
[2025-04-19 17:55:29,818][09004] Updated weights for policy 0, policy_version 1110 (0.0008) |
|
[2025-04-19 17:55:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9284.3, 300 sec: 8997.3). Total num frames: 4567040. Throughput: 0: 2307.6. Samples: 1137932. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:32,304][08935] Avg episode reward: [(0, '23.175')] |
|
[2025-04-19 17:55:34,294][09004] Updated weights for policy 0, policy_version 1120 (0.0008) |
|
[2025-04-19 17:55:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9011.2). Total num frames: 4612096. Throughput: 0: 2309.2. Samples: 1151673. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:37,305][08935] Avg episode reward: [(0, '24.341')] |
|
[2025-04-19 17:55:38,791][09004] Updated weights for policy 0, policy_version 1130 (0.0008) |
|
[2025-04-19 17:55:42,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9025.1). Total num frames: 4657152. Throughput: 0: 2309.3. Samples: 1158498. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:42,304][08935] Avg episode reward: [(0, '24.511')] |
|
[2025-04-19 17:55:43,215][09004] Updated weights for policy 0, policy_version 1140 (0.0008) |
|
[2025-04-19 17:55:47,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9052.9). Total num frames: 4706304. Throughput: 0: 2314.0. Samples: 1172449. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:47,304][08935] Avg episode reward: [(0, '26.442')] |
|
[2025-04-19 17:55:47,579][09004] Updated weights for policy 0, policy_version 1150 (0.0009) |
|
[2025-04-19 17:55:51,913][09004] Updated weights for policy 0, policy_version 1160 (0.0007) |
|
[2025-04-19 17:55:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 9052.9). Total num frames: 4751360. Throughput: 0: 2319.6. Samples: 1186629. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:55:52,304][08935] Avg episode reward: [(0, '26.630')] |
|
[2025-04-19 17:55:56,319][09004] Updated weights for policy 0, policy_version 1170 (0.0007) |
|
[2025-04-19 17:55:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9066.7). Total num frames: 4800512. Throughput: 0: 2319.2. Samples: 1193482. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:55:57,304][08935] Avg episode reward: [(0, '28.653')] |
|
[2025-04-19 17:55:57,307][08993] Saving new best policy, reward=28.653! |
|
[2025-04-19 17:56:00,753][09004] Updated weights for policy 0, policy_version 1180 (0.0007) |
|
[2025-04-19 17:56:02,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9080.6). Total num frames: 4845568. Throughput: 0: 2315.3. Samples: 1207479. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:56:02,304][08935] Avg episode reward: [(0, '25.988')] |
|
[2025-04-19 17:56:05,217][09004] Updated weights for policy 0, policy_version 1190 (0.0008) |
|
[2025-04-19 17:56:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9080.6). Total num frames: 4890624. Throughput: 0: 2309.6. Samples: 1221296. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-04-19 17:56:07,304][08935] Avg episode reward: [(0, '27.006')] |
|
[2025-04-19 17:56:09,687][09004] Updated weights for policy 0, policy_version 1200 (0.0009) |
|
[2025-04-19 17:56:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9108.4). Total num frames: 4935680. Throughput: 0: 2309.8. Samples: 1228147. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:56:12,304][08935] Avg episode reward: [(0, '25.495')] |
|
[2025-04-19 17:56:14,192][09004] Updated weights for policy 0, policy_version 1210 (0.0008) |
|
[2025-04-19 17:56:17,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9136.2). Total num frames: 4984832. Throughput: 0: 2310.4. Samples: 1241900. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-04-19 17:56:17,304][08935] Avg episode reward: [(0, '24.476')] |
|
[2025-04-19 17:56:18,631][09004] Updated weights for policy 0, policy_version 1220 (0.0008) |
|
[2025-04-19 17:56:19,566][08993] Stopping Batcher_0... |
|
[2025-04-19 17:56:19,566][08993] Loop batcher_evt_loop terminating... |
|
[2025-04-19 17:56:19,566][08935] Component Batcher_0 stopped! |
|
[2025-04-19 17:56:19,566][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 17:56:19,579][09006] Stopping RolloutWorker_w0... |
|
[2025-04-19 17:56:19,579][09006] Loop rollout_proc0_evt_loop terminating... |
|
[2025-04-19 17:56:19,579][08935] Component RolloutWorker_w0 stopped! |
|
[2025-04-19 17:56:19,580][09004] Weights refcount: 2 0 |
|
[2025-04-19 17:56:19,582][09004] Stopping InferenceWorker_p0-w0... |
|
[2025-04-19 17:56:19,583][09004] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-04-19 17:56:19,584][08935] Component InferenceWorker_p0-w0 stopped! |
|
[2025-04-19 17:56:19,594][09007] Stopping RolloutWorker_w2... |
|
[2025-04-19 17:56:19,594][09007] Loop rollout_proc2_evt_loop terminating... |
|
[2025-04-19 17:56:19,594][08935] Component RolloutWorker_w2 stopped! |
|
[2025-04-19 17:56:19,597][09005] Stopping RolloutWorker_w1... |
|
[2025-04-19 17:56:19,597][09005] Loop rollout_proc1_evt_loop terminating... |
|
[2025-04-19 17:56:19,597][08935] Component RolloutWorker_w1 stopped! |
|
[2025-04-19 17:56:19,618][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth |
|
[2025-04-19 17:56:19,628][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 17:56:19,631][09009] Stopping RolloutWorker_w5... |
|
[2025-04-19 17:56:19,632][09009] Loop rollout_proc5_evt_loop terminating... |
|
[2025-04-19 17:56:19,631][08935] Component RolloutWorker_w5 stopped! |
|
[2025-04-19 17:56:19,673][09010] Stopping RolloutWorker_w4... |
|
[2025-04-19 17:56:19,673][09010] Loop rollout_proc4_evt_loop terminating... |
|
[2025-04-19 17:56:19,679][09008] Stopping RolloutWorker_w3... |
|
[2025-04-19 17:56:19,680][09008] Loop rollout_proc3_evt_loop terminating... |
|
[2025-04-19 17:56:19,676][08935] Component RolloutWorker_w4 stopped! |
|
[2025-04-19 17:56:19,682][08935] Component RolloutWorker_w3 stopped! |
|
[2025-04-19 17:56:19,767][08993] Stopping LearnerWorker_p0... |
|
[2025-04-19 17:56:19,768][08993] Loop learner_proc0_evt_loop terminating... |
|
[2025-04-19 17:56:19,767][08935] Component LearnerWorker_p0 stopped! |
|
[2025-04-19 17:56:19,768][08935] Waiting for process learner_proc0 to stop... |
|
[2025-04-19 17:56:20,633][08935] Waiting for process inference_proc0-0 to join... |
|
[2025-04-19 17:56:20,633][08935] Waiting for process rollout_proc0 to join... |
|
[2025-04-19 17:56:20,634][08935] Waiting for process rollout_proc1 to join... |
|
[2025-04-19 17:56:20,635][08935] Waiting for process rollout_proc2 to join... |
|
[2025-04-19 17:56:20,635][08935] Waiting for process rollout_proc3 to join... |
|
[2025-04-19 17:56:20,636][08935] Waiting for process rollout_proc4 to join... |
|
[2025-04-19 17:56:20,636][08935] Waiting for process rollout_proc5 to join... |
|
[2025-04-19 17:56:20,637][08935] Batcher 0 profile tree view: |
|
batching: 14.8875, releasing_batches: 0.0256 |
|
[2025-04-19 17:56:20,637][08935] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 9.4959 |
|
update_model: 8.1520 |
|
weight_update: 0.0008 |
|
one_step: 0.0030 |
|
handle_policy_step: 518.4990 |
|
deserialize: 14.0397, stack: 3.2776, obs_to_device_normalize: 122.3004, forward: 259.9247, send_messages: 27.2078 |
|
prepare_outputs: 67.8854 |
|
to_cpu: 40.3985 |
|
[2025-04-19 17:56:20,638][08935] Learner 0 profile tree view: |
|
misc: 0.0037, prepare_batch: 8.8470 |
|
train: 34.5834 |
|
epoch_init: 0.0040, minibatch_init: 0.0056, losses_postprocess: 0.5332, kl_divergence: 0.4351, after_optimizer: 15.5133 |
|
calculate_losses: 11.7097 |
|
losses_init: 0.0024, forward_head: 0.7281, bptt_initial: 8.1532, tail: 0.5335, advantages_returns: 0.1383, losses: 1.0890 |
|
bptt: 0.9039 |
|
bptt_forward_core: 0.8562 |
|
update: 6.0418 |
|
clip: 0.6656 |
|
[2025-04-19 17:56:20,639][08935] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.2220, enqueue_policy_requests: 14.9416, env_step: 189.8717, overhead: 9.0107, complete_rollouts: 0.5647 |
|
save_policy_outputs: 15.6368 |
|
split_output_tensors: 5.3303 |
|
[2025-04-19 17:56:20,639][08935] RolloutWorker_w5 profile tree view: |
|
wait_for_trajectories: 0.2358, enqueue_policy_requests: 15.0036, env_step: 190.3710, overhead: 8.8125, complete_rollouts: 0.5870 |
|
save_policy_outputs: 15.6126 |
|
split_output_tensors: 5.3595 |
|
[2025-04-19 17:56:20,640][08935] Loop Runner_EvtLoop terminating... |
|
[2025-04-19 17:56:20,640][08935] Runner profile tree view: |
|
main_loop: 570.1098 |
|
[2025-04-19 17:56:20,641][08935] Collected {0: 5005312}, FPS: 8779.6 |
|
[2025-04-19 17:56:20,649][08935] Loading existing experiment configuration from ./runs/default_experiment/config.json |
|
[2025-04-19 17:56:20,649][08935] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-04-19 17:56:20,650][08935] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-04-19 17:56:20,650][08935] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-04-19 17:56:20,650][08935] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-04-19 17:56:20,651][08935] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-04-19 17:56:20,651][08935] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-04-19 17:56:20,651][08935] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-04-19 17:56:20,652][08935] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-04-19 17:56:20,652][08935] Adding new argument 'hf_repository'='CarlosElArtista/vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-04-19 17:56:20,652][08935] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-04-19 17:56:20,653][08935] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-04-19 17:56:20,653][08935] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-04-19 17:56:20,654][08935] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-04-19 17:56:20,655][08935] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-04-19 17:56:20,677][08935] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 17:56:20,678][08935] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 17:56:20,679][08935] RunningMeanStd input shape: (1,) |
|
[2025-04-19 17:56:20,689][08935] ConvEncoder: input_channels=3 |
|
[2025-04-19 17:56:20,867][08935] Conv encoder output size: 512 |
|
[2025-04-19 17:56:20,870][08935] Policy head output size: 512 |
|
[2025-04-19 17:56:21,139][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 17:56:21,141][08935] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-04-19 17:56:21,142][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 17:56:21,143][08935] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-04-19 17:56:21,144][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 17:56:21,145][08935] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-04-19 18:34:27,236][15432] Saving configuration to ./runs/default_experiment/config.json... |
|
[2025-04-19 18:34:27,238][15432] Rollout worker 0 uses device cpu |
|
[2025-04-19 18:34:27,238][15432] Rollout worker 1 uses device cpu |
|
[2025-04-19 18:34:27,239][15432] Rollout worker 2 uses device cpu |
|
[2025-04-19 18:34:27,240][15432] Rollout worker 3 uses device cpu |
|
[2025-04-19 18:34:27,240][15432] Rollout worker 4 uses device cpu |
|
[2025-04-19 18:34:27,241][15432] Rollout worker 5 uses device cpu |
|
[2025-04-19 18:34:27,309][15432] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 18:34:27,309][15432] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-04-19 18:34:27,333][15432] Starting all processes... |
|
[2025-04-19 18:34:27,334][15432] Starting process learner_proc0 |
|
[2025-04-19 18:34:27,383][15432] Starting all processes... |
|
[2025-04-19 18:34:27,386][15432] Starting process inference_proc0-0 |
|
[2025-04-19 18:34:27,387][15432] Starting process rollout_proc0 |
|
[2025-04-19 18:34:27,388][15432] Starting process rollout_proc1 |
|
[2025-04-19 18:34:27,390][15432] Starting process rollout_proc2 |
|
[2025-04-19 18:34:27,390][15432] Starting process rollout_proc3 |
|
[2025-04-19 18:34:27,391][15432] Starting process rollout_proc4 |
|
[2025-04-19 18:34:27,391][15432] Starting process rollout_proc5 |
|
[2025-04-19 18:34:29,635][15496] Worker 1 uses CPU cores [1] |
|
[2025-04-19 18:34:29,700][15501] Worker 5 uses CPU cores [5] |
|
[2025-04-19 18:34:29,860][15495] Worker 0 uses CPU cores [0] |
|
[2025-04-19 18:34:29,985][15500] Worker 4 uses CPU cores [4] |
|
[2025-04-19 18:34:30,046][15499] Worker 3 uses CPU cores [3] |
|
[2025-04-19 18:34:30,244][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 18:34:30,244][15484] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-04-19 18:34:30,259][15484] Num visible devices: 1 |
|
[2025-04-19 18:34:30,260][15484] Starting seed is not provided |
|
[2025-04-19 18:34:30,260][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 18:34:30,260][15484] Initializing actor-critic model on device cuda:0 |
|
[2025-04-19 18:34:30,260][15484] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 18:34:30,261][15484] RunningMeanStd input shape: (1,) |
|
[2025-04-19 18:34:30,266][15498] Worker 2 uses CPU cores [2] |
|
[2025-04-19 18:34:30,271][15484] ConvEncoder: input_channels=3 |
|
[2025-04-19 18:34:30,337][15497] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 18:34:30,337][15497] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-04-19 18:34:30,352][15497] Num visible devices: 1 |
|
[2025-04-19 18:34:30,355][15484] Conv encoder output size: 512 |
|
[2025-04-19 18:34:30,355][15484] Policy head output size: 512 |
|
[2025-04-19 18:34:30,367][15484] Created Actor Critic model with architecture: |
|
[2025-04-19 18:34:30,367][15484] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-04-19 18:34:30,661][15484] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-04-19 18:34:31,476][15484] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth... |
|
[2025-04-19 18:34:31,498][15484] Loading model from checkpoint |
|
[2025-04-19 18:34:31,499][15484] Loaded experiment state at self.train_step=1222, self.env_steps=5005312 |
|
[2025-04-19 18:34:31,499][15484] Initialized policy 0 weights for model version 1222 |
|
[2025-04-19 18:34:31,501][15484] LearnerWorker_p0 finished initialization! |
|
[2025-04-19 18:34:31,501][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-04-19 18:34:31,666][15497] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 18:34:31,667][15497] RunningMeanStd input shape: (1,) |
|
[2025-04-19 18:34:31,676][15497] ConvEncoder: input_channels=3 |
|
[2025-04-19 18:34:31,764][15497] Conv encoder output size: 512 |
|
[2025-04-19 18:34:31,764][15497] Policy head output size: 512 |
|
[2025-04-19 18:34:31,794][15432] Inference worker 0-0 is ready! |
|
[2025-04-19 18:34:31,794][15432] All inference workers are ready! Signal rollout workers to start! |
|
[2025-04-19 18:34:31,826][15498] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:31,826][15495] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:31,828][15500] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:31,830][15499] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:31,832][15496] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:31,859][15501] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:32,105][15498] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,107][15495] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,108][15496] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,343][15501] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,363][15498] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:32,393][15495] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:32,454][15500] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,457][15496] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:32,599][15501] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:32,626][15499] Decorrelating experience for 0 frames... |
|
[2025-04-19 18:34:32,863][15500] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:32,885][15499] Decorrelating experience for 32 frames... |
|
[2025-04-19 18:34:34,102][15484] Signal inference workers to stop experience collection... |
|
[2025-04-19 18:34:34,107][15497] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-04-19 18:34:34,149][15432] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 5005312. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-04-19 18:34:34,149][15432] Avg episode reward: [(0, '5.700')] |
|
[2025-04-19 18:34:34,852][15484] Signal inference workers to resume experience collection... |
|
[2025-04-19 18:34:34,852][15484] Stopping Batcher_0... |
|
[2025-04-19 18:34:34,852][15484] Loop batcher_evt_loop terminating... |
|
[2025-04-19 18:34:34,857][15432] Component Batcher_0 stopped! |
|
[2025-04-19 18:34:34,863][15497] Weights refcount: 2 0 |
|
[2025-04-19 18:34:34,864][15497] Stopping InferenceWorker_p0-w0... |
|
[2025-04-19 18:34:34,865][15497] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-04-19 18:34:34,864][15432] Component InferenceWorker_p0-w0 stopped! |
|
[2025-04-19 18:34:34,879][15501] Stopping RolloutWorker_w5... |
|
[2025-04-19 18:34:34,879][15501] Loop rollout_proc5_evt_loop terminating... |
|
[2025-04-19 18:34:34,879][15432] Component RolloutWorker_w5 stopped! |
|
[2025-04-19 18:34:34,886][15500] Stopping RolloutWorker_w4... |
|
[2025-04-19 18:34:34,886][15500] Loop rollout_proc4_evt_loop terminating... |
|
[2025-04-19 18:34:34,886][15432] Component RolloutWorker_w4 stopped! |
|
[2025-04-19 18:34:34,914][15432] Component RolloutWorker_w1 stopped! |
|
[2025-04-19 18:34:34,918][15496] Stopping RolloutWorker_w1... |
|
[2025-04-19 18:34:34,919][15496] Loop rollout_proc1_evt_loop terminating... |
|
[2025-04-19 18:34:34,954][15432] Component RolloutWorker_w3 stopped! |
|
[2025-04-19 18:34:34,955][15499] Stopping RolloutWorker_w3... |
|
[2025-04-19 18:34:34,956][15499] Loop rollout_proc3_evt_loop terminating... |
|
[2025-04-19 18:34:34,968][15432] Component RolloutWorker_w2 stopped! |
|
[2025-04-19 18:34:34,966][15498] Stopping RolloutWorker_w2... |
|
[2025-04-19 18:34:34,970][15495] Stopping RolloutWorker_w0... |
|
[2025-04-19 18:34:34,971][15495] Loop rollout_proc0_evt_loop terminating... |
|
[2025-04-19 18:34:34,970][15432] Component RolloutWorker_w0 stopped! |
|
[2025-04-19 18:34:34,970][15498] Loop rollout_proc2_evt_loop terminating... |
|
[2025-04-19 18:34:35,066][15484] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth... |
|
[2025-04-19 18:34:35,122][15484] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000001014_4153344.pth |
|
[2025-04-19 18:34:35,131][15484] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth... |
|
[2025-04-19 18:34:35,304][15484] Stopping LearnerWorker_p0... |
|
[2025-04-19 18:34:35,305][15484] Loop learner_proc0_evt_loop terminating... |
|
[2025-04-19 18:34:35,304][15432] Component LearnerWorker_p0 stopped! |
|
[2025-04-19 18:34:35,305][15432] Waiting for process learner_proc0 to stop... |
|
[2025-04-19 18:34:36,091][15432] Waiting for process inference_proc0-0 to join... |
|
[2025-04-19 18:34:36,092][15432] Waiting for process rollout_proc0 to join... |
|
[2025-04-19 18:34:36,093][15432] Waiting for process rollout_proc1 to join... |
|
[2025-04-19 18:34:36,093][15432] Waiting for process rollout_proc2 to join... |
|
[2025-04-19 18:34:36,094][15432] Waiting for process rollout_proc3 to join... |
|
[2025-04-19 18:34:36,094][15432] Waiting for process rollout_proc4 to join... |
|
[2025-04-19 18:34:36,095][15432] Waiting for process rollout_proc5 to join... |
|
[2025-04-19 18:34:36,096][15432] Batcher 0 profile tree view: |
|
batching: 0.0296, releasing_batches: 0.0003 |
|
[2025-04-19 18:34:36,096][15432] InferenceWorker_p0-w0 profile tree view: |
|
update_model: 0.0145 |
|
wait_policy: 0.0000 |
|
wait_policy_total: 0.6222 |
|
one_step: 0.0021 |
|
handle_policy_step: 1.6087 |
|
deserialize: 0.0368, stack: 0.0062, obs_to_device_normalize: 0.3223, forward: 1.0050, send_messages: 0.0533 |
|
prepare_outputs: 0.1360 |
|
to_cpu: 0.0799 |
|
[2025-04-19 18:34:36,097][15432] Learner 0 profile tree view: |
|
misc: 0.0000, prepare_batch: 0.5963 |
|
train: 1.0635 |
|
epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0006, kl_divergence: 0.0097, after_optimizer: 0.0431 |
|
calculate_losses: 0.2990 |
|
losses_init: 0.0000, forward_head: 0.1871, bptt_initial: 0.0706, tail: 0.0147, advantages_returns: 0.0023, losses: 0.0216 |
|
bptt: 0.0024 |
|
bptt_forward_core: 0.0023 |
|
update: 0.7104 |
|
clip: 0.0288 |
|
[2025-04-19 18:34:36,098][15432] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.0005, enqueue_policy_requests: 0.0365, env_step: 0.4395, overhead: 0.0177, complete_rollouts: 0.0010 |
|
save_policy_outputs: 0.0293 |
|
split_output_tensors: 0.0102 |
|
[2025-04-19 18:34:36,098][15432] RolloutWorker_w5 profile tree view: |
|
wait_for_trajectories: 0.0005, enqueue_policy_requests: 0.0352, env_step: 0.4595, overhead: 0.0167, complete_rollouts: 0.0008 |
|
save_policy_outputs: 0.0326 |
|
split_output_tensors: 0.0133 |
|
[2025-04-19 18:34:36,099][15432] Loop Runner_EvtLoop terminating... |
|
[2025-04-19 18:34:36,100][15432] Runner profile tree view: |
|
main_loop: 8.7668 |
|
[2025-04-19 18:34:36,100][15432] Collected {0: 5013504}, FPS: 934.4 |
|
[2025-04-19 18:34:36,108][15432] Loading existing experiment configuration from ./runs/default_experiment/config.json |
|
[2025-04-19 18:34:36,108][15432] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-04-19 18:34:36,109][15432] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-04-19 18:34:36,109][15432] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-04-19 18:34:36,110][15432] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-04-19 18:34:36,110][15432] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-04-19 18:34:36,110][15432] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-04-19 18:34:36,111][15432] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-04-19 18:34:36,111][15432] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-04-19 18:34:36,111][15432] Adding new argument 'hf_repository'='CarlosElArtista/vizdoom_health_gathering_supreme' that is not in the saved config file! |
|
[2025-04-19 18:34:36,112][15432] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-04-19 18:34:36,113][15432] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-04-19 18:34:36,114][15432] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-04-19 18:34:36,115][15432] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-04-19 18:34:36,115][15432] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-04-19 18:34:36,137][15432] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-04-19 18:34:36,139][15432] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-04-19 18:34:36,140][15432] RunningMeanStd input shape: (1,) |
|
[2025-04-19 18:34:36,150][15432] ConvEncoder: input_channels=3 |
|
[2025-04-19 18:34:36,272][15432] Conv encoder output size: 512 |
|
[2025-04-19 18:34:36,272][15432] Policy head output size: 512 |
|
[2025-04-19 18:34:36,520][15432] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth... |
|
[2025-04-19 18:34:37,115][15432] Num frames 100... |
|
[2025-04-19 18:34:37,237][15432] Num frames 200... |
|
[2025-04-19 18:34:37,352][15432] Num frames 300... |
|
[2025-04-19 18:34:37,468][15432] Num frames 400... |
|
[2025-04-19 18:34:37,587][15432] Num frames 500... |
|
[2025-04-19 18:34:37,702][15432] Num frames 600... |
|
[2025-04-19 18:34:37,839][15432] Avg episode rewards: #0: 11.720, true rewards: #0: 6.720 |
|
[2025-04-19 18:34:37,840][15432] Avg episode reward: 11.720, avg true_objective: 6.720 |
|
[2025-04-19 18:34:37,875][15432] Num frames 700... |
|
[2025-04-19 18:34:37,998][15432] Num frames 800... |
|
[2025-04-19 18:34:38,115][15432] Num frames 900... |
|
[2025-04-19 18:34:38,234][15432] Num frames 1000... |
|
[2025-04-19 18:34:38,353][15432] Num frames 1100... |
|
[2025-04-19 18:34:38,470][15432] Num frames 1200... |
|
[2025-04-19 18:34:38,590][15432] Num frames 1300... |
|
[2025-04-19 18:34:38,709][15432] Num frames 1400... |
|
[2025-04-19 18:34:38,828][15432] Num frames 1500... |
|
[2025-04-19 18:34:38,947][15432] Num frames 1600... |
|
[2025-04-19 18:34:39,067][15432] Num frames 1700... |
|
[2025-04-19 18:34:39,185][15432] Num frames 1800... |
|
[2025-04-19 18:34:39,299][15432] Num frames 1900... |
|
[2025-04-19 18:34:39,417][15432] Num frames 2000... |
|
[2025-04-19 18:34:39,537][15432] Num frames 2100... |
|
[2025-04-19 18:34:39,655][15432] Num frames 2200... |
|
[2025-04-19 18:34:39,776][15432] Num frames 2300... |
|
[2025-04-19 18:34:39,894][15432] Num frames 2400... |
|
[2025-04-19 18:34:40,013][15432] Num frames 2500... |
|
[2025-04-19 18:34:40,130][15432] Num frames 2600... |
|
[2025-04-19 18:34:40,214][15432] Avg episode rewards: #0: 33.630, true rewards: #0: 13.130 |
|
[2025-04-19 18:34:40,215][15432] Avg episode reward: 33.630, avg true_objective: 13.130 |
|
[2025-04-19 18:34:40,299][15432] Num frames 2700... |
|
[2025-04-19 18:34:40,417][15432] Num frames 2800... |
|
[2025-04-19 18:34:40,532][15432] Num frames 2900... |
|
[2025-04-19 18:34:40,647][15432] Num frames 3000... |
|
[2025-04-19 18:34:40,764][15432] Num frames 3100... |
|
[2025-04-19 18:34:40,880][15432] Num frames 3200... |
|
[2025-04-19 18:34:40,998][15432] Num frames 3300... |
|
[2025-04-19 18:34:41,117][15432] Num frames 3400... |
|
[2025-04-19 18:34:41,233][15432] Num frames 3500... |
|
[2025-04-19 18:34:41,347][15432] Num frames 3600... |
|
[2025-04-19 18:34:41,464][15432] Num frames 3700... |
|
[2025-04-19 18:34:41,578][15432] Num frames 3800... |
|
[2025-04-19 18:34:41,698][15432] Num frames 3900... |
|
[2025-04-19 18:34:41,816][15432] Num frames 4000... |
|
[2025-04-19 18:34:41,938][15432] Num frames 4100... |
|
[2025-04-19 18:34:42,048][15432] Num frames 4200... |
|
[2025-04-19 18:34:42,163][15432] Num frames 4300... |
|
[2025-04-19 18:34:42,281][15432] Num frames 4400... |
|
[2025-04-19 18:34:42,391][15432] Num frames 4500... |
|
[2025-04-19 18:34:42,509][15432] Num frames 4600... |
|
[2025-04-19 18:34:42,630][15432] Num frames 4700... |
|
[2025-04-19 18:34:42,700][15432] Avg episode rewards: #0: 39.709, true rewards: #0: 15.710 |
|
[2025-04-19 18:34:42,701][15432] Avg episode reward: 39.709, avg true_objective: 15.710 |
|
[2025-04-19 18:34:42,798][15432] Num frames 4800... |
|
[2025-04-19 18:34:42,917][15432] Num frames 4900... |
|
[2025-04-19 18:34:43,036][15432] Num frames 5000... |
|
[2025-04-19 18:34:43,155][15432] Num frames 5100... |
|
[2025-04-19 18:34:43,273][15432] Num frames 5200... |
|
[2025-04-19 18:34:43,391][15432] Num frames 5300... |
|
[2025-04-19 18:34:43,510][15432] Num frames 5400... |
|
[2025-04-19 18:34:43,627][15432] Num frames 5500... |
|
[2025-04-19 18:34:43,742][15432] Num frames 5600... |
|
[2025-04-19 18:34:43,857][15432] Num frames 5700... |
|
[2025-04-19 18:34:43,974][15432] Num frames 5800... |
|
[2025-04-19 18:34:44,104][15432] Avg episode rewards: #0: 36.162, true rewards: #0: 14.663 |
|
[2025-04-19 18:34:44,105][15432] Avg episode reward: 36.162, avg true_objective: 14.663 |
|
[2025-04-19 18:34:44,146][15432] Num frames 5900... |
|
[2025-04-19 18:34:44,259][15432] Num frames 6000... |
|
[2025-04-19 18:34:44,376][15432] Num frames 6100... |
|
[2025-04-19 18:34:44,492][15432] Num frames 6200... |
|
[2025-04-19 18:34:44,608][15432] Num frames 6300... |
|
[2025-04-19 18:34:44,726][15432] Num frames 6400... |
|
[2025-04-19 18:34:44,842][15432] Num frames 6500... |
|
[2025-04-19 18:34:44,975][15432] Avg episode rewards: #0: 32.138, true rewards: #0: 13.138 |
|
[2025-04-19 18:34:44,976][15432] Avg episode reward: 32.138, avg true_objective: 13.138 |
|
[2025-04-19 18:34:45,010][15432] Num frames 6600... |
|
[2025-04-19 18:34:45,127][15432] Num frames 6700... |
|
[2025-04-19 18:34:45,243][15432] Num frames 6800... |
|
[2025-04-19 18:34:45,358][15432] Num frames 6900... |
|
[2025-04-19 18:34:45,476][15432] Num frames 7000... |
|
[2025-04-19 18:34:45,593][15432] Num frames 7100... |
|
[2025-04-19 18:34:45,655][15432] Avg episode rewards: #0: 28.680, true rewards: #0: 11.847 |
|
[2025-04-19 18:34:45,656][15432] Avg episode reward: 28.680, avg true_objective: 11.847 |
|
[2025-04-19 18:34:45,761][15432] Num frames 7200... |
|
[2025-04-19 18:34:45,875][15432] Num frames 7300... |
|
[2025-04-19 18:34:45,993][15432] Num frames 7400... |
|
[2025-04-19 18:34:46,108][15432] Num frames 7500... |
|
[2025-04-19 18:34:46,223][15432] Num frames 7600... |
|
[2025-04-19 18:34:46,341][15432] Num frames 7700... |
|
[2025-04-19 18:34:46,457][15432] Num frames 7800... |
|
[2025-04-19 18:34:46,573][15432] Num frames 7900... |
|
[2025-04-19 18:34:46,672][15432] Avg episode rewards: #0: 26.771, true rewards: #0: 11.343 |
|
[2025-04-19 18:34:46,673][15432] Avg episode reward: 26.771, avg true_objective: 11.343 |
|
[2025-04-19 18:34:46,740][15432] Num frames 8000... |
|
[2025-04-19 18:34:46,853][15432] Num frames 8100... |
|
[2025-04-19 18:34:46,968][15432] Num frames 8200... |
|
[2025-04-19 18:34:47,086][15432] Num frames 8300... |
|
[2025-04-19 18:34:47,204][15432] Num frames 8400... |
|
[2025-04-19 18:34:47,320][15432] Num frames 8500... |
|
[2025-04-19 18:34:47,436][15432] Num frames 8600... |
|
[2025-04-19 18:34:47,552][15432] Num frames 8700... |
|
[2025-04-19 18:34:47,671][15432] Num frames 8800... |
|
[2025-04-19 18:34:47,786][15432] Num frames 8900... |
|
[2025-04-19 18:34:47,903][15432] Num frames 9000... |
|
[2025-04-19 18:34:48,022][15432] Num frames 9100... |
|
[2025-04-19 18:34:48,139][15432] Num frames 9200... |
|
[2025-04-19 18:34:48,256][15432] Num frames 9300... |
|
[2025-04-19 18:34:48,374][15432] Num frames 9400... |
|
[2025-04-19 18:34:48,475][15432] Avg episode rewards: #0: 28.176, true rewards: #0: 11.801 |
|
[2025-04-19 18:34:48,476][15432] Avg episode reward: 28.176, avg true_objective: 11.801 |
|
[2025-04-19 18:34:48,541][15432] Num frames 9500... |
|
[2025-04-19 18:34:48,656][15432] Num frames 9600... |
|
[2025-04-19 18:34:48,771][15432] Num frames 9700... |
|
[2025-04-19 18:34:48,887][15432] Num frames 9800... |
|
[2025-04-19 18:34:49,007][15432] Num frames 9900... |
|
[2025-04-19 18:34:49,124][15432] Num frames 10000... |
|
[2025-04-19 18:34:49,241][15432] Num frames 10100... |
|
[2025-04-19 18:34:49,358][15432] Num frames 10200... |
|
[2025-04-19 18:34:49,474][15432] Num frames 10300... |
|
[2025-04-19 18:34:49,590][15432] Num frames 10400... |
|
[2025-04-19 18:34:49,706][15432] Num frames 10500... |
|
[2025-04-19 18:34:49,821][15432] Num frames 10600... |
|
[2025-04-19 18:34:49,938][15432] Num frames 10700... |
|
[2025-04-19 18:34:50,107][15432] Avg episode rewards: #0: 29.331, true rewards: #0: 11.998 |
|
[2025-04-19 18:34:50,108][15432] Avg episode reward: 29.331, avg true_objective: 11.998 |
|
[2025-04-19 18:34:50,111][15432] Num frames 10800... |
|
[2025-04-19 18:34:50,244][15432] Num frames 10900... |
|
[2025-04-19 18:34:50,360][15432] Num frames 11000... |
|
[2025-04-19 18:34:50,475][15432] Num frames 11100... |
|
[2025-04-19 18:34:50,582][15432] Num frames 11200... |
|
[2025-04-19 18:34:50,698][15432] Num frames 11300... |
|
[2025-04-19 18:34:50,837][15432] Avg episode rewards: #0: 27.474, true rewards: #0: 11.374 |
|
[2025-04-19 18:34:50,838][15432] Avg episode reward: 27.474, avg true_objective: 11.374 |
|
[2025-04-19 18:35:08,939][15432] Replay video saved to ./runs/default_experiment/replay.mp4! |
|
|