vizdoom_health_gathering_supreme / sf_log.txt

Upload folder using huggingface_hub

3c88b9e verified 3 days ago

86.6 kB

	[2025-04-19 17:46:50,419][08935] Saving configuration to ./runs/default_experiment/config.json...
	[2025-04-19 17:46:50,420][08935] Rollout worker 0 uses device cpu
	[2025-04-19 17:46:50,421][08935] Rollout worker 1 uses device cpu
	[2025-04-19 17:46:50,422][08935] Rollout worker 2 uses device cpu
	[2025-04-19 17:46:50,423][08935] Rollout worker 3 uses device cpu
	[2025-04-19 17:46:50,423][08935] Rollout worker 4 uses device cpu
	[2025-04-19 17:46:50,424][08935] Rollout worker 5 uses device cpu
	[2025-04-19 17:46:50,507][08935] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 17:46:50,508][08935] InferenceWorker_p0-w0: min num requests: 2
	[2025-04-19 17:46:50,531][08935] Starting all processes...
	[2025-04-19 17:46:50,531][08935] Starting process learner_proc0
	[2025-04-19 17:46:50,581][08935] Starting all processes...
	[2025-04-19 17:46:50,587][08935] Starting process inference_proc0-0
	[2025-04-19 17:46:50,588][08935] Starting process rollout_proc0
	[2025-04-19 17:46:50,588][08935] Starting process rollout_proc1
	[2025-04-19 17:46:50,588][08935] Starting process rollout_proc2
	[2025-04-19 17:46:50,592][08935] Starting process rollout_proc3
	[2025-04-19 17:46:50,593][08935] Starting process rollout_proc4
	[2025-04-19 17:46:50,594][08935] Starting process rollout_proc5
	[2025-04-19 17:46:52,582][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 17:46:52,582][08993] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
	[2025-04-19 17:46:52,600][08993] Num visible devices: 1
	[2025-04-19 17:46:52,603][08993] Starting seed is not provided
	[2025-04-19 17:46:52,603][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 17:46:52,603][08993] Initializing actor-critic model on device cuda:0
	[2025-04-19 17:46:52,603][08993] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 17:46:52,605][08993] RunningMeanStd input shape: (1,)
	[2025-04-19 17:46:52,623][08993] ConvEncoder: input_channels=3
	[2025-04-19 17:46:52,812][09005] Worker 1 uses CPU cores [1]
	[2025-04-19 17:46:52,822][08993] Conv encoder output size: 512
	[2025-04-19 17:46:52,822][08993] Policy head output size: 512
	[2025-04-19 17:46:52,846][08993] Created Actor Critic model with architecture:
	[2025-04-19 17:46:52,846][08993] ActorCriticSharedWeights(
	(obs_normalizer): ObservationNormalizer(
	(running_mean_std): RunningMeanStdDictInPlace(
	(running_mean_std): ModuleDict(
	(obs): RunningMeanStdInPlace()
	)
	)
	)
	(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
	(encoder): VizdoomEncoder(
	(basic_encoder): ConvEncoder(
	(enc): RecursiveScriptModule(
	original_name=ConvEncoderImpl
	(conv_head): RecursiveScriptModule(
	original_name=Sequential
	(0): RecursiveScriptModule(original_name=Conv2d)
	(1): RecursiveScriptModule(original_name=ELU)
	(2): RecursiveScriptModule(original_name=Conv2d)
	(3): RecursiveScriptModule(original_name=ELU)
	(4): RecursiveScriptModule(original_name=Conv2d)
	(5): RecursiveScriptModule(original_name=ELU)
	)
	(mlp_layers): RecursiveScriptModule(
	original_name=Sequential
	(0): RecursiveScriptModule(original_name=Linear)
	(1): RecursiveScriptModule(original_name=ELU)
	)
	)
	)
	)
	(core): ModelCoreRNN(
	(core): GRU(512, 512)
	)
	(decoder): MlpDecoder(
	(mlp): Identity()
	)
	(critic_linear): Linear(in_features=512, out_features=1, bias=True)
	(action_parameterization): ActionParameterizationDefault(
	(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
	)
	)
	[2025-04-19 17:46:52,915][09008] Worker 3 uses CPU cores [3]
	[2025-04-19 17:46:52,918][09010] Worker 4 uses CPU cores [4]
	[2025-04-19 17:46:53,003][09004] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 17:46:53,003][09004] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
	[2025-04-19 17:46:53,040][09004] Num visible devices: 1
	[2025-04-19 17:46:53,157][09009] Worker 5 uses CPU cores [5]
	[2025-04-19 17:46:53,238][08993] Using optimizer <class 'torch.optim.adam.Adam'>
	[2025-04-19 17:46:53,302][09006] Worker 0 uses CPU cores [0]
	[2025-04-19 17:46:53,354][09007] Worker 2 uses CPU cores [2]
	[2025-04-19 17:46:54,037][08993] No checkpoints found
	[2025-04-19 17:46:54,037][08993] Did not load from checkpoint, starting from scratch!
	[2025-04-19 17:46:54,037][08993] Initialized policy 0 weights for model version 0
	[2025-04-19 17:46:54,038][08993] LearnerWorker_p0 finished initialization!
	[2025-04-19 17:46:54,038][08993] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 17:46:54,187][09004] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 17:46:54,187][09004] RunningMeanStd input shape: (1,)
	[2025-04-19 17:46:54,196][09004] ConvEncoder: input_channels=3
	[2025-04-19 17:46:54,282][09004] Conv encoder output size: 512
	[2025-04-19 17:46:54,283][09004] Policy head output size: 512
	[2025-04-19 17:46:54,312][08935] Inference worker 0-0 is ready!
	[2025-04-19 17:46:54,313][08935] All inference workers are ready! Signal rollout workers to start!
	[2025-04-19 17:46:54,350][09007] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,354][09005] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,358][09010] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,363][09006] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,372][09009] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,384][09008] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:46:54,602][09009] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:54,603][09010] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:54,604][09005] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:54,855][09010] Decorrelating experience for 32 frames...
	[2025-04-19 17:46:54,855][09009] Decorrelating experience for 32 frames...
	[2025-04-19 17:46:54,866][09005] Decorrelating experience for 32 frames...
	[2025-04-19 17:46:54,872][09006] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:55,163][09008] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:55,169][09006] Decorrelating experience for 32 frames...
	[2025-04-19 17:46:55,421][09008] Decorrelating experience for 32 frames...
	[2025-04-19 17:46:56,518][08993] Signal inference workers to stop experience collection...
	[2025-04-19 17:46:56,521][09004] InferenceWorker_p0-w0: stopping experience collection
	[2025-04-19 17:46:57,303][08935] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
	[2025-04-19 17:46:57,304][08935] Avg episode reward: [(0, '3.848')]
	[2025-04-19 17:46:57,347][08993] Signal inference workers to resume experience collection...
	[2025-04-19 17:46:57,348][09004] InferenceWorker_p0-w0: resuming experience collection
	[2025-04-19 17:46:57,618][09007] Decorrelating experience for 0 frames...
	[2025-04-19 17:46:57,900][09007] Decorrelating experience for 32 frames...
	[2025-04-19 17:47:01,303][09004] Updated weights for policy 0, policy_version 10 (0.0067)
	[2025-04-19 17:47:02,303][08935] Fps is (10 sec: 9830.5, 60 sec: 9830.5, 300 sec: 9830.5). Total num frames: 49152. Throughput: 0: 1673.2. Samples: 8366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:47:02,304][08935] Avg episode reward: [(0, '4.495')]
	[2025-04-19 17:47:05,907][09004] Updated weights for policy 0, policy_version 20 (0.0009)
	[2025-04-19 17:47:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9420.8, 300 sec: 9420.8). Total num frames: 94208. Throughput: 0: 2154.7. Samples: 21547. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:47:07,304][08935] Avg episode reward: [(0, '4.573')]
	[2025-04-19 17:47:10,442][09004] Updated weights for policy 0, policy_version 30 (0.0009)
	[2025-04-19 17:47:10,500][08935] Heartbeat connected on Batcher_0
	[2025-04-19 17:47:10,503][08935] Heartbeat connected on LearnerWorker_p0
	[2025-04-19 17:47:10,510][08935] Heartbeat connected on InferenceWorker_p0-w0
	[2025-04-19 17:47:10,512][08935] Heartbeat connected on RolloutWorker_w0
	[2025-04-19 17:47:10,519][08935] Heartbeat connected on RolloutWorker_w2
	[2025-04-19 17:47:10,520][08935] Heartbeat connected on RolloutWorker_w1
	[2025-04-19 17:47:10,524][08935] Heartbeat connected on RolloutWorker_w3
	[2025-04-19 17:47:10,528][08935] Heartbeat connected on RolloutWorker_w4
	[2025-04-19 17:47:10,554][08935] Heartbeat connected on RolloutWorker_w5
	[2025-04-19 17:47:12,303][08935] Fps is (10 sec: 8601.5, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 135168. Throughput: 0: 1900.3. Samples: 28505. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:12,304][08935] Avg episode reward: [(0, '4.590')]
	[2025-04-19 17:47:12,305][08993] Saving new best policy, reward=4.590!
	[2025-04-19 17:47:15,226][09004] Updated weights for policy 0, policy_version 40 (0.0008)
	[2025-04-19 17:47:17,303][08935] Fps is (10 sec: 8601.6, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 180224. Throughput: 0: 2068.8. Samples: 41376. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:17,304][08935] Avg episode reward: [(0, '4.679')]
	[2025-04-19 17:47:17,308][08993] Saving new best policy, reward=4.679!
	[2025-04-19 17:47:19,663][09004] Updated weights for policy 0, policy_version 50 (0.0008)
	[2025-04-19 17:47:22,303][08935] Fps is (10 sec: 9011.3, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 225280. Throughput: 0: 2193.6. Samples: 54839. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:22,304][08935] Avg episode reward: [(0, '4.537')]
	[2025-04-19 17:47:24,358][09004] Updated weights for policy 0, policy_version 60 (0.0009)
	[2025-04-19 17:47:27,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 270336. Throughput: 0: 2054.6. Samples: 61638. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:27,304][08935] Avg episode reward: [(0, '4.557')]
	[2025-04-19 17:47:29,041][09004] Updated weights for policy 0, policy_version 70 (0.0008)
	[2025-04-19 17:47:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 315392. Throughput: 0: 2136.9. Samples: 74792. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:32,304][08935] Avg episode reward: [(0, '4.539')]
	[2025-04-19 17:47:33,620][09004] Updated weights for policy 0, policy_version 80 (0.0008)
	[2025-04-19 17:47:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 360448. Throughput: 0: 2205.1. Samples: 88203. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:37,304][08935] Avg episode reward: [(0, '4.418')]
	[2025-04-19 17:47:38,320][09004] Updated weights for policy 0, policy_version 90 (0.0008)
	[2025-04-19 17:47:42,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8829.2, 300 sec: 8829.2). Total num frames: 397312. Throughput: 0: 2099.0. Samples: 94456. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:42,304][08935] Avg episode reward: [(0, '4.463')]
	[2025-04-19 17:47:43,616][09004] Updated weights for policy 0, policy_version 100 (0.0009)
	[2025-04-19 17:47:47,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8765.4, 300 sec: 8765.4). Total num frames: 438272. Throughput: 0: 2177.4. Samples: 106349. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:47:47,304][08935] Avg episode reward: [(0, '4.508')]
	[2025-04-19 17:47:48,450][09004] Updated weights for policy 0, policy_version 110 (0.0008)
	[2025-04-19 17:47:52,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8787.8, 300 sec: 8787.8). Total num frames: 483328. Throughput: 0: 2169.2. Samples: 119159. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:47:52,304][08935] Avg episode reward: [(0, '4.537')]
	[2025-04-19 17:47:53,061][09004] Updated weights for policy 0, policy_version 120 (0.0008)
	[2025-04-19 17:47:57,308][08935] Fps is (10 sec: 8597.1, 60 sec: 8737.4, 300 sec: 8737.4). Total num frames: 524288. Throughput: 0: 2164.1. Samples: 125902. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:47:57,310][08935] Avg episode reward: [(0, '4.575')]
	[2025-04-19 17:47:58,535][09004] Updated weights for policy 0, policy_version 130 (0.0008)
	[2025-04-19 17:48:02,303][08935] Fps is (10 sec: 7372.6, 60 sec: 8465.0, 300 sec: 8570.1). Total num frames: 557056. Throughput: 0: 2111.5. Samples: 136395. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
	[2025-04-19 17:48:02,304][08935] Avg episode reward: [(0, '4.377')]
	[2025-04-19 17:48:04,297][09004] Updated weights for policy 0, policy_version 140 (0.0008)
	[2025-04-19 17:48:07,303][08935] Fps is (10 sec: 6966.8, 60 sec: 8328.5, 300 sec: 8484.6). Total num frames: 593920. Throughput: 0: 2050.9. Samples: 147129. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:07,304][08935] Avg episode reward: [(0, '4.458')]
	[2025-04-19 17:48:09,611][09004] Updated weights for policy 0, policy_version 150 (0.0009)
	[2025-04-19 17:48:12,303][08935] Fps is (10 sec: 7782.6, 60 sec: 8328.5, 300 sec: 8465.1). Total num frames: 634880. Throughput: 0: 2033.4. Samples: 153143. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:12,304][08935] Avg episode reward: [(0, '4.512')]
	[2025-04-19 17:48:15,205][09004] Updated weights for policy 0, policy_version 160 (0.0009)
	[2025-04-19 17:48:17,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8192.0, 300 sec: 8396.8). Total num frames: 671744. Throughput: 0: 1989.4. Samples: 164315. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:17,304][08935] Avg episode reward: [(0, '4.723')]
	[2025-04-19 17:48:17,307][08993] Saving new best policy, reward=4.723!
	[2025-04-19 17:48:19,589][09004] Updated weights for policy 0, policy_version 170 (0.0008)
	[2025-04-19 17:48:22,303][08935] Fps is (10 sec: 8601.7, 60 sec: 8260.3, 300 sec: 8481.1). Total num frames: 720896. Throughput: 0: 2004.3. Samples: 178396. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:48:22,304][08935] Avg episode reward: [(0, '4.696')]
	[2025-04-19 17:48:23,905][09004] Updated weights for policy 0, policy_version 180 (0.0008)
	[2025-04-19 17:48:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8260.3, 300 sec: 8510.6). Total num frames: 765952. Throughput: 0: 2020.6. Samples: 185384. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:27,304][08935] Avg episode reward: [(0, '4.802')]
	[2025-04-19 17:48:27,307][08993] Saving new best policy, reward=4.802!
	[2025-04-19 17:48:28,363][09004] Updated weights for policy 0, policy_version 190 (0.0008)
	[2025-04-19 17:48:32,303][08935] Fps is (10 sec: 9011.1, 60 sec: 8260.3, 300 sec: 8536.9). Total num frames: 811008. Throughput: 0: 2065.0. Samples: 199275. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:32,305][08935] Avg episode reward: [(0, '4.793')]
	[2025-04-19 17:48:32,763][09004] Updated weights for policy 0, policy_version 200 (0.0009)
	[2025-04-19 17:48:37,106][09004] Updated weights for policy 0, policy_version 210 (0.0008)
	[2025-04-19 17:48:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8328.5, 300 sec: 8601.6). Total num frames: 860160. Throughput: 0: 2094.5. Samples: 213412. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:37,304][08935] Avg episode reward: [(0, '4.841')]
	[2025-04-19 17:48:37,308][08993] Saving new best policy, reward=4.841!
	[2025-04-19 17:48:41,335][09004] Updated weights for policy 0, policy_version 220 (0.0008)
	[2025-04-19 17:48:42,305][08935] Fps is (10 sec: 9828.9, 60 sec: 8533.1, 300 sec: 8660.0). Total num frames: 909312. Throughput: 0: 2104.5. Samples: 220597. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:42,306][08935] Avg episode reward: [(0, '5.309')]
	[2025-04-19 17:48:42,308][08993] Saving new best policy, reward=5.309!
	[2025-04-19 17:48:45,971][09004] Updated weights for policy 0, policy_version 230 (0.0008)
	[2025-04-19 17:48:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8533.3, 300 sec: 8638.8). Total num frames: 950272. Throughput: 0: 2172.6. Samples: 234161. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:47,304][08935] Avg episode reward: [(0, '5.528')]
	[2025-04-19 17:48:47,335][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000233_954368.pth...
	[2025-04-19 17:48:47,380][08993] Saving new best policy, reward=5.528!
	[2025-04-19 17:48:50,477][09004] Updated weights for policy 0, policy_version 240 (0.0008)
	[2025-04-19 17:48:52,303][08935] Fps is (10 sec: 9012.7, 60 sec: 8601.6, 300 sec: 8690.6). Total num frames: 999424. Throughput: 0: 2240.3. Samples: 247942. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:48:52,304][08935] Avg episode reward: [(0, '5.789')]
	[2025-04-19 17:48:52,304][08993] Saving new best policy, reward=5.789!
	[2025-04-19 17:48:54,905][09004] Updated weights for policy 0, policy_version 250 (0.0007)
	[2025-04-19 17:48:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8670.6, 300 sec: 8704.0). Total num frames: 1044480. Throughput: 0: 2260.9. Samples: 254885. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:48:57,304][08935] Avg episode reward: [(0, '5.373')]
	[2025-04-19 17:48:59,243][09004] Updated weights for policy 0, policy_version 260 (0.0008)
	[2025-04-19 17:49:02,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8683.5). Total num frames: 1085440. Throughput: 0: 2309.3. Samples: 268233. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:02,304][08935] Avg episode reward: [(0, '5.656')]
	[2025-04-19 17:49:04,443][09004] Updated weights for policy 0, policy_version 270 (0.0010)
	[2025-04-19 17:49:07,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8696.1). Total num frames: 1130496. Throughput: 0: 2280.4. Samples: 281015. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:49:07,304][08935] Avg episode reward: [(0, '6.581')]
	[2025-04-19 17:49:07,309][08993] Saving new best policy, reward=6.581!
	[2025-04-19 17:49:09,165][09004] Updated weights for policy 0, policy_version 280 (0.0008)
	[2025-04-19 17:49:12,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8677.5). Total num frames: 1171456. Throughput: 0: 2264.5. Samples: 287286. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
	[2025-04-19 17:49:12,304][08935] Avg episode reward: [(0, '6.799')]
	[2025-04-19 17:49:12,305][08993] Saving new best policy, reward=6.799!
	[2025-04-19 17:49:14,060][09004] Updated weights for policy 0, policy_version 290 (0.0008)
	[2025-04-19 17:49:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 9011.2, 300 sec: 8660.1). Total num frames: 1212416. Throughput: 0: 2235.1. Samples: 299852. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:17,304][08935] Avg episode reward: [(0, '6.242')]
	[2025-04-19 17:49:18,969][09004] Updated weights for policy 0, policy_version 300 (0.0009)
	[2025-04-19 17:49:22,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8874.7, 300 sec: 8644.0). Total num frames: 1253376. Throughput: 0: 2201.0. Samples: 312458. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:22,304][08935] Avg episode reward: [(0, '6.993')]
	[2025-04-19 17:49:22,339][08993] Saving new best policy, reward=6.993!
	[2025-04-19 17:49:23,741][09004] Updated weights for policy 0, policy_version 310 (0.0011)
	[2025-04-19 17:49:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8656.2). Total num frames: 1298432. Throughput: 0: 2186.4. Samples: 318983. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:27,304][08935] Avg episode reward: [(0, '8.612')]
	[2025-04-19 17:49:27,310][08993] Saving new best policy, reward=8.612!
	[2025-04-19 17:49:28,341][09004] Updated weights for policy 0, policy_version 320 (0.0010)
	[2025-04-19 17:49:32,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8806.4, 300 sec: 8641.2). Total num frames: 1339392. Throughput: 0: 2174.7. Samples: 332022. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:32,304][08935] Avg episode reward: [(0, '8.630')]
	[2025-04-19 17:49:32,315][08993] Saving new best policy, reward=8.630!
	[2025-04-19 17:49:33,202][09004] Updated weights for policy 0, policy_version 330 (0.0008)
	[2025-04-19 17:49:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.4). Total num frames: 1388544. Throughput: 0: 2166.8. Samples: 345447. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:37,304][08935] Avg episode reward: [(0, '8.841')]
	[2025-04-19 17:49:37,308][08993] Saving new best policy, reward=8.841!
	[2025-04-19 17:49:37,645][09004] Updated weights for policy 0, policy_version 340 (0.0008)
	[2025-04-19 17:49:41,959][09004] Updated weights for policy 0, policy_version 350 (0.0008)
	[2025-04-19 17:49:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8738.4, 300 sec: 8688.5). Total num frames: 1433600. Throughput: 0: 2167.5. Samples: 352424. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:42,304][08935] Avg episode reward: [(0, '10.682')]
	[2025-04-19 17:49:42,304][08993] Saving new best policy, reward=10.682!
	[2025-04-19 17:49:46,346][09004] Updated weights for policy 0, policy_version 360 (0.0010)
	[2025-04-19 17:49:47,303][08935] Fps is (10 sec: 9420.6, 60 sec: 8874.6, 300 sec: 8722.1). Total num frames: 1482752. Throughput: 0: 2184.0. Samples: 366514. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:47,304][08935] Avg episode reward: [(0, '10.472')]
	[2025-04-19 17:49:50,780][09004] Updated weights for policy 0, policy_version 370 (0.0008)
	[2025-04-19 17:49:52,303][08935] Fps is (10 sec: 9420.7, 60 sec: 8806.4, 300 sec: 8730.3). Total num frames: 1527808. Throughput: 0: 2205.1. Samples: 380246. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:49:52,304][08935] Avg episode reward: [(0, '11.006')]
	[2025-04-19 17:49:52,305][08993] Saving new best policy, reward=11.006!
	[2025-04-19 17:49:55,575][09004] Updated weights for policy 0, policy_version 380 (0.0010)
	[2025-04-19 17:49:57,303][08935] Fps is (10 sec: 8601.4, 60 sec: 8738.1, 300 sec: 8715.4). Total num frames: 1568768. Throughput: 0: 2204.2. Samples: 386478. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:49:57,304][08935] Avg episode reward: [(0, '11.157')]
	[2025-04-19 17:49:57,309][08993] Saving new best policy, reward=11.157!
	[2025-04-19 17:50:00,514][09004] Updated weights for policy 0, policy_version 390 (0.0009)
	[2025-04-19 17:50:02,303][08935] Fps is (10 sec: 8192.1, 60 sec: 8738.1, 300 sec: 8701.2). Total num frames: 1609728. Throughput: 0: 2205.9. Samples: 399119. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:02,304][08935] Avg episode reward: [(0, '11.236')]
	[2025-04-19 17:50:02,305][08993] Saving new best policy, reward=11.236!
	[2025-04-19 17:50:05,692][09004] Updated weights for policy 0, policy_version 400 (0.0009)
	[2025-04-19 17:50:07,303][08935] Fps is (10 sec: 8192.3, 60 sec: 8669.9, 300 sec: 8687.8). Total num frames: 1650688. Throughput: 0: 2194.5. Samples: 411210. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:07,304][08935] Avg episode reward: [(0, '12.841')]
	[2025-04-19 17:50:07,308][08993] Saving new best policy, reward=12.841!
	[2025-04-19 17:50:10,659][09004] Updated weights for policy 0, policy_version 410 (0.0009)
	[2025-04-19 17:50:12,303][08935] Fps is (10 sec: 8191.7, 60 sec: 8669.8, 300 sec: 8675.1). Total num frames: 1691648. Throughput: 0: 2185.9. Samples: 417349. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:12,306][08935] Avg episode reward: [(0, '13.748')]
	[2025-04-19 17:50:12,310][08993] Saving new best policy, reward=13.748!
	[2025-04-19 17:50:15,441][09004] Updated weights for policy 0, policy_version 420 (0.0010)
	[2025-04-19 17:50:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8669.8, 300 sec: 8663.0). Total num frames: 1732608. Throughput: 0: 2180.6. Samples: 430148. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:17,304][08935] Avg episode reward: [(0, '13.048')]
	[2025-04-19 17:50:19,931][09004] Updated weights for policy 0, policy_version 430 (0.0009)
	[2025-04-19 17:50:22,303][08935] Fps is (10 sec: 9011.5, 60 sec: 8806.4, 300 sec: 8691.5). Total num frames: 1781760. Throughput: 0: 2189.2. Samples: 443963. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:50:22,304][08935] Avg episode reward: [(0, '14.504')]
	[2025-04-19 17:50:22,305][08993] Saving new best policy, reward=14.504!
	[2025-04-19 17:50:24,259][09004] Updated weights for policy 0, policy_version 440 (0.0008)
	[2025-04-19 17:50:27,303][08935] Fps is (10 sec: 9420.9, 60 sec: 8806.4, 300 sec: 8699.1). Total num frames: 1826816. Throughput: 0: 2189.6. Samples: 450958. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:27,305][08935] Avg episode reward: [(0, '15.493')]
	[2025-04-19 17:50:27,335][08993] Saving new best policy, reward=15.493!
	[2025-04-19 17:50:28,984][09004] Updated weights for policy 0, policy_version 450 (0.0008)
	[2025-04-19 17:50:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8874.7, 300 sec: 8706.4). Total num frames: 1871872. Throughput: 0: 2165.2. Samples: 463947. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:50:32,304][08935] Avg episode reward: [(0, '15.586')]
	[2025-04-19 17:50:32,305][08993] Saving new best policy, reward=15.586!
	[2025-04-19 17:50:33,757][09004] Updated weights for policy 0, policy_version 460 (0.0009)
	[2025-04-19 17:50:37,303][08935] Fps is (10 sec: 8601.2, 60 sec: 8738.1, 300 sec: 8694.7). Total num frames: 1912832. Throughput: 0: 2136.8. Samples: 476402. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:37,304][08935] Avg episode reward: [(0, '16.166')]
	[2025-04-19 17:50:37,308][08993] Saving new best policy, reward=16.166!
	[2025-04-19 17:50:38,770][09004] Updated weights for policy 0, policy_version 470 (0.0009)
	[2025-04-19 17:50:42,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8669.9, 300 sec: 8683.5). Total num frames: 1953792. Throughput: 0: 2136.8. Samples: 482633. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:42,304][08935] Avg episode reward: [(0, '15.322')]
	[2025-04-19 17:50:43,683][09004] Updated weights for policy 0, policy_version 480 (0.0008)
	[2025-04-19 17:50:47,303][08935] Fps is (10 sec: 8192.4, 60 sec: 8533.4, 300 sec: 8672.8). Total num frames: 1994752. Throughput: 0: 2136.2. Samples: 495246. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:47,304][08935] Avg episode reward: [(0, '16.011')]
	[2025-04-19 17:50:47,308][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000487_1994752.pth...
	[2025-04-19 17:50:48,527][09004] Updated weights for policy 0, policy_version 490 (0.0008)
	[2025-04-19 17:50:52,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8465.1, 300 sec: 8662.6). Total num frames: 2035712. Throughput: 0: 2150.2. Samples: 507969. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:52,304][08935] Avg episode reward: [(0, '17.297')]
	[2025-04-19 17:50:52,305][08993] Saving new best policy, reward=17.297!
	[2025-04-19 17:50:53,366][09004] Updated weights for policy 0, policy_version 500 (0.0009)
	[2025-04-19 17:50:57,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8533.4, 300 sec: 8669.9). Total num frames: 2080768. Throughput: 0: 2154.3. Samples: 514294. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:50:57,304][08935] Avg episode reward: [(0, '16.608')]
	[2025-04-19 17:50:57,851][09004] Updated weights for policy 0, policy_version 510 (0.0009)
	[2025-04-19 17:51:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8601.6, 300 sec: 8676.8). Total num frames: 2125824. Throughput: 0: 2172.4. Samples: 527908. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:02,305][08935] Avg episode reward: [(0, '14.945')]
	[2025-04-19 17:51:02,690][09004] Updated weights for policy 0, policy_version 520 (0.0009)
	[2025-04-19 17:51:07,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8601.6, 300 sec: 8667.1). Total num frames: 2166784. Throughput: 0: 2142.4. Samples: 540371. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:07,306][08935] Avg episode reward: [(0, '15.166')]
	[2025-04-19 17:51:07,501][09004] Updated weights for policy 0, policy_version 530 (0.0009)
	[2025-04-19 17:51:12,222][09004] Updated weights for policy 0, policy_version 540 (0.0008)
	[2025-04-19 17:51:12,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8669.9, 300 sec: 8673.9). Total num frames: 2211840. Throughput: 0: 2133.7. Samples: 546973. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:51:12,304][08935] Avg episode reward: [(0, '15.981')]
	[2025-04-19 17:51:17,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8601.6, 300 sec: 8648.9). Total num frames: 2248704. Throughput: 0: 2115.4. Samples: 559142. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:17,304][08935] Avg episode reward: [(0, '16.686')]
	[2025-04-19 17:51:17,621][09004] Updated weights for policy 0, policy_version 550 (0.0009)
	[2025-04-19 17:51:22,303][08935] Fps is (10 sec: 7782.4, 60 sec: 8465.1, 300 sec: 8640.2). Total num frames: 2289664. Throughput: 0: 2104.9. Samples: 571122. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:22,304][08935] Avg episode reward: [(0, '18.793')]
	[2025-04-19 17:51:22,305][08993] Saving new best policy, reward=18.793!
	[2025-04-19 17:51:22,557][09004] Updated weights for policy 0, policy_version 560 (0.0008)
	[2025-04-19 17:51:27,101][09004] Updated weights for policy 0, policy_version 570 (0.0008)
	[2025-04-19 17:51:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8465.1, 300 sec: 8647.1). Total num frames: 2334720. Throughput: 0: 2109.7. Samples: 577571. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:51:27,304][08935] Avg episode reward: [(0, '18.396')]
	[2025-04-19 17:51:31,750][09004] Updated weights for policy 0, policy_version 580 (0.0008)
	[2025-04-19 17:51:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8465.1, 300 sec: 8653.7). Total num frames: 2379776. Throughput: 0: 2125.0. Samples: 590873. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:32,304][08935] Avg episode reward: [(0, '18.771')]
	[2025-04-19 17:51:36,167][09004] Updated weights for policy 0, policy_version 590 (0.0009)
	[2025-04-19 17:51:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8533.4, 300 sec: 8660.1). Total num frames: 2424832. Throughput: 0: 2151.0. Samples: 604763. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:37,304][08935] Avg episode reward: [(0, '19.675')]
	[2025-04-19 17:51:37,307][08993] Saving new best policy, reward=19.675!
	[2025-04-19 17:51:40,633][09004] Updated weights for policy 0, policy_version 600 (0.0008)
	[2025-04-19 17:51:42,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8601.6, 300 sec: 8666.3). Total num frames: 2469888. Throughput: 0: 2162.3. Samples: 611599. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:42,304][08935] Avg episode reward: [(0, '20.646')]
	[2025-04-19 17:51:42,305][08993] Saving new best policy, reward=20.646!
	[2025-04-19 17:51:45,154][09004] Updated weights for policy 0, policy_version 610 (0.0009)
	[2025-04-19 17:51:47,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8738.1, 300 sec: 8686.3). Total num frames: 2519040. Throughput: 0: 2162.8. Samples: 625236. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:51:47,304][08935] Avg episode reward: [(0, '21.808')]
	[2025-04-19 17:51:47,309][08993] Saving new best policy, reward=21.808!
	[2025-04-19 17:51:49,683][09004] Updated weights for policy 0, policy_version 620 (0.0008)
	[2025-04-19 17:51:52,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8678.0). Total num frames: 2560000. Throughput: 0: 2184.7. Samples: 638682. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:52,304][08935] Avg episode reward: [(0, '21.149')]
	[2025-04-19 17:51:54,499][09004] Updated weights for policy 0, policy_version 630 (0.0009)
	[2025-04-19 17:51:57,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8738.1, 300 sec: 8664.1). Total num frames: 2605056. Throughput: 0: 2178.2. Samples: 644992. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:51:57,304][08935] Avg episode reward: [(0, '20.334')]
	[2025-04-19 17:51:59,027][09004] Updated weights for policy 0, policy_version 640 (0.0008)
	[2025-04-19 17:52:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8738.1, 300 sec: 8664.1). Total num frames: 2650112. Throughput: 0: 2209.5. Samples: 658569. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:02,304][08935] Avg episode reward: [(0, '20.266')]
	[2025-04-19 17:52:03,501][09004] Updated weights for policy 0, policy_version 650 (0.0008)
	[2025-04-19 17:52:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.0). Total num frames: 2695168. Throughput: 0: 2245.8. Samples: 672183. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:52:07,304][08935] Avg episode reward: [(0, '19.989')]
	[2025-04-19 17:52:08,128][09004] Updated weights for policy 0, policy_version 660 (0.0008)
	[2025-04-19 17:52:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8678.0). Total num frames: 2740224. Throughput: 0: 2248.4. Samples: 678747. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:12,304][08935] Avg episode reward: [(0, '18.762')]
	[2025-04-19 17:52:12,729][09004] Updated weights for policy 0, policy_version 670 (0.0009)
	[2025-04-19 17:52:17,220][09004] Updated weights for policy 0, policy_version 680 (0.0009)
	[2025-04-19 17:52:17,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8678.0). Total num frames: 2785280. Throughput: 0: 2255.8. Samples: 692386. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:17,304][08935] Avg episode reward: [(0, '18.158')]
	[2025-04-19 17:52:21,725][09004] Updated weights for policy 0, policy_version 690 (0.0009)
	[2025-04-19 17:52:22,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8678.0). Total num frames: 2830336. Throughput: 0: 2245.8. Samples: 705822. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:22,304][08935] Avg episode reward: [(0, '20.096')]
	[2025-04-19 17:52:26,613][09004] Updated weights for policy 0, policy_version 700 (0.0008)
	[2025-04-19 17:52:27,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8664.1). Total num frames: 2871296. Throughput: 0: 2225.4. Samples: 711744. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:52:27,304][08935] Avg episode reward: [(0, '23.016')]
	[2025-04-19 17:52:27,307][08993] Saving new best policy, reward=23.016!
	[2025-04-19 17:52:31,397][09004] Updated weights for policy 0, policy_version 710 (0.0009)
	[2025-04-19 17:52:32,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8942.9, 300 sec: 8664.1). Total num frames: 2916352. Throughput: 0: 2221.7. Samples: 725213. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:32,304][08935] Avg episode reward: [(0, '24.012')]
	[2025-04-19 17:52:32,305][08993] Saving new best policy, reward=24.012!
	[2025-04-19 17:52:35,831][09004] Updated weights for policy 0, policy_version 720 (0.0007)
	[2025-04-19 17:52:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8942.9, 300 sec: 8691.8). Total num frames: 2961408. Throughput: 0: 2218.8. Samples: 738526. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:37,304][08935] Avg episode reward: [(0, '20.950')]
	[2025-04-19 17:52:40,565][09004] Updated weights for policy 0, policy_version 730 (0.0009)
	[2025-04-19 17:52:42,303][08935] Fps is (10 sec: 8601.6, 60 sec: 8874.7, 300 sec: 8691.9). Total num frames: 3002368. Throughput: 0: 2221.2. Samples: 744947. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:42,304][08935] Avg episode reward: [(0, '22.606')]
	[2025-04-19 17:52:45,714][09004] Updated weights for policy 0, policy_version 740 (0.0008)
	[2025-04-19 17:52:47,303][08935] Fps is (10 sec: 8192.0, 60 sec: 8738.1, 300 sec: 8678.0). Total num frames: 3043328. Throughput: 0: 2194.5. Samples: 757323. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:52:47,304][08935] Avg episode reward: [(0, '23.337')]
	[2025-04-19 17:52:47,308][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth...
	[2025-04-19 17:52:47,355][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000233_954368.pth
	[2025-04-19 17:52:50,370][09004] Updated weights for policy 0, policy_version 750 (0.0008)
	[2025-04-19 17:52:52,303][08935] Fps is (10 sec: 8601.5, 60 sec: 8806.4, 300 sec: 8692.0). Total num frames: 3088384. Throughput: 0: 2181.8. Samples: 770366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:52:52,304][08935] Avg episode reward: [(0, '22.798')]
	[2025-04-19 17:52:54,851][09004] Updated weights for policy 0, policy_version 760 (0.0008)
	[2025-04-19 17:52:57,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8733.5). Total num frames: 3133440. Throughput: 0: 2186.6. Samples: 777142. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:52:57,304][08935] Avg episode reward: [(0, '21.810')]
	[2025-04-19 17:52:59,340][09004] Updated weights for policy 0, policy_version 770 (0.0009)
	[2025-04-19 17:53:02,303][08935] Fps is (10 sec: 9011.3, 60 sec: 8806.4, 300 sec: 8761.3). Total num frames: 3178496. Throughput: 0: 2188.4. Samples: 790862. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:02,304][08935] Avg episode reward: [(0, '21.429')]
	[2025-04-19 17:53:03,816][09004] Updated weights for policy 0, policy_version 780 (0.0008)
	[2025-04-19 17:53:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 8806.4, 300 sec: 8775.2). Total num frames: 3223552. Throughput: 0: 2198.1. Samples: 804736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:07,304][08935] Avg episode reward: [(0, '20.614')]
	[2025-04-19 17:53:08,231][09004] Updated weights for policy 0, policy_version 790 (0.0008)
	[2025-04-19 17:53:12,303][08935] Fps is (10 sec: 9420.7, 60 sec: 8874.7, 300 sec: 8816.8). Total num frames: 3272704. Throughput: 0: 2221.1. Samples: 811693. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:12,304][08935] Avg episode reward: [(0, '21.085')]
	[2025-04-19 17:53:12,586][09004] Updated weights for policy 0, policy_version 800 (0.0007)
	[2025-04-19 17:53:16,929][09004] Updated weights for policy 0, policy_version 810 (0.0007)
	[2025-04-19 17:53:17,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8874.7, 300 sec: 8802.9). Total num frames: 3317760. Throughput: 0: 2233.7. Samples: 825729. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:17,304][08935] Avg episode reward: [(0, '22.154')]
	[2025-04-19 17:53:21,335][09004] Updated weights for policy 0, policy_version 820 (0.0008)
	[2025-04-19 17:53:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 8942.9, 300 sec: 8816.8). Total num frames: 3366912. Throughput: 0: 2253.7. Samples: 839944. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:22,304][08935] Avg episode reward: [(0, '21.032')]
	[2025-04-19 17:53:25,718][09004] Updated weights for policy 0, policy_version 830 (0.0007)
	[2025-04-19 17:53:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9011.2, 300 sec: 8816.8). Total num frames: 3411968. Throughput: 0: 2264.6. Samples: 846856. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:27,304][08935] Avg episode reward: [(0, '21.124')]
	[2025-04-19 17:53:30,156][09004] Updated weights for policy 0, policy_version 840 (0.0008)
	[2025-04-19 17:53:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9011.2, 300 sec: 8802.9). Total num frames: 3457024. Throughput: 0: 2299.7. Samples: 860811. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:32,304][08935] Avg episode reward: [(0, '23.921')]
	[2025-04-19 17:53:34,590][09004] Updated weights for policy 0, policy_version 850 (0.0008)
	[2025-04-19 17:53:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9079.5, 300 sec: 8803.0). Total num frames: 3506176. Throughput: 0: 2313.7. Samples: 874483. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:37,304][08935] Avg episode reward: [(0, '27.222')]
	[2025-04-19 17:53:37,307][08993] Saving new best policy, reward=27.222!
	[2025-04-19 17:53:39,119][09004] Updated weights for policy 0, policy_version 860 (0.0008)
	[2025-04-19 17:53:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9147.7, 300 sec: 8816.8). Total num frames: 3551232. Throughput: 0: 2315.9. Samples: 881356. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:42,304][08935] Avg episode reward: [(0, '27.461')]
	[2025-04-19 17:53:42,305][08993] Saving new best policy, reward=27.461!
	[2025-04-19 17:53:43,553][09004] Updated weights for policy 0, policy_version 870 (0.0008)
	[2025-04-19 17:53:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8802.9). Total num frames: 3596288. Throughput: 0: 2315.7. Samples: 895068. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:47,304][08935] Avg episode reward: [(0, '24.830')]
	[2025-04-19 17:53:48,006][09004] Updated weights for policy 0, policy_version 880 (0.0008)
	[2025-04-19 17:53:52,251][09004] Updated weights for policy 0, policy_version 890 (0.0007)
	[2025-04-19 17:53:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8816.8). Total num frames: 3645440. Throughput: 0: 2324.4. Samples: 909336. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:52,304][08935] Avg episode reward: [(0, '22.522')]
	[2025-04-19 17:53:56,654][09004] Updated weights for policy 0, policy_version 900 (0.0008)
	[2025-04-19 17:53:57,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8816.8). Total num frames: 3686400. Throughput: 0: 2321.1. Samples: 916142. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:53:57,304][08935] Avg episode reward: [(0, '23.013')]
	[2025-04-19 17:54:01,316][09004] Updated weights for policy 0, policy_version 910 (0.0008)
	[2025-04-19 17:54:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9284.3, 300 sec: 8830.7). Total num frames: 3735552. Throughput: 0: 2310.6. Samples: 929704. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:54:02,304][08935] Avg episode reward: [(0, '26.877')]
	[2025-04-19 17:54:05,695][09004] Updated weights for policy 0, policy_version 920 (0.0008)
	[2025-04-19 17:54:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8844.6). Total num frames: 3780608. Throughput: 0: 2304.1. Samples: 943630. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:07,305][08935] Avg episode reward: [(0, '27.147')]
	[2025-04-19 17:54:10,197][09004] Updated weights for policy 0, policy_version 930 (0.0008)
	[2025-04-19 17:54:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8858.5). Total num frames: 3825664. Throughput: 0: 2303.3. Samples: 950503. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:54:12,304][08935] Avg episode reward: [(0, '26.537')]
	[2025-04-19 17:54:14,632][09004] Updated weights for policy 0, policy_version 940 (0.0008)
	[2025-04-19 17:54:17,303][08935] Fps is (10 sec: 9420.7, 60 sec: 9284.2, 300 sec: 8886.2). Total num frames: 3874816. Throughput: 0: 2299.1. Samples: 964269. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:17,304][08935] Avg episode reward: [(0, '23.645')]
	[2025-04-19 17:54:19,111][09004] Updated weights for policy 0, policy_version 950 (0.0008)
	[2025-04-19 17:54:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 3919872. Throughput: 0: 2303.6. Samples: 978146. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:22,304][08935] Avg episode reward: [(0, '24.159')]
	[2025-04-19 17:54:23,519][09004] Updated weights for policy 0, policy_version 960 (0.0010)
	[2025-04-19 17:54:27,303][08935] Fps is (10 sec: 9011.4, 60 sec: 9216.0, 300 sec: 8900.1). Total num frames: 3964928. Throughput: 0: 2303.0. Samples: 984990. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:27,304][08935] Avg episode reward: [(0, '26.718')]
	[2025-04-19 17:54:28,024][09004] Updated weights for policy 0, policy_version 970 (0.0008)
	[2025-04-19 17:54:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4009984. Throughput: 0: 2306.5. Samples: 998860. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:32,304][08935] Avg episode reward: [(0, '26.309')]
	[2025-04-19 17:54:32,348][09004] Updated weights for policy 0, policy_version 980 (0.0007)
	[2025-04-19 17:54:36,734][09004] Updated weights for policy 0, policy_version 990 (0.0008)
	[2025-04-19 17:54:37,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8900.1). Total num frames: 4059136. Throughput: 0: 2303.2. Samples: 1012981. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:37,304][08935] Avg episode reward: [(0, '27.251')]
	[2025-04-19 17:54:41,099][09004] Updated weights for policy 0, policy_version 1000 (0.0008)
	[2025-04-19 17:54:42,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4104192. Throughput: 0: 2306.2. Samples: 1019923. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:42,304][08935] Avg episode reward: [(0, '24.135')]
	[2025-04-19 17:54:45,484][09004] Updated weights for policy 0, policy_version 1010 (0.0008)
	[2025-04-19 17:54:47,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8886.2). Total num frames: 4149248. Throughput: 0: 2319.7. Samples: 1034089. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:47,304][08935] Avg episode reward: [(0, '22.221')]
	[2025-04-19 17:54:47,315][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001014_4153344.pth...
	[2025-04-19 17:54:47,359][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000487_1994752.pth
	[2025-04-19 17:54:50,000][09004] Updated weights for policy 0, policy_version 1020 (0.0009)
	[2025-04-19 17:54:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 8914.0). Total num frames: 4198400. Throughput: 0: 2313.9. Samples: 1047757. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:54:52,304][08935] Avg episode reward: [(0, '24.666')]
	[2025-04-19 17:54:54,461][09004] Updated weights for policy 0, policy_version 1030 (0.0008)
	[2025-04-19 17:54:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8927.9). Total num frames: 4243456. Throughput: 0: 2312.8. Samples: 1054580. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:54:57,304][08935] Avg episode reward: [(0, '25.012')]
	[2025-04-19 17:54:58,959][09004] Updated weights for policy 0, policy_version 1040 (0.0007)
	[2025-04-19 17:55:02,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8941.8). Total num frames: 4288512. Throughput: 0: 2312.2. Samples: 1068317. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:55:02,304][08935] Avg episode reward: [(0, '24.328')]
	[2025-04-19 17:55:03,371][09004] Updated weights for policy 0, policy_version 1050 (0.0010)
	[2025-04-19 17:55:07,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8969.6). Total num frames: 4337664. Throughput: 0: 2313.3. Samples: 1082245. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:07,304][08935] Avg episode reward: [(0, '26.567')]
	[2025-04-19 17:55:07,823][09004] Updated weights for policy 0, policy_version 1060 (0.0008)
	[2025-04-19 17:55:12,141][09004] Updated weights for policy 0, policy_version 1070 (0.0008)
	[2025-04-19 17:55:12,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4382720. Throughput: 0: 2314.0. Samples: 1089120. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
	[2025-04-19 17:55:12,304][08935] Avg episode reward: [(0, '26.622')]
	[2025-04-19 17:55:16,503][09004] Updated weights for policy 0, policy_version 1080 (0.0007)
	[2025-04-19 17:55:17,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 8969.5). Total num frames: 4427776. Throughput: 0: 2320.7. Samples: 1103292. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:17,304][08935] Avg episode reward: [(0, '25.447')]
	[2025-04-19 17:55:20,859][09004] Updated weights for policy 0, policy_version 1090 (0.0008)
	[2025-04-19 17:55:22,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4476928. Throughput: 0: 2319.6. Samples: 1117363. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:22,304][08935] Avg episode reward: [(0, '24.431')]
	[2025-04-19 17:55:25,312][09004] Updated weights for policy 0, policy_version 1100 (0.0008)
	[2025-04-19 17:55:27,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 8983.4). Total num frames: 4521984. Throughput: 0: 2317.4. Samples: 1124208. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:27,304][08935] Avg episode reward: [(0, '21.919')]
	[2025-04-19 17:55:29,818][09004] Updated weights for policy 0, policy_version 1110 (0.0008)
	[2025-04-19 17:55:32,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9284.3, 300 sec: 8997.3). Total num frames: 4567040. Throughput: 0: 2307.6. Samples: 1137932. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:32,304][08935] Avg episode reward: [(0, '23.175')]
	[2025-04-19 17:55:34,294][09004] Updated weights for policy 0, policy_version 1120 (0.0008)
	[2025-04-19 17:55:37,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9011.2). Total num frames: 4612096. Throughput: 0: 2309.2. Samples: 1151673. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:37,305][08935] Avg episode reward: [(0, '24.341')]
	[2025-04-19 17:55:38,791][09004] Updated weights for policy 0, policy_version 1130 (0.0008)
	[2025-04-19 17:55:42,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9025.1). Total num frames: 4657152. Throughput: 0: 2309.3. Samples: 1158498. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:42,304][08935] Avg episode reward: [(0, '24.511')]
	[2025-04-19 17:55:43,215][09004] Updated weights for policy 0, policy_version 1140 (0.0008)
	[2025-04-19 17:55:47,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9052.9). Total num frames: 4706304. Throughput: 0: 2314.0. Samples: 1172449. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:47,304][08935] Avg episode reward: [(0, '26.442')]
	[2025-04-19 17:55:47,579][09004] Updated weights for policy 0, policy_version 1150 (0.0009)
	[2025-04-19 17:55:51,913][09004] Updated weights for policy 0, policy_version 1160 (0.0007)
	[2025-04-19 17:55:52,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9216.0, 300 sec: 9052.9). Total num frames: 4751360. Throughput: 0: 2319.6. Samples: 1186629. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:55:52,304][08935] Avg episode reward: [(0, '26.630')]
	[2025-04-19 17:55:56,319][09004] Updated weights for policy 0, policy_version 1170 (0.0007)
	[2025-04-19 17:55:57,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9066.7). Total num frames: 4800512. Throughput: 0: 2319.2. Samples: 1193482. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:55:57,304][08935] Avg episode reward: [(0, '28.653')]
	[2025-04-19 17:55:57,307][08993] Saving new best policy, reward=28.653!
	[2025-04-19 17:56:00,753][09004] Updated weights for policy 0, policy_version 1180 (0.0007)
	[2025-04-19 17:56:02,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9080.6). Total num frames: 4845568. Throughput: 0: 2315.3. Samples: 1207479. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:56:02,304][08935] Avg episode reward: [(0, '25.988')]
	[2025-04-19 17:56:05,217][09004] Updated weights for policy 0, policy_version 1190 (0.0008)
	[2025-04-19 17:56:07,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9080.6). Total num frames: 4890624. Throughput: 0: 2309.6. Samples: 1221296. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
	[2025-04-19 17:56:07,304][08935] Avg episode reward: [(0, '27.006')]
	[2025-04-19 17:56:09,687][09004] Updated weights for policy 0, policy_version 1200 (0.0009)
	[2025-04-19 17:56:12,303][08935] Fps is (10 sec: 9011.2, 60 sec: 9216.0, 300 sec: 9108.4). Total num frames: 4935680. Throughput: 0: 2309.8. Samples: 1228147. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:56:12,304][08935] Avg episode reward: [(0, '25.495')]
	[2025-04-19 17:56:14,192][09004] Updated weights for policy 0, policy_version 1210 (0.0008)
	[2025-04-19 17:56:17,303][08935] Fps is (10 sec: 9420.8, 60 sec: 9284.3, 300 sec: 9136.2). Total num frames: 4984832. Throughput: 0: 2310.4. Samples: 1241900. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
	[2025-04-19 17:56:17,304][08935] Avg episode reward: [(0, '24.476')]
	[2025-04-19 17:56:18,631][09004] Updated weights for policy 0, policy_version 1220 (0.0008)
	[2025-04-19 17:56:19,566][08993] Stopping Batcher_0...
	[2025-04-19 17:56:19,566][08993] Loop batcher_evt_loop terminating...
	[2025-04-19 17:56:19,566][08935] Component Batcher_0 stopped!
	[2025-04-19 17:56:19,566][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 17:56:19,579][09006] Stopping RolloutWorker_w0...
	[2025-04-19 17:56:19,579][09006] Loop rollout_proc0_evt_loop terminating...
	[2025-04-19 17:56:19,579][08935] Component RolloutWorker_w0 stopped!
	[2025-04-19 17:56:19,580][09004] Weights refcount: 2 0
	[2025-04-19 17:56:19,582][09004] Stopping InferenceWorker_p0-w0...
	[2025-04-19 17:56:19,583][09004] Loop inference_proc0-0_evt_loop terminating...
	[2025-04-19 17:56:19,584][08935] Component InferenceWorker_p0-w0 stopped!
	[2025-04-19 17:56:19,594][09007] Stopping RolloutWorker_w2...
	[2025-04-19 17:56:19,594][09007] Loop rollout_proc2_evt_loop terminating...
	[2025-04-19 17:56:19,594][08935] Component RolloutWorker_w2 stopped!
	[2025-04-19 17:56:19,597][09005] Stopping RolloutWorker_w1...
	[2025-04-19 17:56:19,597][09005] Loop rollout_proc1_evt_loop terminating...
	[2025-04-19 17:56:19,597][08935] Component RolloutWorker_w1 stopped!
	[2025-04-19 17:56:19,618][08993] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000000743_3043328.pth
	[2025-04-19 17:56:19,628][08993] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 17:56:19,631][09009] Stopping RolloutWorker_w5...
	[2025-04-19 17:56:19,632][09009] Loop rollout_proc5_evt_loop terminating...
	[2025-04-19 17:56:19,631][08935] Component RolloutWorker_w5 stopped!
	[2025-04-19 17:56:19,673][09010] Stopping RolloutWorker_w4...
	[2025-04-19 17:56:19,673][09010] Loop rollout_proc4_evt_loop terminating...
	[2025-04-19 17:56:19,679][09008] Stopping RolloutWorker_w3...
	[2025-04-19 17:56:19,680][09008] Loop rollout_proc3_evt_loop terminating...
	[2025-04-19 17:56:19,676][08935] Component RolloutWorker_w4 stopped!
	[2025-04-19 17:56:19,682][08935] Component RolloutWorker_w3 stopped!
	[2025-04-19 17:56:19,767][08993] Stopping LearnerWorker_p0...
	[2025-04-19 17:56:19,768][08993] Loop learner_proc0_evt_loop terminating...
	[2025-04-19 17:56:19,767][08935] Component LearnerWorker_p0 stopped!
	[2025-04-19 17:56:19,768][08935] Waiting for process learner_proc0 to stop...
	[2025-04-19 17:56:20,633][08935] Waiting for process inference_proc0-0 to join...
	[2025-04-19 17:56:20,633][08935] Waiting for process rollout_proc0 to join...
	[2025-04-19 17:56:20,634][08935] Waiting for process rollout_proc1 to join...
	[2025-04-19 17:56:20,635][08935] Waiting for process rollout_proc2 to join...
	[2025-04-19 17:56:20,635][08935] Waiting for process rollout_proc3 to join...
	[2025-04-19 17:56:20,636][08935] Waiting for process rollout_proc4 to join...
	[2025-04-19 17:56:20,636][08935] Waiting for process rollout_proc5 to join...
	[2025-04-19 17:56:20,637][08935] Batcher 0 profile tree view:
	batching: 14.8875, releasing_batches: 0.0256
	[2025-04-19 17:56:20,637][08935] InferenceWorker_p0-w0 profile tree view:
	wait_policy: 0.0000
	wait_policy_total: 9.4959
	update_model: 8.1520
	weight_update: 0.0008
	one_step: 0.0030
	handle_policy_step: 518.4990
	deserialize: 14.0397, stack: 3.2776, obs_to_device_normalize: 122.3004, forward: 259.9247, send_messages: 27.2078
	prepare_outputs: 67.8854
	to_cpu: 40.3985
	[2025-04-19 17:56:20,638][08935] Learner 0 profile tree view:
	misc: 0.0037, prepare_batch: 8.8470
	train: 34.5834
	epoch_init: 0.0040, minibatch_init: 0.0056, losses_postprocess: 0.5332, kl_divergence: 0.4351, after_optimizer: 15.5133
	calculate_losses: 11.7097
	losses_init: 0.0024, forward_head: 0.7281, bptt_initial: 8.1532, tail: 0.5335, advantages_returns: 0.1383, losses: 1.0890
	bptt: 0.9039
	bptt_forward_core: 0.8562
	update: 6.0418
	clip: 0.6656
	[2025-04-19 17:56:20,639][08935] RolloutWorker_w0 profile tree view:
	wait_for_trajectories: 0.2220, enqueue_policy_requests: 14.9416, env_step: 189.8717, overhead: 9.0107, complete_rollouts: 0.5647
	save_policy_outputs: 15.6368
	split_output_tensors: 5.3303
	[2025-04-19 17:56:20,639][08935] RolloutWorker_w5 profile tree view:
	wait_for_trajectories: 0.2358, enqueue_policy_requests: 15.0036, env_step: 190.3710, overhead: 8.8125, complete_rollouts: 0.5870
	save_policy_outputs: 15.6126
	split_output_tensors: 5.3595
	[2025-04-19 17:56:20,640][08935] Loop Runner_EvtLoop terminating...
	[2025-04-19 17:56:20,640][08935] Runner profile tree view:
	main_loop: 570.1098
	[2025-04-19 17:56:20,641][08935] Collected {0: 5005312}, FPS: 8779.6
	[2025-04-19 17:56:20,649][08935] Loading existing experiment configuration from ./runs/default_experiment/config.json
	[2025-04-19 17:56:20,649][08935] Overriding arg 'num_workers' with value 1 passed from command line
	[2025-04-19 17:56:20,650][08935] Adding new argument 'no_render'=True that is not in the saved config file!
	[2025-04-19 17:56:20,650][08935] Adding new argument 'save_video'=True that is not in the saved config file!
	[2025-04-19 17:56:20,650][08935] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
	[2025-04-19 17:56:20,651][08935] Adding new argument 'video_name'=None that is not in the saved config file!
	[2025-04-19 17:56:20,651][08935] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
	[2025-04-19 17:56:20,651][08935] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
	[2025-04-19 17:56:20,652][08935] Adding new argument 'push_to_hub'=True that is not in the saved config file!
	[2025-04-19 17:56:20,652][08935] Adding new argument 'hf_repository'='CarlosElArtista/vizdoom_health_gathering_supreme' that is not in the saved config file!
	[2025-04-19 17:56:20,652][08935] Adding new argument 'policy_index'=0 that is not in the saved config file!
	[2025-04-19 17:56:20,653][08935] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
	[2025-04-19 17:56:20,653][08935] Adding new argument 'train_script'=None that is not in the saved config file!
	[2025-04-19 17:56:20,654][08935] Adding new argument 'enjoy_script'=None that is not in the saved config file!
	[2025-04-19 17:56:20,655][08935] Using frameskip 1 and render_action_repeat=4 for evaluation
	[2025-04-19 17:56:20,677][08935] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 17:56:20,678][08935] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 17:56:20,679][08935] RunningMeanStd input shape: (1,)
	[2025-04-19 17:56:20,689][08935] ConvEncoder: input_channels=3
	[2025-04-19 17:56:20,867][08935] Conv encoder output size: 512
	[2025-04-19 17:56:20,870][08935] Policy head output size: 512
	[2025-04-19 17:56:21,139][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 17:56:21,141][08935] Could not load from checkpoint, attempt 0
	Traceback (most recent call last):
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint
	checkpoint_dict = torch.load(latest_checkpoint, map_location=device)
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load
	raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
	_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m.
	(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
	(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
	WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function.

	Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
	[2025-04-19 17:56:21,142][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 17:56:21,143][08935] Could not load from checkpoint, attempt 1
	Traceback (most recent call last):
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint
	checkpoint_dict = torch.load(latest_checkpoint, map_location=device)
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load
	raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
	_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m.
	(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
	(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
	WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function.

	Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
	[2025-04-19 17:56:21,144][08935] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 17:56:21,145][08935] Could not load from checkpoint, attempt 2
	Traceback (most recent call last):
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint
	checkpoint_dict = torch.load(latest_checkpoint, map_location=device)
	File "/home/charles/miniconda3/envs/deep_rl_hf_unit_8_2/lib/python3.10/site-packages/torch/serialization.py", line 1470, in load
	raise pickle.UnpicklingError(_get_wo_message(str(e))) from None
	_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m.
	(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source.
	(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message.
	WeightsUnpickler error: Unsupported global: GLOBAL numpy.core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function.

	Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html.
	[2025-04-19 18:34:27,236][15432] Saving configuration to ./runs/default_experiment/config.json...
	[2025-04-19 18:34:27,238][15432] Rollout worker 0 uses device cpu
	[2025-04-19 18:34:27,238][15432] Rollout worker 1 uses device cpu
	[2025-04-19 18:34:27,239][15432] Rollout worker 2 uses device cpu
	[2025-04-19 18:34:27,240][15432] Rollout worker 3 uses device cpu
	[2025-04-19 18:34:27,240][15432] Rollout worker 4 uses device cpu
	[2025-04-19 18:34:27,241][15432] Rollout worker 5 uses device cpu
	[2025-04-19 18:34:27,309][15432] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 18:34:27,309][15432] InferenceWorker_p0-w0: min num requests: 2
	[2025-04-19 18:34:27,333][15432] Starting all processes...
	[2025-04-19 18:34:27,334][15432] Starting process learner_proc0
	[2025-04-19 18:34:27,383][15432] Starting all processes...
	[2025-04-19 18:34:27,386][15432] Starting process inference_proc0-0
	[2025-04-19 18:34:27,387][15432] Starting process rollout_proc0
	[2025-04-19 18:34:27,388][15432] Starting process rollout_proc1
	[2025-04-19 18:34:27,390][15432] Starting process rollout_proc2
	[2025-04-19 18:34:27,390][15432] Starting process rollout_proc3
	[2025-04-19 18:34:27,391][15432] Starting process rollout_proc4
	[2025-04-19 18:34:27,391][15432] Starting process rollout_proc5
	[2025-04-19 18:34:29,635][15496] Worker 1 uses CPU cores [1]
	[2025-04-19 18:34:29,700][15501] Worker 5 uses CPU cores [5]
	[2025-04-19 18:34:29,860][15495] Worker 0 uses CPU cores [0]
	[2025-04-19 18:34:29,985][15500] Worker 4 uses CPU cores [4]
	[2025-04-19 18:34:30,046][15499] Worker 3 uses CPU cores [3]
	[2025-04-19 18:34:30,244][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 18:34:30,244][15484] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
	[2025-04-19 18:34:30,259][15484] Num visible devices: 1
	[2025-04-19 18:34:30,260][15484] Starting seed is not provided
	[2025-04-19 18:34:30,260][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 18:34:30,260][15484] Initializing actor-critic model on device cuda:0
	[2025-04-19 18:34:30,260][15484] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 18:34:30,261][15484] RunningMeanStd input shape: (1,)
	[2025-04-19 18:34:30,266][15498] Worker 2 uses CPU cores [2]
	[2025-04-19 18:34:30,271][15484] ConvEncoder: input_channels=3
	[2025-04-19 18:34:30,337][15497] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 18:34:30,337][15497] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
	[2025-04-19 18:34:30,352][15497] Num visible devices: 1
	[2025-04-19 18:34:30,355][15484] Conv encoder output size: 512
	[2025-04-19 18:34:30,355][15484] Policy head output size: 512
	[2025-04-19 18:34:30,367][15484] Created Actor Critic model with architecture:
	[2025-04-19 18:34:30,367][15484] ActorCriticSharedWeights(
	(obs_normalizer): ObservationNormalizer(
	(running_mean_std): RunningMeanStdDictInPlace(
	(running_mean_std): ModuleDict(
	(obs): RunningMeanStdInPlace()
	)
	)
	)
	(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
	(encoder): VizdoomEncoder(
	(basic_encoder): ConvEncoder(
	(enc): RecursiveScriptModule(
	original_name=ConvEncoderImpl
	(conv_head): RecursiveScriptModule(
	original_name=Sequential
	(0): RecursiveScriptModule(original_name=Conv2d)
	(1): RecursiveScriptModule(original_name=ELU)
	(2): RecursiveScriptModule(original_name=Conv2d)
	(3): RecursiveScriptModule(original_name=ELU)
	(4): RecursiveScriptModule(original_name=Conv2d)
	(5): RecursiveScriptModule(original_name=ELU)
	)
	(mlp_layers): RecursiveScriptModule(
	original_name=Sequential
	(0): RecursiveScriptModule(original_name=Linear)
	(1): RecursiveScriptModule(original_name=ELU)
	)
	)
	)
	)
	(core): ModelCoreRNN(
	(core): GRU(512, 512)
	)
	(decoder): MlpDecoder(
	(mlp): Identity()
	)
	(critic_linear): Linear(in_features=512, out_features=1, bias=True)
	(action_parameterization): ActionParameterizationDefault(
	(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
	)
	)
	[2025-04-19 18:34:30,661][15484] Using optimizer <class 'torch.optim.adam.Adam'>
	[2025-04-19 18:34:31,476][15484] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001222_5005312.pth...
	[2025-04-19 18:34:31,498][15484] Loading model from checkpoint
	[2025-04-19 18:34:31,499][15484] Loaded experiment state at self.train_step=1222, self.env_steps=5005312
	[2025-04-19 18:34:31,499][15484] Initialized policy 0 weights for model version 1222
	[2025-04-19 18:34:31,501][15484] LearnerWorker_p0 finished initialization!
	[2025-04-19 18:34:31,501][15484] Using GPUs [0] for process 0 (actually maps to GPUs [0])
	[2025-04-19 18:34:31,666][15497] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 18:34:31,667][15497] RunningMeanStd input shape: (1,)
	[2025-04-19 18:34:31,676][15497] ConvEncoder: input_channels=3
	[2025-04-19 18:34:31,764][15497] Conv encoder output size: 512
	[2025-04-19 18:34:31,764][15497] Policy head output size: 512
	[2025-04-19 18:34:31,794][15432] Inference worker 0-0 is ready!
	[2025-04-19 18:34:31,794][15432] All inference workers are ready! Signal rollout workers to start!
	[2025-04-19 18:34:31,826][15498] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:31,826][15495] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:31,828][15500] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:31,830][15499] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:31,832][15496] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:31,859][15501] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:32,105][15498] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,107][15495] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,108][15496] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,343][15501] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,363][15498] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:32,393][15495] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:32,454][15500] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,457][15496] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:32,599][15501] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:32,626][15499] Decorrelating experience for 0 frames...
	[2025-04-19 18:34:32,863][15500] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:32,885][15499] Decorrelating experience for 32 frames...
	[2025-04-19 18:34:34,102][15484] Signal inference workers to stop experience collection...
	[2025-04-19 18:34:34,107][15497] InferenceWorker_p0-w0: stopping experience collection
	[2025-04-19 18:34:34,149][15432] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 5005312. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
	[2025-04-19 18:34:34,149][15432] Avg episode reward: [(0, '5.700')]
	[2025-04-19 18:34:34,852][15484] Signal inference workers to resume experience collection...
	[2025-04-19 18:34:34,852][15484] Stopping Batcher_0...
	[2025-04-19 18:34:34,852][15484] Loop batcher_evt_loop terminating...
	[2025-04-19 18:34:34,857][15432] Component Batcher_0 stopped!
	[2025-04-19 18:34:34,863][15497] Weights refcount: 2 0
	[2025-04-19 18:34:34,864][15497] Stopping InferenceWorker_p0-w0...
	[2025-04-19 18:34:34,865][15497] Loop inference_proc0-0_evt_loop terminating...
	[2025-04-19 18:34:34,864][15432] Component InferenceWorker_p0-w0 stopped!
	[2025-04-19 18:34:34,879][15501] Stopping RolloutWorker_w5...
	[2025-04-19 18:34:34,879][15501] Loop rollout_proc5_evt_loop terminating...
	[2025-04-19 18:34:34,879][15432] Component RolloutWorker_w5 stopped!
	[2025-04-19 18:34:34,886][15500] Stopping RolloutWorker_w4...
	[2025-04-19 18:34:34,886][15500] Loop rollout_proc4_evt_loop terminating...
	[2025-04-19 18:34:34,886][15432] Component RolloutWorker_w4 stopped!
	[2025-04-19 18:34:34,914][15432] Component RolloutWorker_w1 stopped!
	[2025-04-19 18:34:34,918][15496] Stopping RolloutWorker_w1...
	[2025-04-19 18:34:34,919][15496] Loop rollout_proc1_evt_loop terminating...
	[2025-04-19 18:34:34,954][15432] Component RolloutWorker_w3 stopped!
	[2025-04-19 18:34:34,955][15499] Stopping RolloutWorker_w3...
	[2025-04-19 18:34:34,956][15499] Loop rollout_proc3_evt_loop terminating...
	[2025-04-19 18:34:34,968][15432] Component RolloutWorker_w2 stopped!
	[2025-04-19 18:34:34,966][15498] Stopping RolloutWorker_w2...
	[2025-04-19 18:34:34,970][15495] Stopping RolloutWorker_w0...
	[2025-04-19 18:34:34,971][15495] Loop rollout_proc0_evt_loop terminating...
	[2025-04-19 18:34:34,970][15432] Component RolloutWorker_w0 stopped!
	[2025-04-19 18:34:34,970][15498] Loop rollout_proc2_evt_loop terminating...
	[2025-04-19 18:34:35,066][15484] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth...
	[2025-04-19 18:34:35,122][15484] Removing ./runs/default_experiment/checkpoint_p0/checkpoint_000001014_4153344.pth
	[2025-04-19 18:34:35,131][15484] Saving ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth...
	[2025-04-19 18:34:35,304][15484] Stopping LearnerWorker_p0...
	[2025-04-19 18:34:35,305][15484] Loop learner_proc0_evt_loop terminating...
	[2025-04-19 18:34:35,304][15432] Component LearnerWorker_p0 stopped!
	[2025-04-19 18:34:35,305][15432] Waiting for process learner_proc0 to stop...
	[2025-04-19 18:34:36,091][15432] Waiting for process inference_proc0-0 to join...
	[2025-04-19 18:34:36,092][15432] Waiting for process rollout_proc0 to join...
	[2025-04-19 18:34:36,093][15432] Waiting for process rollout_proc1 to join...
	[2025-04-19 18:34:36,093][15432] Waiting for process rollout_proc2 to join...
	[2025-04-19 18:34:36,094][15432] Waiting for process rollout_proc3 to join...
	[2025-04-19 18:34:36,094][15432] Waiting for process rollout_proc4 to join...
	[2025-04-19 18:34:36,095][15432] Waiting for process rollout_proc5 to join...
	[2025-04-19 18:34:36,096][15432] Batcher 0 profile tree view:
	batching: 0.0296, releasing_batches: 0.0003
	[2025-04-19 18:34:36,096][15432] InferenceWorker_p0-w0 profile tree view:
	update_model: 0.0145
	wait_policy: 0.0000
	wait_policy_total: 0.6222
	one_step: 0.0021
	handle_policy_step: 1.6087
	deserialize: 0.0368, stack: 0.0062, obs_to_device_normalize: 0.3223, forward: 1.0050, send_messages: 0.0533
	prepare_outputs: 0.1360
	to_cpu: 0.0799
	[2025-04-19 18:34:36,097][15432] Learner 0 profile tree view:
	misc: 0.0000, prepare_batch: 0.5963
	train: 1.0635
	epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0006, kl_divergence: 0.0097, after_optimizer: 0.0431
	calculate_losses: 0.2990
	losses_init: 0.0000, forward_head: 0.1871, bptt_initial: 0.0706, tail: 0.0147, advantages_returns: 0.0023, losses: 0.0216
	bptt: 0.0024
	bptt_forward_core: 0.0023
	update: 0.7104
	clip: 0.0288
	[2025-04-19 18:34:36,098][15432] RolloutWorker_w0 profile tree view:
	wait_for_trajectories: 0.0005, enqueue_policy_requests: 0.0365, env_step: 0.4395, overhead: 0.0177, complete_rollouts: 0.0010
	save_policy_outputs: 0.0293
	split_output_tensors: 0.0102
	[2025-04-19 18:34:36,098][15432] RolloutWorker_w5 profile tree view:
	wait_for_trajectories: 0.0005, enqueue_policy_requests: 0.0352, env_step: 0.4595, overhead: 0.0167, complete_rollouts: 0.0008
	save_policy_outputs: 0.0326
	split_output_tensors: 0.0133
	[2025-04-19 18:34:36,099][15432] Loop Runner_EvtLoop terminating...
	[2025-04-19 18:34:36,100][15432] Runner profile tree view:
	main_loop: 8.7668
	[2025-04-19 18:34:36,100][15432] Collected {0: 5013504}, FPS: 934.4
	[2025-04-19 18:34:36,108][15432] Loading existing experiment configuration from ./runs/default_experiment/config.json
	[2025-04-19 18:34:36,108][15432] Overriding arg 'num_workers' with value 1 passed from command line
	[2025-04-19 18:34:36,109][15432] Adding new argument 'no_render'=True that is not in the saved config file!
	[2025-04-19 18:34:36,109][15432] Adding new argument 'save_video'=True that is not in the saved config file!
	[2025-04-19 18:34:36,110][15432] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
	[2025-04-19 18:34:36,110][15432] Adding new argument 'video_name'=None that is not in the saved config file!
	[2025-04-19 18:34:36,110][15432] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
	[2025-04-19 18:34:36,111][15432] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
	[2025-04-19 18:34:36,111][15432] Adding new argument 'push_to_hub'=True that is not in the saved config file!
	[2025-04-19 18:34:36,111][15432] Adding new argument 'hf_repository'='CarlosElArtista/vizdoom_health_gathering_supreme' that is not in the saved config file!
	[2025-04-19 18:34:36,112][15432] Adding new argument 'policy_index'=0 that is not in the saved config file!
	[2025-04-19 18:34:36,113][15432] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
	[2025-04-19 18:34:36,114][15432] Adding new argument 'train_script'=None that is not in the saved config file!
	[2025-04-19 18:34:36,115][15432] Adding new argument 'enjoy_script'=None that is not in the saved config file!
	[2025-04-19 18:34:36,115][15432] Using frameskip 1 and render_action_repeat=4 for evaluation
	[2025-04-19 18:34:36,137][15432] Doom resolution: 160x120, resize resolution: (128, 72)
	[2025-04-19 18:34:36,139][15432] RunningMeanStd input shape: (3, 72, 128)
	[2025-04-19 18:34:36,140][15432] RunningMeanStd input shape: (1,)
	[2025-04-19 18:34:36,150][15432] ConvEncoder: input_channels=3
	[2025-04-19 18:34:36,272][15432] Conv encoder output size: 512
	[2025-04-19 18:34:36,272][15432] Policy head output size: 512
	[2025-04-19 18:34:36,520][15432] Loading state from checkpoint ./runs/default_experiment/checkpoint_p0/checkpoint_000001224_5013504.pth...
	[2025-04-19 18:34:37,115][15432] Num frames 100...
	[2025-04-19 18:34:37,237][15432] Num frames 200...
	[2025-04-19 18:34:37,352][15432] Num frames 300...
	[2025-04-19 18:34:37,468][15432] Num frames 400...
	[2025-04-19 18:34:37,587][15432] Num frames 500...
	[2025-04-19 18:34:37,702][15432] Num frames 600...
	[2025-04-19 18:34:37,839][15432] Avg episode rewards: #0: 11.720, true rewards: #0: 6.720
	[2025-04-19 18:34:37,840][15432] Avg episode reward: 11.720, avg true_objective: 6.720
	[2025-04-19 18:34:37,875][15432] Num frames 700...
	[2025-04-19 18:34:37,998][15432] Num frames 800...
	[2025-04-19 18:34:38,115][15432] Num frames 900...
	[2025-04-19 18:34:38,234][15432] Num frames 1000...
	[2025-04-19 18:34:38,353][15432] Num frames 1100...
	[2025-04-19 18:34:38,470][15432] Num frames 1200...
	[2025-04-19 18:34:38,590][15432] Num frames 1300...
	[2025-04-19 18:34:38,709][15432] Num frames 1400...
	[2025-04-19 18:34:38,828][15432] Num frames 1500...
	[2025-04-19 18:34:38,947][15432] Num frames 1600...
	[2025-04-19 18:34:39,067][15432] Num frames 1700...
	[2025-04-19 18:34:39,185][15432] Num frames 1800...
	[2025-04-19 18:34:39,299][15432] Num frames 1900...
	[2025-04-19 18:34:39,417][15432] Num frames 2000...
	[2025-04-19 18:34:39,537][15432] Num frames 2100...
	[2025-04-19 18:34:39,655][15432] Num frames 2200...
	[2025-04-19 18:34:39,776][15432] Num frames 2300...
	[2025-04-19 18:34:39,894][15432] Num frames 2400...
	[2025-04-19 18:34:40,013][15432] Num frames 2500...
	[2025-04-19 18:34:40,130][15432] Num frames 2600...
	[2025-04-19 18:34:40,214][15432] Avg episode rewards: #0: 33.630, true rewards: #0: 13.130
	[2025-04-19 18:34:40,215][15432] Avg episode reward: 33.630, avg true_objective: 13.130
	[2025-04-19 18:34:40,299][15432] Num frames 2700...
	[2025-04-19 18:34:40,417][15432] Num frames 2800...
	[2025-04-19 18:34:40,532][15432] Num frames 2900...
	[2025-04-19 18:34:40,647][15432] Num frames 3000...
	[2025-04-19 18:34:40,764][15432] Num frames 3100...
	[2025-04-19 18:34:40,880][15432] Num frames 3200...
	[2025-04-19 18:34:40,998][15432] Num frames 3300...
	[2025-04-19 18:34:41,117][15432] Num frames 3400...
	[2025-04-19 18:34:41,233][15432] Num frames 3500...
	[2025-04-19 18:34:41,347][15432] Num frames 3600...
	[2025-04-19 18:34:41,464][15432] Num frames 3700...
	[2025-04-19 18:34:41,578][15432] Num frames 3800...
	[2025-04-19 18:34:41,698][15432] Num frames 3900...
	[2025-04-19 18:34:41,816][15432] Num frames 4000...
	[2025-04-19 18:34:41,938][15432] Num frames 4100...
	[2025-04-19 18:34:42,048][15432] Num frames 4200...
	[2025-04-19 18:34:42,163][15432] Num frames 4300...
	[2025-04-19 18:34:42,281][15432] Num frames 4400...
	[2025-04-19 18:34:42,391][15432] Num frames 4500...
	[2025-04-19 18:34:42,509][15432] Num frames 4600...
	[2025-04-19 18:34:42,630][15432] Num frames 4700...
	[2025-04-19 18:34:42,700][15432] Avg episode rewards: #0: 39.709, true rewards: #0: 15.710
	[2025-04-19 18:34:42,701][15432] Avg episode reward: 39.709, avg true_objective: 15.710
	[2025-04-19 18:34:42,798][15432] Num frames 4800...
	[2025-04-19 18:34:42,917][15432] Num frames 4900...
	[2025-04-19 18:34:43,036][15432] Num frames 5000...
	[2025-04-19 18:34:43,155][15432] Num frames 5100...
	[2025-04-19 18:34:43,273][15432] Num frames 5200...
	[2025-04-19 18:34:43,391][15432] Num frames 5300...
	[2025-04-19 18:34:43,510][15432] Num frames 5400...
	[2025-04-19 18:34:43,627][15432] Num frames 5500...
	[2025-04-19 18:34:43,742][15432] Num frames 5600...
	[2025-04-19 18:34:43,857][15432] Num frames 5700...
	[2025-04-19 18:34:43,974][15432] Num frames 5800...
	[2025-04-19 18:34:44,104][15432] Avg episode rewards: #0: 36.162, true rewards: #0: 14.663
	[2025-04-19 18:34:44,105][15432] Avg episode reward: 36.162, avg true_objective: 14.663
	[2025-04-19 18:34:44,146][15432] Num frames 5900...
	[2025-04-19 18:34:44,259][15432] Num frames 6000...
	[2025-04-19 18:34:44,376][15432] Num frames 6100...
	[2025-04-19 18:34:44,492][15432] Num frames 6200...
	[2025-04-19 18:34:44,608][15432] Num frames 6300...
	[2025-04-19 18:34:44,726][15432] Num frames 6400...
	[2025-04-19 18:34:44,842][15432] Num frames 6500...
	[2025-04-19 18:34:44,975][15432] Avg episode rewards: #0: 32.138, true rewards: #0: 13.138
	[2025-04-19 18:34:44,976][15432] Avg episode reward: 32.138, avg true_objective: 13.138
	[2025-04-19 18:34:45,010][15432] Num frames 6600...
	[2025-04-19 18:34:45,127][15432] Num frames 6700...
	[2025-04-19 18:34:45,243][15432] Num frames 6800...
	[2025-04-19 18:34:45,358][15432] Num frames 6900...
	[2025-04-19 18:34:45,476][15432] Num frames 7000...
	[2025-04-19 18:34:45,593][15432] Num frames 7100...
	[2025-04-19 18:34:45,655][15432] Avg episode rewards: #0: 28.680, true rewards: #0: 11.847
	[2025-04-19 18:34:45,656][15432] Avg episode reward: 28.680, avg true_objective: 11.847
	[2025-04-19 18:34:45,761][15432] Num frames 7200...
	[2025-04-19 18:34:45,875][15432] Num frames 7300...
	[2025-04-19 18:34:45,993][15432] Num frames 7400...
	[2025-04-19 18:34:46,108][15432] Num frames 7500...
	[2025-04-19 18:34:46,223][15432] Num frames 7600...
	[2025-04-19 18:34:46,341][15432] Num frames 7700...
	[2025-04-19 18:34:46,457][15432] Num frames 7800...
	[2025-04-19 18:34:46,573][15432] Num frames 7900...
	[2025-04-19 18:34:46,672][15432] Avg episode rewards: #0: 26.771, true rewards: #0: 11.343
	[2025-04-19 18:34:46,673][15432] Avg episode reward: 26.771, avg true_objective: 11.343
	[2025-04-19 18:34:46,740][15432] Num frames 8000...
	[2025-04-19 18:34:46,853][15432] Num frames 8100...
	[2025-04-19 18:34:46,968][15432] Num frames 8200...
	[2025-04-19 18:34:47,086][15432] Num frames 8300...
	[2025-04-19 18:34:47,204][15432] Num frames 8400...
	[2025-04-19 18:34:47,320][15432] Num frames 8500...
	[2025-04-19 18:34:47,436][15432] Num frames 8600...
	[2025-04-19 18:34:47,552][15432] Num frames 8700...
	[2025-04-19 18:34:47,671][15432] Num frames 8800...
	[2025-04-19 18:34:47,786][15432] Num frames 8900...
	[2025-04-19 18:34:47,903][15432] Num frames 9000...
	[2025-04-19 18:34:48,022][15432] Num frames 9100...
	[2025-04-19 18:34:48,139][15432] Num frames 9200...
	[2025-04-19 18:34:48,256][15432] Num frames 9300...
	[2025-04-19 18:34:48,374][15432] Num frames 9400...
	[2025-04-19 18:34:48,475][15432] Avg episode rewards: #0: 28.176, true rewards: #0: 11.801
	[2025-04-19 18:34:48,476][15432] Avg episode reward: 28.176, avg true_objective: 11.801
	[2025-04-19 18:34:48,541][15432] Num frames 9500...
	[2025-04-19 18:34:48,656][15432] Num frames 9600...
	[2025-04-19 18:34:48,771][15432] Num frames 9700...
	[2025-04-19 18:34:48,887][15432] Num frames 9800...
	[2025-04-19 18:34:49,007][15432] Num frames 9900...
	[2025-04-19 18:34:49,124][15432] Num frames 10000...
	[2025-04-19 18:34:49,241][15432] Num frames 10100...
	[2025-04-19 18:34:49,358][15432] Num frames 10200...
	[2025-04-19 18:34:49,474][15432] Num frames 10300...
	[2025-04-19 18:34:49,590][15432] Num frames 10400...
	[2025-04-19 18:34:49,706][15432] Num frames 10500...
	[2025-04-19 18:34:49,821][15432] Num frames 10600...
	[2025-04-19 18:34:49,938][15432] Num frames 10700...
	[2025-04-19 18:34:50,107][15432] Avg episode rewards: #0: 29.331, true rewards: #0: 11.998
	[2025-04-19 18:34:50,108][15432] Avg episode reward: 29.331, avg true_objective: 11.998
	[2025-04-19 18:34:50,111][15432] Num frames 10800...
	[2025-04-19 18:34:50,244][15432] Num frames 10900...
	[2025-04-19 18:34:50,360][15432] Num frames 11000...
	[2025-04-19 18:34:50,475][15432] Num frames 11100...
	[2025-04-19 18:34:50,582][15432] Num frames 11200...
	[2025-04-19 18:34:50,698][15432] Num frames 11300...
	[2025-04-19 18:34:50,837][15432] Avg episode rewards: #0: 27.474, true rewards: #0: 11.374
	[2025-04-19 18:34:50,838][15432] Avg episode reward: 27.474, avg true_objective: 11.374
	[2025-04-19 18:35:08,939][15432] Replay video saved to ./runs/default_experiment/replay.mp4!