hnj0022's picture
Upload folder using huggingface_hub
9a8b5ad verified
[2025-03-29 06:05:01,495][04457] Saving configuration to /content/train_dir/default_experiment/config.json...
[2025-03-29 06:05:01,500][04457] Rollout worker 0 uses device cpu
[2025-03-29 06:05:01,501][04457] Rollout worker 1 uses device cpu
[2025-03-29 06:05:01,502][04457] Rollout worker 2 uses device cpu
[2025-03-29 06:05:01,503][04457] Rollout worker 3 uses device cpu
[2025-03-29 06:05:01,504][04457] Rollout worker 4 uses device cpu
[2025-03-29 06:05:01,505][04457] Rollout worker 5 uses device cpu
[2025-03-29 06:05:01,506][04457] Rollout worker 6 uses device cpu
[2025-03-29 06:05:01,507][04457] Rollout worker 7 uses device cpu
[2025-03-29 06:05:01,601][04457] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-03-29 06:05:01,603][04457] InferenceWorker_p0-w0: min num requests: 2
[2025-03-29 06:05:01,635][04457] Starting all processes...
[2025-03-29 06:05:01,636][04457] Starting process learner_proc0
[2025-03-29 06:05:01,690][04457] Starting all processes...
[2025-03-29 06:05:01,699][04457] Starting process inference_proc0-0
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc0
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc1
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc2
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc3
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc4
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc5
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc6
[2025-03-29 06:05:01,700][04457] Starting process rollout_proc7
[2025-03-29 06:05:20,059][04749] Worker 5 uses CPU cores [1]
[2025-03-29 06:05:20,112][04745] Worker 1 uses CPU cores [1]
[2025-03-29 06:05:20,443][04731] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-03-29 06:05:20,443][04731] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0
[2025-03-29 06:05:20,524][04731] Num visible devices: 1
[2025-03-29 06:05:20,578][04731] Starting seed is not provided
[2025-03-29 06:05:20,578][04731] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-03-29 06:05:20,578][04731] Initializing actor-critic model on device cuda:0
[2025-03-29 06:05:20,579][04731] RunningMeanStd input shape: (3, 72, 128)
[2025-03-29 06:05:20,580][04731] RunningMeanStd input shape: (1,)
[2025-03-29 06:05:20,668][04731] ConvEncoder: input_channels=3
[2025-03-29 06:05:20,699][04746] Worker 0 uses CPU cores [0]
[2025-03-29 06:05:20,714][04750] Worker 4 uses CPU cores [0]
[2025-03-29 06:05:20,831][04748] Worker 3 uses CPU cores [1]
[2025-03-29 06:05:20,843][04747] Worker 2 uses CPU cores [0]
[2025-03-29 06:05:20,916][04752] Worker 7 uses CPU cores [1]
[2025-03-29 06:05:20,925][04744] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-03-29 06:05:20,925][04744] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0
[2025-03-29 06:05:20,927][04751] Worker 6 uses CPU cores [0]
[2025-03-29 06:05:20,953][04744] Num visible devices: 1
[2025-03-29 06:05:21,056][04731] Conv encoder output size: 512
[2025-03-29 06:05:21,056][04731] Policy head output size: 512
[2025-03-29 06:05:21,072][04731] Created Actor Critic model with architecture:
[2025-03-29 06:05:21,073][04731] ActorCriticSharedWeights(
(obs_normalizer): ObservationNormalizer(
(running_mean_std): RunningMeanStdDictInPlace(
(running_mean_std): ModuleDict(
(obs): RunningMeanStdInPlace()
)
)
)
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace)
(encoder): VizdoomEncoder(
(basic_encoder): ConvEncoder(
(enc): RecursiveScriptModule(
original_name=ConvEncoderImpl
(conv_head): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Conv2d)
(1): RecursiveScriptModule(original_name=ELU)
(2): RecursiveScriptModule(original_name=Conv2d)
(3): RecursiveScriptModule(original_name=ELU)
(4): RecursiveScriptModule(original_name=Conv2d)
(5): RecursiveScriptModule(original_name=ELU)
)
(mlp_layers): RecursiveScriptModule(
original_name=Sequential
(0): RecursiveScriptModule(original_name=Linear)
(1): RecursiveScriptModule(original_name=ELU)
)
)
)
)
(core): ModelCoreRNN(
(core): GRU(512, 512)
)
(decoder): MlpDecoder(
(mlp): Identity()
)
(critic_linear): Linear(in_features=512, out_features=1, bias=True)
(action_parameterization): ActionParameterizationDefault(
(distribution_linear): Linear(in_features=512, out_features=5, bias=True)
)
)
[2025-03-29 06:05:21,346][04731] Using optimizer <class 'torch.optim.adam.Adam'>
[2025-03-29 06:05:21,598][04457] Heartbeat connected on Batcher_0
[2025-03-29 06:05:21,605][04457] Heartbeat connected on InferenceWorker_p0-w0
[2025-03-29 06:05:21,609][04457] Heartbeat connected on RolloutWorker_w0
[2025-03-29 06:05:21,613][04457] Heartbeat connected on RolloutWorker_w1
[2025-03-29 06:05:21,617][04457] Heartbeat connected on RolloutWorker_w2
[2025-03-29 06:05:21,620][04457] Heartbeat connected on RolloutWorker_w3
[2025-03-29 06:05:21,623][04457] Heartbeat connected on RolloutWorker_w4
[2025-03-29 06:05:21,629][04457] Heartbeat connected on RolloutWorker_w5
[2025-03-29 06:05:21,633][04457] Heartbeat connected on RolloutWorker_w6
[2025-03-29 06:05:21,636][04457] Heartbeat connected on RolloutWorker_w7
[2025-03-29 06:05:26,232][04731] No checkpoints found
[2025-03-29 06:05:26,232][04731] Did not load from checkpoint, starting from scratch!
[2025-03-29 06:05:26,233][04731] Initialized policy 0 weights for model version 0
[2025-03-29 06:05:26,236][04731] Using GPUs [0] for process 0 (actually maps to GPUs [0])
[2025-03-29 06:05:26,243][04731] LearnerWorker_p0 finished initialization!
[2025-03-29 06:05:26,246][04457] Heartbeat connected on LearnerWorker_p0
[2025-03-29 06:05:26,408][04744] RunningMeanStd input shape: (3, 72, 128)
[2025-03-29 06:05:26,411][04744] RunningMeanStd input shape: (1,)
[2025-03-29 06:05:26,484][04744] ConvEncoder: input_channels=3
[2025-03-29 06:05:26,589][04744] Conv encoder output size: 512
[2025-03-29 06:05:26,589][04744] Policy head output size: 512
[2025-03-29 06:05:26,625][04457] Inference worker 0-0 is ready!
[2025-03-29 06:05:26,626][04457] All inference workers are ready! Signal rollout workers to start!
[2025-03-29 06:05:26,932][04745] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,943][04752] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,942][04749] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,951][04751] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,955][04747] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,981][04746] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:26,988][04750] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:27,040][04748] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:05:28,383][04750] Decorrelating experience for 0 frames...
[2025-03-29 06:05:28,385][04747] Decorrelating experience for 0 frames...
[2025-03-29 06:05:28,384][04752] Decorrelating experience for 0 frames...
[2025-03-29 06:05:28,383][04749] Decorrelating experience for 0 frames...
[2025-03-29 06:05:29,184][04747] Decorrelating experience for 32 frames...
[2025-03-29 06:05:29,187][04750] Decorrelating experience for 32 frames...
[2025-03-29 06:05:29,612][04752] Decorrelating experience for 32 frames...
[2025-03-29 06:05:29,625][04748] Decorrelating experience for 0 frames...
[2025-03-29 06:05:29,634][04749] Decorrelating experience for 32 frames...
[2025-03-29 06:05:30,162][04457] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-03-29 06:05:30,250][04750] Decorrelating experience for 64 frames...
[2025-03-29 06:05:30,255][04747] Decorrelating experience for 64 frames...
[2025-03-29 06:05:30,580][04748] Decorrelating experience for 32 frames...
[2025-03-29 06:05:30,763][04747] Decorrelating experience for 96 frames...
[2025-03-29 06:05:30,826][04752] Decorrelating experience for 64 frames...
[2025-03-29 06:05:31,755][04748] Decorrelating experience for 64 frames...
[2025-03-29 06:05:31,760][04749] Decorrelating experience for 64 frames...
[2025-03-29 06:05:31,898][04752] Decorrelating experience for 96 frames...
[2025-03-29 06:05:32,280][04750] Decorrelating experience for 96 frames...
[2025-03-29 06:05:33,344][04749] Decorrelating experience for 96 frames...
[2025-03-29 06:05:33,346][04748] Decorrelating experience for 96 frames...
[2025-03-29 06:05:35,162][04457] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 42.4. Samples: 212. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0)
[2025-03-29 06:05:35,164][04457] Avg episode reward: [(0, '2.129')]
[2025-03-29 06:05:36,534][04731] Signal inference workers to stop experience collection...
[2025-03-29 06:05:36,546][04744] InferenceWorker_p0-w0: stopping experience collection
[2025-03-29 06:05:38,532][04731] Signal inference workers to resume experience collection...
[2025-03-29 06:05:38,533][04744] InferenceWorker_p0-w0: resuming experience collection
[2025-03-29 06:05:40,162][04457] Fps is (10 sec: 1228.8, 60 sec: 1228.8, 300 sec: 1228.8). Total num frames: 12288. Throughput: 0: 225.6. Samples: 2256. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0)
[2025-03-29 06:05:40,166][04457] Avg episode reward: [(0, '3.575')]
[2025-03-29 06:05:45,162][04457] Fps is (10 sec: 3276.8, 60 sec: 2184.5, 300 sec: 2184.5). Total num frames: 32768. Throughput: 0: 545.6. Samples: 8184. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:05:45,166][04457] Avg episode reward: [(0, '4.095')]
[2025-03-29 06:05:47,026][04744] Updated weights for policy 0, policy_version 10 (0.0095)
[2025-03-29 06:05:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 49152. Throughput: 0: 560.6. Samples: 11212. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:05:50,163][04457] Avg episode reward: [(0, '4.425')]
[2025-03-29 06:05:55,162][04457] Fps is (10 sec: 3276.7, 60 sec: 2621.4, 300 sec: 2621.4). Total num frames: 65536. Throughput: 0: 627.4. Samples: 15686. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:05:55,166][04457] Avg episode reward: [(0, '4.452')]
[2025-03-29 06:05:58,660][04744] Updated weights for policy 0, policy_version 20 (0.0017)
[2025-03-29 06:06:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 86016. Throughput: 0: 727.6. Samples: 21828. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:00,168][04457] Avg episode reward: [(0, '4.363')]
[2025-03-29 06:06:05,172][04457] Fps is (10 sec: 3682.9, 60 sec: 2924.9, 300 sec: 2924.9). Total num frames: 102400. Throughput: 0: 697.8. Samples: 24430. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:05,178][04457] Avg episode reward: [(0, '4.515')]
[2025-03-29 06:06:05,192][04731] Saving new best policy, reward=4.515!
[2025-03-29 06:06:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 2969.6, 300 sec: 2969.6). Total num frames: 118784. Throughput: 0: 717.0. Samples: 28680. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:10,165][04457] Avg episode reward: [(0, '4.458')]
[2025-03-29 06:06:10,992][04744] Updated weights for policy 0, policy_version 30 (0.0017)
[2025-03-29 06:06:15,162][04457] Fps is (10 sec: 3689.9, 60 sec: 3094.7, 300 sec: 3094.7). Total num frames: 139264. Throughput: 0: 771.1. Samples: 34698. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:15,164][04457] Avg episode reward: [(0, '4.390')]
[2025-03-29 06:06:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3031.0, 300 sec: 3031.0). Total num frames: 151552. Throughput: 0: 825.6. Samples: 37362. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:06:20,167][04457] Avg episode reward: [(0, '4.491')]
[2025-03-29 06:06:22,768][04744] Updated weights for policy 0, policy_version 40 (0.0015)
[2025-03-29 06:06:25,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3127.9, 300 sec: 3127.9). Total num frames: 172032. Throughput: 0: 888.2. Samples: 42224. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:06:25,164][04457] Avg episode reward: [(0, '4.634')]
[2025-03-29 06:06:25,170][04731] Saving new best policy, reward=4.634!
[2025-03-29 06:06:30,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3208.5, 300 sec: 3208.5). Total num frames: 192512. Throughput: 0: 884.4. Samples: 47982. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:30,163][04457] Avg episode reward: [(0, '4.621')]
[2025-03-29 06:06:34,387][04744] Updated weights for policy 0, policy_version 50 (0.0020)
[2025-03-29 06:06:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3413.3, 300 sec: 3150.8). Total num frames: 204800. Throughput: 0: 871.4. Samples: 50424. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:06:35,166][04457] Avg episode reward: [(0, '4.644')]
[2025-03-29 06:06:35,173][04731] Saving new best policy, reward=4.644!
[2025-03-29 06:06:40,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3218.3). Total num frames: 225280. Throughput: 0: 880.6. Samples: 55312. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:06:40,163][04457] Avg episode reward: [(0, '4.728')]
[2025-03-29 06:06:40,166][04731] Saving new best policy, reward=4.728!
[2025-03-29 06:06:45,164][04457] Fps is (10 sec: 3685.6, 60 sec: 3481.5, 300 sec: 3222.1). Total num frames: 241664. Throughput: 0: 870.9. Samples: 61020. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:45,166][04457] Avg episode reward: [(0, '4.626')]
[2025-03-29 06:06:45,715][04744] Updated weights for policy 0, policy_version 60 (0.0015)
[2025-03-29 06:06:50,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3225.6). Total num frames: 258048. Throughput: 0: 859.1. Samples: 63080. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:50,166][04457] Avg episode reward: [(0, '4.532')]
[2025-03-29 06:06:55,162][04457] Fps is (10 sec: 3277.5, 60 sec: 3481.6, 300 sec: 3228.6). Total num frames: 274432. Throughput: 0: 885.0. Samples: 68506. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:06:55,166][04457] Avg episode reward: [(0, '4.455')]
[2025-03-29 06:06:55,234][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000068_278528.pth...
[2025-03-29 06:06:57,165][04744] Updated weights for policy 0, policy_version 70 (0.0022)
[2025-03-29 06:07:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3276.8). Total num frames: 294912. Throughput: 0: 881.1. Samples: 74348. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:07:00,170][04457] Avg episode reward: [(0, '4.619')]
[2025-03-29 06:07:05,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3482.2, 300 sec: 3276.8). Total num frames: 311296. Throughput: 0: 863.7. Samples: 76230. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:07:05,165][04457] Avg episode reward: [(0, '4.563')]
[2025-03-29 06:07:08,989][04744] Updated weights for policy 0, policy_version 80 (0.0016)
[2025-03-29 06:07:10,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3317.8). Total num frames: 331776. Throughput: 0: 881.9. Samples: 81908. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:07:10,164][04457] Avg episode reward: [(0, '4.591')]
[2025-03-29 06:07:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3315.8). Total num frames: 348160. Throughput: 0: 878.4. Samples: 87512. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:07:15,167][04457] Avg episode reward: [(0, '4.641')]
[2025-03-29 06:07:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3314.0). Total num frames: 364544. Throughput: 0: 866.9. Samples: 89434. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:07:20,163][04457] Avg episode reward: [(0, '4.693')]
[2025-03-29 06:07:20,593][04744] Updated weights for policy 0, policy_version 90 (0.0024)
[2025-03-29 06:07:25,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3348.0). Total num frames: 385024. Throughput: 0: 893.3. Samples: 95510. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:07:25,167][04457] Avg episode reward: [(0, '4.480')]
[2025-03-29 06:07:30,164][04457] Fps is (10 sec: 3685.7, 60 sec: 3481.5, 300 sec: 3345.0). Total num frames: 401408. Throughput: 0: 883.3. Samples: 100766. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:07:30,166][04457] Avg episode reward: [(0, '4.457')]
[2025-03-29 06:07:32,512][04744] Updated weights for policy 0, policy_version 100 (0.0023)
[2025-03-29 06:07:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3342.3). Total num frames: 417792. Throughput: 0: 885.6. Samples: 102932. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:07:35,166][04457] Avg episode reward: [(0, '4.522')]
[2025-03-29 06:07:40,162][04457] Fps is (10 sec: 3687.0, 60 sec: 3549.9, 300 sec: 3371.3). Total num frames: 438272. Throughput: 0: 899.1. Samples: 108964. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:07:40,164][04457] Avg episode reward: [(0, '4.599')]
[2025-03-29 06:07:43,004][04744] Updated weights for policy 0, policy_version 110 (0.0020)
[2025-03-29 06:07:45,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3367.8). Total num frames: 454656. Throughput: 0: 881.3. Samples: 114006. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:07:45,170][04457] Avg episode reward: [(0, '4.646')]
[2025-03-29 06:07:50,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3364.6). Total num frames: 471040. Throughput: 0: 892.6. Samples: 116398. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:07:50,166][04457] Avg episode reward: [(0, '4.489')]
[2025-03-29 06:07:54,321][04744] Updated weights for policy 0, policy_version 120 (0.0014)
[2025-03-29 06:07:55,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3389.8). Total num frames: 491520. Throughput: 0: 901.6. Samples: 122480. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:07:55,164][04457] Avg episode reward: [(0, '4.535')]
[2025-03-29 06:08:00,163][04457] Fps is (10 sec: 3686.2, 60 sec: 3549.8, 300 sec: 3386.0). Total num frames: 507904. Throughput: 0: 882.6. Samples: 127228. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:00,166][04457] Avg episode reward: [(0, '4.570')]
[2025-03-29 06:08:05,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3382.5). Total num frames: 524288. Throughput: 0: 898.7. Samples: 129874. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:08:05,164][04457] Avg episode reward: [(0, '4.561')]
[2025-03-29 06:08:06,269][04744] Updated weights for policy 0, policy_version 130 (0.0017)
[2025-03-29 06:08:10,162][04457] Fps is (10 sec: 3686.6, 60 sec: 3549.9, 300 sec: 3404.8). Total num frames: 544768. Throughput: 0: 894.8. Samples: 135778. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:08:10,163][04457] Avg episode reward: [(0, '4.464')]
[2025-03-29 06:08:15,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3376.1). Total num frames: 557056. Throughput: 0: 876.3. Samples: 140198. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:15,167][04457] Avg episode reward: [(0, '4.496')]
[2025-03-29 06:08:18,319][04744] Updated weights for policy 0, policy_version 140 (0.0013)
[2025-03-29 06:08:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3397.3). Total num frames: 577536. Throughput: 0: 891.9. Samples: 143066. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:20,164][04457] Avg episode reward: [(0, '4.530')]
[2025-03-29 06:08:25,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3417.2). Total num frames: 598016. Throughput: 0: 889.3. Samples: 148982. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:25,164][04457] Avg episode reward: [(0, '4.468')]
[2025-03-29 06:08:30,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.7, 300 sec: 3390.6). Total num frames: 610304. Throughput: 0: 873.3. Samples: 153306. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:08:30,164][04457] Avg episode reward: [(0, '4.615')]
[2025-03-29 06:08:30,358][04744] Updated weights for policy 0, policy_version 150 (0.0016)
[2025-03-29 06:08:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3409.6). Total num frames: 630784. Throughput: 0: 886.6. Samples: 156294. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:08:35,164][04457] Avg episode reward: [(0, '4.897')]
[2025-03-29 06:08:35,170][04731] Saving new best policy, reward=4.897!
[2025-03-29 06:08:40,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3427.7). Total num frames: 651264. Throughput: 0: 882.1. Samples: 162176. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:08:40,165][04457] Avg episode reward: [(0, '5.135')]
[2025-03-29 06:08:40,169][04731] Saving new best policy, reward=5.135!
[2025-03-29 06:08:41,849][04744] Updated weights for policy 0, policy_version 160 (0.0020)
[2025-03-29 06:08:45,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3402.8). Total num frames: 663552. Throughput: 0: 871.7. Samples: 166454. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:45,165][04457] Avg episode reward: [(0, '5.074')]
[2025-03-29 06:08:50,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3420.2). Total num frames: 684032. Throughput: 0: 878.9. Samples: 169424. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:08:50,165][04457] Avg episode reward: [(0, '5.023')]
[2025-03-29 06:08:52,664][04744] Updated weights for policy 0, policy_version 170 (0.0022)
[2025-03-29 06:08:55,166][04457] Fps is (10 sec: 3685.0, 60 sec: 3481.4, 300 sec: 3416.6). Total num frames: 700416. Throughput: 0: 876.5. Samples: 175224. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:08:55,167][04457] Avg episode reward: [(0, '4.996')]
[2025-03-29 06:08:55,181][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000171_700416.pth...
[2025-03-29 06:09:00,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3413.3). Total num frames: 716800. Throughput: 0: 878.5. Samples: 179732. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:00,164][04457] Avg episode reward: [(0, '5.093')]
[2025-03-29 06:09:04,691][04744] Updated weights for policy 0, policy_version 180 (0.0013)
[2025-03-29 06:09:05,162][04457] Fps is (10 sec: 3687.8, 60 sec: 3549.9, 300 sec: 3429.2). Total num frames: 737280. Throughput: 0: 880.8. Samples: 182702. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:05,164][04457] Avg episode reward: [(0, '5.213')]
[2025-03-29 06:09:05,171][04731] Saving new best policy, reward=5.213!
[2025-03-29 06:09:10,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3425.7). Total num frames: 753664. Throughput: 0: 868.4. Samples: 188060. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:09:10,167][04457] Avg episode reward: [(0, '5.248')]
[2025-03-29 06:09:10,169][04731] Saving new best policy, reward=5.248!
[2025-03-29 06:09:15,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3422.4). Total num frames: 770048. Throughput: 0: 879.7. Samples: 192894. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:15,167][04457] Avg episode reward: [(0, '5.437')]
[2025-03-29 06:09:15,176][04731] Saving new best policy, reward=5.437!
[2025-03-29 06:09:16,634][04744] Updated weights for policy 0, policy_version 190 (0.0023)
[2025-03-29 06:09:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3437.1). Total num frames: 790528. Throughput: 0: 879.5. Samples: 195870. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:20,167][04457] Avg episode reward: [(0, '5.129')]
[2025-03-29 06:09:25,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3433.7). Total num frames: 806912. Throughput: 0: 867.2. Samples: 201200. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:25,164][04457] Avg episode reward: [(0, '5.183')]
[2025-03-29 06:09:28,326][04744] Updated weights for policy 0, policy_version 200 (0.0015)
[2025-03-29 06:09:30,165][04457] Fps is (10 sec: 3275.9, 60 sec: 3549.7, 300 sec: 3430.4). Total num frames: 823296. Throughput: 0: 885.8. Samples: 206318. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:09:30,166][04457] Avg episode reward: [(0, '5.057')]
[2025-03-29 06:09:35,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3444.0). Total num frames: 843776. Throughput: 0: 885.4. Samples: 209268. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:35,166][04457] Avg episode reward: [(0, '5.415')]
[2025-03-29 06:09:40,162][04457] Fps is (10 sec: 3277.6, 60 sec: 3413.3, 300 sec: 3424.3). Total num frames: 856064. Throughput: 0: 868.2. Samples: 214292. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:40,164][04457] Avg episode reward: [(0, '5.338')]
[2025-03-29 06:09:40,296][04744] Updated weights for policy 0, policy_version 210 (0.0019)
[2025-03-29 06:09:45,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3437.4). Total num frames: 876544. Throughput: 0: 885.3. Samples: 219570. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:09:45,165][04457] Avg episode reward: [(0, '5.520')]
[2025-03-29 06:09:45,170][04731] Saving new best policy, reward=5.520!
[2025-03-29 06:09:50,162][04457] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3450.1). Total num frames: 897024. Throughput: 0: 884.8. Samples: 222520. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:50,163][04457] Avg episode reward: [(0, '5.611')]
[2025-03-29 06:09:50,165][04731] Saving new best policy, reward=5.611!
[2025-03-29 06:09:50,813][04744] Updated weights for policy 0, policy_version 220 (0.0019)
[2025-03-29 06:09:55,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3481.8, 300 sec: 3431.4). Total num frames: 909312. Throughput: 0: 873.4. Samples: 227362. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:09:55,164][04457] Avg episode reward: [(0, '5.849')]
[2025-03-29 06:09:55,172][04731] Saving new best policy, reward=5.849!
[2025-03-29 06:10:00,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3443.7). Total num frames: 929792. Throughput: 0: 889.6. Samples: 232928. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:00,164][04457] Avg episode reward: [(0, '6.015')]
[2025-03-29 06:10:00,168][04731] Saving new best policy, reward=6.015!
[2025-03-29 06:10:02,672][04744] Updated weights for policy 0, policy_version 230 (0.0020)
[2025-03-29 06:10:05,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3549.8, 300 sec: 3455.5). Total num frames: 950272. Throughput: 0: 889.5. Samples: 235896. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:10:05,166][04457] Avg episode reward: [(0, '6.375')]
[2025-03-29 06:10:05,173][04731] Saving new best policy, reward=6.375!
[2025-03-29 06:10:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3437.7). Total num frames: 962560. Throughput: 0: 867.1. Samples: 240220. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:10:10,163][04457] Avg episode reward: [(0, '6.482')]
[2025-03-29 06:10:10,168][04731] Saving new best policy, reward=6.482!
[2025-03-29 06:10:14,707][04744] Updated weights for policy 0, policy_version 240 (0.0015)
[2025-03-29 06:10:15,162][04457] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3449.3). Total num frames: 983040. Throughput: 0: 882.7. Samples: 246038. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:15,164][04457] Avg episode reward: [(0, '6.845')]
[2025-03-29 06:10:15,171][04731] Saving new best policy, reward=6.845!
[2025-03-29 06:10:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3446.3). Total num frames: 999424. Throughput: 0: 883.0. Samples: 249004. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:20,163][04457] Avg episode reward: [(0, '6.731')]
[2025-03-29 06:10:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3443.4). Total num frames: 1015808. Throughput: 0: 869.2. Samples: 253404. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-03-29 06:10:25,166][04457] Avg episode reward: [(0, '6.523')]
[2025-03-29 06:10:26,476][04744] Updated weights for policy 0, policy_version 250 (0.0015)
[2025-03-29 06:10:30,164][04457] Fps is (10 sec: 3685.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1036288. Throughput: 0: 886.1. Samples: 259448. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:10:30,167][04457] Avg episode reward: [(0, '7.099')]
[2025-03-29 06:10:30,168][04731] Saving new best policy, reward=7.099!
[2025-03-29 06:10:35,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1052672. Throughput: 0: 887.0. Samples: 262436. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:35,166][04457] Avg episode reward: [(0, '7.737')]
[2025-03-29 06:10:35,176][04731] Saving new best policy, reward=7.737!
[2025-03-29 06:10:38,429][04744] Updated weights for policy 0, policy_version 260 (0.0022)
[2025-03-29 06:10:40,162][04457] Fps is (10 sec: 3277.4, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1069056. Throughput: 0: 877.1. Samples: 266832. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:10:40,167][04457] Avg episode reward: [(0, '8.296')]
[2025-03-29 06:10:40,172][04731] Saving new best policy, reward=8.296!
[2025-03-29 06:10:45,165][04457] Fps is (10 sec: 3685.3, 60 sec: 3549.7, 300 sec: 3526.7). Total num frames: 1089536. Throughput: 0: 882.9. Samples: 272662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:45,170][04457] Avg episode reward: [(0, '7.891')]
[2025-03-29 06:10:49,258][04744] Updated weights for policy 0, policy_version 270 (0.0022)
[2025-03-29 06:10:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1105920. Throughput: 0: 883.2. Samples: 275640. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:50,164][04457] Avg episode reward: [(0, '7.628')]
[2025-03-29 06:10:55,162][04457] Fps is (10 sec: 3277.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1122304. Throughput: 0: 887.0. Samples: 280134. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:10:55,166][04457] Avg episode reward: [(0, '8.082')]
[2025-03-29 06:10:55,176][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000274_1122304.pth...
[2025-03-29 06:10:55,297][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000068_278528.pth
[2025-03-29 06:11:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.8). Total num frames: 1142784. Throughput: 0: 891.1. Samples: 286138. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:00,164][04457] Avg episode reward: [(0, '7.937')]
[2025-03-29 06:11:00,447][04744] Updated weights for policy 0, policy_version 280 (0.0019)
[2025-03-29 06:11:05,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1159168. Throughput: 0: 884.1. Samples: 288790. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:05,164][04457] Avg episode reward: [(0, '7.828')]
[2025-03-29 06:11:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1175552. Throughput: 0: 890.9. Samples: 293496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:10,163][04457] Avg episode reward: [(0, '7.359')]
[2025-03-29 06:11:12,472][04744] Updated weights for policy 0, policy_version 290 (0.0025)
[2025-03-29 06:11:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1196032. Throughput: 0: 889.5. Samples: 299474. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:11:15,164][04457] Avg episode reward: [(0, '7.848')]
[2025-03-29 06:11:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1212416. Throughput: 0: 875.0. Samples: 301812. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:11:20,164][04457] Avg episode reward: [(0, '8.463')]
[2025-03-29 06:11:20,166][04731] Saving new best policy, reward=8.463!
[2025-03-29 06:11:24,260][04744] Updated weights for policy 0, policy_version 300 (0.0013)
[2025-03-29 06:11:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3512.8). Total num frames: 1228800. Throughput: 0: 889.7. Samples: 306868. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:11:25,169][04457] Avg episode reward: [(0, '8.702')]
[2025-03-29 06:11:25,253][04731] Saving new best policy, reward=8.702!
[2025-03-29 06:11:30,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 1249280. Throughput: 0: 891.9. Samples: 312796. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:30,166][04457] Avg episode reward: [(0, '9.047')]
[2025-03-29 06:11:30,173][04731] Saving new best policy, reward=9.047!
[2025-03-29 06:11:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3512.8). Total num frames: 1261568. Throughput: 0: 867.6. Samples: 314680. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:35,168][04457] Avg episode reward: [(0, '9.288')]
[2025-03-29 06:11:35,181][04731] Saving new best policy, reward=9.288!
[2025-03-29 06:11:36,338][04744] Updated weights for policy 0, policy_version 310 (0.0015)
[2025-03-29 06:11:40,166][04457] Fps is (10 sec: 3275.5, 60 sec: 3549.6, 300 sec: 3526.7). Total num frames: 1282048. Throughput: 0: 889.5. Samples: 320166. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:11:40,173][04457] Avg episode reward: [(0, '9.814')]
[2025-03-29 06:11:40,174][04731] Saving new best policy, reward=9.814!
[2025-03-29 06:11:45,163][04457] Fps is (10 sec: 4095.7, 60 sec: 3550.0, 300 sec: 3540.6). Total num frames: 1302528. Throughput: 0: 881.0. Samples: 325784. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:11:45,164][04457] Avg episode reward: [(0, '9.853')]
[2025-03-29 06:11:45,174][04731] Saving new best policy, reward=9.853!
[2025-03-29 06:11:48,020][04744] Updated weights for policy 0, policy_version 320 (0.0020)
[2025-03-29 06:11:50,162][04457] Fps is (10 sec: 3278.1, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1314816. Throughput: 0: 862.1. Samples: 327586. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:11:50,168][04457] Avg episode reward: [(0, '9.601')]
[2025-03-29 06:11:55,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3549.8, 300 sec: 3526.7). Total num frames: 1335296. Throughput: 0: 889.9. Samples: 333540. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:11:55,164][04457] Avg episode reward: [(0, '9.503')]
[2025-03-29 06:11:58,498][04744] Updated weights for policy 0, policy_version 330 (0.0017)
[2025-03-29 06:12:00,162][04457] Fps is (10 sec: 4096.1, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1355776. Throughput: 0: 875.4. Samples: 338866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0)
[2025-03-29 06:12:00,163][04457] Avg episode reward: [(0, '9.560')]
[2025-03-29 06:12:05,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1372160. Throughput: 0: 874.6. Samples: 341168. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:12:05,167][04457] Avg episode reward: [(0, '10.115')]
[2025-03-29 06:12:05,175][04731] Saving new best policy, reward=10.115!
[2025-03-29 06:12:10,135][04744] Updated weights for policy 0, policy_version 340 (0.0013)
[2025-03-29 06:12:10,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1392640. Throughput: 0: 893.9. Samples: 347092. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:12:10,163][04457] Avg episode reward: [(0, '10.966')]
[2025-03-29 06:12:10,165][04731] Saving new best policy, reward=10.966!
[2025-03-29 06:12:15,166][04457] Fps is (10 sec: 3275.6, 60 sec: 3481.4, 300 sec: 3526.7). Total num frames: 1404928. Throughput: 0: 867.3. Samples: 351826. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:12:15,167][04457] Avg episode reward: [(0, '11.846')]
[2025-03-29 06:12:15,181][04731] Saving new best policy, reward=11.846!
[2025-03-29 06:12:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3526.7). Total num frames: 1425408. Throughput: 0: 884.2. Samples: 354468. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:12:20,166][04457] Avg episode reward: [(0, '12.894')]
[2025-03-29 06:12:20,170][04731] Saving new best policy, reward=12.894!
[2025-03-29 06:12:21,833][04744] Updated weights for policy 0, policy_version 350 (0.0019)
[2025-03-29 06:12:25,162][04457] Fps is (10 sec: 4097.5, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1445888. Throughput: 0: 896.5. Samples: 360506. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:12:25,164][04457] Avg episode reward: [(0, '12.425')]
[2025-03-29 06:12:30,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3526.7). Total num frames: 1458176. Throughput: 0: 872.1. Samples: 365026. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:12:30,164][04457] Avg episode reward: [(0, '12.222')]
[2025-03-29 06:12:33,432][04744] Updated weights for policy 0, policy_version 360 (0.0020)
[2025-03-29 06:12:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3526.7). Total num frames: 1478656. Throughput: 0: 900.1. Samples: 368092. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:12:35,163][04457] Avg episode reward: [(0, '12.503')]
[2025-03-29 06:12:40,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3618.4, 300 sec: 3540.6). Total num frames: 1499136. Throughput: 0: 905.6. Samples: 374292. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:12:40,166][04457] Avg episode reward: [(0, '11.351')]
[2025-03-29 06:12:45,034][04744] Updated weights for policy 0, policy_version 370 (0.0028)
[2025-03-29 06:12:45,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1515520. Throughput: 0: 887.5. Samples: 378804. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:12:45,164][04457] Avg episode reward: [(0, '12.478')]
[2025-03-29 06:12:50,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3540.6). Total num frames: 1536000. Throughput: 0: 903.5. Samples: 381826. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:12:50,164][04457] Avg episode reward: [(0, '12.176')]
[2025-03-29 06:12:55,163][04457] Fps is (10 sec: 3686.1, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1552384. Throughput: 0: 909.5. Samples: 388020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:12:55,165][04457] Avg episode reward: [(0, '12.219')]
[2025-03-29 06:12:55,175][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000379_1552384.pth...
[2025-03-29 06:12:55,336][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000171_700416.pth
[2025-03-29 06:12:56,206][04744] Updated weights for policy 0, policy_version 380 (0.0017)
[2025-03-29 06:13:00,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3540.6). Total num frames: 1568768. Throughput: 0: 903.6. Samples: 392486. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:13:00,163][04457] Avg episode reward: [(0, '12.679')]
[2025-03-29 06:13:05,162][04457] Fps is (10 sec: 3686.7, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1589248. Throughput: 0: 913.2. Samples: 395560. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:05,164][04457] Avg episode reward: [(0, '12.644')]
[2025-03-29 06:13:06,766][04744] Updated weights for policy 0, policy_version 390 (0.0014)
[2025-03-29 06:13:10,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1605632. Throughput: 0: 902.7. Samples: 401128. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:10,164][04457] Avg episode reward: [(0, '12.762')]
[2025-03-29 06:13:15,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3618.3, 300 sec: 3540.6). Total num frames: 1622016. Throughput: 0: 910.3. Samples: 405992. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:15,167][04457] Avg episode reward: [(0, '13.124')]
[2025-03-29 06:13:15,174][04731] Saving new best policy, reward=13.124!
[2025-03-29 06:13:18,699][04744] Updated weights for policy 0, policy_version 400 (0.0019)
[2025-03-29 06:13:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1642496. Throughput: 0: 908.0. Samples: 408950. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:13:20,163][04457] Avg episode reward: [(0, '13.467')]
[2025-03-29 06:13:20,165][04731] Saving new best policy, reward=13.467!
[2025-03-29 06:13:25,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1658880. Throughput: 0: 886.2. Samples: 414172. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:13:25,166][04457] Avg episode reward: [(0, '13.836')]
[2025-03-29 06:13:25,175][04731] Saving new best policy, reward=13.836!
[2025-03-29 06:13:30,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1675264. Throughput: 0: 904.2. Samples: 419494. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:30,166][04457] Avg episode reward: [(0, '14.106')]
[2025-03-29 06:13:30,210][04731] Saving new best policy, reward=14.106!
[2025-03-29 06:13:30,216][04744] Updated weights for policy 0, policy_version 410 (0.0014)
[2025-03-29 06:13:35,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3540.6). Total num frames: 1695744. Throughput: 0: 904.1. Samples: 422512. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:13:35,166][04457] Avg episode reward: [(0, '14.110')]
[2025-03-29 06:13:35,173][04731] Saving new best policy, reward=14.110!
[2025-03-29 06:13:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1712128. Throughput: 0: 876.6. Samples: 427468. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:13:40,168][04457] Avg episode reward: [(0, '15.244')]
[2025-03-29 06:13:40,172][04731] Saving new best policy, reward=15.244!
[2025-03-29 06:13:41,972][04744] Updated weights for policy 0, policy_version 420 (0.0015)
[2025-03-29 06:13:45,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1732608. Throughput: 0: 901.4. Samples: 433048. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:45,167][04457] Avg episode reward: [(0, '15.282')]
[2025-03-29 06:13:45,173][04731] Saving new best policy, reward=15.282!
[2025-03-29 06:13:50,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1748992. Throughput: 0: 899.2. Samples: 436026. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:13:50,167][04457] Avg episode reward: [(0, '14.893')]
[2025-03-29 06:13:53,646][04744] Updated weights for policy 0, policy_version 430 (0.0028)
[2025-03-29 06:13:55,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1765376. Throughput: 0: 878.0. Samples: 440640. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:13:55,167][04457] Avg episode reward: [(0, '16.465')]
[2025-03-29 06:13:55,177][04731] Saving new best policy, reward=16.465!
[2025-03-29 06:14:00,162][04457] Fps is (10 sec: 3686.6, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1785856. Throughput: 0: 904.4. Samples: 446690. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:14:00,164][04457] Avg episode reward: [(0, '17.535')]
[2025-03-29 06:14:00,165][04731] Saving new best policy, reward=17.535!
[2025-03-29 06:14:04,020][04744] Updated weights for policy 0, policy_version 440 (0.0019)
[2025-03-29 06:14:05,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1802240. Throughput: 0: 906.5. Samples: 449742. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:05,164][04457] Avg episode reward: [(0, '17.463')]
[2025-03-29 06:14:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1818624. Throughput: 0: 891.5. Samples: 454290. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:10,166][04457] Avg episode reward: [(0, '19.372')]
[2025-03-29 06:14:10,168][04731] Saving new best policy, reward=19.372!
[2025-03-29 06:14:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3554.5). Total num frames: 1839104. Throughput: 0: 905.4. Samples: 460238. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:15,169][04457] Avg episode reward: [(0, '20.278')]
[2025-03-29 06:14:15,176][04731] Saving new best policy, reward=20.278!
[2025-03-29 06:14:15,610][04744] Updated weights for policy 0, policy_version 450 (0.0017)
[2025-03-29 06:14:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1855488. Throughput: 0: 902.7. Samples: 463134. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:20,168][04457] Avg episode reward: [(0, '18.782')]
[2025-03-29 06:14:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3554.5). Total num frames: 1871872. Throughput: 0: 894.0. Samples: 467698. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:25,165][04457] Avg episode reward: [(0, '18.194')]
[2025-03-29 06:14:27,326][04744] Updated weights for policy 0, policy_version 460 (0.0017)
[2025-03-29 06:14:30,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3554.5). Total num frames: 1892352. Throughput: 0: 905.9. Samples: 473814. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:14:30,168][04457] Avg episode reward: [(0, '18.235')]
[2025-03-29 06:14:35,165][04457] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3568.3). Total num frames: 1908736. Throughput: 0: 898.1. Samples: 476444. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:14:35,172][04457] Avg episode reward: [(0, '18.708')]
[2025-03-29 06:14:38,670][04744] Updated weights for policy 0, policy_version 470 (0.0014)
[2025-03-29 06:14:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1929216. Throughput: 0: 907.4. Samples: 481472. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:14:40,164][04457] Avg episode reward: [(0, '19.533')]
[2025-03-29 06:14:45,162][04457] Fps is (10 sec: 4097.2, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1949696. Throughput: 0: 908.0. Samples: 487552. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:14:45,164][04457] Avg episode reward: [(0, '20.050')]
[2025-03-29 06:14:50,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3568.4). Total num frames: 1961984. Throughput: 0: 889.3. Samples: 489760. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:14:50,167][04457] Avg episode reward: [(0, '21.371')]
[2025-03-29 06:14:50,230][04731] Saving new best policy, reward=21.371!
[2025-03-29 06:14:50,233][04744] Updated weights for policy 0, policy_version 480 (0.0017)
[2025-03-29 06:14:55,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 1982464. Throughput: 0: 908.3. Samples: 495164. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:14:55,164][04457] Avg episode reward: [(0, '20.671')]
[2025-03-29 06:14:55,173][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000484_1982464.pth...
[2025-03-29 06:14:55,289][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000274_1122304.pth
[2025-03-29 06:15:00,162][04457] Fps is (10 sec: 4095.9, 60 sec: 3618.1, 300 sec: 3568.4). Total num frames: 2002944. Throughput: 0: 907.8. Samples: 501090. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:00,164][04457] Avg episode reward: [(0, '21.069')]
[2025-03-29 06:15:01,259][04744] Updated weights for policy 0, policy_version 490 (0.0013)
[2025-03-29 06:15:05,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2019328. Throughput: 0: 885.9. Samples: 502998. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:05,164][04457] Avg episode reward: [(0, '20.866')]
[2025-03-29 06:15:10,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 2039808. Throughput: 0: 914.4. Samples: 508848. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:10,163][04457] Avg episode reward: [(0, '19.552')]
[2025-03-29 06:15:12,120][04744] Updated weights for policy 0, policy_version 500 (0.0014)
[2025-03-29 06:15:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2056192. Throughput: 0: 902.0. Samples: 514402. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:15,166][04457] Avg episode reward: [(0, '18.946')]
[2025-03-29 06:15:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2072576. Throughput: 0: 888.0. Samples: 516402. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:20,167][04457] Avg episode reward: [(0, '18.789')]
[2025-03-29 06:15:23,793][04744] Updated weights for policy 0, policy_version 510 (0.0015)
[2025-03-29 06:15:25,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 2093056. Throughput: 0: 913.3. Samples: 522572. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:15:25,164][04457] Avg episode reward: [(0, '17.851')]
[2025-03-29 06:15:30,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2109440. Throughput: 0: 894.8. Samples: 527820. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:15:30,170][04457] Avg episode reward: [(0, '17.103')]
[2025-03-29 06:15:35,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.3, 300 sec: 3582.3). Total num frames: 2125824. Throughput: 0: 898.8. Samples: 530206. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:15:35,164][04457] Avg episode reward: [(0, '17.399')]
[2025-03-29 06:15:35,283][04744] Updated weights for policy 0, policy_version 520 (0.0019)
[2025-03-29 06:15:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2146304. Throughput: 0: 916.0. Samples: 536386. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:15:40,164][04457] Avg episode reward: [(0, '18.382')]
[2025-03-29 06:15:45,164][04457] Fps is (10 sec: 3685.9, 60 sec: 3549.8, 300 sec: 3582.2). Total num frames: 2162688. Throughput: 0: 889.7. Samples: 541126. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:15:45,166][04457] Avg episode reward: [(0, '18.388')]
[2025-03-29 06:15:47,031][04744] Updated weights for policy 0, policy_version 530 (0.0017)
[2025-03-29 06:15:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 2183168. Throughput: 0: 907.7. Samples: 543846. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2025-03-29 06:15:50,164][04457] Avg episode reward: [(0, '19.159')]
[2025-03-29 06:15:55,162][04457] Fps is (10 sec: 4096.6, 60 sec: 3686.4, 300 sec: 3596.1). Total num frames: 2203648. Throughput: 0: 913.5. Samples: 549954. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:15:55,166][04457] Avg episode reward: [(0, '19.462')]
[2025-03-29 06:15:57,812][04744] Updated weights for policy 0, policy_version 540 (0.0017)
[2025-03-29 06:16:00,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2215936. Throughput: 0: 890.1. Samples: 554458. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:16:00,164][04457] Avg episode reward: [(0, '20.325')]
[2025-03-29 06:16:05,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2236416. Throughput: 0: 912.8. Samples: 557480. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:16:05,164][04457] Avg episode reward: [(0, '20.309')]
[2025-03-29 06:16:08,711][04744] Updated weights for policy 0, policy_version 550 (0.0019)
[2025-03-29 06:16:10,162][04457] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2256896. Throughput: 0: 908.5. Samples: 563456. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:10,166][04457] Avg episode reward: [(0, '20.751')]
[2025-03-29 06:16:15,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2269184. Throughput: 0: 889.5. Samples: 567848. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:15,165][04457] Avg episode reward: [(0, '20.597')]
[2025-03-29 06:16:20,163][04457] Fps is (10 sec: 3276.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2289664. Throughput: 0: 902.6. Samples: 570822. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:16:20,165][04457] Avg episode reward: [(0, '21.166')]
[2025-03-29 06:16:20,539][04744] Updated weights for policy 0, policy_version 560 (0.0023)
[2025-03-29 06:16:25,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2306048. Throughput: 0: 897.6. Samples: 576780. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:25,167][04457] Avg episode reward: [(0, '21.790')]
[2025-03-29 06:16:25,175][04731] Saving new best policy, reward=21.790!
[2025-03-29 06:16:30,162][04457] Fps is (10 sec: 3277.0, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 2322432. Throughput: 0: 893.0. Samples: 581308. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:30,163][04457] Avg episode reward: [(0, '22.972')]
[2025-03-29 06:16:30,166][04731] Saving new best policy, reward=22.972!
[2025-03-29 06:16:32,321][04744] Updated weights for policy 0, policy_version 570 (0.0016)
[2025-03-29 06:16:35,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 2342912. Throughput: 0: 898.7. Samples: 584290. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:16:35,164][04457] Avg episode reward: [(0, '24.275')]
[2025-03-29 06:16:35,176][04731] Saving new best policy, reward=24.275!
[2025-03-29 06:16:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2359296. Throughput: 0: 884.3. Samples: 589746. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:40,166][04457] Avg episode reward: [(0, '23.331')]
[2025-03-29 06:16:43,986][04744] Updated weights for policy 0, policy_version 580 (0.0021)
[2025-03-29 06:16:45,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3550.0, 300 sec: 3596.1). Total num frames: 2375680. Throughput: 0: 897.4. Samples: 594842. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:45,164][04457] Avg episode reward: [(0, '22.637')]
[2025-03-29 06:16:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 2396160. Throughput: 0: 894.7. Samples: 597742. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:50,163][04457] Avg episode reward: [(0, '21.792')]
[2025-03-29 06:16:55,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3481.6, 300 sec: 3582.3). Total num frames: 2412544. Throughput: 0: 875.3. Samples: 602846. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:16:55,164][04457] Avg episode reward: [(0, '21.232')]
[2025-03-29 06:16:55,174][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000589_2412544.pth...
[2025-03-29 06:16:55,291][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000379_1552384.pth
[2025-03-29 06:16:55,812][04744] Updated weights for policy 0, policy_version 590 (0.0014)
[2025-03-29 06:17:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3596.1). Total num frames: 2433024. Throughput: 0: 899.5. Samples: 608326. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:00,169][04457] Avg episode reward: [(0, '20.441')]
[2025-03-29 06:17:05,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2453504. Throughput: 0: 899.3. Samples: 611290. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:05,167][04457] Avg episode reward: [(0, '19.821')]
[2025-03-29 06:17:06,670][04744] Updated weights for policy 0, policy_version 600 (0.0019)
[2025-03-29 06:17:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3596.2). Total num frames: 2465792. Throughput: 0: 872.5. Samples: 616042. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:10,166][04457] Avg episode reward: [(0, '20.266')]
[2025-03-29 06:17:15,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2486272. Throughput: 0: 901.3. Samples: 621868. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:15,167][04457] Avg episode reward: [(0, '21.158')]
[2025-03-29 06:17:17,850][04744] Updated weights for policy 0, policy_version 610 (0.0014)
[2025-03-29 06:17:20,165][04457] Fps is (10 sec: 3685.2, 60 sec: 3549.7, 300 sec: 3582.2). Total num frames: 2502656. Throughput: 0: 900.3. Samples: 624804. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:20,167][04457] Avg episode reward: [(0, '21.384')]
[2025-03-29 06:17:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 2519040. Throughput: 0: 877.8. Samples: 629246. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:17:25,164][04457] Avg episode reward: [(0, '20.646')]
[2025-03-29 06:17:29,493][04744] Updated weights for policy 0, policy_version 620 (0.0015)
[2025-03-29 06:17:30,162][04457] Fps is (10 sec: 3687.6, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2539520. Throughput: 0: 899.2. Samples: 635308. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:30,163][04457] Avg episode reward: [(0, '21.935')]
[2025-03-29 06:17:35,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2555904. Throughput: 0: 902.0. Samples: 638332. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:35,164][04457] Avg episode reward: [(0, '21.931')]
[2025-03-29 06:17:40,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3582.3). Total num frames: 2572288. Throughput: 0: 890.1. Samples: 642900. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:40,164][04457] Avg episode reward: [(0, '21.073')]
[2025-03-29 06:17:41,067][04744] Updated weights for policy 0, policy_version 630 (0.0017)
[2025-03-29 06:17:45,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3582.3). Total num frames: 2592768. Throughput: 0: 903.8. Samples: 648998. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:17:45,164][04457] Avg episode reward: [(0, '20.192')]
[2025-03-29 06:17:50,164][04457] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3582.3). Total num frames: 2609152. Throughput: 0: 898.4. Samples: 651720. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:17:50,169][04457] Avg episode reward: [(0, '20.931')]
[2025-03-29 06:17:52,730][04744] Updated weights for policy 0, policy_version 640 (0.0017)
[2025-03-29 06:17:55,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2629632. Throughput: 0: 902.5. Samples: 656654. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:17:55,164][04457] Avg episode reward: [(0, '21.675')]
[2025-03-29 06:18:00,162][04457] Fps is (10 sec: 4096.7, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2650112. Throughput: 0: 909.5. Samples: 662794. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:00,169][04457] Avg episode reward: [(0, '21.302')]
[2025-03-29 06:18:03,846][04744] Updated weights for policy 0, policy_version 650 (0.0017)
[2025-03-29 06:18:05,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3549.8, 300 sec: 3596.1). Total num frames: 2666496. Throughput: 0: 896.7. Samples: 665152. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:05,171][04457] Avg episode reward: [(0, '22.548')]
[2025-03-29 06:18:10,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2686976. Throughput: 0: 917.5. Samples: 670532. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:10,164][04457] Avg episode reward: [(0, '22.785')]
[2025-03-29 06:18:14,173][04744] Updated weights for policy 0, policy_version 660 (0.0013)
[2025-03-29 06:18:15,164][04457] Fps is (10 sec: 3685.7, 60 sec: 3618.0, 300 sec: 3596.1). Total num frames: 2703360. Throughput: 0: 917.7. Samples: 676606. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:18:15,168][04457] Avg episode reward: [(0, '22.989')]
[2025-03-29 06:18:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.3, 300 sec: 3596.1). Total num frames: 2719744. Throughput: 0: 892.0. Samples: 678470. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:18:20,166][04457] Avg episode reward: [(0, '22.170')]
[2025-03-29 06:18:25,162][04457] Fps is (10 sec: 3687.1, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2740224. Throughput: 0: 916.7. Samples: 684150. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:18:25,164][04457] Avg episode reward: [(0, '21.995')]
[2025-03-29 06:18:25,954][04744] Updated weights for policy 0, policy_version 670 (0.0013)
[2025-03-29 06:18:30,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2756608. Throughput: 0: 908.5. Samples: 689882. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:18:30,166][04457] Avg episode reward: [(0, '21.651')]
[2025-03-29 06:18:35,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3618.1, 300 sec: 3596.1). Total num frames: 2772992. Throughput: 0: 891.1. Samples: 691818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:18:35,163][04457] Avg episode reward: [(0, '20.524')]
[2025-03-29 06:18:37,357][04744] Updated weights for policy 0, policy_version 680 (0.0014)
[2025-03-29 06:18:40,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3596.2). Total num frames: 2793472. Throughput: 0: 917.6. Samples: 697946. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:40,165][04457] Avg episode reward: [(0, '19.712')]
[2025-03-29 06:18:45,163][04457] Fps is (10 sec: 3685.8, 60 sec: 3618.0, 300 sec: 3596.1). Total num frames: 2809856. Throughput: 0: 900.8. Samples: 703330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:18:45,167][04457] Avg episode reward: [(0, '19.485')]
[2025-03-29 06:18:48,997][04744] Updated weights for policy 0, policy_version 690 (0.0016)
[2025-03-29 06:18:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.5, 300 sec: 3610.0). Total num frames: 2830336. Throughput: 0: 897.3. Samples: 705528. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:50,167][04457] Avg episode reward: [(0, '18.301')]
[2025-03-29 06:18:55,162][04457] Fps is (10 sec: 4096.6, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2850816. Throughput: 0: 915.6. Samples: 711736. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:18:55,163][04457] Avg episode reward: [(0, '17.986')]
[2025-03-29 06:18:55,172][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000696_2850816.pth...
[2025-03-29 06:18:55,276][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000484_1982464.pth
[2025-03-29 06:19:00,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3596.1). Total num frames: 2863104. Throughput: 0: 888.7. Samples: 716596. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:19:00,169][04457] Avg episode reward: [(0, '18.886')]
[2025-03-29 06:19:00,497][04744] Updated weights for policy 0, policy_version 700 (0.0016)
[2025-03-29 06:19:05,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3610.0). Total num frames: 2883584. Throughput: 0: 908.0. Samples: 719330. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:19:05,163][04457] Avg episode reward: [(0, '19.891')]
[2025-03-29 06:19:10,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 2904064. Throughput: 0: 918.5. Samples: 725480. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:19:10,165][04457] Avg episode reward: [(0, '21.116')]
[2025-03-29 06:19:10,394][04744] Updated weights for policy 0, policy_version 710 (0.0014)
[2025-03-29 06:19:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.3, 300 sec: 3610.0). Total num frames: 2920448. Throughput: 0: 894.6. Samples: 730140. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:19:15,165][04457] Avg episode reward: [(0, '22.268')]
[2025-03-29 06:19:20,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 2940928. Throughput: 0: 916.0. Samples: 733040. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:19:20,164][04457] Avg episode reward: [(0, '22.745')]
[2025-03-29 06:19:22,231][04744] Updated weights for policy 0, policy_version 720 (0.0013)
[2025-03-29 06:19:25,164][04457] Fps is (10 sec: 3685.7, 60 sec: 3618.0, 300 sec: 3610.0). Total num frames: 2957312. Throughput: 0: 914.9. Samples: 739118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:19:25,165][04457] Avg episode reward: [(0, '22.977')]
[2025-03-29 06:19:30,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3610.1). Total num frames: 2973696. Throughput: 0: 897.9. Samples: 743734. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:19:30,163][04457] Avg episode reward: [(0, '21.847')]
[2025-03-29 06:19:33,760][04744] Updated weights for policy 0, policy_version 730 (0.0013)
[2025-03-29 06:19:35,162][04457] Fps is (10 sec: 3687.1, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 2994176. Throughput: 0: 914.9. Samples: 746700. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:19:35,164][04457] Avg episode reward: [(0, '21.290')]
[2025-03-29 06:19:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3596.2). Total num frames: 3010560. Throughput: 0: 910.4. Samples: 752702. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:19:40,171][04457] Avg episode reward: [(0, '21.502')]
[2025-03-29 06:19:45,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3618.2, 300 sec: 3610.0). Total num frames: 3026944. Throughput: 0: 908.0. Samples: 757458. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:19:45,166][04457] Avg episode reward: [(0, '20.570')]
[2025-03-29 06:19:45,386][04744] Updated weights for policy 0, policy_version 740 (0.0014)
[2025-03-29 06:19:50,166][04457] Fps is (10 sec: 3685.0, 60 sec: 3617.9, 300 sec: 3610.0). Total num frames: 3047424. Throughput: 0: 914.1. Samples: 760470. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:19:50,167][04457] Avg episode reward: [(0, '19.356')]
[2025-03-29 06:19:55,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3549.9, 300 sec: 3596.2). Total num frames: 3063808. Throughput: 0: 901.9. Samples: 766066. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:19:55,163][04457] Avg episode reward: [(0, '19.500')]
[2025-03-29 06:19:56,637][04744] Updated weights for policy 0, policy_version 750 (0.0015)
[2025-03-29 06:20:00,162][04457] Fps is (10 sec: 3687.8, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3084288. Throughput: 0: 913.5. Samples: 771248. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:20:00,163][04457] Avg episode reward: [(0, '19.673')]
[2025-03-29 06:20:05,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3104768. Throughput: 0: 916.9. Samples: 774302. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:20:05,168][04457] Avg episode reward: [(0, '19.835')]
[2025-03-29 06:20:07,022][04744] Updated weights for policy 0, policy_version 760 (0.0015)
[2025-03-29 06:20:10,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3549.8, 300 sec: 3596.1). Total num frames: 3117056. Throughput: 0: 898.1. Samples: 779532. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:20:10,166][04457] Avg episode reward: [(0, '20.243')]
[2025-03-29 06:20:15,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3141632. Throughput: 0: 919.4. Samples: 785108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:20:15,163][04457] Avg episode reward: [(0, '20.585')]
[2025-03-29 06:20:18,257][04744] Updated weights for policy 0, policy_version 770 (0.0020)
[2025-03-29 06:20:20,162][04457] Fps is (10 sec: 4096.2, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 3158016. Throughput: 0: 922.3. Samples: 788202. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:20:20,167][04457] Avg episode reward: [(0, '18.741')]
[2025-03-29 06:20:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3610.0). Total num frames: 3174400. Throughput: 0: 893.2. Samples: 792894. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:20:25,168][04457] Avg episode reward: [(0, '18.616')]
[2025-03-29 06:20:29,759][04744] Updated weights for policy 0, policy_version 780 (0.0018)
[2025-03-29 06:20:30,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3194880. Throughput: 0: 922.1. Samples: 798950. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:20:30,168][04457] Avg episode reward: [(0, '18.478')]
[2025-03-29 06:20:35,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.1, 300 sec: 3610.0). Total num frames: 3211264. Throughput: 0: 923.7. Samples: 802032. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:20:35,166][04457] Avg episode reward: [(0, '19.352')]
[2025-03-29 06:20:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3231744. Throughput: 0: 900.7. Samples: 806598. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:20:40,163][04457] Avg episode reward: [(0, '20.656')]
[2025-03-29 06:20:41,233][04744] Updated weights for policy 0, policy_version 790 (0.0014)
[2025-03-29 06:20:45,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3610.0). Total num frames: 3248128. Throughput: 0: 922.4. Samples: 812758. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:20:45,168][04457] Avg episode reward: [(0, '21.552')]
[2025-03-29 06:20:50,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.6, 300 sec: 3610.0). Total num frames: 3268608. Throughput: 0: 921.2. Samples: 815754. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:20:50,165][04457] Avg episode reward: [(0, '23.085')]
[2025-03-29 06:20:52,849][04744] Updated weights for policy 0, policy_version 800 (0.0015)
[2025-03-29 06:20:55,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3284992. Throughput: 0: 905.8. Samples: 820294. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:20:55,168][04457] Avg episode reward: [(0, '22.459')]
[2025-03-29 06:20:55,176][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000802_3284992.pth...
[2025-03-29 06:20:55,296][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000589_2412544.pth
[2025-03-29 06:21:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3305472. Throughput: 0: 915.7. Samples: 826314. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:00,169][04457] Avg episode reward: [(0, '22.341')]
[2025-03-29 06:21:03,783][04744] Updated weights for policy 0, policy_version 810 (0.0015)
[2025-03-29 06:21:05,165][04457] Fps is (10 sec: 3275.9, 60 sec: 3549.7, 300 sec: 3596.1). Total num frames: 3317760. Throughput: 0: 909.2. Samples: 829118. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:21:05,168][04457] Avg episode reward: [(0, '20.967')]
[2025-03-29 06:21:10,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3338240. Throughput: 0: 912.4. Samples: 833954. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:10,167][04457] Avg episode reward: [(0, '21.029')]
[2025-03-29 06:21:14,641][04744] Updated weights for policy 0, policy_version 820 (0.0017)
[2025-03-29 06:21:15,162][04457] Fps is (10 sec: 4097.2, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3358720. Throughput: 0: 911.1. Samples: 839950. Policy #0 lag: (min: 0.0, avg: 0.1, max: 1.0)
[2025-03-29 06:21:15,163][04457] Avg episode reward: [(0, '20.805')]
[2025-03-29 06:21:20,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3610.0). Total num frames: 3371008. Throughput: 0: 895.9. Samples: 842348. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:20,168][04457] Avg episode reward: [(0, '21.212')]
[2025-03-29 06:21:25,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3391488. Throughput: 0: 907.4. Samples: 847432. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:25,169][04457] Avg episode reward: [(0, '20.622')]
[2025-03-29 06:21:26,637][04744] Updated weights for policy 0, policy_version 830 (0.0014)
[2025-03-29 06:21:30,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3411968. Throughput: 0: 905.4. Samples: 853502. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:21:30,170][04457] Avg episode reward: [(0, '21.765')]
[2025-03-29 06:21:35,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3623.9). Total num frames: 3428352. Throughput: 0: 883.6. Samples: 855514. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:21:35,167][04457] Avg episode reward: [(0, '22.446')]
[2025-03-29 06:21:38,128][04744] Updated weights for policy 0, policy_version 840 (0.0016)
[2025-03-29 06:21:40,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 3448832. Throughput: 0: 906.1. Samples: 861068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:21:40,167][04457] Avg episode reward: [(0, '22.269')]
[2025-03-29 06:21:45,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3465216. Throughput: 0: 901.2. Samples: 866870. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:45,163][04457] Avg episode reward: [(0, '22.816')]
[2025-03-29 06:21:49,806][04744] Updated weights for policy 0, policy_version 850 (0.0019)
[2025-03-29 06:21:50,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 3481600. Throughput: 0: 880.8. Samples: 868752. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:50,163][04457] Avg episode reward: [(0, '22.815')]
[2025-03-29 06:21:55,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3502080. Throughput: 0: 907.3. Samples: 874782. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:21:55,164][04457] Avg episode reward: [(0, '21.494')]
[2025-03-29 06:22:00,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3610.0). Total num frames: 3518464. Throughput: 0: 893.2. Samples: 880142. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:22:00,164][04457] Avg episode reward: [(0, '20.081')]
[2025-03-29 06:22:00,764][04744] Updated weights for policy 0, policy_version 860 (0.0016)
[2025-03-29 06:22:05,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.3, 300 sec: 3623.9). Total num frames: 3534848. Throughput: 0: 890.0. Samples: 882400. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:22:05,165][04457] Avg episode reward: [(0, '21.092')]
[2025-03-29 06:22:10,163][04457] Fps is (10 sec: 3686.2, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3555328. Throughput: 0: 911.5. Samples: 888448. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:10,164][04457] Avg episode reward: [(0, '19.569')]
[2025-03-29 06:22:11,277][04744] Updated weights for policy 0, policy_version 870 (0.0019)
[2025-03-29 06:22:15,164][04457] Fps is (10 sec: 3685.8, 60 sec: 3549.8, 300 sec: 3623.9). Total num frames: 3571712. Throughput: 0: 888.4. Samples: 893482. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:15,165][04457] Avg episode reward: [(0, '19.010')]
[2025-03-29 06:22:20,162][04457] Fps is (10 sec: 3686.6, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3592192. Throughput: 0: 902.4. Samples: 896124. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:20,166][04457] Avg episode reward: [(0, '19.074')]
[2025-03-29 06:22:22,942][04744] Updated weights for policy 0, policy_version 880 (0.0021)
[2025-03-29 06:22:25,162][04457] Fps is (10 sec: 4096.7, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3612672. Throughput: 0: 912.0. Samples: 902108. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:25,164][04457] Avg episode reward: [(0, '20.477')]
[2025-03-29 06:22:30,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 3624960. Throughput: 0: 885.9. Samples: 906736. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0)
[2025-03-29 06:22:30,168][04457] Avg episode reward: [(0, '20.125')]
[2025-03-29 06:22:34,484][04744] Updated weights for policy 0, policy_version 890 (0.0014)
[2025-03-29 06:22:35,164][04457] Fps is (10 sec: 3276.1, 60 sec: 3618.0, 300 sec: 3637.8). Total num frames: 3645440. Throughput: 0: 911.5. Samples: 909772. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:35,166][04457] Avg episode reward: [(0, '19.245')]
[2025-03-29 06:22:40,163][04457] Fps is (10 sec: 4095.7, 60 sec: 3618.1, 300 sec: 3637.8). Total num frames: 3665920. Throughput: 0: 914.1. Samples: 915916. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:40,164][04457] Avg episode reward: [(0, '20.036')]
[2025-03-29 06:22:45,164][04457] Fps is (10 sec: 3686.5, 60 sec: 3618.0, 300 sec: 3637.8). Total num frames: 3682304. Throughput: 0: 897.7. Samples: 920538. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:45,165][04457] Avg episode reward: [(0, '20.342')]
[2025-03-29 06:22:45,957][04744] Updated weights for policy 0, policy_version 900 (0.0017)
[2025-03-29 06:22:50,162][04457] Fps is (10 sec: 3686.7, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3702784. Throughput: 0: 916.2. Samples: 923630. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:22:50,163][04457] Avg episode reward: [(0, '19.498')]
[2025-03-29 06:22:55,164][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.0, 300 sec: 3623.9). Total num frames: 3719168. Throughput: 0: 917.0. Samples: 929716. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:22:55,171][04457] Avg episode reward: [(0, '20.298')]
[2025-03-29 06:22:55,183][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000908_3719168.pth...
[2025-03-29 06:22:55,345][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000696_2850816.pth
[2025-03-29 06:22:57,659][04744] Updated weights for policy 0, policy_version 910 (0.0018)
[2025-03-29 06:23:00,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3735552. Throughput: 0: 905.2. Samples: 934214. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:00,171][04457] Avg episode reward: [(0, '20.789')]
[2025-03-29 06:23:05,162][04457] Fps is (10 sec: 3687.0, 60 sec: 3686.4, 300 sec: 3623.9). Total num frames: 3756032. Throughput: 0: 914.4. Samples: 937272. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:05,164][04457] Avg episode reward: [(0, '21.153')]
[2025-03-29 06:23:07,660][04744] Updated weights for policy 0, policy_version 920 (0.0022)
[2025-03-29 06:23:10,162][04457] Fps is (10 sec: 3686.3, 60 sec: 3618.2, 300 sec: 3623.9). Total num frames: 3772416. Throughput: 0: 907.2. Samples: 942930. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:23:10,166][04457] Avg episode reward: [(0, '20.577')]
[2025-03-29 06:23:15,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3623.9). Total num frames: 3788800. Throughput: 0: 916.7. Samples: 947986. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:15,165][04457] Avg episode reward: [(0, '21.214')]
[2025-03-29 06:23:19,148][04744] Updated weights for policy 0, policy_version 930 (0.0017)
[2025-03-29 06:23:20,162][04457] Fps is (10 sec: 3686.5, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3809280. Throughput: 0: 916.5. Samples: 951012. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:23:20,167][04457] Avg episode reward: [(0, '22.990')]
[2025-03-29 06:23:25,164][04457] Fps is (10 sec: 3685.7, 60 sec: 3549.8, 300 sec: 3623.9). Total num frames: 3825664. Throughput: 0: 893.9. Samples: 956144. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:23:25,166][04457] Avg episode reward: [(0, '22.217')]
[2025-03-29 06:23:30,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 3846144. Throughput: 0: 913.1. Samples: 961624. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:30,164][04457] Avg episode reward: [(0, '22.455')]
[2025-03-29 06:23:30,758][04744] Updated weights for policy 0, policy_version 940 (0.0018)
[2025-03-29 06:23:35,162][04457] Fps is (10 sec: 4096.8, 60 sec: 3686.5, 300 sec: 3637.8). Total num frames: 3866624. Throughput: 0: 911.2. Samples: 964634. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:35,163][04457] Avg episode reward: [(0, '22.501')]
[2025-03-29 06:23:40,162][04457] Fps is (10 sec: 3276.8, 60 sec: 3549.9, 300 sec: 3623.9). Total num frames: 3878912. Throughput: 0: 884.0. Samples: 969496. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:23:40,167][04457] Avg episode reward: [(0, '23.118')]
[2025-03-29 06:23:42,499][04744] Updated weights for policy 0, policy_version 950 (0.0022)
[2025-03-29 06:23:45,162][04457] Fps is (10 sec: 3276.7, 60 sec: 3618.2, 300 sec: 3623.9). Total num frames: 3899392. Throughput: 0: 912.2. Samples: 975262. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0)
[2025-03-29 06:23:45,166][04457] Avg episode reward: [(0, '21.909')]
[2025-03-29 06:23:50,163][04457] Fps is (10 sec: 4095.8, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3919872. Throughput: 0: 912.2. Samples: 978322. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:23:50,164][04457] Avg episode reward: [(0, '22.678')]
[2025-03-29 06:23:54,200][04744] Updated weights for policy 0, policy_version 960 (0.0017)
[2025-03-29 06:23:55,162][04457] Fps is (10 sec: 3276.9, 60 sec: 3550.0, 300 sec: 3623.9). Total num frames: 3932160. Throughput: 0: 886.3. Samples: 982812. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:23:55,164][04457] Avg episode reward: [(0, '23.297')]
[2025-03-29 06:24:00,162][04457] Fps is (10 sec: 3277.0, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3952640. Throughput: 0: 908.3. Samples: 988860. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:24:00,166][04457] Avg episode reward: [(0, '23.941')]
[2025-03-29 06:24:04,727][04744] Updated weights for policy 0, policy_version 970 (0.0022)
[2025-03-29 06:24:05,162][04457] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3973120. Throughput: 0: 908.4. Samples: 991888. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0)
[2025-03-29 06:24:05,163][04457] Avg episode reward: [(0, '24.788')]
[2025-03-29 06:24:05,171][04731] Saving new best policy, reward=24.788!
[2025-03-29 06:24:10,162][04457] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3623.9). Total num frames: 3989504. Throughput: 0: 895.1. Samples: 996422. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0)
[2025-03-29 06:24:10,169][04457] Avg episode reward: [(0, '24.064')]
[2025-03-29 06:24:14,096][04457] Component Batcher_0 stopped!
[2025-03-29 06:24:14,097][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-03-29 06:24:14,098][04457] Component RolloutWorker_w0 process died already! Don't wait for it.
[2025-03-29 06:24:14,102][04457] Component RolloutWorker_w1 process died already! Don't wait for it.
[2025-03-29 06:24:14,109][04731] Stopping Batcher_0...
[2025-03-29 06:24:14,109][04731] Loop batcher_evt_loop terminating...
[2025-03-29 06:24:14,104][04457] Component RolloutWorker_w6 process died already! Don't wait for it.
[2025-03-29 06:24:14,206][04744] Weights refcount: 2 0
[2025-03-29 06:24:14,211][04744] Stopping InferenceWorker_p0-w0...
[2025-03-29 06:24:14,211][04744] Loop inference_proc0-0_evt_loop terminating...
[2025-03-29 06:24:14,211][04457] Component InferenceWorker_p0-w0 stopped!
[2025-03-29 06:24:14,252][04731] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000802_3284992.pth
[2025-03-29 06:24:14,268][04731] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-03-29 06:24:14,434][04457] Component LearnerWorker_p0 stopped!
[2025-03-29 06:24:14,439][04731] Stopping LearnerWorker_p0...
[2025-03-29 06:24:14,441][04731] Loop learner_proc0_evt_loop terminating...
[2025-03-29 06:24:14,528][04749] Stopping RolloutWorker_w5...
[2025-03-29 06:24:14,530][04748] Stopping RolloutWorker_w3...
[2025-03-29 06:24:14,530][04748] Loop rollout_proc3_evt_loop terminating...
[2025-03-29 06:24:14,532][04749] Loop rollout_proc5_evt_loop terminating...
[2025-03-29 06:24:14,528][04457] Component RolloutWorker_w5 stopped!
[2025-03-29 06:24:14,534][04457] Component RolloutWorker_w3 stopped!
[2025-03-29 06:24:14,546][04752] Stopping RolloutWorker_w7...
[2025-03-29 06:24:14,548][04752] Loop rollout_proc7_evt_loop terminating...
[2025-03-29 06:24:14,546][04457] Component RolloutWorker_w7 stopped!
[2025-03-29 06:24:14,691][04457] Component RolloutWorker_w2 stopped!
[2025-03-29 06:24:14,697][04747] Stopping RolloutWorker_w2...
[2025-03-29 06:24:14,704][04747] Loop rollout_proc2_evt_loop terminating...
[2025-03-29 06:24:14,739][04457] Component RolloutWorker_w4 stopped!
[2025-03-29 06:24:14,742][04457] Waiting for process learner_proc0 to stop...
[2025-03-29 06:24:14,745][04750] Stopping RolloutWorker_w4...
[2025-03-29 06:24:14,758][04750] Loop rollout_proc4_evt_loop terminating...
[2025-03-29 06:24:16,219][04457] Waiting for process inference_proc0-0 to join...
[2025-03-29 06:24:16,220][04457] Waiting for process rollout_proc0 to join...
[2025-03-29 06:24:16,230][04457] Waiting for process rollout_proc1 to join...
[2025-03-29 06:24:16,232][04457] Waiting for process rollout_proc2 to join...
[2025-03-29 06:24:16,937][04457] Waiting for process rollout_proc3 to join...
[2025-03-29 06:24:17,456][04457] Waiting for process rollout_proc4 to join...
[2025-03-29 06:24:17,458][04457] Waiting for process rollout_proc5 to join...
[2025-03-29 06:24:17,459][04457] Waiting for process rollout_proc6 to join...
[2025-03-29 06:24:17,460][04457] Waiting for process rollout_proc7 to join...
[2025-03-29 06:24:17,461][04457] Batcher 0 profile tree view:
batching: 23.7814, releasing_batches: 0.0287
[2025-03-29 06:24:17,462][04457] InferenceWorker_p0-w0 profile tree view:
wait_policy: 0.0168
wait_policy_total: 433.1954
update_model: 9.5917
weight_update: 0.0015
one_step: 0.0030
handle_policy_step: 644.3335
deserialize: 15.4057, stack: 3.7877, obs_to_device_normalize: 143.0670, forward: 340.1459, send_messages: 23.5950
prepare_outputs: 90.4842
to_cpu: 56.3200
[2025-03-29 06:24:17,463][04457] Learner 0 profile tree view:
misc: 0.0039, prepare_batch: 12.7524
train: 70.2065
epoch_init: 0.0075, minibatch_init: 0.0070, losses_postprocess: 0.5790, kl_divergence: 0.5858, after_optimizer: 33.5108
calculate_losses: 23.6525
losses_init: 0.0083, forward_head: 1.3624, bptt_initial: 15.6632, tail: 0.9594, advantages_returns: 0.2563, losses: 3.1843
bptt: 1.9398
bptt_forward_core: 1.8700
update: 11.3344
clip: 0.8349
[2025-03-29 06:24:17,465][04457] RolloutWorker_w7 profile tree view:
wait_for_trajectories: 0.4048, enqueue_policy_requests: 107.9757, env_step: 879.4914, overhead: 17.3657, complete_rollouts: 9.0303
save_policy_outputs: 31.5106
split_output_tensors: 10.1017
[2025-03-29 06:24:17,466][04457] Loop Runner_EvtLoop terminating...
[2025-03-29 06:24:17,468][04457] Runner profile tree view:
main_loop: 1155.8328
[2025-03-29 06:24:17,469][04457] Collected {0: 4005888}, FPS: 3465.8
[2025-03-29 06:24:18,290][04457] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2025-03-29 06:24:18,291][04457] Overriding arg 'num_workers' with value 1 passed from command line
[2025-03-29 06:24:18,292][04457] Adding new argument 'no_render'=True that is not in the saved config file!
[2025-03-29 06:24:18,292][04457] Adding new argument 'save_video'=True that is not in the saved config file!
[2025-03-29 06:24:18,294][04457] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2025-03-29 06:24:18,295][04457] Adding new argument 'video_name'=None that is not in the saved config file!
[2025-03-29 06:24:18,296][04457] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file!
[2025-03-29 06:24:18,298][04457] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2025-03-29 06:24:18,299][04457] Adding new argument 'push_to_hub'=False that is not in the saved config file!
[2025-03-29 06:24:18,300][04457] Adding new argument 'hf_repository'=None that is not in the saved config file!
[2025-03-29 06:24:18,301][04457] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2025-03-29 06:24:18,302][04457] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2025-03-29 06:24:18,304][04457] Adding new argument 'train_script'=None that is not in the saved config file!
[2025-03-29 06:24:18,305][04457] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2025-03-29 06:24:18,305][04457] Using frameskip 1 and render_action_repeat=4 for evaluation
[2025-03-29 06:24:18,359][04457] Doom resolution: 160x120, resize resolution: (128, 72)
[2025-03-29 06:24:18,364][04457] RunningMeanStd input shape: (3, 72, 128)
[2025-03-29 06:24:18,366][04457] RunningMeanStd input shape: (1,)
[2025-03-29 06:24:18,390][04457] ConvEncoder: input_channels=3
[2025-03-29 06:24:18,554][04457] Conv encoder output size: 512
[2025-03-29 06:24:18,555][04457] Policy head output size: 512
[2025-03-29 06:24:18,756][04457] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-03-29 06:24:19,843][04457] Num frames 100...
[2025-03-29 06:24:19,987][04457] Num frames 200...
[2025-03-29 06:24:20,123][04457] Num frames 300...
[2025-03-29 06:24:20,258][04457] Num frames 400...
[2025-03-29 06:24:20,404][04457] Num frames 500...
[2025-03-29 06:24:20,544][04457] Num frames 600...
[2025-03-29 06:24:20,610][04457] Avg episode rewards: #0: 9.080, true rewards: #0: 6.080
[2025-03-29 06:24:20,611][04457] Avg episode reward: 9.080, avg true_objective: 6.080
[2025-03-29 06:24:20,735][04457] Num frames 700...
[2025-03-29 06:24:20,871][04457] Num frames 800...
[2025-03-29 06:24:21,006][04457] Num frames 900...
[2025-03-29 06:24:21,146][04457] Num frames 1000...
[2025-03-29 06:24:21,281][04457] Num frames 1100...
[2025-03-29 06:24:21,423][04457] Num frames 1200...
[2025-03-29 06:24:21,572][04457] Num frames 1300...
[2025-03-29 06:24:21,727][04457] Num frames 1400...
[2025-03-29 06:24:21,866][04457] Num frames 1500...
[2025-03-29 06:24:22,004][04457] Num frames 1600...
[2025-03-29 06:24:22,146][04457] Avg episode rewards: #0: 17.320, true rewards: #0: 8.320
[2025-03-29 06:24:22,147][04457] Avg episode reward: 17.320, avg true_objective: 8.320
[2025-03-29 06:24:22,198][04457] Num frames 1700...
[2025-03-29 06:24:22,335][04457] Num frames 1800...
[2025-03-29 06:24:22,488][04457] Num frames 1900...
[2025-03-29 06:24:22,620][04457] Num frames 2000...
[2025-03-29 06:24:22,755][04457] Num frames 2100...
[2025-03-29 06:24:22,891][04457] Num frames 2200...
[2025-03-29 06:24:23,042][04457] Avg episode rewards: #0: 15.573, true rewards: #0: 7.573
[2025-03-29 06:24:23,043][04457] Avg episode reward: 15.573, avg true_objective: 7.573
[2025-03-29 06:24:23,083][04457] Num frames 2300...
[2025-03-29 06:24:23,218][04457] Num frames 2400...
[2025-03-29 06:24:23,352][04457] Num frames 2500...
[2025-03-29 06:24:23,502][04457] Num frames 2600...
[2025-03-29 06:24:23,638][04457] Num frames 2700...
[2025-03-29 06:24:23,772][04457] Num frames 2800...
[2025-03-29 06:24:23,910][04457] Num frames 2900...
[2025-03-29 06:24:24,048][04457] Num frames 3000...
[2025-03-29 06:24:24,115][04457] Avg episode rewards: #0: 15.020, true rewards: #0: 7.520
[2025-03-29 06:24:24,115][04457] Avg episode reward: 15.020, avg true_objective: 7.520
[2025-03-29 06:24:24,240][04457] Num frames 3100...
[2025-03-29 06:24:24,375][04457] Num frames 3200...
[2025-03-29 06:24:24,522][04457] Num frames 3300...
[2025-03-29 06:24:24,665][04457] Num frames 3400...
[2025-03-29 06:24:24,839][04457] Avg episode rewards: #0: 13.576, true rewards: #0: 6.976
[2025-03-29 06:24:24,840][04457] Avg episode reward: 13.576, avg true_objective: 6.976
[2025-03-29 06:24:24,861][04457] Num frames 3500...
[2025-03-29 06:24:25,006][04457] Num frames 3600...
[2025-03-29 06:24:25,151][04457] Num frames 3700...
[2025-03-29 06:24:25,285][04457] Num frames 3800...
[2025-03-29 06:24:25,421][04457] Num frames 3900...
[2025-03-29 06:24:25,571][04457] Num frames 4000...
[2025-03-29 06:24:25,706][04457] Num frames 4100...
[2025-03-29 06:24:25,844][04457] Num frames 4200...
[2025-03-29 06:24:25,980][04457] Num frames 4300...
[2025-03-29 06:24:26,121][04457] Num frames 4400...
[2025-03-29 06:24:26,259][04457] Num frames 4500...
[2025-03-29 06:24:26,395][04457] Num frames 4600...
[2025-03-29 06:24:26,551][04457] Avg episode rewards: #0: 15.958, true rewards: #0: 7.792
[2025-03-29 06:24:26,552][04457] Avg episode reward: 15.958, avg true_objective: 7.792
[2025-03-29 06:24:26,588][04457] Num frames 4700...
[2025-03-29 06:24:26,728][04457] Num frames 4800...
[2025-03-29 06:24:26,868][04457] Num frames 4900...
[2025-03-29 06:24:27,004][04457] Num frames 5000...
[2025-03-29 06:24:27,151][04457] Num frames 5100...
[2025-03-29 06:24:27,286][04457] Num frames 5200...
[2025-03-29 06:24:27,367][04457] Avg episode rewards: #0: 15.456, true rewards: #0: 7.456
[2025-03-29 06:24:27,368][04457] Avg episode reward: 15.456, avg true_objective: 7.456
[2025-03-29 06:24:27,481][04457] Num frames 5300...
[2025-03-29 06:24:27,627][04457] Num frames 5400...
[2025-03-29 06:24:27,766][04457] Num frames 5500...
[2025-03-29 06:24:27,906][04457] Num frames 5600...
[2025-03-29 06:24:28,044][04457] Num frames 5700...
[2025-03-29 06:24:28,180][04457] Num frames 5800...
[2025-03-29 06:24:28,358][04457] Avg episode rewards: #0: 15.114, true rewards: #0: 7.364
[2025-03-29 06:24:28,359][04457] Avg episode reward: 15.114, avg true_objective: 7.364
[2025-03-29 06:24:28,374][04457] Num frames 5900...
[2025-03-29 06:24:28,511][04457] Num frames 6000...
[2025-03-29 06:24:28,659][04457] Num frames 6100...
[2025-03-29 06:24:28,802][04457] Num frames 6200...
[2025-03-29 06:24:28,941][04457] Num frames 6300...
[2025-03-29 06:24:29,077][04457] Num frames 6400...
[2025-03-29 06:24:29,212][04457] Num frames 6500...
[2025-03-29 06:24:29,349][04457] Num frames 6600...
[2025-03-29 06:24:29,487][04457] Num frames 6700...
[2025-03-29 06:24:29,628][04457] Num frames 6800...
[2025-03-29 06:24:29,798][04457] Num frames 6900...
[2025-03-29 06:24:29,908][04457] Avg episode rewards: #0: 16.252, true rewards: #0: 7.697
[2025-03-29 06:24:29,909][04457] Avg episode reward: 16.252, avg true_objective: 7.697
[2025-03-29 06:24:30,048][04457] Num frames 7000...
[2025-03-29 06:24:30,234][04457] Num frames 7100...
[2025-03-29 06:24:30,412][04457] Num frames 7200...
[2025-03-29 06:24:30,599][04457] Num frames 7300...
[2025-03-29 06:24:30,779][04457] Num frames 7400...
[2025-03-29 06:24:30,953][04457] Num frames 7500...
[2025-03-29 06:24:31,141][04457] Num frames 7600...
[2025-03-29 06:24:31,332][04457] Num frames 7700...
[2025-03-29 06:24:31,523][04457] Num frames 7800...
[2025-03-29 06:24:31,711][04457] Num frames 7900...
[2025-03-29 06:24:31,851][04457] Num frames 8000...
[2025-03-29 06:24:31,991][04457] Num frames 8100...
[2025-03-29 06:24:32,062][04457] Avg episode rewards: #0: 17.211, true rewards: #0: 8.111
[2025-03-29 06:24:32,063][04457] Avg episode reward: 17.211, avg true_objective: 8.111
[2025-03-29 06:25:26,011][04457] Replay video saved to /content/train_dir/default_experiment/replay.mp4!
[2025-03-29 06:26:36,386][04457] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json
[2025-03-29 06:26:36,387][04457] Overriding arg 'num_workers' with value 1 passed from command line
[2025-03-29 06:26:36,388][04457] Adding new argument 'no_render'=True that is not in the saved config file!
[2025-03-29 06:26:36,389][04457] Adding new argument 'save_video'=True that is not in the saved config file!
[2025-03-29 06:26:36,390][04457] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file!
[2025-03-29 06:26:36,391][04457] Adding new argument 'video_name'=None that is not in the saved config file!
[2025-03-29 06:26:36,391][04457] Adding new argument 'max_num_frames'=100000 that is not in the saved config file!
[2025-03-29 06:26:36,392][04457] Adding new argument 'max_num_episodes'=10 that is not in the saved config file!
[2025-03-29 06:26:36,394][04457] Adding new argument 'push_to_hub'=True that is not in the saved config file!
[2025-03-29 06:26:36,395][04457] Adding new argument 'hf_repository'='hnj0022/rl_course_vizdoom_health_gathering_supreme' that is not in the saved config file!
[2025-03-29 06:26:36,396][04457] Adding new argument 'policy_index'=0 that is not in the saved config file!
[2025-03-29 06:26:36,397][04457] Adding new argument 'eval_deterministic'=False that is not in the saved config file!
[2025-03-29 06:26:36,398][04457] Adding new argument 'train_script'=None that is not in the saved config file!
[2025-03-29 06:26:36,399][04457] Adding new argument 'enjoy_script'=None that is not in the saved config file!
[2025-03-29 06:26:36,400][04457] Using frameskip 1 and render_action_repeat=4 for evaluation
[2025-03-29 06:26:36,426][04457] RunningMeanStd input shape: (3, 72, 128)
[2025-03-29 06:26:36,428][04457] RunningMeanStd input shape: (1,)
[2025-03-29 06:26:36,441][04457] ConvEncoder: input_channels=3
[2025-03-29 06:26:36,482][04457] Conv encoder output size: 512
[2025-03-29 06:26:36,483][04457] Policy head output size: 512
[2025-03-29 06:26:36,501][04457] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth...
[2025-03-29 06:26:36,952][04457] Num frames 100...
[2025-03-29 06:26:37,095][04457] Num frames 200...
[2025-03-29 06:26:37,230][04457] Num frames 300...
[2025-03-29 06:26:37,365][04457] Num frames 400...
[2025-03-29 06:26:37,501][04457] Num frames 500...
[2025-03-29 06:26:37,638][04457] Num frames 600...
[2025-03-29 06:26:37,772][04457] Num frames 700...
[2025-03-29 06:26:37,903][04457] Num frames 800...
[2025-03-29 06:26:38,038][04457] Num frames 900...
[2025-03-29 06:26:38,214][04457] Avg episode rewards: #0: 23.880, true rewards: #0: 9.880
[2025-03-29 06:26:38,215][04457] Avg episode reward: 23.880, avg true_objective: 9.880
[2025-03-29 06:26:38,234][04457] Num frames 1000...
[2025-03-29 06:26:38,364][04457] Num frames 1100...
[2025-03-29 06:26:38,499][04457] Num frames 1200...
[2025-03-29 06:26:38,631][04457] Num frames 1300...
[2025-03-29 06:26:38,763][04457] Num frames 1400...
[2025-03-29 06:26:38,896][04457] Num frames 1500...
[2025-03-29 06:26:39,026][04457] Num frames 1600...
[2025-03-29 06:26:39,167][04457] Num frames 1700...
[2025-03-29 06:26:39,299][04457] Num frames 1800...
[2025-03-29 06:26:39,435][04457] Num frames 1900...
[2025-03-29 06:26:39,572][04457] Num frames 2000...
[2025-03-29 06:26:39,705][04457] Num frames 2100...
[2025-03-29 06:26:39,840][04457] Num frames 2200...
[2025-03-29 06:26:39,976][04457] Num frames 2300...
[2025-03-29 06:26:40,121][04457] Num frames 2400...
[2025-03-29 06:26:40,255][04457] Num frames 2500...
[2025-03-29 06:26:40,405][04457] Num frames 2600...
[2025-03-29 06:26:40,465][04457] Avg episode rewards: #0: 32.015, true rewards: #0: 13.015
[2025-03-29 06:26:40,466][04457] Avg episode reward: 32.015, avg true_objective: 13.015
[2025-03-29 06:26:40,629][04457] Num frames 2700...
[2025-03-29 06:26:40,819][04457] Num frames 2800...
[2025-03-29 06:26:40,997][04457] Num frames 2900...
[2025-03-29 06:26:41,181][04457] Num frames 3000...
[2025-03-29 06:26:41,365][04457] Num frames 3100...
[2025-03-29 06:26:41,549][04457] Num frames 3200...
[2025-03-29 06:26:41,726][04457] Num frames 3300...
[2025-03-29 06:26:41,904][04457] Num frames 3400...
[2025-03-29 06:26:42,088][04457] Num frames 3500...
[2025-03-29 06:26:42,272][04457] Avg episode rewards: #0: 29.210, true rewards: #0: 11.877
[2025-03-29 06:26:42,273][04457] Avg episode reward: 29.210, avg true_objective: 11.877
[2025-03-29 06:26:42,343][04457] Num frames 3600...
[2025-03-29 06:26:42,537][04457] Num frames 3700...
[2025-03-29 06:26:42,686][04457] Num frames 3800...
[2025-03-29 06:26:42,824][04457] Num frames 3900...
[2025-03-29 06:26:42,955][04457] Num frames 4000...
[2025-03-29 06:26:43,089][04457] Num frames 4100...
[2025-03-29 06:26:43,231][04457] Num frames 4200...
[2025-03-29 06:26:43,372][04457] Num frames 4300...
[2025-03-29 06:26:43,516][04457] Num frames 4400...
[2025-03-29 06:26:43,654][04457] Num frames 4500...
[2025-03-29 06:26:43,791][04457] Num frames 4600...
[2025-03-29 06:26:43,929][04457] Num frames 4700...
[2025-03-29 06:26:44,067][04457] Num frames 4800...
[2025-03-29 06:26:44,208][04457] Num frames 4900...
[2025-03-29 06:26:44,327][04457] Avg episode rewards: #0: 31.097, true rewards: #0: 12.347
[2025-03-29 06:26:44,328][04457] Avg episode reward: 31.097, avg true_objective: 12.347
[2025-03-29 06:26:44,413][04457] Num frames 5000...
[2025-03-29 06:26:44,550][04457] Num frames 5100...
[2025-03-29 06:26:44,686][04457] Num frames 5200...
[2025-03-29 06:26:44,824][04457] Num frames 5300...
[2025-03-29 06:26:44,960][04457] Num frames 5400...
[2025-03-29 06:26:45,096][04457] Num frames 5500...
[2025-03-29 06:26:45,237][04457] Num frames 5600...
[2025-03-29 06:26:45,380][04457] Num frames 5700...
[2025-03-29 06:26:45,516][04457] Num frames 5800...
[2025-03-29 06:26:45,653][04457] Num frames 5900...
[2025-03-29 06:26:45,792][04457] Num frames 6000...
[2025-03-29 06:26:45,857][04457] Avg episode rewards: #0: 29.410, true rewards: #0: 12.010
[2025-03-29 06:26:45,858][04457] Avg episode reward: 29.410, avg true_objective: 12.010
[2025-03-29 06:26:45,984][04457] Num frames 6100...
[2025-03-29 06:26:46,125][04457] Num frames 6200...
[2025-03-29 06:26:46,258][04457] Num frames 6300...
[2025-03-29 06:26:46,400][04457] Num frames 6400...
[2025-03-29 06:26:46,539][04457] Num frames 6500...
[2025-03-29 06:26:46,677][04457] Num frames 6600...
[2025-03-29 06:26:46,808][04457] Num frames 6700...
[2025-03-29 06:26:46,941][04457] Num frames 6800...
[2025-03-29 06:26:47,071][04457] Num frames 6900...
[2025-03-29 06:26:47,207][04457] Num frames 7000...
[2025-03-29 06:26:47,352][04457] Num frames 7100...
[2025-03-29 06:26:47,489][04457] Num frames 7200...
[2025-03-29 06:26:47,662][04457] Avg episode rewards: #0: 29.975, true rewards: #0: 12.142
[2025-03-29 06:26:47,663][04457] Avg episode reward: 29.975, avg true_objective: 12.142
[2025-03-29 06:26:47,685][04457] Num frames 7300...
[2025-03-29 06:26:47,821][04457] Num frames 7400...
[2025-03-29 06:26:47,957][04457] Num frames 7500...
[2025-03-29 06:26:48,091][04457] Num frames 7600...
[2025-03-29 06:26:48,231][04457] Num frames 7700...
[2025-03-29 06:26:48,377][04457] Num frames 7800...
[2025-03-29 06:26:48,517][04457] Num frames 7900...
[2025-03-29 06:26:48,657][04457] Num frames 8000...
[2025-03-29 06:26:48,794][04457] Num frames 8100...
[2025-03-29 06:26:48,929][04457] Num frames 8200...
[2025-03-29 06:26:49,072][04457] Num frames 8300...
[2025-03-29 06:26:49,217][04457] Num frames 8400...
[2025-03-29 06:26:49,324][04457] Avg episode rewards: #0: 29.339, true rewards: #0: 12.053
[2025-03-29 06:26:49,325][04457] Avg episode reward: 29.339, avg true_objective: 12.053
[2025-03-29 06:26:49,415][04457] Num frames 8500...
[2025-03-29 06:26:49,556][04457] Num frames 8600...
[2025-03-29 06:26:49,695][04457] Num frames 8700...
[2025-03-29 06:26:49,833][04457] Num frames 8800...
[2025-03-29 06:26:50,006][04457] Avg episode rewards: #0: 26.856, true rewards: #0: 11.106
[2025-03-29 06:26:50,007][04457] Avg episode reward: 26.856, avg true_objective: 11.106
[2025-03-29 06:26:50,028][04457] Num frames 8900...
[2025-03-29 06:26:50,165][04457] Num frames 9000...
[2025-03-29 06:26:50,303][04457] Num frames 9100...
[2025-03-29 06:26:50,448][04457] Num frames 9200...
[2025-03-29 06:26:50,585][04457] Num frames 9300...
[2025-03-29 06:26:50,719][04457] Num frames 9400...
[2025-03-29 06:26:50,866][04457] Num frames 9500...
[2025-03-29 06:26:51,004][04457] Num frames 9600...
[2025-03-29 06:26:51,140][04457] Num frames 9700...
[2025-03-29 06:26:51,274][04457] Num frames 9800...
[2025-03-29 06:26:51,410][04457] Num frames 9900...
[2025-03-29 06:26:51,558][04457] Num frames 10000...
[2025-03-29 06:26:51,697][04457] Num frames 10100...
[2025-03-29 06:26:51,834][04457] Num frames 10200...
[2025-03-29 06:26:51,970][04457] Num frames 10300...
[2025-03-29 06:26:52,107][04457] Num frames 10400...
[2025-03-29 06:26:52,245][04457] Num frames 10500...
[2025-03-29 06:26:52,380][04457] Num frames 10600...
[2025-03-29 06:26:52,455][04457] Avg episode rewards: #0: 28.348, true rewards: #0: 11.792
[2025-03-29 06:26:52,456][04457] Avg episode reward: 28.348, avg true_objective: 11.792
[2025-03-29 06:26:52,596][04457] Num frames 10700...
[2025-03-29 06:26:52,782][04457] Num frames 10800...
[2025-03-29 06:26:52,966][04457] Num frames 10900...
[2025-03-29 06:26:53,166][04457] Num frames 11000...
[2025-03-29 06:26:53,356][04457] Num frames 11100...
[2025-03-29 06:26:53,559][04457] Num frames 11200...
[2025-03-29 06:26:53,735][04457] Num frames 11300...
[2025-03-29 06:26:53,911][04457] Num frames 11400...
[2025-03-29 06:26:54,089][04457] Num frames 11500...
[2025-03-29 06:26:54,274][04457] Num frames 11600...
[2025-03-29 06:26:54,473][04457] Avg episode rewards: #0: 27.976, true rewards: #0: 11.676
[2025-03-29 06:26:54,474][04457] Avg episode reward: 27.976, avg true_objective: 11.676
[2025-03-29 06:28:11,595][04457] Replay video saved to /content/train_dir/default_experiment/replay.mp4!