|
[2025-03-22 15:39:06,957][03219] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2025-03-22 15:39:06,959][03219] Rollout worker 0 uses device cpu |
|
[2025-03-22 15:39:06,960][03219] Rollout worker 1 uses device cpu |
|
[2025-03-22 15:39:06,961][03219] Rollout worker 2 uses device cpu |
|
[2025-03-22 15:39:06,962][03219] Rollout worker 3 uses device cpu |
|
[2025-03-22 15:39:06,963][03219] Rollout worker 4 uses device cpu |
|
[2025-03-22 15:39:06,964][03219] Rollout worker 5 uses device cpu |
|
[2025-03-22 15:39:06,966][03219] Rollout worker 6 uses device cpu |
|
[2025-03-22 15:39:06,967][03219] Rollout worker 7 uses device cpu |
|
[2025-03-22 15:39:07,114][03219] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 15:39:07,114][03219] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-03-22 15:39:07,146][03219] Starting all processes... |
|
[2025-03-22 15:39:07,147][03219] Starting process learner_proc0 |
|
[2025-03-22 15:39:07,199][03219] Starting all processes... |
|
[2025-03-22 15:39:07,208][03219] Starting process inference_proc0-0 |
|
[2025-03-22 15:39:07,209][03219] Starting process rollout_proc0 |
|
[2025-03-22 15:39:07,209][03219] Starting process rollout_proc1 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc2 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc3 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc4 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc5 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc6 |
|
[2025-03-22 15:39:07,211][03219] Starting process rollout_proc7 |
|
[2025-03-22 15:39:25,225][03414] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 15:39:25,225][03414] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-03-22 15:39:25,310][03414] Num visible devices: 1 |
|
[2025-03-22 15:39:25,326][03414] Starting seed is not provided |
|
[2025-03-22 15:39:25,326][03414] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 15:39:25,326][03414] Initializing actor-critic model on device cuda:0 |
|
[2025-03-22 15:39:25,327][03414] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 15:39:25,332][03414] RunningMeanStd input shape: (1,) |
|
[2025-03-22 15:39:25,417][03414] ConvEncoder: input_channels=3 |
|
[2025-03-22 15:39:25,484][03428] Worker 1 uses CPU cores [1] |
|
[2025-03-22 15:39:25,765][03429] Worker 0 uses CPU cores [0] |
|
[2025-03-22 15:39:25,873][03435] Worker 7 uses CPU cores [1] |
|
[2025-03-22 15:39:25,888][03432] Worker 4 uses CPU cores [0] |
|
[2025-03-22 15:39:25,911][03431] Worker 3 uses CPU cores [1] |
|
[2025-03-22 15:39:25,921][03434] Worker 6 uses CPU cores [0] |
|
[2025-03-22 15:39:25,954][03430] Worker 2 uses CPU cores [0] |
|
[2025-03-22 15:39:25,971][03427] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 15:39:25,972][03427] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-03-22 15:39:25,998][03427] Num visible devices: 1 |
|
[2025-03-22 15:39:26,039][03433] Worker 5 uses CPU cores [1] |
|
[2025-03-22 15:39:26,057][03414] Conv encoder output size: 512 |
|
[2025-03-22 15:39:26,057][03414] Policy head output size: 512 |
|
[2025-03-22 15:39:26,111][03414] Created Actor Critic model with architecture: |
|
[2025-03-22 15:39:26,111][03414] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-03-22 15:39:26,358][03414] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-03-22 15:39:27,108][03219] Heartbeat connected on Batcher_0 |
|
[2025-03-22 15:39:27,114][03219] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-03-22 15:39:27,121][03219] Heartbeat connected on RolloutWorker_w0 |
|
[2025-03-22 15:39:27,125][03219] Heartbeat connected on RolloutWorker_w1 |
|
[2025-03-22 15:39:27,132][03219] Heartbeat connected on RolloutWorker_w3 |
|
[2025-03-22 15:39:27,132][03219] Heartbeat connected on RolloutWorker_w2 |
|
[2025-03-22 15:39:27,135][03219] Heartbeat connected on RolloutWorker_w4 |
|
[2025-03-22 15:39:27,142][03219] Heartbeat connected on RolloutWorker_w6 |
|
[2025-03-22 15:39:27,144][03219] Heartbeat connected on RolloutWorker_w5 |
|
[2025-03-22 15:39:27,146][03219] Heartbeat connected on RolloutWorker_w7 |
|
[2025-03-22 15:39:30,747][03414] No checkpoints found |
|
[2025-03-22 15:39:30,747][03414] Did not load from checkpoint, starting from scratch! |
|
[2025-03-22 15:39:30,747][03414] Initialized policy 0 weights for model version 0 |
|
[2025-03-22 15:39:30,750][03414] LearnerWorker_p0 finished initialization! |
|
[2025-03-22 15:39:30,751][03219] Heartbeat connected on LearnerWorker_p0 |
|
[2025-03-22 15:39:30,753][03414] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 15:39:30,924][03427] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 15:39:30,926][03427] RunningMeanStd input shape: (1,) |
|
[2025-03-22 15:39:30,938][03427] ConvEncoder: input_channels=3 |
|
[2025-03-22 15:39:31,041][03427] Conv encoder output size: 512 |
|
[2025-03-22 15:39:31,041][03427] Policy head output size: 512 |
|
[2025-03-22 15:39:31,077][03219] Inference worker 0-0 is ready! |
|
[2025-03-22 15:39:31,077][03219] All inference workers are ready! Signal rollout workers to start! |
|
[2025-03-22 15:39:31,357][03431] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,403][03433] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,434][03435] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,467][03434] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,528][03432] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,532][03430] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,565][03429] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:31,573][03428] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:39:32,776][03219] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 15:39:32,877][03431] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:32,879][03432] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:32,879][03435] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:32,877][03434] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:33,670][03432] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:33,673][03434] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:34,201][03431] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:34,203][03435] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:34,198][03433] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:34,505][03434] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:35,059][03428] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:35,622][03432] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:35,628][03430] Decorrelating experience for 0 frames... |
|
[2025-03-22 15:39:35,967][03433] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:36,175][03434] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:36,604][03428] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:37,235][03430] Decorrelating experience for 32 frames... |
|
[2025-03-22 15:39:37,404][03431] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:37,776][03219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 15:39:39,376][03435] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:39,374][03428] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:39,551][03433] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:39,671][03431] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:41,468][03432] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:41,852][03435] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:41,904][03428] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:42,116][03433] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:42,227][03430] Decorrelating experience for 64 frames... |
|
[2025-03-22 15:39:42,776][03219] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 72.6. Samples: 726. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 15:39:42,779][03219] Avg episode reward: [(0, '3.513')] |
|
[2025-03-22 15:39:44,152][03414] Signal inference workers to stop experience collection... |
|
[2025-03-22 15:39:44,171][03427] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-03-22 15:39:44,367][03430] Decorrelating experience for 96 frames... |
|
[2025-03-22 15:39:44,720][03414] Signal inference workers to resume experience collection... |
|
[2025-03-22 15:39:44,720][03427] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-03-22 15:39:47,776][03219] Fps is (10 sec: 1228.8, 60 sec: 819.2, 300 sec: 819.2). Total num frames: 12288. Throughput: 0: 204.9. Samples: 3074. Policy #0 lag: (min: 0.0, avg: 1.1, max: 2.0) |
|
[2025-03-22 15:39:47,779][03219] Avg episode reward: [(0, '3.337')] |
|
[2025-03-22 15:39:52,778][03219] Fps is (10 sec: 2866.6, 60 sec: 1433.4, 300 sec: 1433.4). Total num frames: 28672. Throughput: 0: 374.5. Samples: 7490. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:39:52,782][03219] Avg episode reward: [(0, '3.873')] |
|
[2025-03-22 15:39:55,703][03427] Updated weights for policy 0, policy_version 10 (0.0022) |
|
[2025-03-22 15:39:57,776][03219] Fps is (10 sec: 3686.4, 60 sec: 1966.1, 300 sec: 1966.1). Total num frames: 49152. Throughput: 0: 399.8. Samples: 9994. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:39:57,779][03219] Avg episode reward: [(0, '4.432')] |
|
[2025-03-22 15:40:02,776][03219] Fps is (10 sec: 4097.0, 60 sec: 2321.1, 300 sec: 2321.1). Total num frames: 69632. Throughput: 0: 557.7. Samples: 16730. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:40:02,781][03219] Avg episode reward: [(0, '4.395')] |
|
[2025-03-22 15:40:05,553][03427] Updated weights for policy 0, policy_version 20 (0.0020) |
|
[2025-03-22 15:40:07,776][03219] Fps is (10 sec: 3686.4, 60 sec: 2457.6, 300 sec: 2457.6). Total num frames: 86016. Throughput: 0: 629.8. Samples: 22042. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:40:07,780][03219] Avg episode reward: [(0, '4.258')] |
|
[2025-03-22 15:40:12,776][03219] Fps is (10 sec: 3686.4, 60 sec: 2662.4, 300 sec: 2662.4). Total num frames: 106496. Throughput: 0: 624.3. Samples: 24972. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:40:12,778][03219] Avg episode reward: [(0, '4.416')] |
|
[2025-03-22 15:40:12,786][03414] Saving new best policy, reward=4.416! |
|
[2025-03-22 15:40:15,677][03427] Updated weights for policy 0, policy_version 30 (0.0022) |
|
[2025-03-22 15:40:17,776][03219] Fps is (10 sec: 4505.7, 60 sec: 2912.7, 300 sec: 2912.7). Total num frames: 131072. Throughput: 0: 708.5. Samples: 31884. Policy #0 lag: (min: 0.0, avg: 0.7, max: 1.0) |
|
[2025-03-22 15:40:17,778][03219] Avg episode reward: [(0, '4.478')] |
|
[2025-03-22 15:40:17,780][03414] Saving new best policy, reward=4.478! |
|
[2025-03-22 15:40:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 2867.2, 300 sec: 2867.2). Total num frames: 143360. Throughput: 0: 818.6. Samples: 36838. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:40:22,779][03219] Avg episode reward: [(0, '4.385')] |
|
[2025-03-22 15:40:26,491][03427] Updated weights for policy 0, policy_version 40 (0.0031) |
|
[2025-03-22 15:40:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3053.4, 300 sec: 3053.4). Total num frames: 167936. Throughput: 0: 871.2. Samples: 39928. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:40:27,783][03219] Avg episode reward: [(0, '4.331')] |
|
[2025-03-22 15:40:32,776][03219] Fps is (10 sec: 4915.2, 60 sec: 3208.5, 300 sec: 3208.5). Total num frames: 192512. Throughput: 0: 973.1. Samples: 46862. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:40:32,779][03219] Avg episode reward: [(0, '4.434')] |
|
[2025-03-22 15:40:36,747][03427] Updated weights for policy 0, policy_version 50 (0.0029) |
|
[2025-03-22 15:40:37,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3413.4, 300 sec: 3150.8). Total num frames: 204800. Throughput: 0: 986.2. Samples: 51866. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:40:37,781][03219] Avg episode reward: [(0, '4.430')] |
|
[2025-03-22 15:40:42,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3218.3). Total num frames: 225280. Throughput: 0: 998.4. Samples: 54922. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:40:42,778][03219] Avg episode reward: [(0, '4.420')] |
|
[2025-03-22 15:40:46,680][03427] Updated weights for policy 0, policy_version 60 (0.0032) |
|
[2025-03-22 15:40:47,779][03219] Fps is (10 sec: 4094.9, 60 sec: 3891.0, 300 sec: 3276.7). Total num frames: 245760. Throughput: 0: 991.7. Samples: 61360. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:40:47,780][03219] Avg episode reward: [(0, '4.594')] |
|
[2025-03-22 15:40:47,821][03414] Saving new best policy, reward=4.594! |
|
[2025-03-22 15:40:52,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3823.1, 300 sec: 3225.6). Total num frames: 258048. Throughput: 0: 964.3. Samples: 65436. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:40:52,777][03219] Avg episode reward: [(0, '4.473')] |
|
[2025-03-22 15:40:57,776][03219] Fps is (10 sec: 3687.4, 60 sec: 3891.2, 300 sec: 3325.0). Total num frames: 282624. Throughput: 0: 967.1. Samples: 68490. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:40:57,780][03219] Avg episode reward: [(0, '4.183')] |
|
[2025-03-22 15:40:58,564][03427] Updated weights for policy 0, policy_version 70 (0.0019) |
|
[2025-03-22 15:41:02,779][03219] Fps is (10 sec: 4504.2, 60 sec: 3891.0, 300 sec: 3367.7). Total num frames: 303104. Throughput: 0: 957.8. Samples: 74988. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:41:02,780][03219] Avg episode reward: [(0, '4.145')] |
|
[2025-03-22 15:41:02,785][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth... |
|
[2025-03-22 15:41:07,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3319.9). Total num frames: 315392. Throughput: 0: 942.9. Samples: 79268. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:41:07,779][03219] Avg episode reward: [(0, '4.476')] |
|
[2025-03-22 15:41:10,003][03427] Updated weights for policy 0, policy_version 80 (0.0027) |
|
[2025-03-22 15:41:12,776][03219] Fps is (10 sec: 3687.6, 60 sec: 3891.2, 300 sec: 3399.7). Total num frames: 339968. Throughput: 0: 946.0. Samples: 82496. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:41:12,779][03219] Avg episode reward: [(0, '4.634')] |
|
[2025-03-22 15:41:12,784][03414] Saving new best policy, reward=4.634! |
|
[2025-03-22 15:41:17,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3754.7, 300 sec: 3393.8). Total num frames: 356352. Throughput: 0: 931.1. Samples: 88760. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:41:17,785][03219] Avg episode reward: [(0, '4.548')] |
|
[2025-03-22 15:41:21,336][03427] Updated weights for policy 0, policy_version 90 (0.0021) |
|
[2025-03-22 15:41:22,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3822.9, 300 sec: 3388.5). Total num frames: 372736. Throughput: 0: 918.7. Samples: 93208. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:41:22,778][03219] Avg episode reward: [(0, '4.506')] |
|
[2025-03-22 15:41:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3419.3). Total num frames: 393216. Throughput: 0: 925.6. Samples: 96574. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:41:27,781][03219] Avg episode reward: [(0, '4.727')] |
|
[2025-03-22 15:41:27,784][03414] Saving new best policy, reward=4.727! |
|
[2025-03-22 15:41:30,756][03427] Updated weights for policy 0, policy_version 100 (0.0025) |
|
[2025-03-22 15:41:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3447.5). Total num frames: 413696. Throughput: 0: 931.3. Samples: 103264. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:41:32,778][03219] Avg episode reward: [(0, '4.635')] |
|
[2025-03-22 15:41:37,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3440.6). Total num frames: 430080. Throughput: 0: 943.1. Samples: 107874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:41:37,778][03219] Avg episode reward: [(0, '4.534')] |
|
[2025-03-22 15:41:42,159][03427] Updated weights for policy 0, policy_version 110 (0.0033) |
|
[2025-03-22 15:41:42,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3465.8). Total num frames: 450560. Throughput: 0: 945.8. Samples: 111052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:41:42,778][03219] Avg episode reward: [(0, '4.657')] |
|
[2025-03-22 15:41:47,776][03219] Fps is (10 sec: 4095.9, 60 sec: 3754.8, 300 sec: 3489.2). Total num frames: 471040. Throughput: 0: 940.5. Samples: 117306. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:41:47,778][03219] Avg episode reward: [(0, '4.765')] |
|
[2025-03-22 15:41:47,786][03414] Saving new best policy, reward=4.765! |
|
[2025-03-22 15:41:52,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3452.3). Total num frames: 483328. Throughput: 0: 937.7. Samples: 121464. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:41:52,780][03219] Avg episode reward: [(0, '4.668')] |
|
[2025-03-22 15:41:53,776][03427] Updated weights for policy 0, policy_version 120 (0.0039) |
|
[2025-03-22 15:41:57,776][03219] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3474.5). Total num frames: 503808. Throughput: 0: 935.5. Samples: 124592. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:41:57,777][03219] Avg episode reward: [(0, '4.507')] |
|
[2025-03-22 15:42:02,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3686.6, 300 sec: 3495.3). Total num frames: 524288. Throughput: 0: 940.9. Samples: 131102. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:42:02,778][03219] Avg episode reward: [(0, '4.626')] |
|
[2025-03-22 15:42:04,772][03427] Updated weights for policy 0, policy_version 130 (0.0019) |
|
[2025-03-22 15:42:07,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3754.7, 300 sec: 3488.2). Total num frames: 540672. Throughput: 0: 936.0. Samples: 135326. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:42:07,778][03219] Avg episode reward: [(0, '4.887')] |
|
[2025-03-22 15:42:07,782][03414] Saving new best policy, reward=4.887! |
|
[2025-03-22 15:42:12,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3507.2). Total num frames: 561152. Throughput: 0: 929.6. Samples: 138406. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:42:12,781][03219] Avg episode reward: [(0, '4.708')] |
|
[2025-03-22 15:42:15,292][03427] Updated weights for policy 0, policy_version 140 (0.0013) |
|
[2025-03-22 15:42:17,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3500.2). Total num frames: 577536. Throughput: 0: 920.8. Samples: 144700. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:42:17,778][03219] Avg episode reward: [(0, '4.558')] |
|
[2025-03-22 15:42:22,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3493.6). Total num frames: 593920. Throughput: 0: 909.3. Samples: 148792. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:42:22,778][03219] Avg episode reward: [(0, '4.584')] |
|
[2025-03-22 15:42:27,078][03427] Updated weights for policy 0, policy_version 150 (0.0024) |
|
[2025-03-22 15:42:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3510.9). Total num frames: 614400. Throughput: 0: 908.3. Samples: 151926. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:42:27,777][03219] Avg episode reward: [(0, '4.658')] |
|
[2025-03-22 15:42:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3527.1). Total num frames: 634880. Throughput: 0: 910.3. Samples: 158270. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:42:32,780][03219] Avg episode reward: [(0, '4.507')] |
|
[2025-03-22 15:42:37,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3498.2). Total num frames: 647168. Throughput: 0: 911.0. Samples: 162458. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:42:37,779][03219] Avg episode reward: [(0, '4.625')] |
|
[2025-03-22 15:42:38,832][03427] Updated weights for policy 0, policy_version 160 (0.0018) |
|
[2025-03-22 15:42:42,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3535.5). Total num frames: 671744. Throughput: 0: 912.0. Samples: 165632. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:42:42,781][03219] Avg episode reward: [(0, '4.971')] |
|
[2025-03-22 15:42:42,788][03414] Saving new best policy, reward=4.971! |
|
[2025-03-22 15:42:47,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3528.9). Total num frames: 688128. Throughput: 0: 906.3. Samples: 171886. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:42:47,777][03219] Avg episode reward: [(0, '5.008')] |
|
[2025-03-22 15:42:47,780][03414] Saving new best policy, reward=5.008! |
|
[2025-03-22 15:42:50,378][03427] Updated weights for policy 0, policy_version 170 (0.0015) |
|
[2025-03-22 15:42:52,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3522.6). Total num frames: 704512. Throughput: 0: 901.6. Samples: 175898. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:42:52,778][03219] Avg episode reward: [(0, '5.009')] |
|
[2025-03-22 15:42:52,789][03414] Saving new best policy, reward=5.009! |
|
[2025-03-22 15:42:57,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3536.5). Total num frames: 724992. Throughput: 0: 901.2. Samples: 178962. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:42:57,782][03219] Avg episode reward: [(0, '4.845')] |
|
[2025-03-22 15:43:00,690][03427] Updated weights for policy 0, policy_version 180 (0.0020) |
|
[2025-03-22 15:43:02,779][03219] Fps is (10 sec: 3685.3, 60 sec: 3617.9, 300 sec: 3530.3). Total num frames: 741376. Throughput: 0: 904.3. Samples: 185396. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:43:02,783][03219] Avg episode reward: [(0, '4.987')] |
|
[2025-03-22 15:43:02,792][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000181_741376.pth... |
|
[2025-03-22 15:43:07,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3524.5). Total num frames: 757760. Throughput: 0: 909.0. Samples: 189698. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:43:07,781][03219] Avg episode reward: [(0, '4.844')] |
|
[2025-03-22 15:43:12,306][03427] Updated weights for policy 0, policy_version 190 (0.0019) |
|
[2025-03-22 15:43:12,776][03219] Fps is (10 sec: 3687.5, 60 sec: 3618.1, 300 sec: 3537.5). Total num frames: 778240. Throughput: 0: 909.1. Samples: 192836. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:43:12,777][03219] Avg episode reward: [(0, '4.586')] |
|
[2025-03-22 15:43:17,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3549.9). Total num frames: 798720. Throughput: 0: 908.0. Samples: 199132. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:43:17,777][03219] Avg episode reward: [(0, '4.607')] |
|
[2025-03-22 15:43:22,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3526.1). Total num frames: 811008. Throughput: 0: 908.7. Samples: 203348. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:43:22,781][03219] Avg episode reward: [(0, '4.601')] |
|
[2025-03-22 15:43:23,932][03427] Updated weights for policy 0, policy_version 200 (0.0016) |
|
[2025-03-22 15:43:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3686.4, 300 sec: 3555.7). Total num frames: 835584. Throughput: 0: 910.6. Samples: 206608. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:43:27,778][03219] Avg episode reward: [(0, '4.838')] |
|
[2025-03-22 15:43:32,776][03219] Fps is (10 sec: 4096.1, 60 sec: 3618.1, 300 sec: 3549.9). Total num frames: 851968. Throughput: 0: 922.7. Samples: 213406. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:43:32,780][03219] Avg episode reward: [(0, '4.721')] |
|
[2025-03-22 15:43:34,379][03427] Updated weights for policy 0, policy_version 210 (0.0028) |
|
[2025-03-22 15:43:37,776][03219] Fps is (10 sec: 3276.7, 60 sec: 3686.4, 300 sec: 3544.3). Total num frames: 868352. Throughput: 0: 934.8. Samples: 217966. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:43:37,778][03219] Avg episode reward: [(0, '4.728')] |
|
[2025-03-22 15:43:42,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3686.4, 300 sec: 3571.7). Total num frames: 892928. Throughput: 0: 941.5. Samples: 221330. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:43:42,781][03219] Avg episode reward: [(0, '5.023')] |
|
[2025-03-22 15:43:42,787][03414] Saving new best policy, reward=5.023! |
|
[2025-03-22 15:43:44,267][03427] Updated weights for policy 0, policy_version 220 (0.0030) |
|
[2025-03-22 15:43:47,776][03219] Fps is (10 sec: 4096.1, 60 sec: 3686.4, 300 sec: 3565.9). Total num frames: 909312. Throughput: 0: 934.8. Samples: 227460. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:43:47,779][03219] Avg episode reward: [(0, '4.732')] |
|
[2025-03-22 15:43:52,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3560.4). Total num frames: 925696. Throughput: 0: 934.9. Samples: 231770. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:43:52,778][03219] Avg episode reward: [(0, '4.711')] |
|
[2025-03-22 15:43:56,261][03427] Updated weights for policy 0, policy_version 230 (0.0033) |
|
[2025-03-22 15:43:57,776][03219] Fps is (10 sec: 3686.3, 60 sec: 3686.4, 300 sec: 3570.5). Total num frames: 946176. Throughput: 0: 934.6. Samples: 234894. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:43:57,778][03219] Avg episode reward: [(0, '4.790')] |
|
[2025-03-22 15:44:02,778][03219] Fps is (10 sec: 3685.7, 60 sec: 3686.5, 300 sec: 3565.0). Total num frames: 962560. Throughput: 0: 928.8. Samples: 240928. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:44:02,780][03219] Avg episode reward: [(0, '4.827')] |
|
[2025-03-22 15:44:07,776][03219] Fps is (10 sec: 3276.9, 60 sec: 3686.4, 300 sec: 3559.8). Total num frames: 978944. Throughput: 0: 933.8. Samples: 245368. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:44:07,778][03219] Avg episode reward: [(0, '4.752')] |
|
[2025-03-22 15:44:08,141][03427] Updated weights for policy 0, policy_version 240 (0.0023) |
|
[2025-03-22 15:44:12,776][03219] Fps is (10 sec: 3687.1, 60 sec: 3686.4, 300 sec: 3569.4). Total num frames: 999424. Throughput: 0: 930.1. Samples: 248464. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:44:12,778][03219] Avg episode reward: [(0, '4.757')] |
|
[2025-03-22 15:44:17,779][03219] Fps is (10 sec: 4094.8, 60 sec: 3686.2, 300 sec: 3578.6). Total num frames: 1019904. Throughput: 0: 913.3. Samples: 254506. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:44:17,781][03219] Avg episode reward: [(0, '4.863')] |
|
[2025-03-22 15:44:19,210][03427] Updated weights for policy 0, policy_version 250 (0.0039) |
|
[2025-03-22 15:44:22,777][03219] Fps is (10 sec: 3686.3, 60 sec: 3754.6, 300 sec: 3573.4). Total num frames: 1036288. Throughput: 0: 911.5. Samples: 258984. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:44:22,778][03219] Avg episode reward: [(0, '4.586')] |
|
[2025-03-22 15:44:27,776][03219] Fps is (10 sec: 3687.5, 60 sec: 3686.4, 300 sec: 3582.3). Total num frames: 1056768. Throughput: 0: 907.2. Samples: 262156. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:44:27,778][03219] Avg episode reward: [(0, '4.705')] |
|
[2025-03-22 15:44:29,370][03427] Updated weights for policy 0, policy_version 260 (0.0017) |
|
[2025-03-22 15:44:32,776][03219] Fps is (10 sec: 3686.5, 60 sec: 3686.4, 300 sec: 3637.8). Total num frames: 1073152. Throughput: 0: 903.3. Samples: 268108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:44:32,781][03219] Avg episode reward: [(0, '4.830')] |
|
[2025-03-22 15:44:37,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1089536. Throughput: 0: 911.8. Samples: 272800. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:44:37,785][03219] Avg episode reward: [(0, '4.862')] |
|
[2025-03-22 15:44:41,248][03427] Updated weights for policy 0, policy_version 270 (0.0013) |
|
[2025-03-22 15:44:42,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 1110016. Throughput: 0: 910.6. Samples: 275870. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:44:42,781][03219] Avg episode reward: [(0, '4.801')] |
|
[2025-03-22 15:44:47,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3721.1). Total num frames: 1126400. Throughput: 0: 904.0. Samples: 281606. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:44:47,778][03219] Avg episode reward: [(0, '4.670')] |
|
[2025-03-22 15:44:52,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3618.1, 300 sec: 3707.2). Total num frames: 1142784. Throughput: 0: 907.6. Samples: 286210. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:44:52,782][03219] Avg episode reward: [(0, '4.722')] |
|
[2025-03-22 15:44:53,061][03427] Updated weights for policy 0, policy_version 280 (0.0028) |
|
[2025-03-22 15:44:57,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3707.2). Total num frames: 1163264. Throughput: 0: 908.5. Samples: 289346. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:44:57,781][03219] Avg episode reward: [(0, '4.662')] |
|
[2025-03-22 15:45:02,776][03219] Fps is (10 sec: 3686.3, 60 sec: 3618.2, 300 sec: 3707.2). Total num frames: 1179648. Throughput: 0: 902.6. Samples: 295122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:45:02,781][03219] Avg episode reward: [(0, '4.748')] |
|
[2025-03-22 15:45:02,788][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000288_1179648.pth... |
|
[2025-03-22 15:45:02,938][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000074_303104.pth |
|
[2025-03-22 15:45:04,800][03427] Updated weights for policy 0, policy_version 290 (0.0022) |
|
[2025-03-22 15:45:07,777][03219] Fps is (10 sec: 3276.5, 60 sec: 3618.1, 300 sec: 3693.3). Total num frames: 1196032. Throughput: 0: 910.5. Samples: 299958. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:45:07,781][03219] Avg episode reward: [(0, '4.825')] |
|
[2025-03-22 15:45:12,776][03219] Fps is (10 sec: 4096.2, 60 sec: 3686.4, 300 sec: 3693.3). Total num frames: 1220608. Throughput: 0: 908.9. Samples: 303056. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:45:12,778][03219] Avg episode reward: [(0, '4.978')] |
|
[2025-03-22 15:45:14,324][03427] Updated weights for policy 0, policy_version 300 (0.0017) |
|
[2025-03-22 15:45:17,776][03219] Fps is (10 sec: 3686.7, 60 sec: 3550.1, 300 sec: 3693.3). Total num frames: 1232896. Throughput: 0: 899.4. Samples: 308580. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:45:17,780][03219] Avg episode reward: [(0, '4.923')] |
|
[2025-03-22 15:45:22,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3618.2, 300 sec: 3679.5). Total num frames: 1253376. Throughput: 0: 902.8. Samples: 313426. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:45:22,778][03219] Avg episode reward: [(0, '4.979')] |
|
[2025-03-22 15:45:26,221][03427] Updated weights for policy 0, policy_version 310 (0.0039) |
|
[2025-03-22 15:45:27,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3618.1, 300 sec: 3665.6). Total num frames: 1273856. Throughput: 0: 905.6. Samples: 316620. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:45:27,777][03219] Avg episode reward: [(0, '5.056')] |
|
[2025-03-22 15:45:27,781][03414] Saving new best policy, reward=5.056! |
|
[2025-03-22 15:45:32,778][03219] Fps is (10 sec: 3685.7, 60 sec: 3618.0, 300 sec: 3679.4). Total num frames: 1290240. Throughput: 0: 898.1. Samples: 322020. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:45:32,786][03219] Avg episode reward: [(0, '4.992')] |
|
[2025-03-22 15:45:37,776][03219] Fps is (10 sec: 2867.2, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1302528. Throughput: 0: 899.2. Samples: 326674. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:45:37,781][03219] Avg episode reward: [(0, '5.153')] |
|
[2025-03-22 15:45:37,791][03414] Saving new best policy, reward=5.153! |
|
[2025-03-22 15:45:39,982][03427] Updated weights for policy 0, policy_version 320 (0.0019) |
|
[2025-03-22 15:45:42,776][03219] Fps is (10 sec: 2867.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 1318912. Throughput: 0: 860.0. Samples: 328048. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:45:42,777][03219] Avg episode reward: [(0, '5.192')] |
|
[2025-03-22 15:45:42,785][03414] Saving new best policy, reward=5.192! |
|
[2025-03-22 15:45:47,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3651.7). Total num frames: 1335296. Throughput: 0: 850.8. Samples: 333406. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:45:47,784][03219] Avg episode reward: [(0, '5.130')] |
|
[2025-03-22 15:45:51,818][03427] Updated weights for policy 0, policy_version 330 (0.0027) |
|
[2025-03-22 15:45:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 1355776. Throughput: 0: 863.2. Samples: 338800. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:45:52,779][03219] Avg episode reward: [(0, '4.820')] |
|
[2025-03-22 15:45:57,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3549.9, 300 sec: 3637.8). Total num frames: 1376256. Throughput: 0: 867.5. Samples: 342094. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:45:57,781][03219] Avg episode reward: [(0, '4.730')] |
|
[2025-03-22 15:46:02,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3481.6, 300 sec: 3637.8). Total num frames: 1388544. Throughput: 0: 866.4. Samples: 347568. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-03-22 15:46:02,781][03219] Avg episode reward: [(0, '4.788')] |
|
[2025-03-22 15:46:03,047][03427] Updated weights for policy 0, policy_version 340 (0.0025) |
|
[2025-03-22 15:46:07,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3637.8). Total num frames: 1413120. Throughput: 0: 884.1. Samples: 353212. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:46:07,781][03219] Avg episode reward: [(0, '4.894')] |
|
[2025-03-22 15:46:12,058][03427] Updated weights for policy 0, policy_version 350 (0.0018) |
|
[2025-03-22 15:46:12,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3549.9, 300 sec: 3651.7). Total num frames: 1433600. Throughput: 0: 889.5. Samples: 356646. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:46:12,781][03219] Avg episode reward: [(0, '4.585')] |
|
[2025-03-22 15:46:17,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 1449984. Throughput: 0: 892.6. Samples: 362184. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) |
|
[2025-03-22 15:46:17,777][03219] Avg episode reward: [(0, '4.664')] |
|
[2025-03-22 15:46:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.1, 300 sec: 3651.7). Total num frames: 1470464. Throughput: 0: 920.7. Samples: 368104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:46:22,778][03219] Avg episode reward: [(0, '5.074')] |
|
[2025-03-22 15:46:23,090][03427] Updated weights for policy 0, policy_version 360 (0.0019) |
|
[2025-03-22 15:46:27,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3686.4, 300 sec: 3665.6). Total num frames: 1495040. Throughput: 0: 965.8. Samples: 371508. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:46:27,778][03219] Avg episode reward: [(0, '5.271')] |
|
[2025-03-22 15:46:27,783][03414] Saving new best policy, reward=5.271! |
|
[2025-03-22 15:46:32,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3618.2, 300 sec: 3651.7). Total num frames: 1507328. Throughput: 0: 962.0. Samples: 376698. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) |
|
[2025-03-22 15:46:32,782][03219] Avg episode reward: [(0, '5.093')] |
|
[2025-03-22 15:46:34,049][03427] Updated weights for policy 0, policy_version 370 (0.0021) |
|
[2025-03-22 15:46:37,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3754.7, 300 sec: 3651.7). Total num frames: 1527808. Throughput: 0: 979.5. Samples: 382876. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:46:37,781][03219] Avg episode reward: [(0, '4.752')] |
|
[2025-03-22 15:46:42,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3891.2, 300 sec: 3665.6). Total num frames: 1552384. Throughput: 0: 975.6. Samples: 385996. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:46:42,783][03219] Avg episode reward: [(0, '5.007')] |
|
[2025-03-22 15:46:44,027][03427] Updated weights for policy 0, policy_version 380 (0.0021) |
|
[2025-03-22 15:46:47,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 1568768. Throughput: 0: 965.6. Samples: 391018. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:46:47,778][03219] Avg episode reward: [(0, '4.844')] |
|
[2025-03-22 15:46:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3679.5). Total num frames: 1589248. Throughput: 0: 981.4. Samples: 397374. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:46:52,777][03219] Avg episode reward: [(0, '4.992')] |
|
[2025-03-22 15:46:54,330][03427] Updated weights for policy 0, policy_version 390 (0.0019) |
|
[2025-03-22 15:46:57,779][03219] Fps is (10 sec: 4094.8, 60 sec: 3891.0, 300 sec: 3679.4). Total num frames: 1609728. Throughput: 0: 981.6. Samples: 400822. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:46:57,780][03219] Avg episode reward: [(0, '4.914')] |
|
[2025-03-22 15:47:02,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3679.5). Total num frames: 1626112. Throughput: 0: 970.2. Samples: 405844. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:47:02,782][03219] Avg episode reward: [(0, '4.665')] |
|
[2025-03-22 15:47:02,790][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth... |
|
[2025-03-22 15:47:02,926][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000181_741376.pth |
|
[2025-03-22 15:47:05,137][03427] Updated weights for policy 0, policy_version 400 (0.0025) |
|
[2025-03-22 15:47:07,776][03219] Fps is (10 sec: 4097.2, 60 sec: 3959.5, 300 sec: 3693.3). Total num frames: 1650688. Throughput: 0: 989.3. Samples: 412622. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:47:07,777][03219] Avg episode reward: [(0, '4.715')] |
|
[2025-03-22 15:47:12,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 1671168. Throughput: 0: 991.1. Samples: 416106. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:47:12,777][03219] Avg episode reward: [(0, '5.115')] |
|
[2025-03-22 15:47:15,269][03427] Updated weights for policy 0, policy_version 410 (0.0019) |
|
[2025-03-22 15:47:17,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 1687552. Throughput: 0: 982.4. Samples: 420906. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:47:17,778][03219] Avg episode reward: [(0, '5.374')] |
|
[2025-03-22 15:47:17,779][03414] Saving new best policy, reward=5.374! |
|
[2025-03-22 15:47:22,776][03219] Fps is (10 sec: 3686.3, 60 sec: 3959.5, 300 sec: 3707.2). Total num frames: 1708032. Throughput: 0: 991.1. Samples: 427476. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:47:22,783][03219] Avg episode reward: [(0, '5.421')] |
|
[2025-03-22 15:47:22,796][03414] Saving new best policy, reward=5.421! |
|
[2025-03-22 15:47:24,741][03427] Updated weights for policy 0, policy_version 420 (0.0016) |
|
[2025-03-22 15:47:27,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 3707.2). Total num frames: 1728512. Throughput: 0: 996.0. Samples: 430818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:47:27,779][03219] Avg episode reward: [(0, '5.277')] |
|
[2025-03-22 15:47:32,776][03219] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3721.1). Total num frames: 1744896. Throughput: 0: 991.2. Samples: 435624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:47:32,781][03219] Avg episode reward: [(0, '5.300')] |
|
[2025-03-22 15:47:35,455][03427] Updated weights for policy 0, policy_version 430 (0.0015) |
|
[2025-03-22 15:47:37,776][03219] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 3721.1). Total num frames: 1769472. Throughput: 0: 1007.5. Samples: 442712. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:47:37,778][03219] Avg episode reward: [(0, '5.463')] |
|
[2025-03-22 15:47:37,780][03414] Saving new best policy, reward=5.463! |
|
[2025-03-22 15:47:42,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 1789952. Throughput: 0: 1005.4. Samples: 446062. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:47:42,778][03219] Avg episode reward: [(0, '5.501')] |
|
[2025-03-22 15:47:42,784][03414] Saving new best policy, reward=5.501! |
|
[2025-03-22 15:47:46,208][03427] Updated weights for policy 0, policy_version 440 (0.0026) |
|
[2025-03-22 15:47:47,776][03219] Fps is (10 sec: 3686.5, 60 sec: 3959.5, 300 sec: 3735.0). Total num frames: 1806336. Throughput: 0: 1000.7. Samples: 450874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:47:47,778][03219] Avg episode reward: [(0, '5.582')] |
|
[2025-03-22 15:47:47,781][03414] Saving new best policy, reward=5.582! |
|
[2025-03-22 15:47:52,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3748.9). Total num frames: 1830912. Throughput: 0: 1000.5. Samples: 457644. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:47:52,777][03219] Avg episode reward: [(0, '5.768')] |
|
[2025-03-22 15:47:52,784][03414] Saving new best policy, reward=5.768! |
|
[2025-03-22 15:47:55,531][03427] Updated weights for policy 0, policy_version 450 (0.0027) |
|
[2025-03-22 15:47:57,779][03219] Fps is (10 sec: 4095.0, 60 sec: 3959.5, 300 sec: 3748.9). Total num frames: 1847296. Throughput: 0: 996.7. Samples: 460962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:47:57,790][03219] Avg episode reward: [(0, '5.919')] |
|
[2025-03-22 15:47:57,796][03414] Saving new best policy, reward=5.919! |
|
[2025-03-22 15:48:02,778][03219] Fps is (10 sec: 3685.6, 60 sec: 4027.6, 300 sec: 3762.7). Total num frames: 1867776. Throughput: 0: 995.3. Samples: 465698. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:48:02,780][03219] Avg episode reward: [(0, '5.886')] |
|
[2025-03-22 15:48:06,249][03427] Updated weights for policy 0, policy_version 460 (0.0021) |
|
[2025-03-22 15:48:07,776][03219] Fps is (10 sec: 4096.9, 60 sec: 3959.4, 300 sec: 3762.8). Total num frames: 1888256. Throughput: 0: 1001.5. Samples: 472544. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:48:07,781][03219] Avg episode reward: [(0, '6.234')] |
|
[2025-03-22 15:48:07,783][03414] Saving new best policy, reward=6.234! |
|
[2025-03-22 15:48:12,776][03219] Fps is (10 sec: 4096.8, 60 sec: 3959.5, 300 sec: 3762.8). Total num frames: 1908736. Throughput: 0: 1000.3. Samples: 475832. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:48:12,780][03219] Avg episode reward: [(0, '6.184')] |
|
[2025-03-22 15:48:16,782][03427] Updated weights for policy 0, policy_version 470 (0.0019) |
|
[2025-03-22 15:48:17,776][03219] Fps is (10 sec: 4096.1, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 1929216. Throughput: 0: 1006.4. Samples: 480914. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:48:17,780][03219] Avg episode reward: [(0, '6.274')] |
|
[2025-03-22 15:48:17,783][03414] Saving new best policy, reward=6.274! |
|
[2025-03-22 15:48:22,776][03219] Fps is (10 sec: 4096.1, 60 sec: 4027.7, 300 sec: 3776.7). Total num frames: 1949696. Throughput: 0: 1003.1. Samples: 487850. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:48:22,777][03219] Avg episode reward: [(0, '6.703')] |
|
[2025-03-22 15:48:22,787][03414] Saving new best policy, reward=6.703! |
|
[2025-03-22 15:48:26,652][03427] Updated weights for policy 0, policy_version 480 (0.0016) |
|
[2025-03-22 15:48:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3776.7). Total num frames: 1966080. Throughput: 0: 997.0. Samples: 490928. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:48:27,781][03219] Avg episode reward: [(0, '6.651')] |
|
[2025-03-22 15:48:32,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 1986560. Throughput: 0: 1006.6. Samples: 496172. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:48:32,781][03219] Avg episode reward: [(0, '6.803')] |
|
[2025-03-22 15:48:32,807][03414] Saving new best policy, reward=6.803! |
|
[2025-03-22 15:48:36,306][03427] Updated weights for policy 0, policy_version 490 (0.0027) |
|
[2025-03-22 15:48:37,776][03219] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3790.5). Total num frames: 2011136. Throughput: 0: 1013.6. Samples: 503254. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:48:37,782][03219] Avg episode reward: [(0, '6.779')] |
|
[2025-03-22 15:48:42,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3790.5). Total num frames: 2027520. Throughput: 0: 1003.3. Samples: 506108. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:48:42,779][03219] Avg episode reward: [(0, '6.342')] |
|
[2025-03-22 15:48:46,703][03427] Updated weights for policy 0, policy_version 500 (0.0018) |
|
[2025-03-22 15:48:47,776][03219] Fps is (10 sec: 4095.9, 60 sec: 4096.0, 300 sec: 3818.3). Total num frames: 2052096. Throughput: 0: 1021.2. Samples: 511650. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:48:47,778][03219] Avg episode reward: [(0, '6.072')] |
|
[2025-03-22 15:48:52,776][03219] Fps is (10 sec: 4505.6, 60 sec: 4027.7, 300 sec: 3818.3). Total num frames: 2072576. Throughput: 0: 1025.8. Samples: 518704. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:48:52,781][03219] Avg episode reward: [(0, '5.767')] |
|
[2025-03-22 15:48:57,063][03427] Updated weights for policy 0, policy_version 510 (0.0015) |
|
[2025-03-22 15:48:57,778][03219] Fps is (10 sec: 3685.8, 60 sec: 4027.8, 300 sec: 3818.3). Total num frames: 2088960. Throughput: 0: 1012.8. Samples: 521410. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:48:57,779][03219] Avg episode reward: [(0, '5.660')] |
|
[2025-03-22 15:49:02,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4096.1, 300 sec: 3846.1). Total num frames: 2113536. Throughput: 0: 1026.9. Samples: 527124. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:49:02,782][03219] Avg episode reward: [(0, '6.020')] |
|
[2025-03-22 15:49:02,790][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000516_2113536.pth... |
|
[2025-03-22 15:49:02,918][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000288_1179648.pth |
|
[2025-03-22 15:49:06,065][03427] Updated weights for policy 0, policy_version 520 (0.0022) |
|
[2025-03-22 15:49:07,776][03219] Fps is (10 sec: 4506.5, 60 sec: 4096.0, 300 sec: 3846.1). Total num frames: 2134016. Throughput: 0: 1028.4. Samples: 534130. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:49:07,781][03219] Avg episode reward: [(0, '6.409')] |
|
[2025-03-22 15:49:12,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3832.2). Total num frames: 2150400. Throughput: 0: 1018.4. Samples: 536758. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:49:12,780][03219] Avg episode reward: [(0, '6.596')] |
|
[2025-03-22 15:49:16,585][03427] Updated weights for policy 0, policy_version 530 (0.0025) |
|
[2025-03-22 15:49:17,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3860.0). Total num frames: 2174976. Throughput: 0: 1032.3. Samples: 542624. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:49:17,780][03219] Avg episode reward: [(0, '7.498')] |
|
[2025-03-22 15:49:17,784][03414] Saving new best policy, reward=7.498! |
|
[2025-03-22 15:49:22,777][03219] Fps is (10 sec: 4505.2, 60 sec: 4095.9, 300 sec: 3859.9). Total num frames: 2195456. Throughput: 0: 1027.4. Samples: 549488. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:49:22,779][03219] Avg episode reward: [(0, '8.144')] |
|
[2025-03-22 15:49:22,895][03414] Saving new best policy, reward=8.144! |
|
[2025-03-22 15:49:27,372][03427] Updated weights for policy 0, policy_version 540 (0.0021) |
|
[2025-03-22 15:49:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3860.0). Total num frames: 2211840. Throughput: 0: 1014.6. Samples: 551764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:49:27,777][03219] Avg episode reward: [(0, '8.331')] |
|
[2025-03-22 15:49:27,782][03414] Saving new best policy, reward=8.331! |
|
[2025-03-22 15:49:32,776][03219] Fps is (10 sec: 4096.4, 60 sec: 4164.3, 300 sec: 3887.7). Total num frames: 2236416. Throughput: 0: 1025.0. Samples: 557776. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:49:32,777][03219] Avg episode reward: [(0, '8.851')] |
|
[2025-03-22 15:49:32,783][03414] Saving new best policy, reward=8.851! |
|
[2025-03-22 15:49:36,194][03427] Updated weights for policy 0, policy_version 550 (0.0018) |
|
[2025-03-22 15:49:37,778][03219] Fps is (10 sec: 4504.9, 60 sec: 4095.9, 300 sec: 3887.7). Total num frames: 2256896. Throughput: 0: 1024.9. Samples: 564824. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:49:37,779][03219] Avg episode reward: [(0, '8.774')] |
|
[2025-03-22 15:49:42,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4096.0, 300 sec: 3887.7). Total num frames: 2273280. Throughput: 0: 1010.1. Samples: 566864. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:49:42,777][03219] Avg episode reward: [(0, '8.335')] |
|
[2025-03-22 15:49:46,765][03427] Updated weights for policy 0, policy_version 560 (0.0023) |
|
[2025-03-22 15:49:47,776][03219] Fps is (10 sec: 4096.6, 60 sec: 4096.0, 300 sec: 3915.5). Total num frames: 2297856. Throughput: 0: 1020.6. Samples: 573052. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:49:47,785][03219] Avg episode reward: [(0, '8.480')] |
|
[2025-03-22 15:49:52,776][03219] Fps is (10 sec: 4505.7, 60 sec: 4096.0, 300 sec: 3915.5). Total num frames: 2318336. Throughput: 0: 1018.5. Samples: 579962. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:49:52,779][03219] Avg episode reward: [(0, '8.690')] |
|
[2025-03-22 15:49:57,538][03427] Updated weights for policy 0, policy_version 570 (0.0018) |
|
[2025-03-22 15:49:57,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4096.1, 300 sec: 3915.5). Total num frames: 2334720. Throughput: 0: 1004.9. Samples: 581978. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:49:57,777][03219] Avg episode reward: [(0, '9.524')] |
|
[2025-03-22 15:49:57,783][03414] Saving new best policy, reward=9.524! |
|
[2025-03-22 15:50:02,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3929.4). Total num frames: 2355200. Throughput: 0: 1014.1. Samples: 588260. Policy #0 lag: (min: 0.0, avg: 0.3, max: 2.0) |
|
[2025-03-22 15:50:02,780][03219] Avg episode reward: [(0, '9.506')] |
|
[2025-03-22 15:50:06,324][03427] Updated weights for policy 0, policy_version 580 (0.0024) |
|
[2025-03-22 15:50:07,783][03219] Fps is (10 sec: 4502.4, 60 sec: 4095.5, 300 sec: 3929.3). Total num frames: 2379776. Throughput: 0: 1013.0. Samples: 595080. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:50:07,785][03219] Avg episode reward: [(0, '9.683')] |
|
[2025-03-22 15:50:07,786][03414] Saving new best policy, reward=9.683! |
|
[2025-03-22 15:50:12,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 3943.3). Total num frames: 2396160. Throughput: 0: 1008.0. Samples: 597122. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:50:12,779][03219] Avg episode reward: [(0, '9.998')] |
|
[2025-03-22 15:50:12,787][03414] Saving new best policy, reward=9.998! |
|
[2025-03-22 15:50:16,995][03427] Updated weights for policy 0, policy_version 590 (0.0015) |
|
[2025-03-22 15:50:17,776][03219] Fps is (10 sec: 3689.0, 60 sec: 4027.7, 300 sec: 3943.3). Total num frames: 2416640. Throughput: 0: 1017.4. Samples: 603558. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:50:17,779][03219] Avg episode reward: [(0, '10.634')] |
|
[2025-03-22 15:50:17,782][03414] Saving new best policy, reward=10.634! |
|
[2025-03-22 15:50:22,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3943.3). Total num frames: 2437120. Throughput: 0: 1007.5. Samples: 610158. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:50:22,777][03219] Avg episode reward: [(0, '12.168')] |
|
[2025-03-22 15:50:22,782][03414] Saving new best policy, reward=12.168! |
|
[2025-03-22 15:50:27,775][03427] Updated weights for policy 0, policy_version 600 (0.0030) |
|
[2025-03-22 15:50:27,777][03219] Fps is (10 sec: 4095.7, 60 sec: 4096.0, 300 sec: 3957.2). Total num frames: 2457600. Throughput: 0: 1005.0. Samples: 612090. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:50:27,783][03219] Avg episode reward: [(0, '11.299')] |
|
[2025-03-22 15:50:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3984.9). Total num frames: 2478080. Throughput: 0: 1015.2. Samples: 618734. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:50:32,780][03219] Avg episode reward: [(0, '11.376')] |
|
[2025-03-22 15:50:36,984][03427] Updated weights for policy 0, policy_version 610 (0.0014) |
|
[2025-03-22 15:50:37,777][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 3998.8). Total num frames: 2498560. Throughput: 0: 1005.1. Samples: 625190. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:50:37,780][03219] Avg episode reward: [(0, '10.408')] |
|
[2025-03-22 15:50:42,778][03219] Fps is (10 sec: 3685.6, 60 sec: 4027.6, 300 sec: 3998.8). Total num frames: 2514944. Throughput: 0: 1006.1. Samples: 627254. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:50:42,780][03219] Avg episode reward: [(0, '10.239')] |
|
[2025-03-22 15:50:47,218][03427] Updated weights for policy 0, policy_version 620 (0.0019) |
|
[2025-03-22 15:50:47,776][03219] Fps is (10 sec: 4096.3, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2539520. Throughput: 0: 1019.6. Samples: 634142. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:50:47,780][03219] Avg episode reward: [(0, '9.877')] |
|
[2025-03-22 15:50:52,776][03219] Fps is (10 sec: 4506.6, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2560000. Throughput: 0: 1006.2. Samples: 640352. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:50:52,783][03219] Avg episode reward: [(0, '10.574')] |
|
[2025-03-22 15:50:57,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2576384. Throughput: 0: 1006.8. Samples: 642430. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:50:57,777][03219] Avg episode reward: [(0, '10.810')] |
|
[2025-03-22 15:50:57,897][03427] Updated weights for policy 0, policy_version 630 (0.0024) |
|
[2025-03-22 15:51:02,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4096.0, 300 sec: 4026.6). Total num frames: 2600960. Throughput: 0: 1016.6. Samples: 649304. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:51:02,781][03219] Avg episode reward: [(0, '11.097')] |
|
[2025-03-22 15:51:02,790][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000635_2600960.pth... |
|
[2025-03-22 15:51:02,949][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000397_1626112.pth |
|
[2025-03-22 15:51:07,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.9, 300 sec: 4012.7). Total num frames: 2617344. Throughput: 0: 999.0. Samples: 655112. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:51:07,779][03219] Avg episode reward: [(0, '12.026')] |
|
[2025-03-22 15:51:08,173][03427] Updated weights for policy 0, policy_version 640 (0.0018) |
|
[2025-03-22 15:51:12,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2637824. Throughput: 0: 1003.5. Samples: 657246. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:51:12,781][03219] Avg episode reward: [(0, '11.942')] |
|
[2025-03-22 15:51:17,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2658304. Throughput: 0: 1006.6. Samples: 664030. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:51:17,782][03219] Avg episode reward: [(0, '12.031')] |
|
[2025-03-22 15:51:17,973][03427] Updated weights for policy 0, policy_version 650 (0.0024) |
|
[2025-03-22 15:51:22,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 2678784. Throughput: 0: 995.6. Samples: 669990. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:51:22,780][03219] Avg episode reward: [(0, '13.076')] |
|
[2025-03-22 15:51:22,785][03414] Saving new best policy, reward=13.076! |
|
[2025-03-22 15:51:27,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.8, 300 sec: 4040.5). Total num frames: 2699264. Throughput: 0: 1000.6. Samples: 672278. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:51:27,781][03219] Avg episode reward: [(0, '12.828')] |
|
[2025-03-22 15:51:28,611][03427] Updated weights for policy 0, policy_version 660 (0.0013) |
|
[2025-03-22 15:51:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2719744. Throughput: 0: 1004.4. Samples: 679340. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:51:32,782][03219] Avg episode reward: [(0, '13.384')] |
|
[2025-03-22 15:51:32,790][03414] Saving new best policy, reward=13.384! |
|
[2025-03-22 15:51:37,780][03219] Fps is (10 sec: 3684.9, 60 sec: 3959.2, 300 sec: 4012.6). Total num frames: 2736128. Throughput: 0: 993.1. Samples: 685044. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:51:37,786][03219] Avg episode reward: [(0, '12.395')] |
|
[2025-03-22 15:51:39,259][03427] Updated weights for policy 0, policy_version 670 (0.0015) |
|
[2025-03-22 15:51:42,776][03219] Fps is (10 sec: 4096.1, 60 sec: 4096.2, 300 sec: 4040.5). Total num frames: 2760704. Throughput: 0: 1004.3. Samples: 687624. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:51:42,780][03219] Avg episode reward: [(0, '11.518')] |
|
[2025-03-22 15:51:47,776][03219] Fps is (10 sec: 4507.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2781184. Throughput: 0: 1008.0. Samples: 694662. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:51:47,778][03219] Avg episode reward: [(0, '12.925')] |
|
[2025-03-22 15:51:47,992][03427] Updated weights for policy 0, policy_version 680 (0.0017) |
|
[2025-03-22 15:51:52,777][03219] Fps is (10 sec: 3685.9, 60 sec: 3959.4, 300 sec: 4026.6). Total num frames: 2797568. Throughput: 0: 999.8. Samples: 700104. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:51:52,779][03219] Avg episode reward: [(0, '12.732')] |
|
[2025-03-22 15:51:57,776][03219] Fps is (10 sec: 3686.5, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2818048. Throughput: 0: 1011.5. Samples: 702764. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:51:57,781][03219] Avg episode reward: [(0, '13.907')] |
|
[2025-03-22 15:51:57,847][03414] Saving new best policy, reward=13.907! |
|
[2025-03-22 15:51:58,908][03427] Updated weights for policy 0, policy_version 690 (0.0018) |
|
[2025-03-22 15:52:02,776][03219] Fps is (10 sec: 4506.2, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2842624. Throughput: 0: 1013.8. Samples: 709650. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:52:02,778][03219] Avg episode reward: [(0, '14.310')] |
|
[2025-03-22 15:52:02,791][03414] Saving new best policy, reward=14.310! |
|
[2025-03-22 15:52:07,776][03219] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 4026.6). Total num frames: 2859008. Throughput: 0: 997.4. Samples: 714874. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:52:07,780][03219] Avg episode reward: [(0, '13.965')] |
|
[2025-03-22 15:52:09,632][03427] Updated weights for policy 0, policy_version 700 (0.0014) |
|
[2025-03-22 15:52:12,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2879488. Throughput: 0: 1009.3. Samples: 717696. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:52:12,782][03219] Avg episode reward: [(0, '14.523')] |
|
[2025-03-22 15:52:12,789][03414] Saving new best policy, reward=14.523! |
|
[2025-03-22 15:52:17,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2899968. Throughput: 0: 1003.3. Samples: 724490. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:52:17,777][03219] Avg episode reward: [(0, '14.812')] |
|
[2025-03-22 15:52:17,854][03414] Saving new best policy, reward=14.812! |
|
[2025-03-22 15:52:19,249][03427] Updated weights for policy 0, policy_version 710 (0.0029) |
|
[2025-03-22 15:52:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 2916352. Throughput: 0: 988.4. Samples: 729516. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:52:22,781][03219] Avg episode reward: [(0, '15.211')] |
|
[2025-03-22 15:52:22,791][03414] Saving new best policy, reward=15.211! |
|
[2025-03-22 15:52:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4040.5). Total num frames: 2936832. Throughput: 0: 993.2. Samples: 732316. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:52:27,777][03219] Avg episode reward: [(0, '15.374')] |
|
[2025-03-22 15:52:27,786][03414] Saving new best policy, reward=15.374! |
|
[2025-03-22 15:52:29,967][03427] Updated weights for policy 0, policy_version 720 (0.0015) |
|
[2025-03-22 15:52:32,776][03219] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 4040.5). Total num frames: 2961408. Throughput: 0: 986.3. Samples: 739046. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:52:32,777][03219] Avg episode reward: [(0, '16.102')] |
|
[2025-03-22 15:52:32,785][03414] Saving new best policy, reward=16.102! |
|
[2025-03-22 15:52:37,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.7, 300 sec: 4012.7). Total num frames: 2973696. Throughput: 0: 974.5. Samples: 743956. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:52:37,778][03219] Avg episode reward: [(0, '15.941')] |
|
[2025-03-22 15:52:41,235][03427] Updated weights for policy 0, policy_version 730 (0.0026) |
|
[2025-03-22 15:52:42,776][03219] Fps is (10 sec: 3276.8, 60 sec: 3891.2, 300 sec: 4026.6). Total num frames: 2994176. Throughput: 0: 980.3. Samples: 746878. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:52:42,782][03219] Avg episode reward: [(0, '15.761')] |
|
[2025-03-22 15:52:47,780][03219] Fps is (10 sec: 4503.8, 60 sec: 3959.2, 300 sec: 4026.5). Total num frames: 3018752. Throughput: 0: 977.6. Samples: 753648. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:52:47,781][03219] Avg episode reward: [(0, '16.045')] |
|
[2025-03-22 15:52:51,944][03427] Updated weights for policy 0, policy_version 740 (0.0021) |
|
[2025-03-22 15:52:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3891.3, 300 sec: 4012.7). Total num frames: 3031040. Throughput: 0: 969.9. Samples: 758520. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:52:52,778][03219] Avg episode reward: [(0, '15.885')] |
|
[2025-03-22 15:52:57,776][03219] Fps is (10 sec: 3687.8, 60 sec: 3959.4, 300 sec: 4026.6). Total num frames: 3055616. Throughput: 0: 978.9. Samples: 761746. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:52:57,778][03219] Avg episode reward: [(0, '16.483')] |
|
[2025-03-22 15:52:57,781][03414] Saving new best policy, reward=16.483! |
|
[2025-03-22 15:53:01,205][03427] Updated weights for policy 0, policy_version 750 (0.0017) |
|
[2025-03-22 15:53:02,779][03219] Fps is (10 sec: 4504.2, 60 sec: 3891.0, 300 sec: 4026.5). Total num frames: 3076096. Throughput: 0: 979.2. Samples: 768558. Policy #0 lag: (min: 0.0, avg: 0.7, max: 2.0) |
|
[2025-03-22 15:53:02,785][03219] Avg episode reward: [(0, '15.047')] |
|
[2025-03-22 15:53:02,797][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000751_3076096.pth... |
|
[2025-03-22 15:53:02,968][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000516_2113536.pth |
|
[2025-03-22 15:53:07,776][03219] Fps is (10 sec: 3686.5, 60 sec: 3891.2, 300 sec: 4012.7). Total num frames: 3092480. Throughput: 0: 973.8. Samples: 773338. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:53:07,778][03219] Avg episode reward: [(0, '14.267')] |
|
[2025-03-22 15:53:11,876][03427] Updated weights for policy 0, policy_version 760 (0.0033) |
|
[2025-03-22 15:53:12,776][03219] Fps is (10 sec: 3687.5, 60 sec: 3891.2, 300 sec: 4012.7). Total num frames: 3112960. Throughput: 0: 987.6. Samples: 776756. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:53:12,780][03219] Avg episode reward: [(0, '15.728')] |
|
[2025-03-22 15:53:17,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3137536. Throughput: 0: 992.0. Samples: 783688. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:53:17,779][03219] Avg episode reward: [(0, '15.968')] |
|
[2025-03-22 15:53:22,457][03427] Updated weights for policy 0, policy_version 770 (0.0017) |
|
[2025-03-22 15:53:22,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3153920. Throughput: 0: 990.7. Samples: 788538. Policy #0 lag: (min: 0.0, avg: 0.2, max: 1.0) |
|
[2025-03-22 15:53:22,782][03219] Avg episode reward: [(0, '16.199')] |
|
[2025-03-22 15:53:27,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4026.6). Total num frames: 3174400. Throughput: 0: 998.9. Samples: 791828. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:53:27,780][03219] Avg episode reward: [(0, '17.856')] |
|
[2025-03-22 15:53:27,783][03414] Saving new best policy, reward=17.856! |
|
[2025-03-22 15:53:31,660][03427] Updated weights for policy 0, policy_version 780 (0.0017) |
|
[2025-03-22 15:53:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3891.2, 300 sec: 4012.7). Total num frames: 3194880. Throughput: 0: 999.7. Samples: 798630. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:53:32,780][03219] Avg episode reward: [(0, '18.213')] |
|
[2025-03-22 15:53:32,793][03414] Saving new best policy, reward=18.213! |
|
[2025-03-22 15:53:37,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 4012.7). Total num frames: 3211264. Throughput: 0: 997.5. Samples: 803406. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:53:37,781][03219] Avg episode reward: [(0, '17.842')] |
|
[2025-03-22 15:53:42,506][03427] Updated weights for policy 0, policy_version 790 (0.0015) |
|
[2025-03-22 15:53:42,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3235840. Throughput: 0: 999.6. Samples: 806728. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:53:42,778][03219] Avg episode reward: [(0, '18.180')] |
|
[2025-03-22 15:53:47,781][03219] Fps is (10 sec: 4503.4, 60 sec: 3959.4, 300 sec: 4012.6). Total num frames: 3256320. Throughput: 0: 1002.0. Samples: 813650. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:53:47,783][03219] Avg episode reward: [(0, '17.431')] |
|
[2025-03-22 15:53:52,776][03219] Fps is (10 sec: 3686.3, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3272704. Throughput: 0: 1001.0. Samples: 818384. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:53:52,778][03219] Avg episode reward: [(0, '17.132')] |
|
[2025-03-22 15:53:53,305][03427] Updated weights for policy 0, policy_version 800 (0.0033) |
|
[2025-03-22 15:53:57,776][03219] Fps is (10 sec: 3688.2, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3293184. Throughput: 0: 998.0. Samples: 821666. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:53:57,777][03219] Avg episode reward: [(0, '17.750')] |
|
[2025-03-22 15:54:02,776][03219] Fps is (10 sec: 4096.1, 60 sec: 3959.7, 300 sec: 3998.8). Total num frames: 3313664. Throughput: 0: 994.1. Samples: 828424. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:54:02,780][03219] Avg episode reward: [(0, '17.928')] |
|
[2025-03-22 15:54:03,283][03427] Updated weights for policy 0, policy_version 810 (0.0016) |
|
[2025-03-22 15:54:07,776][03219] Fps is (10 sec: 4095.9, 60 sec: 4027.7, 300 sec: 4012.7). Total num frames: 3334144. Throughput: 0: 992.5. Samples: 833200. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:54:07,778][03219] Avg episode reward: [(0, '18.541')] |
|
[2025-03-22 15:54:07,779][03414] Saving new best policy, reward=18.541! |
|
[2025-03-22 15:54:12,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3998.8). Total num frames: 3354624. Throughput: 0: 995.3. Samples: 836618. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:54:12,777][03219] Avg episode reward: [(0, '19.677')] |
|
[2025-03-22 15:54:12,785][03414] Saving new best policy, reward=19.677! |
|
[2025-03-22 15:54:13,309][03427] Updated weights for policy 0, policy_version 820 (0.0016) |
|
[2025-03-22 15:54:17,777][03219] Fps is (10 sec: 3686.1, 60 sec: 3891.1, 300 sec: 3984.9). Total num frames: 3371008. Throughput: 0: 989.1. Samples: 843140. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:54:17,778][03219] Avg episode reward: [(0, '19.758')] |
|
[2025-03-22 15:54:17,806][03414] Saving new best policy, reward=19.758! |
|
[2025-03-22 15:54:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3998.8). Total num frames: 3391488. Throughput: 0: 993.4. Samples: 848108. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:54:22,778][03219] Avg episode reward: [(0, '20.368')] |
|
[2025-03-22 15:54:22,790][03414] Saving new best policy, reward=20.368! |
|
[2025-03-22 15:54:24,150][03427] Updated weights for policy 0, policy_version 830 (0.0015) |
|
[2025-03-22 15:54:27,776][03219] Fps is (10 sec: 4096.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3411968. Throughput: 0: 992.0. Samples: 851370. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:54:27,782][03219] Avg episode reward: [(0, '18.790')] |
|
[2025-03-22 15:54:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3432448. Throughput: 0: 979.7. Samples: 857730. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:54:32,780][03219] Avg episode reward: [(0, '20.068')] |
|
[2025-03-22 15:54:35,098][03427] Updated weights for policy 0, policy_version 840 (0.0028) |
|
[2025-03-22 15:54:37,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3448832. Throughput: 0: 990.1. Samples: 862938. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:54:37,780][03219] Avg episode reward: [(0, '19.636')] |
|
[2025-03-22 15:54:42,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3473408. Throughput: 0: 991.7. Samples: 866292. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:54:42,782][03219] Avg episode reward: [(0, '20.835')] |
|
[2025-03-22 15:54:42,789][03414] Saving new best policy, reward=20.835! |
|
[2025-03-22 15:54:44,142][03427] Updated weights for policy 0, policy_version 850 (0.0013) |
|
[2025-03-22 15:54:47,779][03219] Fps is (10 sec: 4094.8, 60 sec: 3891.3, 300 sec: 3971.0). Total num frames: 3489792. Throughput: 0: 980.6. Samples: 872556. Policy #0 lag: (min: 0.0, avg: 0.6, max: 1.0) |
|
[2025-03-22 15:54:47,780][03219] Avg episode reward: [(0, '19.599')] |
|
[2025-03-22 15:54:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3510272. Throughput: 0: 987.1. Samples: 877620. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:54:52,782][03219] Avg episode reward: [(0, '21.726')] |
|
[2025-03-22 15:54:52,790][03414] Saving new best policy, reward=21.726! |
|
[2025-03-22 15:54:55,245][03427] Updated weights for policy 0, policy_version 860 (0.0020) |
|
[2025-03-22 15:54:57,776][03219] Fps is (10 sec: 4097.2, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3530752. Throughput: 0: 987.8. Samples: 881068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:54:57,782][03219] Avg episode reward: [(0, '20.489')] |
|
[2025-03-22 15:55:02,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 3547136. Throughput: 0: 976.2. Samples: 887068. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:55:02,782][03219] Avg episode reward: [(0, '20.337')] |
|
[2025-03-22 15:55:02,790][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000866_3547136.pth... |
|
[2025-03-22 15:55:02,955][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000635_2600960.pth |
|
[2025-03-22 15:55:06,066][03427] Updated weights for policy 0, policy_version 870 (0.0013) |
|
[2025-03-22 15:55:07,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3971.0). Total num frames: 3567616. Throughput: 0: 983.5. Samples: 892366. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:55:07,777][03219] Avg episode reward: [(0, '19.988')] |
|
[2025-03-22 15:55:12,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3984.9). Total num frames: 3592192. Throughput: 0: 987.8. Samples: 895822. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:55:12,782][03219] Avg episode reward: [(0, '20.494')] |
|
[2025-03-22 15:55:15,940][03427] Updated weights for policy 0, policy_version 880 (0.0026) |
|
[2025-03-22 15:55:17,778][03219] Fps is (10 sec: 4095.2, 60 sec: 3959.4, 300 sec: 3971.0). Total num frames: 3608576. Throughput: 0: 979.2. Samples: 901798. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:55:17,779][03219] Avg episode reward: [(0, '21.034')] |
|
[2025-03-22 15:55:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3629056. Throughput: 0: 991.2. Samples: 907542. Policy #0 lag: (min: 0.0, avg: 0.6, max: 2.0) |
|
[2025-03-22 15:55:22,777][03219] Avg episode reward: [(0, '20.752')] |
|
[2025-03-22 15:55:26,038][03427] Updated weights for policy 0, policy_version 890 (0.0014) |
|
[2025-03-22 15:55:27,776][03219] Fps is (10 sec: 4096.8, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3649536. Throughput: 0: 992.3. Samples: 910944. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:55:27,777][03219] Avg episode reward: [(0, '20.562')] |
|
[2025-03-22 15:55:32,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3891.2, 300 sec: 3957.2). Total num frames: 3665920. Throughput: 0: 979.6. Samples: 916634. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:55:32,781][03219] Avg episode reward: [(0, '20.165')] |
|
[2025-03-22 15:55:36,924][03427] Updated weights for policy 0, policy_version 900 (0.0029) |
|
[2025-03-22 15:55:37,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3985.0). Total num frames: 3690496. Throughput: 0: 995.3. Samples: 922410. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:55:37,777][03219] Avg episode reward: [(0, '20.053')] |
|
[2025-03-22 15:55:42,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3710976. Throughput: 0: 994.3. Samples: 925810. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:55:42,777][03219] Avg episode reward: [(0, '20.387')] |
|
[2025-03-22 15:55:47,282][03427] Updated weights for policy 0, policy_version 910 (0.0020) |
|
[2025-03-22 15:55:47,776][03219] Fps is (10 sec: 3686.3, 60 sec: 3959.7, 300 sec: 3957.2). Total num frames: 3727360. Throughput: 0: 986.4. Samples: 931456. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:55:47,778][03219] Avg episode reward: [(0, '20.776')] |
|
[2025-03-22 15:55:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3747840. Throughput: 0: 1000.8. Samples: 937404. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:55:52,782][03219] Avg episode reward: [(0, '20.376')] |
|
[2025-03-22 15:55:56,571][03427] Updated weights for policy 0, policy_version 920 (0.0013) |
|
[2025-03-22 15:55:57,776][03219] Fps is (10 sec: 4505.7, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 3772416. Throughput: 0: 1001.6. Samples: 940896. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:55:57,779][03219] Avg episode reward: [(0, '21.251')] |
|
[2025-03-22 15:56:02,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3784704. Throughput: 0: 987.2. Samples: 946218. Policy #0 lag: (min: 0.0, avg: 0.4, max: 2.0) |
|
[2025-03-22 15:56:02,780][03219] Avg episode reward: [(0, '20.183')] |
|
[2025-03-22 15:56:07,389][03427] Updated weights for policy 0, policy_version 930 (0.0020) |
|
[2025-03-22 15:56:07,776][03219] Fps is (10 sec: 3686.4, 60 sec: 4027.7, 300 sec: 3971.0). Total num frames: 3809280. Throughput: 0: 997.7. Samples: 952438. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:56:07,777][03219] Avg episode reward: [(0, '20.141')] |
|
[2025-03-22 15:56:12,776][03219] Fps is (10 sec: 4505.6, 60 sec: 3959.5, 300 sec: 3971.0). Total num frames: 3829760. Throughput: 0: 997.2. Samples: 955818. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:56:12,780][03219] Avg episode reward: [(0, '19.740')] |
|
[2025-03-22 15:56:17,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.6, 300 sec: 3957.2). Total num frames: 3846144. Throughput: 0: 989.2. Samples: 961146. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:56:17,777][03219] Avg episode reward: [(0, '20.032')] |
|
[2025-03-22 15:56:18,141][03427] Updated weights for policy 0, policy_version 940 (0.0017) |
|
[2025-03-22 15:56:22,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3866624. Throughput: 0: 1002.0. Samples: 967502. Policy #0 lag: (min: 0.0, avg: 0.4, max: 1.0) |
|
[2025-03-22 15:56:22,778][03219] Avg episode reward: [(0, '19.139')] |
|
[2025-03-22 15:56:27,396][03427] Updated weights for policy 0, policy_version 950 (0.0016) |
|
[2025-03-22 15:56:27,780][03219] Fps is (10 sec: 4503.7, 60 sec: 4027.4, 300 sec: 3971.0). Total num frames: 3891200. Throughput: 0: 1002.9. Samples: 970946. Policy #0 lag: (min: 0.0, avg: 0.5, max: 1.0) |
|
[2025-03-22 15:56:27,785][03219] Avg episode reward: [(0, '19.745')] |
|
[2025-03-22 15:56:32,776][03219] Fps is (10 sec: 4096.0, 60 sec: 4027.7, 300 sec: 3971.1). Total num frames: 3907584. Throughput: 0: 986.4. Samples: 975846. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:56:32,778][03219] Avg episode reward: [(0, '19.496')] |
|
[2025-03-22 15:56:37,776][03219] Fps is (10 sec: 3687.9, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3928064. Throughput: 0: 999.7. Samples: 982392. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:56:37,778][03219] Avg episode reward: [(0, '18.704')] |
|
[2025-03-22 15:56:38,073][03427] Updated weights for policy 0, policy_version 960 (0.0014) |
|
[2025-03-22 15:56:42,776][03219] Fps is (10 sec: 4096.0, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3948544. Throughput: 0: 995.6. Samples: 985700. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:56:42,778][03219] Avg episode reward: [(0, '18.520')] |
|
[2025-03-22 15:56:47,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3964928. Throughput: 0: 988.0. Samples: 990678. Policy #0 lag: (min: 0.0, avg: 0.3, max: 1.0) |
|
[2025-03-22 15:56:47,782][03219] Avg episode reward: [(0, '18.811')] |
|
[2025-03-22 15:56:49,624][03427] Updated weights for policy 0, policy_version 970 (0.0026) |
|
[2025-03-22 15:56:52,776][03219] Fps is (10 sec: 3686.4, 60 sec: 3959.5, 300 sec: 3957.2). Total num frames: 3985408. Throughput: 0: 981.7. Samples: 996616. Policy #0 lag: (min: 0.0, avg: 0.5, max: 2.0) |
|
[2025-03-22 15:56:52,781][03219] Avg episode reward: [(0, '19.875')] |
|
[2025-03-22 15:56:56,717][03414] Stopping Batcher_0... |
|
[2025-03-22 15:56:56,717][03219] Component Batcher_0 stopped! |
|
[2025-03-22 15:56:56,718][03414] Loop batcher_evt_loop terminating... |
|
[2025-03-22 15:56:56,720][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 15:56:56,722][03219] Component RolloutWorker_w0 process died already! Don't wait for it. |
|
[2025-03-22 15:56:56,791][03427] Weights refcount: 2 0 |
|
[2025-03-22 15:56:56,794][03219] Component InferenceWorker_p0-w0 stopped! |
|
[2025-03-22 15:56:56,801][03427] Stopping InferenceWorker_p0-w0... |
|
[2025-03-22 15:56:56,801][03427] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-03-22 15:56:56,874][03414] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000751_3076096.pth |
|
[2025-03-22 15:56:56,902][03414] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 15:56:57,104][03219] Component LearnerWorker_p0 stopped! |
|
[2025-03-22 15:56:57,103][03414] Stopping LearnerWorker_p0... |
|
[2025-03-22 15:56:57,105][03414] Loop learner_proc0_evt_loop terminating... |
|
[2025-03-22 15:56:57,298][03431] Stopping RolloutWorker_w3... |
|
[2025-03-22 15:56:57,299][03431] Loop rollout_proc3_evt_loop terminating... |
|
[2025-03-22 15:56:57,298][03219] Component RolloutWorker_w3 stopped! |
|
[2025-03-22 15:56:57,348][03433] Stopping RolloutWorker_w5... |
|
[2025-03-22 15:56:57,349][03219] Component RolloutWorker_w5 stopped! |
|
[2025-03-22 15:56:57,351][03428] Stopping RolloutWorker_w1... |
|
[2025-03-22 15:56:57,351][03219] Component RolloutWorker_w1 stopped! |
|
[2025-03-22 15:56:57,349][03433] Loop rollout_proc5_evt_loop terminating... |
|
[2025-03-22 15:56:57,352][03428] Loop rollout_proc1_evt_loop terminating... |
|
[2025-03-22 15:56:57,392][03435] Stopping RolloutWorker_w7... |
|
[2025-03-22 15:56:57,393][03435] Loop rollout_proc7_evt_loop terminating... |
|
[2025-03-22 15:56:57,392][03219] Component RolloutWorker_w7 stopped! |
|
[2025-03-22 15:56:57,425][03219] Component RolloutWorker_w4 stopped! |
|
[2025-03-22 15:56:57,429][03432] Stopping RolloutWorker_w4... |
|
[2025-03-22 15:56:57,430][03432] Loop rollout_proc4_evt_loop terminating... |
|
[2025-03-22 15:56:57,436][03219] Component RolloutWorker_w2 stopped! |
|
[2025-03-22 15:56:57,438][03430] Stopping RolloutWorker_w2... |
|
[2025-03-22 15:56:57,439][03430] Loop rollout_proc2_evt_loop terminating... |
|
[2025-03-22 15:56:57,506][03219] Component RolloutWorker_w6 stopped! |
|
[2025-03-22 15:56:57,510][03219] Waiting for process learner_proc0 to stop... |
|
[2025-03-22 15:56:57,511][03434] Stopping RolloutWorker_w6... |
|
[2025-03-22 15:56:57,516][03434] Loop rollout_proc6_evt_loop terminating... |
|
[2025-03-22 15:56:59,575][03219] Waiting for process inference_proc0-0 to join... |
|
[2025-03-22 15:56:59,687][03219] Waiting for process rollout_proc0 to join... |
|
[2025-03-22 15:56:59,688][03219] Waiting for process rollout_proc1 to join... |
|
[2025-03-22 15:57:02,117][03219] Waiting for process rollout_proc2 to join... |
|
[2025-03-22 15:57:02,118][03219] Waiting for process rollout_proc3 to join... |
|
[2025-03-22 15:57:02,119][03219] Waiting for process rollout_proc4 to join... |
|
[2025-03-22 15:57:02,120][03219] Waiting for process rollout_proc5 to join... |
|
[2025-03-22 15:57:02,121][03219] Waiting for process rollout_proc6 to join... |
|
[2025-03-22 15:57:02,123][03219] Waiting for process rollout_proc7 to join... |
|
[2025-03-22 15:57:02,124][03219] Batcher 0 profile tree view: |
|
batching: 26.0754, releasing_batches: 0.0299 |
|
[2025-03-22 15:57:02,125][03219] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0000 |
|
wait_policy_total: 384.9343 |
|
update_model: 8.9280 |
|
weight_update: 0.0017 |
|
one_step: 0.0026 |
|
handle_policy_step: 613.8213 |
|
deserialize: 14.4172, stack: 3.3808, obs_to_device_normalize: 131.0309, forward: 321.4640, send_messages: 27.6357 |
|
prepare_outputs: 89.5364 |
|
to_cpu: 55.5379 |
|
[2025-03-22 15:57:02,127][03219] Learner 0 profile tree view: |
|
misc: 0.0042, prepare_batch: 12.7394 |
|
train: 73.5539 |
|
epoch_init: 0.0051, minibatch_init: 0.0062, losses_postprocess: 0.7201, kl_divergence: 0.7135, after_optimizer: 33.2484 |
|
calculate_losses: 26.0906 |
|
losses_init: 0.0114, forward_head: 1.3828, bptt_initial: 17.2202, tail: 1.2022, advantages_returns: 0.3084, losses: 3.5212 |
|
bptt: 2.1486 |
|
bptt_forward_core: 2.0358 |
|
update: 12.2106 |
|
clip: 1.0642 |
|
[2025-03-22 15:57:02,128][03219] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.2852, enqueue_policy_requests: 81.3337, env_step: 837.6647, overhead: 12.5524, complete_rollouts: 8.0634 |
|
save_policy_outputs: 21.7719 |
|
split_output_tensors: 8.0823 |
|
[2025-03-22 15:57:02,130][03219] Loop Runner_EvtLoop terminating... |
|
[2025-03-22 15:57:02,131][03219] Runner profile tree view: |
|
main_loop: 1074.9850 |
|
[2025-03-22 15:57:02,132][03219] Collected {0: 4005888}, FPS: 3726.5 |
|
[2025-03-22 15:57:39,396][03219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 15:57:39,397][03219] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 15:57:39,398][03219] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 15:57:39,399][03219] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 15:57:39,400][03219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 15:57:39,401][03219] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 15:57:39,402][03219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 15:57:39,403][03219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 15:57:39,404][03219] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 15:57:39,405][03219] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 15:57:39,406][03219] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 15:57:39,407][03219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 15:57:39,409][03219] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 15:57:39,410][03219] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 15:57:39,411][03219] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 15:57:39,440][03219] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 15:57:39,443][03219] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 15:57:39,445][03219] RunningMeanStd input shape: (1,) |
|
[2025-03-22 15:57:39,462][03219] ConvEncoder: input_channels=3 |
|
[2025-03-22 15:57:39,575][03219] Conv encoder output size: 512 |
|
[2025-03-22 15:57:39,576][03219] Policy head output size: 512 |
|
[2025-03-22 15:57:39,762][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 15:57:39,765][03219] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy._core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 15:57:39,767][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 15:57:39,770][03219] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy._core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 15:57:39,771][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 15:57:39,774][03219] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy._core.multiarray.scalar was not an allowed global by default. Please use `torch.serialization.add_safe_globals([scalar])` or the `torch.serialization.safe_globals([scalar])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:02:16,459][03219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:02:16,460][03219] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:02:16,461][03219] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:02:16,461][03219] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:02:16,462][03219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:02:16,463][03219] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:02:16,464][03219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:02:16,465][03219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:02:16,466][03219] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 16:02:16,467][03219] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 16:02:16,469][03219] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:02:16,471][03219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:02:16,472][03219] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:02:16,474][03219] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:02:16,475][03219] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:02:16,502][03219] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:02:16,504][03219] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:02:16,515][03219] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:02:16,552][03219] Conv encoder output size: 512 |
|
[2025-03-22 16:02:16,553][03219] Policy head output size: 512 |
|
[2025-03-22 16:02:16,570][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:02:16,572][03219] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.dtype was not an allowed global by default. Please use `torch.serialization.add_safe_globals([dtype])` or the `torch.serialization.safe_globals([dtype])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:02:16,573][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:02:16,575][03219] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.dtype was not an allowed global by default. Please use `torch.serialization.add_safe_globals([dtype])` or the `torch.serialization.safe_globals([dtype])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:02:16,576][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:02:16,578][03219] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. This file can still be loaded, to do so you have two options, [1mdo those steps only if you trust the source of the checkpoint[0m. |
|
(1) In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
(2) Alternatively, to load with `weights_only=True` please check the recommended steps in the following error message. |
|
WeightsUnpickler error: Unsupported global: GLOBAL numpy.dtype was not an allowed global by default. Please use `torch.serialization.add_safe_globals([dtype])` or the `torch.serialization.safe_globals([dtype])` context manager to allowlist this global if you trust this class/function. |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:03:41,547][03219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:03:41,548][03219] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:03:41,549][03219] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:03:41,550][03219] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:03:41,551][03219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:03:41,552][03219] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:03:41,553][03219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:03:41,554][03219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:03:41,555][03219] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 16:03:41,556][03219] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 16:03:41,557][03219] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:03:41,558][03219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:03:41,558][03219] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:03:41,559][03219] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:03:41,560][03219] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:03:41,585][03219] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:03:41,587][03219] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:03:41,599][03219] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:03:41,633][03219] Conv encoder output size: 512 |
|
[2025-03-22 16:03:41,634][03219] Policy head output size: 512 |
|
[2025-03-22 16:03:41,652][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:03:41,654][03219] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:03:41,655][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:03:41,657][03219] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:03:41,658][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:03:41,660][03219] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:08:58,716][03219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:08:58,717][03219] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:08:58,718][03219] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:08:58,718][03219] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:08:58,719][03219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:08:58,720][03219] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:08:58,721][03219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:08:58,722][03219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:08:58,723][03219] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 16:08:58,723][03219] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 16:08:58,724][03219] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:08:58,725][03219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:08:58,726][03219] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:08:58,727][03219] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:08:58,728][03219] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:08:58,758][03219] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:08:58,760][03219] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:08:58,771][03219] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:08:58,806][03219] Conv encoder output size: 512 |
|
[2025-03-22 16:08:58,807][03219] Policy head output size: 512 |
|
[2025-03-22 16:08:58,828][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:08:58,830][03219] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
# noinspection PyBroadException |
|
^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:08:58,832][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:08:58,834][03219] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
# noinspection PyBroadException |
|
^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:08:58,836][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:08:58,837][03219] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
# noinspection PyBroadException |
|
^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:10:29,619][03219] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:10:29,620][03219] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:10:29,621][03219] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:10:29,622][03219] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:10:29,623][03219] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:10:29,624][03219] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:10:29,625][03219] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:10:29,626][03219] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:10:29,627][03219] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 16:10:29,628][03219] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 16:10:29,629][03219] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:10:29,630][03219] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:10:29,631][03219] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:10:29,632][03219] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:10:29,633][03219] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:10:29,662][03219] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:10:29,664][03219] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:10:29,676][03219] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:10:29,712][03219] Conv encoder output size: 512 |
|
[2025-03-22 16:10:29,713][03219] Policy head output size: 512 |
|
[2025-03-22 16:10:29,733][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:10:29,734][03219] Could not load from checkpoint, attempt 0 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:10:29,736][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:10:29,738][03219] Could not load from checkpoint, attempt 1 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:10:29,739][03219] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:10:29,741][03219] Could not load from checkpoint, attempt 2 |
|
Traceback (most recent call last): |
|
File "/usr/local/lib/python3.11/dist-packages/sample_factory/algo/learning/learner.py", line 281, in load_checkpoint |
|
checkpoint_dict = torch.load(latest_checkpoint, map_location=device) |
|
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
|
File "/usr/local/lib/python3.11/dist-packages/torch/serialization.py", line 1470, in load |
|
raise pickle.UnpicklingError(_get_wo_message(str(e))) from None |
|
_pickle.UnpicklingError: Weights only load failed. In PyTorch 2.6, we changed the default value of the `weights_only` argument in `torch.load` from `False` to `True`. Re-running `torch.load` with `weights_only` set to `False` will likely succeed, but it can result in arbitrary code execution. Do it only if you got the file from a trusted source. |
|
Please file an issue with the following so that we can make `weights_only=True` compatible with your use case: WeightsUnpickler error: Can only build Tensor, Parameter, OrderedDict or types allowlisted via `add_safe_globals`, but got <class 'numpy.dtypes.Float64DType'> |
|
|
|
Check the documentation of torch.load to learn more about types accepted by default with weights_only https://pytorch.org/docs/stable/generated/torch.load.html. |
|
[2025-03-22 16:15:05,490][15900] Saving configuration to /content/train_dir/default_experiment/config.json... |
|
[2025-03-22 16:15:05,493][15900] Rollout worker 0 uses device cpu |
|
[2025-03-22 16:15:05,494][15900] Rollout worker 1 uses device cpu |
|
[2025-03-22 16:15:05,494][15900] Rollout worker 2 uses device cpu |
|
[2025-03-22 16:15:05,495][15900] Rollout worker 3 uses device cpu |
|
[2025-03-22 16:15:05,496][15900] Rollout worker 4 uses device cpu |
|
[2025-03-22 16:15:05,497][15900] Rollout worker 5 uses device cpu |
|
[2025-03-22 16:15:05,498][15900] Rollout worker 6 uses device cpu |
|
[2025-03-22 16:15:05,499][15900] Rollout worker 7 uses device cpu |
|
[2025-03-22 16:15:05,604][15900] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 16:15:05,605][15900] InferenceWorker_p0-w0: min num requests: 2 |
|
[2025-03-22 16:15:05,638][15900] Starting all processes... |
|
[2025-03-22 16:15:05,639][15900] Starting process learner_proc0 |
|
[2025-03-22 16:15:05,798][15900] Starting all processes... |
|
[2025-03-22 16:15:05,809][15900] Starting process inference_proc0-0 |
|
[2025-03-22 16:15:05,809][15900] Starting process rollout_proc0 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc1 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc2 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc3 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc4 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc5 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc6 |
|
[2025-03-22 16:15:05,813][15900] Starting process rollout_proc7 |
|
[2025-03-22 16:15:21,735][16056] Worker 2 uses CPU cores [0] |
|
[2025-03-22 16:15:21,738][16060] Worker 5 uses CPU cores [1] |
|
[2025-03-22 16:15:21,742][16061] Worker 6 uses CPU cores [0] |
|
[2025-03-22 16:15:21,885][16041] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 16:15:21,886][16041] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 |
|
[2025-03-22 16:15:21,957][16041] Num visible devices: 1 |
|
[2025-03-22 16:15:21,975][16041] Starting seed is not provided |
|
[2025-03-22 16:15:21,976][16041] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 16:15:21,976][16041] Initializing actor-critic model on device cuda:0 |
|
[2025-03-22 16:15:21,977][16041] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:15:21,982][16041] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:15:21,998][16059] Worker 4 uses CPU cores [0] |
|
[2025-03-22 16:15:22,046][16062] Worker 7 uses CPU cores [1] |
|
[2025-03-22 16:15:22,050][16058] Worker 0 uses CPU cores [0] |
|
[2025-03-22 16:15:22,050][16055] Worker 1 uses CPU cores [1] |
|
[2025-03-22 16:15:22,102][16057] Worker 3 uses CPU cores [1] |
|
[2025-03-22 16:15:22,125][16054] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 16:15:22,126][16054] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 |
|
[2025-03-22 16:15:22,148][16041] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:15:22,149][16054] Num visible devices: 1 |
|
[2025-03-22 16:15:22,268][16041] Conv encoder output size: 512 |
|
[2025-03-22 16:15:22,268][16041] Policy head output size: 512 |
|
[2025-03-22 16:15:22,284][16041] Created Actor Critic model with architecture: |
|
[2025-03-22 16:15:22,285][16041] ActorCriticSharedWeights( |
|
(obs_normalizer): ObservationNormalizer( |
|
(running_mean_std): RunningMeanStdDictInPlace( |
|
(running_mean_std): ModuleDict( |
|
(obs): RunningMeanStdInPlace() |
|
) |
|
) |
|
) |
|
(returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) |
|
(encoder): VizdoomEncoder( |
|
(basic_encoder): ConvEncoder( |
|
(enc): RecursiveScriptModule( |
|
original_name=ConvEncoderImpl |
|
(conv_head): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Conv2d) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
(2): RecursiveScriptModule(original_name=Conv2d) |
|
(3): RecursiveScriptModule(original_name=ELU) |
|
(4): RecursiveScriptModule(original_name=Conv2d) |
|
(5): RecursiveScriptModule(original_name=ELU) |
|
) |
|
(mlp_layers): RecursiveScriptModule( |
|
original_name=Sequential |
|
(0): RecursiveScriptModule(original_name=Linear) |
|
(1): RecursiveScriptModule(original_name=ELU) |
|
) |
|
) |
|
) |
|
) |
|
(core): ModelCoreRNN( |
|
(core): GRU(512, 512) |
|
) |
|
(decoder): MlpDecoder( |
|
(mlp): Identity() |
|
) |
|
(critic_linear): Linear(in_features=512, out_features=1, bias=True) |
|
(action_parameterization): ActionParameterizationDefault( |
|
(distribution_linear): Linear(in_features=512, out_features=5, bias=True) |
|
) |
|
) |
|
[2025-03-22 16:15:22,457][16041] Using optimizer <class 'torch.optim.adam.Adam'> |
|
[2025-03-22 16:15:23,923][16041] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000978_4005888.pth... |
|
[2025-03-22 16:15:24,130][16041] Loading model from checkpoint |
|
[2025-03-22 16:15:24,134][16041] Loaded experiment state at self.train_step=978, self.env_steps=4005888 |
|
[2025-03-22 16:15:24,134][16041] Initialized policy 0 weights for model version 978 |
|
[2025-03-22 16:15:24,139][16041] Using GPUs [0] for process 0 (actually maps to GPUs [0]) |
|
[2025-03-22 16:15:24,150][16041] LearnerWorker_p0 finished initialization! |
|
[2025-03-22 16:15:24,393][16054] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:15:24,396][16054] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:15:24,485][16054] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:15:24,672][16054] Conv encoder output size: 512 |
|
[2025-03-22 16:15:24,673][16054] Policy head output size: 512 |
|
[2025-03-22 16:15:24,727][15900] Inference worker 0-0 is ready! |
|
[2025-03-22 16:15:24,730][15900] All inference workers are ready! Signal rollout workers to start! |
|
[2025-03-22 16:15:25,060][16057] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,044][16062] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,070][16055] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,077][16060] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,219][16058] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,381][16059] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,397][16056] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,388][16061] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:15:25,594][15900] Heartbeat connected on Batcher_0 |
|
[2025-03-22 16:15:25,603][15900] Heartbeat connected on LearnerWorker_p0 |
|
[2025-03-22 16:15:25,655][15900] Heartbeat connected on InferenceWorker_p0-w0 |
|
[2025-03-22 16:15:26,293][15900] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 4005888. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 16:15:27,483][16061] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,484][16059] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,486][16058] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,556][16060] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,561][16057] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,559][16055] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:27,563][16062] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:28,716][16058] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:28,722][16059] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:28,772][16060] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:28,775][16055] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:28,784][16062] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:28,844][16056] Decorrelating experience for 0 frames... |
|
[2025-03-22 16:15:30,044][16061] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:30,146][16056] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:30,450][16055] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:30,455][16060] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:30,452][16062] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:30,486][16058] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:31,210][16057] Decorrelating experience for 32 frames... |
|
[2025-03-22 16:15:31,295][15900] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 0.0. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 16:15:31,622][16059] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:31,885][16060] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:32,028][16061] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:32,112][15900] Heartbeat connected on RolloutWorker_w5 |
|
[2025-03-22 16:15:32,293][16058] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:32,855][15900] Heartbeat connected on RolloutWorker_w0 |
|
[2025-03-22 16:15:33,499][16056] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:33,503][16055] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:33,919][15900] Heartbeat connected on RolloutWorker_w1 |
|
[2025-03-22 16:15:34,438][16056] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:34,538][16057] Decorrelating experience for 64 frames... |
|
[2025-03-22 16:15:34,751][15900] Heartbeat connected on RolloutWorker_w2 |
|
[2025-03-22 16:15:35,954][16062] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:36,293][15900] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 4005888. Throughput: 0: 30.2. Samples: 302. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) |
|
[2025-03-22 16:15:36,299][15900] Avg episode reward: [(0, '4.154')] |
|
[2025-03-22 16:15:36,519][15900] Heartbeat connected on RolloutWorker_w7 |
|
[2025-03-22 16:15:37,311][16041] Signal inference workers to stop experience collection... |
|
[2025-03-22 16:15:37,320][16054] InferenceWorker_p0-w0: stopping experience collection |
|
[2025-03-22 16:15:37,492][16057] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:37,587][15900] Heartbeat connected on RolloutWorker_w3 |
|
[2025-03-22 16:15:37,649][16059] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:37,785][15900] Heartbeat connected on RolloutWorker_w4 |
|
[2025-03-22 16:15:38,397][16061] Decorrelating experience for 96 frames... |
|
[2025-03-22 16:15:38,827][15900] Heartbeat connected on RolloutWorker_w6 |
|
[2025-03-22 16:15:39,140][16041] Signal inference workers to resume experience collection... |
|
[2025-03-22 16:15:39,141][16054] InferenceWorker_p0-w0: resuming experience collection |
|
[2025-03-22 16:15:39,159][16041] Stopping Batcher_0... |
|
[2025-03-22 16:15:39,159][16041] Loop batcher_evt_loop terminating... |
|
[2025-03-22 16:15:39,161][15900] Component Batcher_0 stopped! |
|
[2025-03-22 16:15:39,320][16054] Weights refcount: 2 0 |
|
[2025-03-22 16:15:39,329][15900] Component InferenceWorker_p0-w0 stopped! |
|
[2025-03-22 16:15:39,332][16054] Stopping InferenceWorker_p0-w0... |
|
[2025-03-22 16:15:39,337][16054] Loop inference_proc0-0_evt_loop terminating... |
|
[2025-03-22 16:15:39,732][15900] Component RolloutWorker_w7 stopped! |
|
[2025-03-22 16:15:39,735][16062] Stopping RolloutWorker_w7... |
|
[2025-03-22 16:15:39,741][16062] Loop rollout_proc7_evt_loop terminating... |
|
[2025-03-22 16:15:39,748][15900] Component RolloutWorker_w1 stopped! |
|
[2025-03-22 16:15:39,750][16055] Stopping RolloutWorker_w1... |
|
[2025-03-22 16:15:39,752][16055] Loop rollout_proc1_evt_loop terminating... |
|
[2025-03-22 16:15:39,756][15900] Component RolloutWorker_w3 stopped! |
|
[2025-03-22 16:15:39,759][16057] Stopping RolloutWorker_w3... |
|
[2025-03-22 16:15:39,760][16057] Loop rollout_proc3_evt_loop terminating... |
|
[2025-03-22 16:15:39,785][15900] Component RolloutWorker_w5 stopped! |
|
[2025-03-22 16:15:39,787][16060] Stopping RolloutWorker_w5... |
|
[2025-03-22 16:15:39,791][16041] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth... |
|
[2025-03-22 16:15:39,788][16060] Loop rollout_proc5_evt_loop terminating... |
|
[2025-03-22 16:15:39,883][16061] Stopping RolloutWorker_w6... |
|
[2025-03-22 16:15:39,878][15900] Component RolloutWorker_w6 stopped! |
|
[2025-03-22 16:15:39,897][15900] Component RolloutWorker_w0 stopped! |
|
[2025-03-22 16:15:39,883][16061] Loop rollout_proc6_evt_loop terminating... |
|
[2025-03-22 16:15:39,896][16058] Stopping RolloutWorker_w0... |
|
[2025-03-22 16:15:39,902][16058] Loop rollout_proc0_evt_loop terminating... |
|
[2025-03-22 16:15:39,914][15900] Component RolloutWorker_w4 stopped! |
|
[2025-03-22 16:15:39,915][16059] Stopping RolloutWorker_w4... |
|
[2025-03-22 16:15:39,916][16059] Loop rollout_proc4_evt_loop terminating... |
|
[2025-03-22 16:15:39,959][15900] Component RolloutWorker_w2 stopped! |
|
[2025-03-22 16:15:39,960][16056] Stopping RolloutWorker_w2... |
|
[2025-03-22 16:15:39,961][16056] Loop rollout_proc2_evt_loop terminating... |
|
[2025-03-22 16:15:39,981][16041] Removing /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000866_3547136.pth |
|
[2025-03-22 16:15:39,988][16041] Saving /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth... |
|
[2025-03-22 16:15:40,205][16041] Stopping LearnerWorker_p0... |
|
[2025-03-22 16:15:40,207][16041] Loop learner_proc0_evt_loop terminating... |
|
[2025-03-22 16:15:40,211][15900] Component LearnerWorker_p0 stopped! |
|
[2025-03-22 16:15:40,212][15900] Waiting for process learner_proc0 to stop... |
|
[2025-03-22 16:15:42,205][15900] Waiting for process inference_proc0-0 to join... |
|
[2025-03-22 16:15:42,206][15900] Waiting for process rollout_proc0 to join... |
|
[2025-03-22 16:15:44,163][15900] Waiting for process rollout_proc1 to join... |
|
[2025-03-22 16:15:44,296][15900] Waiting for process rollout_proc2 to join... |
|
[2025-03-22 16:15:44,300][15900] Waiting for process rollout_proc3 to join... |
|
[2025-03-22 16:15:44,304][15900] Waiting for process rollout_proc4 to join... |
|
[2025-03-22 16:15:44,305][15900] Waiting for process rollout_proc5 to join... |
|
[2025-03-22 16:15:44,307][15900] Waiting for process rollout_proc6 to join... |
|
[2025-03-22 16:15:44,308][15900] Waiting for process rollout_proc7 to join... |
|
[2025-03-22 16:15:44,309][15900] Batcher 0 profile tree view: |
|
batching: 0.0475, releasing_batches: 0.0004 |
|
[2025-03-22 16:15:44,310][15900] InferenceWorker_p0-w0 profile tree view: |
|
wait_policy: 0.0051 |
|
wait_policy_total: 9.5581 |
|
update_model: 0.0227 |
|
weight_update: 0.0013 |
|
one_step: 0.0892 |
|
handle_policy_step: 2.9471 |
|
deserialize: 0.0564, stack: 0.0093, obs_to_device_normalize: 0.5743, forward: 1.9118, send_messages: 0.0664 |
|
prepare_outputs: 0.2480 |
|
to_cpu: 0.1698 |
|
[2025-03-22 16:15:44,311][15900] Learner 0 profile tree view: |
|
misc: 0.0000, prepare_batch: 2.0491 |
|
train: 2.4677 |
|
epoch_init: 0.0000, minibatch_init: 0.0000, losses_postprocess: 0.0005, kl_divergence: 0.0173, after_optimizer: 0.0624 |
|
calculate_losses: 0.7091 |
|
losses_init: 0.0000, forward_head: 0.3859, bptt_initial: 0.2138, tail: 0.0541, advantages_returns: 0.0012, losses: 0.0413 |
|
bptt: 0.0122 |
|
bptt_forward_core: 0.0121 |
|
update: 1.6725 |
|
clip: 0.0682 |
|
[2025-03-22 16:15:44,312][15900] RolloutWorker_w0 profile tree view: |
|
wait_for_trajectories: 0.0011, enqueue_policy_requests: 1.0230, env_step: 2.4242, overhead: 0.0986, complete_rollouts: 0.0278 |
|
save_policy_outputs: 0.0770 |
|
split_output_tensors: 0.0315 |
|
[2025-03-22 16:15:44,313][15900] RolloutWorker_w7 profile tree view: |
|
wait_for_trajectories: 0.0029, enqueue_policy_requests: 0.0343, env_step: 0.7313, overhead: 0.0167, complete_rollouts: 0.0000 |
|
save_policy_outputs: 0.0230 |
|
split_output_tensors: 0.0104 |
|
[2025-03-22 16:15:44,315][15900] Loop Runner_EvtLoop terminating... |
|
[2025-03-22 16:15:44,316][15900] Runner profile tree view: |
|
main_loop: 38.6781 |
|
[2025-03-22 16:15:44,317][15900] Collected {0: 4014080}, FPS: 211.8 |
|
[2025-03-22 16:16:19,731][15900] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:16:19,732][15900] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:16:19,733][15900] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:16:19,734][15900] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:16:19,735][15900] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:16:19,736][15900] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:16:19,737][15900] Adding new argument 'max_num_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:16:19,741][15900] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:16:19,741][15900] Adding new argument 'push_to_hub'=False that is not in the saved config file! |
|
[2025-03-22 16:16:19,742][15900] Adding new argument 'hf_repository'=None that is not in the saved config file! |
|
[2025-03-22 16:16:19,743][15900] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:16:19,744][15900] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:16:19,746][15900] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:16:19,747][15900] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:16:19,748][15900] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:16:19,791][15900] Doom resolution: 160x120, resize resolution: (128, 72) |
|
[2025-03-22 16:16:19,795][15900] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:16:19,796][15900] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:16:19,811][15900] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:16:19,913][15900] Conv encoder output size: 512 |
|
[2025-03-22 16:16:19,914][15900] Policy head output size: 512 |
|
[2025-03-22 16:16:20,094][15900] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth... |
|
[2025-03-22 16:16:20,844][15900] Num frames 100... |
|
[2025-03-22 16:16:20,977][15900] Num frames 200... |
|
[2025-03-22 16:16:21,117][15900] Num frames 300... |
|
[2025-03-22 16:16:21,241][15900] Avg episode rewards: #0: 4.520, true rewards: #0: 3.520 |
|
[2025-03-22 16:16:21,242][15900] Avg episode reward: 4.520, avg true_objective: 3.520 |
|
[2025-03-22 16:16:21,307][15900] Num frames 400... |
|
[2025-03-22 16:16:21,439][15900] Num frames 500... |
|
[2025-03-22 16:16:21,568][15900] Num frames 600... |
|
[2025-03-22 16:16:21,696][15900] Num frames 700... |
|
[2025-03-22 16:16:21,826][15900] Num frames 800... |
|
[2025-03-22 16:16:21,958][15900] Num frames 900... |
|
[2025-03-22 16:16:22,146][15900] Avg episode rewards: #0: 7.460, true rewards: #0: 4.960 |
|
[2025-03-22 16:16:22,147][15900] Avg episode reward: 7.460, avg true_objective: 4.960 |
|
[2025-03-22 16:16:22,161][15900] Num frames 1000... |
|
[2025-03-22 16:16:22,290][15900] Num frames 1100... |
|
[2025-03-22 16:16:22,421][15900] Num frames 1200... |
|
[2025-03-22 16:16:22,549][15900] Num frames 1300... |
|
[2025-03-22 16:16:22,682][15900] Num frames 1400... |
|
[2025-03-22 16:16:22,815][15900] Num frames 1500... |
|
[2025-03-22 16:16:22,944][15900] Num frames 1600... |
|
[2025-03-22 16:16:23,080][15900] Num frames 1700... |
|
[2025-03-22 16:16:23,211][15900] Num frames 1800... |
|
[2025-03-22 16:16:23,296][15900] Avg episode rewards: #0: 9.747, true rewards: #0: 6.080 |
|
[2025-03-22 16:16:23,297][15900] Avg episode reward: 9.747, avg true_objective: 6.080 |
|
[2025-03-22 16:16:23,398][15900] Num frames 1900... |
|
[2025-03-22 16:16:23,528][15900] Num frames 2000... |
|
[2025-03-22 16:16:23,656][15900] Num frames 2100... |
|
[2025-03-22 16:16:23,792][15900] Num frames 2200... |
|
[2025-03-22 16:16:23,925][15900] Num frames 2300... |
|
[2025-03-22 16:16:24,057][15900] Num frames 2400... |
|
[2025-03-22 16:16:24,196][15900] Num frames 2500... |
|
[2025-03-22 16:16:24,328][15900] Num frames 2600... |
|
[2025-03-22 16:16:24,459][15900] Num frames 2700... |
|
[2025-03-22 16:16:24,594][15900] Num frames 2800... |
|
[2025-03-22 16:16:24,727][15900] Num frames 2900... |
|
[2025-03-22 16:16:24,865][15900] Num frames 3000... |
|
[2025-03-22 16:16:25,014][15900] Num frames 3100... |
|
[2025-03-22 16:16:25,181][15900] Num frames 3200... |
|
[2025-03-22 16:16:25,367][15900] Num frames 3300... |
|
[2025-03-22 16:16:25,553][15900] Num frames 3400... |
|
[2025-03-22 16:16:25,735][15900] Num frames 3500... |
|
[2025-03-22 16:16:25,917][15900] Num frames 3600... |
|
[2025-03-22 16:16:25,994][15900] Avg episode rewards: #0: 19.025, true rewards: #0: 9.025 |
|
[2025-03-22 16:16:25,995][15900] Avg episode reward: 19.025, avg true_objective: 9.025 |
|
[2025-03-22 16:16:26,153][15900] Num frames 3700... |
|
[2025-03-22 16:16:26,328][15900] Num frames 3800... |
|
[2025-03-22 16:16:26,502][15900] Num frames 3900... |
|
[2025-03-22 16:16:26,686][15900] Num frames 4000... |
|
[2025-03-22 16:16:26,849][15900] Avg episode rewards: #0: 16.714, true rewards: #0: 8.114 |
|
[2025-03-22 16:16:26,850][15900] Avg episode reward: 16.714, avg true_objective: 8.114 |
|
[2025-03-22 16:16:26,932][15900] Num frames 4100... |
|
[2025-03-22 16:16:27,113][15900] Num frames 4200... |
|
[2025-03-22 16:16:27,297][15900] Num frames 4300... |
|
[2025-03-22 16:16:27,432][15900] Num frames 4400... |
|
[2025-03-22 16:16:27,561][15900] Num frames 4500... |
|
[2025-03-22 16:16:27,691][15900] Num frames 4600... |
|
[2025-03-22 16:16:27,827][15900] Num frames 4700... |
|
[2025-03-22 16:16:27,958][15900] Num frames 4800... |
|
[2025-03-22 16:16:28,049][15900] Avg episode rewards: #0: 16.375, true rewards: #0: 8.042 |
|
[2025-03-22 16:16:28,050][15900] Avg episode reward: 16.375, avg true_objective: 8.042 |
|
[2025-03-22 16:16:28,153][15900] Num frames 4900... |
|
[2025-03-22 16:16:28,293][15900] Num frames 5000... |
|
[2025-03-22 16:16:28,424][15900] Num frames 5100... |
|
[2025-03-22 16:16:28,556][15900] Num frames 5200... |
|
[2025-03-22 16:16:28,690][15900] Num frames 5300... |
|
[2025-03-22 16:16:28,827][15900] Num frames 5400... |
|
[2025-03-22 16:16:28,885][15900] Avg episode rewards: #0: 16.001, true rewards: #0: 7.716 |
|
[2025-03-22 16:16:28,885][15900] Avg episode reward: 16.001, avg true_objective: 7.716 |
|
[2025-03-22 16:16:29,014][15900] Num frames 5500... |
|
[2025-03-22 16:16:29,146][15900] Num frames 5600... |
|
[2025-03-22 16:16:29,282][15900] Num frames 5700... |
|
[2025-03-22 16:16:29,416][15900] Num frames 5800... |
|
[2025-03-22 16:16:29,547][15900] Num frames 5900... |
|
[2025-03-22 16:16:29,680][15900] Num frames 6000... |
|
[2025-03-22 16:16:29,816][15900] Num frames 6100... |
|
[2025-03-22 16:16:29,950][15900] Num frames 6200... |
|
[2025-03-22 16:16:30,083][15900] Num frames 6300... |
|
[2025-03-22 16:16:30,217][15900] Num frames 6400... |
|
[2025-03-22 16:16:30,398][15900] Avg episode rewards: #0: 17.236, true rewards: #0: 8.111 |
|
[2025-03-22 16:16:30,399][15900] Avg episode reward: 17.236, avg true_objective: 8.111 |
|
[2025-03-22 16:16:30,414][15900] Num frames 6500... |
|
[2025-03-22 16:16:30,546][15900] Num frames 6600... |
|
[2025-03-22 16:16:30,681][15900] Num frames 6700... |
|
[2025-03-22 16:16:30,814][15900] Num frames 6800... |
|
[2025-03-22 16:16:30,953][15900] Num frames 6900... |
|
[2025-03-22 16:16:31,092][15900] Num frames 7000... |
|
[2025-03-22 16:16:31,144][15900] Avg episode rewards: #0: 16.333, true rewards: #0: 7.778 |
|
[2025-03-22 16:16:31,145][15900] Avg episode reward: 16.333, avg true_objective: 7.778 |
|
[2025-03-22 16:16:31,277][15900] Num frames 7100... |
|
[2025-03-22 16:16:31,421][15900] Num frames 7200... |
|
[2025-03-22 16:16:31,553][15900] Num frames 7300... |
|
[2025-03-22 16:16:31,689][15900] Num frames 7400... |
|
[2025-03-22 16:16:31,824][15900] Num frames 7500... |
|
[2025-03-22 16:16:31,920][15900] Avg episode rewards: #0: 15.631, true rewards: #0: 7.531 |
|
[2025-03-22 16:16:31,921][15900] Avg episode reward: 15.631, avg true_objective: 7.531 |
|
[2025-03-22 16:17:20,425][15900] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
[2025-03-22 16:18:57,804][15900] Loading existing experiment configuration from /content/train_dir/default_experiment/config.json |
|
[2025-03-22 16:18:57,805][15900] Overriding arg 'num_workers' with value 1 passed from command line |
|
[2025-03-22 16:18:57,807][15900] Adding new argument 'no_render'=True that is not in the saved config file! |
|
[2025-03-22 16:18:57,808][15900] Adding new argument 'save_video'=True that is not in the saved config file! |
|
[2025-03-22 16:18:57,809][15900] Adding new argument 'video_frames'=1000000000.0 that is not in the saved config file! |
|
[2025-03-22 16:18:57,810][15900] Adding new argument 'video_name'=None that is not in the saved config file! |
|
[2025-03-22 16:18:57,811][15900] Adding new argument 'max_num_frames'=100000 that is not in the saved config file! |
|
[2025-03-22 16:18:57,812][15900] Adding new argument 'max_num_episodes'=10 that is not in the saved config file! |
|
[2025-03-22 16:18:57,814][15900] Adding new argument 'push_to_hub'=True that is not in the saved config file! |
|
[2025-03-22 16:18:57,815][15900] Adding new argument 'hf_repository'='zimka/HFRLC_U8_health_gathering_supreme' that is not in the saved config file! |
|
[2025-03-22 16:18:57,816][15900] Adding new argument 'policy_index'=0 that is not in the saved config file! |
|
[2025-03-22 16:18:57,817][15900] Adding new argument 'eval_deterministic'=False that is not in the saved config file! |
|
[2025-03-22 16:18:57,818][15900] Adding new argument 'train_script'=None that is not in the saved config file! |
|
[2025-03-22 16:18:57,819][15900] Adding new argument 'enjoy_script'=None that is not in the saved config file! |
|
[2025-03-22 16:18:57,820][15900] Using frameskip 1 and render_action_repeat=4 for evaluation |
|
[2025-03-22 16:18:57,846][15900] RunningMeanStd input shape: (3, 72, 128) |
|
[2025-03-22 16:18:57,847][15900] RunningMeanStd input shape: (1,) |
|
[2025-03-22 16:18:57,859][15900] ConvEncoder: input_channels=3 |
|
[2025-03-22 16:18:57,895][15900] Conv encoder output size: 512 |
|
[2025-03-22 16:18:57,896][15900] Policy head output size: 512 |
|
[2025-03-22 16:18:57,916][15900] Loading state from checkpoint /content/train_dir/default_experiment/checkpoint_p0/checkpoint_000000980_4014080.pth... |
|
[2025-03-22 16:18:58,379][15900] Num frames 100... |
|
[2025-03-22 16:18:58,513][15900] Num frames 200... |
|
[2025-03-22 16:18:58,640][15900] Num frames 300... |
|
[2025-03-22 16:18:58,763][15900] Avg episode rewards: #0: 4.520, true rewards: #0: 3.520 |
|
[2025-03-22 16:18:58,764][15900] Avg episode reward: 4.520, avg true_objective: 3.520 |
|
[2025-03-22 16:18:58,841][15900] Num frames 400... |
|
[2025-03-22 16:18:58,974][15900] Num frames 500... |
|
[2025-03-22 16:18:59,118][15900] Num frames 600... |
|
[2025-03-22 16:18:59,249][15900] Num frames 700... |
|
[2025-03-22 16:18:59,380][15900] Num frames 800... |
|
[2025-03-22 16:18:59,511][15900] Num frames 900... |
|
[2025-03-22 16:18:59,645][15900] Num frames 1000... |
|
[2025-03-22 16:18:59,788][15900] Num frames 1100... |
|
[2025-03-22 16:18:59,922][15900] Num frames 1200... |
|
[2025-03-22 16:19:00,065][15900] Num frames 1300... |
|
[2025-03-22 16:19:00,221][15900] Avg episode rewards: #0: 13.880, true rewards: #0: 6.880 |
|
[2025-03-22 16:19:00,222][15900] Avg episode reward: 13.880, avg true_objective: 6.880 |
|
[2025-03-22 16:19:00,259][15900] Num frames 1400... |
|
[2025-03-22 16:19:00,389][15900] Num frames 1500... |
|
[2025-03-22 16:19:00,520][15900] Num frames 1600... |
|
[2025-03-22 16:19:00,692][15900] Num frames 1700... |
|
[2025-03-22 16:19:00,872][15900] Num frames 1800... |
|
[2025-03-22 16:19:00,973][15900] Avg episode rewards: #0: 11.080, true rewards: #0: 6.080 |
|
[2025-03-22 16:19:00,974][15900] Avg episode reward: 11.080, avg true_objective: 6.080 |
|
[2025-03-22 16:19:01,112][15900] Num frames 1900... |
|
[2025-03-22 16:19:01,281][15900] Num frames 2000... |
|
[2025-03-22 16:19:01,452][15900] Num frames 2100... |
|
[2025-03-22 16:19:01,621][15900] Num frames 2200... |
|
[2025-03-22 16:19:01,793][15900] Num frames 2300... |
|
[2025-03-22 16:19:01,972][15900] Num frames 2400... |
|
[2025-03-22 16:19:02,165][15900] Num frames 2500... |
|
[2025-03-22 16:19:02,340][15900] Num frames 2600... |
|
[2025-03-22 16:19:02,528][15900] Num frames 2700... |
|
[2025-03-22 16:19:02,622][15900] Avg episode rewards: #0: 12.800, true rewards: #0: 6.800 |
|
[2025-03-22 16:19:02,623][15900] Avg episode reward: 12.800, avg true_objective: 6.800 |
|
[2025-03-22 16:19:02,760][15900] Num frames 2800... |
|
[2025-03-22 16:19:02,894][15900] Num frames 2900... |
|
[2025-03-22 16:19:03,028][15900] Num frames 3000... |
|
[2025-03-22 16:19:03,171][15900] Num frames 3100... |
|
[2025-03-22 16:19:03,301][15900] Num frames 3200... |
|
[2025-03-22 16:19:03,435][15900] Num frames 3300... |
|
[2025-03-22 16:19:03,566][15900] Num frames 3400... |
|
[2025-03-22 16:19:03,701][15900] Num frames 3500... |
|
[2025-03-22 16:19:03,836][15900] Num frames 3600... |
|
[2025-03-22 16:19:03,965][15900] Num frames 3700... |
|
[2025-03-22 16:19:04,095][15900] Num frames 3800... |
|
[2025-03-22 16:19:04,265][15900] Num frames 3900... |
|
[2025-03-22 16:19:04,394][15900] Num frames 4000... |
|
[2025-03-22 16:19:04,564][15900] Avg episode rewards: #0: 16.976, true rewards: #0: 8.176 |
|
[2025-03-22 16:19:04,565][15900] Avg episode reward: 16.976, avg true_objective: 8.176 |
|
[2025-03-22 16:19:04,583][15900] Num frames 4100... |
|
[2025-03-22 16:19:04,710][15900] Num frames 4200... |
|
[2025-03-22 16:19:04,846][15900] Num frames 4300... |
|
[2025-03-22 16:19:04,980][15900] Num frames 4400... |
|
[2025-03-22 16:19:05,115][15900] Num frames 4500... |
|
[2025-03-22 16:19:05,250][15900] Num frames 4600... |
|
[2025-03-22 16:19:05,378][15900] Num frames 4700... |
|
[2025-03-22 16:19:05,510][15900] Avg episode rewards: #0: 16.100, true rewards: #0: 7.933 |
|
[2025-03-22 16:19:05,512][15900] Avg episode reward: 16.100, avg true_objective: 7.933 |
|
[2025-03-22 16:19:05,567][15900] Num frames 4800... |
|
[2025-03-22 16:19:05,707][15900] Num frames 4900... |
|
[2025-03-22 16:19:05,841][15900] Num frames 5000... |
|
[2025-03-22 16:19:05,972][15900] Num frames 5100... |
|
[2025-03-22 16:19:06,105][15900] Num frames 5200... |
|
[2025-03-22 16:19:06,237][15900] Num frames 5300... |
|
[2025-03-22 16:19:06,374][15900] Num frames 5400... |
|
[2025-03-22 16:19:06,505][15900] Num frames 5500... |
|
[2025-03-22 16:19:06,638][15900] Num frames 5600... |
|
[2025-03-22 16:19:06,770][15900] Num frames 5700... |
|
[2025-03-22 16:19:06,854][15900] Avg episode rewards: #0: 16.600, true rewards: #0: 8.171 |
|
[2025-03-22 16:19:06,855][15900] Avg episode reward: 16.600, avg true_objective: 8.171 |
|
[2025-03-22 16:19:06,962][15900] Num frames 5800... |
|
[2025-03-22 16:19:07,099][15900] Num frames 5900... |
|
[2025-03-22 16:19:07,231][15900] Num frames 6000... |
|
[2025-03-22 16:19:07,370][15900] Num frames 6100... |
|
[2025-03-22 16:19:07,497][15900] Num frames 6200... |
|
[2025-03-22 16:19:07,629][15900] Num frames 6300... |
|
[2025-03-22 16:19:07,760][15900] Num frames 6400... |
|
[2025-03-22 16:19:07,851][15900] Avg episode rewards: #0: 16.405, true rewards: #0: 8.030 |
|
[2025-03-22 16:19:07,852][15900] Avg episode reward: 16.405, avg true_objective: 8.030 |
|
[2025-03-22 16:19:07,951][15900] Num frames 6500... |
|
[2025-03-22 16:19:08,088][15900] Num frames 6600... |
|
[2025-03-22 16:19:08,227][15900] Num frames 6700... |
|
[2025-03-22 16:19:08,367][15900] Num frames 6800... |
|
[2025-03-22 16:19:08,503][15900] Num frames 6900... |
|
[2025-03-22 16:19:08,636][15900] Num frames 7000... |
|
[2025-03-22 16:19:08,774][15900] Num frames 7100... |
|
[2025-03-22 16:19:08,908][15900] Num frames 7200... |
|
[2025-03-22 16:19:09,044][15900] Num frames 7300... |
|
[2025-03-22 16:19:09,181][15900] Num frames 7400... |
|
[2025-03-22 16:19:09,312][15900] Num frames 7500... |
|
[2025-03-22 16:19:09,451][15900] Num frames 7600... |
|
[2025-03-22 16:19:09,584][15900] Num frames 7700... |
|
[2025-03-22 16:19:09,722][15900] Num frames 7800... |
|
[2025-03-22 16:19:09,773][15900] Avg episode rewards: #0: 18.222, true rewards: #0: 8.667 |
|
[2025-03-22 16:19:09,775][15900] Avg episode reward: 18.222, avg true_objective: 8.667 |
|
[2025-03-22 16:19:09,903][15900] Num frames 7900... |
|
[2025-03-22 16:19:10,032][15900] Num frames 8000... |
|
[2025-03-22 16:19:10,168][15900] Num frames 8100... |
|
[2025-03-22 16:19:10,298][15900] Num frames 8200... |
|
[2025-03-22 16:19:10,438][15900] Num frames 8300... |
|
[2025-03-22 16:19:10,574][15900] Num frames 8400... |
|
[2025-03-22 16:19:10,760][15900] Avg episode rewards: #0: 17.797, true rewards: #0: 8.497 |
|
[2025-03-22 16:19:10,761][15900] Avg episode reward: 17.797, avg true_objective: 8.497 |
|
[2025-03-22 16:19:10,769][15900] Num frames 8500... |
|
[2025-03-22 16:20:03,421][15900] Replay video saved to /content/train_dir/default_experiment/replay.mp4! |
|
|