--- tags: - LunarLander-v2 - ppo - deep-reinforcement-learning - reinforcement-learning - custom-implementation - deep-rl-course model-index: - name: PPO results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: LunarLander-v2 type: LunarLander-v2 metrics: - type: mean_reward value: 35.48 +/- 126.92 name: mean_reward verified: false --- # PPO Agent Playing LunarLander-v2 A PPO agent playing LunarLander-v2 but decides to go for a walk instead. Do not download it if you are looking for an agent that follows the plan. # Hyperparameters ```python {"exp_name": "ppo" "seed": 1 "torch_deterministic": true "cuda": true "track": false "wandb_project_name": "cleanRL" "wandb_entity": null "capture_video": false "env_id": "LunarLander-v2" "total_timesteps": 1000000 "learning_rate": 0.00025 "num_envs": 4 "num_steps": 128 "anneal_lr": true "gae": true "gamma": 0.99 "gae_lambda": 0.95 "num_minibatches": 4 "update_epochs": 4 "norm_adv": true "clip_coef": 0.2 "clip_vloss": true "ent_coef": 0.01 "vf_coef": 0.5 "max_grad_norm": 0.5 "target_kl": null "repo_id": "salym/PPO-CleanRL-LunarLander-v2" "batch_size": 512 "minibatch_size": 128} ```