João Raposo
Update READ me with important notes for the testing of the model
a5043b2 verified
metadata
library_name: sample-factory
tags:
  - deep-reinforcement-learning
  - reinforcement-learning
  - sample-factory
model-index:
  - name: APPO
    results:
      - task:
          type: reinforcement-learning
          name: reinforcement-learning
        dataset:
          name: doom_health_gathering_supreme
          type: doom_health_gathering_supreme
        metrics:
          - type: mean_reward
            value: 11.88 +/- 5.17
            name: mean_reward
            verified: false

A(n) APPO model trained on the doom_health_gathering_supreme environment.

This model was trained using Sample-Factory 2.0: https://github.com/alex-petrenko/sample-factory. Documentation for how to use Sample-Factory can be found at https://www.samplefactory.dev/

IMPORTANT NOTE TO MAKE THE CODE RUN IN COLAB

The following code was necessary to run the testing of the model after training. It is necessary to put this code before calling enjoy function. This forces torch.load to load the full checkpoint instead of trying to load weights only, which should bypass the safe globals error.

# Monkey-patch torch.load to disable weights_only mode.
old_torch_load = torch.load
def patched_torch_load(*args, **kwargs):
    kwargs['weights_only'] = False
    return old_torch_load(*args, **kwargs)
torch.load = patched_torch_load

Downloading the model

After installing Sample-Factory, download the model with:

python -m sample_factory.huggingface.load_from_hub -r DarkDummo/rl_course_vizdoom_health_gathering_supreme

Using the model

To run the model after download, use the enjoy script corresponding to this environment:

python -m <path.to.enjoy.module> --algo=APPO --env=doom_health_gathering_supreme --train_dir=./train_dir --experiment=rl_course_vizdoom_health_gathering_supreme

You can also upload models to the Hugging Face Hub using the same script with the --push_to_hub flag. See https://www.samplefactory.dev/10-huggingface/huggingface/ for more details

Training with this model

To continue training with this model, use the train script corresponding to this environment:

python -m <path.to.train.module> --algo=APPO --env=doom_health_gathering_supreme --train_dir=./train_dir --experiment=rl_course_vizdoom_health_gathering_supreme --restart_behavior=resume --train_for_env_steps=10000000000

Note, you may have to adjust --train_for_env_steps to a suitably high number as the experiment will resume at the number of steps it concluded at.