RaiBP commited on
Commit
77409d2
·
verified ·
1 Parent(s): f739a35

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -7
README.md CHANGED
@@ -29,11 +29,10 @@ More information needed
29
 
30
  ## Training procedure
31
 
32
- The following command was used:
33
-
34
  ```bash
35
- CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 run_clm.py \
36
- --output_dir="./training_full" \
37
  --model_type="gpt2" \
38
  --config_name="./training" \
39
  --tokenizer_name="./training" \
@@ -47,9 +46,8 @@ CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 r
47
  --num_train_epochs="1" \
48
  --logging_steps="500" \
49
  --save_steps="5000" --preprocessing_num_workers="16" \
50
- --gradient_accumulation_steps="4" \
51
- --report_to="tensorboard" \
52
- --logging_dir="./log_full"
53
  ```
54
 
55
  ### Training hyperparameters
 
29
 
30
  ## Training procedure
31
 
32
+ The [`run_clm.py` script](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) from the transformers library was used. Training was distributed on two NVIDIA Quadro RTX 6000 GPUs:
 
33
  ```bash
34
+ TORCH_CPP_LOG_LEVEL=INFO NCCL_DEBUG=INFO CUDA_VISIBLE_DEVICES=0,1 nohup python -m torch.distributed.launch \
35
+ --nproc_per_node=2 run_clm.py --output_dir="./training_full" \
36
  --model_type="gpt2" \
37
  --config_name="./training" \
38
  --tokenizer_name="./training" \
 
46
  --num_train_epochs="1" \
47
  --logging_steps="500" \
48
  --save_steps="5000" --preprocessing_num_workers="16" \
49
+ --gradient_accumulation_steps="4" --report_to="tensorboard" \
50
+ --logging_dir="./log_full" > command_full_log.log 2>&1 &
 
51
  ```
52
 
53
  ### Training hyperparameters