RaiBP
/

gpt2-openwebtext2-first-30-chunks-ablation-full

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

Metrics Training metrics Community

RaiBP commited on Feb 8, 2024

Commit

77409d2

·

verified ·

1 Parent(s): f739a35

Update README.md

Files changed (1) hide show

README.md +5 -7

README.md CHANGED Viewed

@@ -29,11 +29,10 @@ More information needed
 ## Training procedure
-The following command was used:
 ```bash
-CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 run_clm.py \
---output_dir="./training_full" \
 --model_type="gpt2" \
 --config_name="./training" \
 --tokenizer_name="./training" \
@@ -47,9 +46,8 @@ CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 r
 --num_train_epochs="1" \
 --logging_steps="500" \
 --save_steps="5000" --preprocessing_num_workers="16" \
---gradient_accumulation_steps="4" \
---report_to="tensorboard" \
---logging_dir="./log_full"
 ```
 ### Training hyperparameters

 ## Training procedure
+The [`run_clm.py` script](https://github.com/huggingface/transformers/blob/main/examples/pytorch/language-modeling/run_clm.py) from the transformers library was used. Training was distributed on two NVIDIA Quadro RTX 6000 GPUs:
 ```bash
+TORCH_CPP_LOG_LEVEL=INFO NCCL_DEBUG=INFO CUDA_VISIBLE_DEVICES=0,1 nohup python -m torch.distributed.launch \
+--nproc_per_node=2 run_clm.py --output_dir="./training_full" \
 --model_type="gpt2" \
 --config_name="./training" \
 --tokenizer_name="./training" \
 --num_train_epochs="1" \
 --logging_steps="500" \
 --save_steps="5000" --preprocessing_num_workers="16" \
+--gradient_accumulation_steps="4" --report_to="tensorboard" \
+--logging_dir="./log_full"  > command_full_log.log 2>&1 &
 ```
 ### Training hyperparameters