luodian commited on
Commit
e9f1edb
1 Parent(s): 52dc246

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -39,7 +39,7 @@ We conducted the training on LLaVA-1.6's codebase with adding support of Llama-3
39
  ### Training Hyperparameters
40
 
41
  ```shell
42
- LLM_VERSION="Qwen/Qwen1.5-72B-Chat"
43
  LLM_VERSION_CLEAN="${LLM_VERSION//\//_}"
44
  VISION_MODEL_VERSION="openai/clip-vit-large-patch14-336"
45
  VISION_MODEL_VERSION_CLEAN="${VISION_MODEL_VERSION//\//_}"
@@ -80,7 +80,7 @@ torchrun # with necessary torchrun information for distributed training\
80
  --num_train_epochs 1 \
81
  --per_device_train_batch_size 1 \
82
  --per_device_eval_batch_size 4 \
83
- --gradient_accumulation_steps 2 \
84
  --evaluation_strategy "no" \
85
  --save_strategy "steps" \
86
  --save_steps 3000 \
@@ -108,6 +108,7 @@ torchrun # with necessary torchrun information for distributed training\
108
  - 500K academic-task-oriented VQA data mixture.
109
  - 50K GPT-4V data mixture.
110
  - 40K ShareGPT data.
 
111
 
112
  #### Speeds, Sizes, Times [optional]
113
 
 
39
  ### Training Hyperparameters
40
 
41
  ```shell
42
+ LLM_VERSION="Qwen/Qwen1.5-110B-Chat"
43
  LLM_VERSION_CLEAN="${LLM_VERSION//\//_}"
44
  VISION_MODEL_VERSION="openai/clip-vit-large-patch14-336"
45
  VISION_MODEL_VERSION_CLEAN="${VISION_MODEL_VERSION//\//_}"
 
80
  --num_train_epochs 1 \
81
  --per_device_train_batch_size 1 \
82
  --per_device_eval_batch_size 4 \
83
+ --gradient_accumulation_steps 1 \
84
  --evaluation_strategy "no" \
85
  --save_strategy "steps" \
86
  --save_steps 3000 \
 
108
  - 500K academic-task-oriented VQA data mixture.
109
  - 50K GPT-4V data mixture.
110
  - 40K ShareGPT data.
111
+ - 20K COCO Caption data.
112
 
113
  #### Speeds, Sizes, Times [optional]
114