[2024-08-14 23:36:44 root] (mobilequant.py 132): INFO Namespace(hf_path='checkpoints/hfmodels/llama-1.1b', dtype='float32', output_dir='results/llama-1.1b-e2e-w4a8-s1024-e60', cache_dir='./cache', resume=None, calib_dataset='pile', nsamples=1024, seqlen=2048, act_dict_path='checkpoints/hfmodels/llama-1.1b/act_dict.json', override_qcfg_path='checkpoints/hfmodels/llama-1.1b/default_qcfg.json', weight_bitwidth=4, weight_group_size=-1, weight_is_per_channel=True, weight_is_symmetric=False, weight_is_dynamic=False, act_bitwidth=8, act_group_size=-1, act_is_per_channel=False, act_is_symmetric=False, act_is_dynamic=False, let=True, lwc=True, lrl=True, let_lr=0.001, lwc_lr=0.01, lrl_lr=1e-06, let_min_lr=0.0001, lwc_min_lr=0.001, lrl_min_lr=1e-07, wd=0, epochs=60, warmup_epochs=0, use_shift=False, aug_loss=False, deactive_amp=True, batch_size=1, num_fewshot=0, tasks='wikitext', mode='e2e', original_omniquant=False, cache_in_gpu=False, use_8bit_softmax_input=False, use_8bit_softmax_output=False, model_family='llama') [2024-08-14 23:37:00 root] (mobilequant.py 218): INFO === start quantization === [2024-08-14 23:37:00 root] (mobilequant.py 224): INFO load calibration set from ./cache/dataloader_llama_pile_1024.cache [2024-08-14 23:37:00 root] (algorithm.py 588): INFO Starting ...