Using different optimizer and learning rate scheduler to improve performance ff5518b verified avanishd commited on Apr 11