Error when loading audio processor saved with processor.save_pretrained(path_to_dir), and num_logits_to_keep wrong init value

#73
by YaelAiola - opened
  1. After fine-tuning the model on a speech task, I attempted to save both the model and the processor using processor.save_pretrained(path_to_dir). However, when loading the model, I encountered an error due to mismatches between the processor's saved fields and the expected input arguments:

audio_compression_rate β†’ compression_rate

qformer_compression_rate β†’ audio_downsample_rate

audio_feat_stride β†’ feat_stride

Additionally, some arguments such as feature_size, sampling_rate, and padding_value are hardcoded in the processor, causing the loading process to fail when they are encountered again
preprocessor_config.png

  1. If the user does not provide the num_logits_to_keep argument, the code fails because in modeling_phi4mm.py, within the prepare_inputs_for_generation function, the argument is initialized with None instead of a default value- 0.

Sign up or log in to comment