Error when loading audio processor saved with processor.save_pretrained(path_to_dir), and num_logits_to_keep wrong init value
#73
by
YaelAiola
- opened
- After fine-tuning the model on a speech task, I attempted to save both the model and the processor using processor.save_pretrained(path_to_dir). However, when loading the model, I encountered an error due to mismatches between the processor's saved fields and the expected input arguments:
audio_compression_rate β compression_rate
qformer_compression_rate β audio_downsample_rate
audio_feat_stride β feat_stride
Additionally, some arguments such as feature_size, sampling_rate, and padding_value are hardcoded in the processor, causing the loading process to fail when they are encountered again
- If the user does not provide the num_logits_to_keep argument, the code fails because in modeling_phi4mm.py, within the prepare_inputs_for_generation function, the argument is initialized with None instead of a default value- 0.