numpy paddlenlp>=3.0.0b2 tensorboardX opencv-python Pillow pycocoevalcap ftfy regex einops>=0.6.1 soundfile librosa h5py jsonschema>=4.19.0 referencing>=0.32.1 decord>=0.6.0