[xcodec2 model mismatch]

#13
by judy9710 - opened

I am getting errors

"You are using a model of type xcodec2 to instantiate a model of type xcodec. This is not supported for all configurations of models and can yield errors."

It seems like as of now, the xcodec2 model is not natively supported by Hugging Face Transformers

Should I try to customize AutoConfig and Trainer at the code level?

It's working for me with xcodec2 version 0.1.3 and the with 0.1.5 it fails.
Furthermore if xcodec2 version is 0.1.3 and transformers is latest then also it fails. Fails in the sense, it shows below message

You are using a model of type xcodec2 to instantiate a model of type xcodec. This is not supported for all configurations of models and can yield errors.
Some weights of the model checkpoint at /data1/nowshad/models/xcodec2 were not used when initializing XCodec2Model: ['CodecEnc.conv_blocks.1.block.0.block.0.act.beta', 'CodecEnc.conv_blocks.1.block.0.block.2.act.beta', 'CodecEnc.conv_blocks.1.block.1.block.0.act.beta', 'CodecEnc.conv_blocks.1.block.1.block.2.act.beta', 'CodecEnc.conv_blocks.1.block.2.block.0.act.beta', 'CodecEnc.conv_blocks.1.block.2.block.2.act.beta', 'CodecEnc.conv_blocks.1.block.3.act.beta', 'CodecEnc.conv_blocks.2.block.0.block.0.act.beta', 'CodecEnc.conv_blocks.2.block.0.block.2.act.beta', 'CodecEnc.conv_blocks.2.block.1.block.0.act.beta', 'CodecEnc.conv_blocks.2.block.1.block.2.act.beta', 'CodecEnc.conv_blocks.2.block.2.block.0.act.beta', 'CodecEnc.conv_blocks.2.block.2.block.2.act.beta', 'CodecEnc.conv_blocks.2.block.3.act.beta', 'CodecEnc.conv_blocks.3.block.0.block.0.act.beta', 'CodecEnc.conv_blocks.3.block.0.block.2.act.beta', 'CodecEnc.conv_blocks.3.block.1.block.0.act.beta', 'CodecEnc.conv_blocks.3.block.1.block.2.act.beta', 'CodecEnc.conv_blocks.3.block.2.block.0.act.beta', 'CodecEnc.conv_blocks.3.block.2.block.2.act.beta', 'CodecEnc.conv_blocks.3.block.3.act.beta', 'CodecEnc.conv_blocks.4.block.0.block.0.act.beta', 'CodecEnc.conv_blocks.4.block.0.block.2.act.beta', 'CodecEnc.conv_blocks.4.block.1.block.0.act.beta', 'CodecEnc.conv_blocks.4.block.1.block.2.act.beta', 'CodecEnc.conv_blocks.4.block.2.block.0.act.beta', 'CodecEnc.conv_blocks.4.block.2.block.2.act.beta', 'CodecEnc.conv_blocks.4.block.3.act.beta', 'CodecEnc.conv_blocks.5.block.0.block.0.act.beta', 'CodecEnc.conv_blocks.5.block.0.block.2.act.beta', 'CodecEnc.conv_blocks.5.block.1.block.0.act.beta', 'CodecEnc.conv_blocks.5.block.1.block.2.act.beta', 'CodecEnc.conv_blocks.5.block.2.block.0.act.beta', 'CodecEnc.conv_blocks.5.block.2.block.2.act.beta', 'CodecEnc.conv_blocks.5.block.3.act.beta', 'CodecEnc.conv_final_block.0.act.beta']
- This IS expected if you are initializing XCodec2Model from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XCodec2Model from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of XCodec2Model were not initialized from the model checkpoint at /data1/nowshad/models/xcodec2 and are newly initialized: ['CodecEnc.conv_blocks.1.block.0.block.0.act.bias', 'CodecEnc.conv_blocks.1.block.0.block.2.act.bias', 'CodecEnc.conv_blocks.1.block.1.block.0.act.bias', 'CodecEnc.conv_blocks.1.block.1.block.2.act.bias', 'CodecEnc.conv_blocks.1.block.2.block.0.act.bias', 'CodecEnc.conv_blocks.1.block.2.block.2.act.bias', 'CodecEnc.conv_blocks.1.block.3.act.bias', 'CodecEnc.conv_blocks.2.block.0.block.0.act.bias', 'CodecEnc.conv_blocks.2.block.0.block.2.act.bias', 'CodecEnc.conv_blocks.2.block.1.block.0.act.bias', 'CodecEnc.conv_blocks.2.block.1.block.2.act.bias', 'CodecEnc.conv_blocks.2.block.2.block.0.act.bias', 'CodecEnc.conv_blocks.2.block.2.block.2.act.bias', 'CodecEnc.conv_blocks.2.block.3.act.bias', 'CodecEnc.conv_blocks.3.block.0.block.0.act.bias', 'CodecEnc.conv_blocks.3.block.0.block.2.act.bias', 'CodecEnc.conv_blocks.3.block.1.block.0.act.bias', 'CodecEnc.conv_blocks.3.block.1.block.2.act.bias', 'CodecEnc.conv_blocks.3.block.2.block.0.act.bias', 'CodecEnc.conv_blocks.3.block.2.block.2.act.bias', 'CodecEnc.conv_blocks.3.block.3.act.bias', 'CodecEnc.conv_blocks.4.block.0.block.0.act.bias', 'CodecEnc.conv_blocks.4.block.0.block.2.act.bias', 'CodecEnc.conv_blocks.4.block.1.block.0.act.bias', 'CodecEnc.conv_blocks.4.block.1.block.2.act.bias', 'CodecEnc.conv_blocks.4.block.2.block.0.act.bias', 'CodecEnc.conv_blocks.4.block.2.block.2.act.bias', 'CodecEnc.conv_blocks.4.block.3.act.bias', 'CodecEnc.conv_blocks.5.block.0.block.0.act.bias', 'CodecEnc.conv_blocks.5.block.0.block.2.act.bias', 'CodecEnc.conv_blocks.5.block.1.block.0.act.bias', 'CodecEnc.conv_blocks.5.block.1.block.2.act.bias', 'CodecEnc.conv_blocks.5.block.2.block.0.act.bias', 'CodecEnc.conv_blocks.5.block.2.block.2.act.bias', 'CodecEnc.conv_blocks.5.block.3.act.bias', 'CodecEnc.conv_final_block.0.act.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Because some weights are not initialised properly the encoded values are 0.

Ok, the issue is not with the xcodec2 but with the latest transformers. As shown in the earlier warning message (the weights getting not loaded properly), after modifying the keys in model.safetensors file it's working without any issues for the latest transformers as well as the xcodec2.
Basically in the latest transformers the name of the key expected is converted to bias (Ex: CodecEnc.conv_blocks.1.block.0.block.0.act.beta key is present in original safetensors file but transformers expects CodecEnc.conv_blocks.1.block.0.block.0.act.bias). So after renaming it works as expected.

Sign up or log in to comment