ValueError Rope Scaling
Hi , when i try to deploy the model via a HF Inference Endpoint i get this : [Server Message] Endpoint Failed to Start.
With the following details :
Exit code: 1. Reason: 1.0, β β\nβ β β 1.0, β β\nβ β β 1.0, β β\nβ β β 1.0 β β\nβ β β ], β β\nβ β β "type": "longrope" β β\nβ β }, β β\nβ β "rope_theta": 10000.0, β β\nβ β "transformers_version": "4.48.3", β β\nβ β "use_cache": true, β β\nβ β "vocab_size": 200064 β β\nβ β } β β\nβ β°βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ― β\nβ°βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ―\nValueError: rope_scaling
's short_factor field must have length 64, got 48"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2025-03-17T11:18:59.041473Z","level":"ERROR","fields":{"message":"Shard 0 failed to start"},"target":"text_generation_launcher"}
{"timestamp":"2025-03-17T11:18:59.041521Z","level":"INFO","fields":{"message":"Shutting down shards"},"target":"text_generation_launcher"}
Error: ShardCannotStart
I'm pretty new to using HF Enpoints, so I just wanted to know if there was a way I could fix it myself, or if I needed to wait for a model update or something like that.
same thing here too `---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in <cell line: 0>()
----> 1 generate(PHI3, messages)
5 frames
/usr/local/lib/python3.11/dist-packages/transformers/models/phi3/configuration_phi3.py in _rope_scaling_validation(self)
206 )
207 if not len(rope_scaling_short_factor) == self.hidden_size // self.num_attention_heads // 2:
--> 208 raise ValueError(
209 f"rope_scaling
's short_factor field must have length {self.hidden_size // self.num_attention_heads // 2}, got {len(rope_scaling_short_factor)}"
210 )
ValueError: rope_scaling
's short_factor field must have length 64, got 48`
Hi @clawvyrin and @3mar2000 ,
Thanks for your interest!
Yes, the new model feature is added to the latest HF(v4.49.0) and vllm (v0.7.3) already.
Can you upgrade your HF version, or path the changes below and try?
VLLM: https://github.com/vllm-project/vllm/pull/12718
HF: https://github.com/huggingface/transformers/pull/35947
Hi
@ykim362
, I'm not using Transformers but InferenceClient and it still doesn't work, unfortunately.
I am trying to deploy the model directly from this page: https://huggingface.co/microsoft/Phi-4-mini-instruct
If you want to fix it yourself, ensure that the partial rotary factor is added to your config (here explicitly as 0.75): "rotary_ndims = int(self.hidden_size // self.num_attention_heads * 0.75)
if not len(rope_scaling_short_factor) == rotary_ndims // 2:
raise ValueError(
f"rope_scaling
's short_factor field must have length {rotary_ndims // 2}, got {len(rope_scaling_short_factor)}"
)"