ValueError Rope Scaling

#22

by clawvyrin - opened Mar 17

Mar 17

Hi , when i try to deploy the model via a HF Inference Endpoint i get this : [Server Message] Endpoint Failed to Start.

With the following details :

Exit code: 1. Reason: 1.0, │ │\n│ │ │ 1.0, │ │\n│ │ │ 1.0, │ │\n│ │ │ 1.0 │ │\n│ │ │ ], │ │\n│ │ │ "type": "longrope" │ │\n│ │ }, │ │\n│ │ "rope_theta": 10000.0, │ │\n│ │ "transformers_version": "4.48.3", │ │\n│ │ "use_cache": true, │ │\n│ │ "vocab_size": 200064 │ │\n│ │ } │ │\n│ ╰──────────────────────────────────────────────────────────────────────────╯ │\n╰──────────────────────────────────────────────────────────────────────────────╯\nValueError: rope_scaling's short_factor field must have length 64, got 48"},"target":"text_generation_launcher","span":{"rank":0,"name":"shard-manager"},"spans":[{"rank":0,"name":"shard-manager"}]}
{"timestamp":"2025-03-17T11:18:59.041473Z","level":"ERROR","fields":{"message":"Shard 0 failed to start"},"target":"text_generation_launcher"}
{"timestamp":"2025-03-17T11:18:59.041521Z","level":"INFO","fields":{"message":"Shutting down shards"},"target":"text_generation_launcher"}
Error: ShardCannotStart

I'm pretty new to using HF Enpoints, so I just wanted to know if there was a way I could fix it myself, or if I needed to wait for a model update or something like that.

deleted

Mar 19

same thing here too `---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
in <cell line: 0>()
----> 1 generate(PHI3, messages)

5 frames
/usr/local/lib/python3.11/dist-packages/transformers/models/phi3/configuration_phi3.py in _rope_scaling_validation(self)
206 )
207 if not len(rope_scaling_short_factor) == self.hidden_size // self.num_attention_heads // 2:
--> 208 raise ValueError(
209 f"rope_scaling's short_factor field must have length {self.hidden_size // self.num_attention_heads // 2}, got {len(rope_scaling_short_factor)}"
210 )

ValueError: rope_scaling's short_factor field must have length 64, got 48`

ykim362

Microsoft org Mar 21

Hi @clawvyrin and @3mar2000 ,

Thanks for your interest!
Yes, the new model feature is added to the latest HF(v4.49.0) and vllm (v0.7.3) already.

Can you upgrade your HF version, or path the changes below and try?

VLLM: https://github.com/vllm-project/vllm/pull/12718
HF: https://github.com/huggingface/transformers/pull/35947

clawvyrin

Mar 21

Hi @ykim362 , I'm not using Transformers but InferenceClient and it still doesn't work, unfortunately.
I am trying to deploy the model directly from this page: https://huggingface.co/microsoft/Phi-4-mini-instruct

Dcas89

Mar 23

•

edited Mar 23

If you want to fix it yourself, ensure that the partial rotary factor is added to your config (here explicitly as 0.75): "rotary_ndims = int(self.hidden_size // self.num_attention_heads * 0.75)
if not len(rope_scaling_short_factor) == rotary_ndims // 2:
raise ValueError(
f"rope_scaling's short_factor field must have length {rotary_ndims // 2}, got {len(rope_scaling_short_factor)}"
)"

panalexeu

21 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment