microsoft
/

Phi-4-reasoning

@@ -56,7 +56,7 @@ library_name: transformers
 ## Usage
 > [!IMPORTANT]
-> To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).
 ### Input Formats
@@ -88,6 +88,7 @@ outputs = model.generate(
     inputs.to(model.device),
     max_new_tokens=4096,
     temperature=0.8,
     top_p=0.95,
     do_sample=True,
 )

 ## Usage
 > [!IMPORTANT]
+> To fully take advantage of the model's capabilities, inference must use `temperature=0.8`, `top_k=50`, `top_p=0.95`, and `do_sample=True`. For more complex queries, set `max_new_tokens=32768` to allow for longer chain-of-thought (CoT).
 ### Input Formats
     inputs.to(model.device),
     max_new_tokens=4096,
     temperature=0.8,
+    top_k=50,
     top_p=0.95,
     do_sample=True,
 )

generation_config.json CHANGED Viewed

@@ -5,6 +5,7 @@
   "eos_token_id": 100265,
   "pad_token_id": 100349,
   "temperature": 0.8,
   "top_p": 0.95,
   "transformers_version": "4.51.1"
 }

   "eos_token_id": 100265,
   "pad_token_id": 100349,
   "temperature": 0.8,
+  "top_k": 50,
   "top_p": 0.95,
   "transformers_version": "4.51.1"
 }