microsoft
/

Phi-3-mini-4k-instruct

@@ -12,9 +12,9 @@ tags:
 ## Model Summary
-Phi-3 Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model built upon datasets used for Phi-2 - synthetic data and filtered websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family, and the Mini version comes in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) it can support.
-The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
 When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
 Resources and Technical Documentation:
@@ -33,7 +33,7 @@ The model is intended for commercial and research use in English. The model prov
 1) Memory/compute constrained environments
 2) Latency bound scenarios
-3) Strong reasoning (especially math and logic)
 Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.

 ## Model Summary
+The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) that it can support.
+The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization for the instruction following and safety measures.
 When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
 Resources and Technical Documentation:
 1) Memory/compute constrained environments
 2) Latency bound scenarios
+3) Strong reasoning (especially code, math and logic)
 Our model is designed to accelerate research on language and multimodal models, for use as a building block for generative AI powered features.