nsbendre25 commited on
Commit
bc0e81b
·
verified ·
1 Parent(s): 5299f82

Updated README.md

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md CHANGED
@@ -1,3 +1,54 @@
1
  ---
 
 
 
 
 
 
 
 
2
  license: mit
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ pipeline_tag: text-generation
5
+ tags:
6
+ - OpenVINO
7
+ - Phi-3
8
+ - PyTorch
9
+ - weight_compression
10
  license: mit
11
+
12
+ library_name: transformers
13
  ---
14
+
15
+ # Phi-3-128K-Instruct-ov-fp16-int4-asym
16
+
17
+
18
+ ## Model Description
19
+
20
+ This is a version of the original [Phi-3-128K-Instruct](https://huggingface.co/microsoft/Phi-3-128k-instruct) model, converted to OpenVINO™ IR (Intermediate Representation) format for optimized inference on Intel® hardware. This model is created using the procedures detailed in the [OpenVINO™ Notebooks](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks) repository.
21
+
22
+ ## Intended Use
23
+ This model is designed for advanced natural language understanding and generation tasks, ideal for developers and researchers in both academic and commercial settings who require efficient AI capabilities for devices with limited computational power. It is not intended for use in creating or promoting harmful or illegal content, in accordance with the guidelines outlined in the Phi-3 Acceptable Use Policy.
24
+
25
+ ## Licensing and Redistribution
26
+ This model is released under the [MIT license](https://huggingface.co/microsoft/Phi-3-128k-instruct/resolve/main/LICENSE). Redistribution requires inclusion of this license and a citation to the original model. Modifications and derivative works must prominently display "Built with Phi-3 Technology" and adhere to the redistribution policies detailed in the original model license terms.
27
+
28
+ ## Weight Compression Parameters
29
+ For more information on the parameters, refer to the [OpenVINO™ 2024.1.0 documentation](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)
30
+
31
+ * mode: **INT4_ASYM**
32
+ * group_size: **128**
33
+ * ratio: **0.8**
34
+
35
+ ## Running Model Inference
36
+
37
+ Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO™ backend:
38
+
39
+ ```python
40
+ pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
41
+
42
+ from optimum.intel.openvino import OVModelForCausalLM
43
+ from transformers import AutoTokenizer
44
+
45
+ model_id = "microsoft/Phi-3-128K-Instruct-ov-fp32-int4-asym"
46
+
47
+ # Initialize the tokenizer and model
48
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
49
+ model = OVModelForCausalLM.from_pretrained(model_id)
50
+
51
+ pipeline = transformers.pipeline("text-generation", model=model, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto")
52
+ pipeline("i am in paris, plan me a 2 week trip")
53
+ ```
54
+