lbourdois commited on
Commit
87504dc
·
verified ·
1 Parent(s): 21b54f1

Improve language tag

Browse files

Hi! As the model is multilingual, this is a PR to add other languages than English to the language tag to improve the referencing. Note that 29 languages are announced in the README, but only 13 are explicitly listed. I was therefore only able to add these 13 languages.

Files changed (1) hide show
  1. README.md +110 -98
README.md CHANGED
@@ -1,99 +1,111 @@
1
- ---
2
- license: apache-2.0
3
- language:
4
- - en
5
- base_model:
6
- - Qwen/Qwen2.5-3B-Instruct
7
- pipeline_tag: text-generation
8
- library_name: transformers
9
- ---
10
-
11
- # Qwen-2.5-3B-Instruct-ov-INT8
12
- * Model creator: [Qwen](https://huggingface.co/Qwen)
13
- * Original model: [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
14
-
15
- ## Description
16
- This is [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
17
-
18
- ## Quantization Parameters
19
-
20
- Weight compression was performed using `nncf.compress_weights` with the following parameters:
21
-
22
- * mode: **int8_asym**
23
- * ratio: **0.8**
24
- * group_size: **128**
25
-
26
- For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html).
27
-
28
-
29
- ## Compatibility
30
-
31
- The provided OpenVINO™ IR model is compatible with:
32
-
33
- * OpenVINO version 2024.4.0 and higher
34
- * Optimum Intel 1.19.0 and higher
35
-
36
- ## Prompt Template
37
-
38
- ```
39
- <|im_start|>system
40
- You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
41
- <|im_start|>user
42
- {input}<|im_end|>
43
- ```
44
-
45
- ## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
46
-
47
-
48
- 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
49
-
50
- ```
51
- pip install optimum[openvino]
52
- ```
53
-
54
- 2. Run model inference:
55
-
56
- ```
57
- from transformers import AutoTokenizer
58
- from optimum.intel.openvino import OVModelForCausalLM
59
-
60
- model_id = "srang992/Qwen-2.5-3B-Instruct-ov-INT8"
61
- tokenizer = AutoTokenizer.from_pretrained(model_id)
62
- model = OVModelForCausalLM.from_pretrained(model_id)
63
-
64
- inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
65
-
66
- outputs = model.generate(**inputs, max_length=200)
67
- text = tokenizer.batch_decode(outputs)[0]
68
- print(text)
69
- ```
70
-
71
- For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
72
-
73
- ## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
74
-
75
- 1. Install packages required for using OpenVINO GenAI.
76
- ```
77
- pip install openvino-genai huggingface_hub
78
- ```
79
-
80
- 2. Download model from HuggingFace Hub
81
-
82
- ```
83
- import huggingface_hub as hf_hub
84
-
85
- model_id = "srang992/Qwen-2.5-3B-Instruct-ov-INT8"
86
- model_path = "Qwen-2.5-3B-Instruct-ov-INT8"
87
-
88
- hf_hub.snapshot_download(model_id, local_dir=model_path)
89
-
90
- ```
91
-
92
- 3. Run model inference:
93
-
94
- ```
95
- import openvino_genai as ov_genai
96
-
97
- device = "CPU"
98
- pipe = ov_genai.LLMPipeline(model_path, device)
 
 
 
 
 
 
 
 
 
 
 
 
99
  print(pipe.generate("What is OpenVINO?", max_length=200))
 
1
+ ---
2
+ license: apache-2.0
3
+ language:
4
+ - zho
5
+ - eng
6
+ - fra
7
+ - spa
8
+ - por
9
+ - deu
10
+ - ita
11
+ - rus
12
+ - jpn
13
+ - kor
14
+ - vie
15
+ - tha
16
+ - ara
17
+ base_model:
18
+ - Qwen/Qwen2.5-3B-Instruct
19
+ pipeline_tag: text-generation
20
+ library_name: transformers
21
+ ---
22
+
23
+ # Qwen-2.5-3B-Instruct-ov-INT8
24
+ * Model creator: [Qwen](https://huggingface.co/Qwen)
25
+ * Original model: [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct)
26
+
27
+ ## Description
28
+ This is [Qwen2.5-3B-Instruct](https://huggingface.co/Qwen/Qwen2.5-3B-Instruct) model converted to the [OpenVINO™ IR](https://docs.openvino.ai/2024/documentation/openvino-ir-format.html) (Intermediate Representation) format with weights compressed to INT8 by [NNCF](https://github.com/openvinotoolkit/nncf).
29
+
30
+ ## Quantization Parameters
31
+
32
+ Weight compression was performed using `nncf.compress_weights` with the following parameters:
33
+
34
+ * mode: **int8_asym**
35
+ * ratio: **0.8**
36
+ * group_size: **128**
37
+
38
+ For more information on quantization, check the [OpenVINO model optimization guide](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html).
39
+
40
+
41
+ ## Compatibility
42
+
43
+ The provided OpenVINO™ IR model is compatible with:
44
+
45
+ * OpenVINO version 2024.4.0 and higher
46
+ * Optimum Intel 1.19.0 and higher
47
+
48
+ ## Prompt Template
49
+
50
+ ```
51
+ <|im_start|>system
52
+ You are Qwen, created by Alibaba Cloud. You are a helpful assistant.<|im_end|>
53
+ <|im_start|>user
54
+ {input}<|im_end|>
55
+ ```
56
+
57
+ ## Running Model Inference with [Optimum Intel](https://huggingface.co/docs/optimum/intel/index)
58
+
59
+
60
+ 1. Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO backend:
61
+
62
+ ```
63
+ pip install optimum[openvino]
64
+ ```
65
+
66
+ 2. Run model inference:
67
+
68
+ ```
69
+ from transformers import AutoTokenizer
70
+ from optimum.intel.openvino import OVModelForCausalLM
71
+
72
+ model_id = "srang992/Qwen-2.5-3B-Instruct-ov-INT8"
73
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
74
+ model = OVModelForCausalLM.from_pretrained(model_id)
75
+
76
+ inputs = tokenizer("What is OpenVINO?", return_tensors="pt")
77
+
78
+ outputs = model.generate(**inputs, max_length=200)
79
+ text = tokenizer.batch_decode(outputs)[0]
80
+ print(text)
81
+ ```
82
+
83
+ For more examples and possible optimizations, refer to the [OpenVINO Large Language Model Inference Guide](https://docs.openvino.ai/2024/learn-openvino/llm_inference_guide.html).
84
+
85
+ ## Running Model Inference with [OpenVINO GenAI](https://github.com/openvinotoolkit/openvino.genai)
86
+
87
+ 1. Install packages required for using OpenVINO GenAI.
88
+ ```
89
+ pip install openvino-genai huggingface_hub
90
+ ```
91
+
92
+ 2. Download model from HuggingFace Hub
93
+
94
+ ```
95
+ import huggingface_hub as hf_hub
96
+
97
+ model_id = "srang992/Qwen-2.5-3B-Instruct-ov-INT8"
98
+ model_path = "Qwen-2.5-3B-Instruct-ov-INT8"
99
+
100
+ hf_hub.snapshot_download(model_id, local_dir=model_path)
101
+
102
+ ```
103
+
104
+ 3. Run model inference:
105
+
106
+ ```
107
+ import openvino_genai as ov_genai
108
+
109
+ device = "CPU"
110
+ pipe = ov_genai.LLMPipeline(model_path, device)
111
  print(pipe.generate("What is OpenVINO?", max_length=200))