nsbendre25
/

Phi-3-mini-128k-instruct-ov-fp16-int4-asym

Text Generation

weight_compression

Model card Files Files and versions Community

Phi-3-mini-128k-instruct-ov-fp16-int4-asym / README.md

nsbendre25's picture

Update README.md

4131503 about 1 year ago

|

history blame contribute delete

2.25 kB

	---
	language:
	- en
	pipeline_tag: text-generation
	tags:
	- OpenVINO
	- Phi-3
	- PyTorch
	- weight_compression
	- optimum-intel
	license: mit

	library_name: transformers
	---

	# Phi-3-128K-Instruct-ov-fp16-int4-asym


	## Model Description

	This is a version of the original [Phi-3-128K-Instruct](https://huggingface.co/microsoft/Phi-3-128k-instruct) model, converted to OpenVINO™ IR (Intermediate Representation) format for optimized inference on Intel® hardware. This model is created using the procedures detailed in the [OpenVINO™ Notebooks](https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks) repository.

	## Intended Use
	This model is designed for advanced natural language understanding and generation tasks, ideal for developers and researchers in both academic and commercial settings who require efficient AI capabilities for devices with limited computational power. It is not intended for use in creating or promoting harmful or illegal content, in accordance with the guidelines outlined in the Phi-3 Acceptable Use Policy.

	## Licensing and Redistribution
	This model is released under the [MIT license](https://huggingface.co/microsoft/Phi-3-128k-instruct/resolve/main/LICENSE).

	## Weight Compression Parameters
	For more information on the parameters, refer to the [OpenVINO™ 2024.1.0 documentation](https://docs.openvino.ai/2024/openvino-workflow/model-optimization-guide/weight-compression.html)

	* mode: INT4_ASYM
	* group_size: 128
	* ratio: 0.8

	## Running Model Inference

	Install packages required for using [Optimum Intel](https://huggingface.co/docs/optimum/intel/index) integration with the OpenVINO™ backend:

	```python
	pip install --upgrade --upgrade-strategy eager "optimum[openvino]"

	from optimum.intel.openvino import OVModelForCausalLM
	from transformers import AutoTokenizer

	model_id = "microsoft/Phi-3-128K-Instruct-ov-fp32-int4-asym"

	# Initialize the tokenizer and model
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = OVModelForCausalLM.from_pretrained(model_id)

	pipeline = transformers.pipeline("text-generation", model=model, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto")
	pipeline("i am in paris, plan me a 2 week trip")
	```