Update README.md

84e8bca verified about 1 month ago

3.62 kB

	---
	language:
	- en
	- ar
	library_name: openvino
	pipeline_tag: text-generation
	license: apache-2.0
	base_model: inceptionai/jais-13b
	tags:
	- openvino
	- optimized
	- int4
	- awq
	- bilingual
	- arabic
	- english
	- jais
	---

	# Jais-13B OpenVINO INT4

	This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model...
	# Jais-13B OpenVINO INT4

	This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model optimized for inference with Intel's OpenVINO runtime. The model has been quantized to INT4 using the AWQ quantization scheme for improved performance while maintaining quality.

	## Model Details

	* Original Model: [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b)
	* Model Type: Bilingual (Arabic-English) Large Language Model
	* Parameters: 13B
	* OpenVINO Version: 2024.0+
	* Quantization: INT4 Symmetric AWQ (Activation-aware Weight Quantization)
	* Group Size: -1 (per-channel quantization)

	Jais-13B is a bilingual model that supports both Arabic and English text generation. The model can:
	- Generate fluent text in both Arabic and English
	- Respond to prompts in either language
	- Handle code-switching between the two languages

	## Optimization Details

	This model was converted from the original Hugging Face model to OpenVINO format using the Optimum Intel library. The following optimization command was used:

	```bash
	optimum-cli export openvino \
	-m inceptionai/jais-13b \
	--weight-format int4 \
	--sym \
	--dataset auto \
	--awq \
	--group-size -1 \
	--trust-remote-code \
	jais-13b-int4-sym-ov
	```

	### Optimization Parameters:
	- INT4 Quantization: Weights compressed to 4-bit integers
	- Symmetric Quantization: Using symmetric quantization for better accuracy
	- AWQ: Activation-aware Weight Quantization to preserve model quality
	- Auto Dataset: Used automatic dataset sampling for calibration
	- Group Size: -1 (quantize each output channel independently)
	- Trust Remote Code: Enabled to support custom model code


	## Usage

	### Prerequisites
	- OpenVINO 2024.0 or newer
	- optimum-intel
	- transformers

	### Sample Inference code with Optimum Intel

	```python
	from optimum.intel import OVModelForCausalLM
	from transformers import AutoTokenizer

	# Load tokenizer and model
	model_id = "rpanchum/jais-13b-int4-sym-ov"
	tokenizer = AutoTokenizer.from_pretrained(model_id)
	model = OVModelForCausalLM.from_pretrained(model_id)

	# Generate text
	prompt = "Write a short story about a robot learning to paint:"
	input_ids = tokenizer(prompt, return_tensors="pt")
	output = model.generate(
	**input_ids,
	max_new_tokens=512,
	temperature=0.7,
	top_p=0.9,
	)
	response = tokenizer.decode(output[0], skip_special_tokens=True)
	print(response)
	```

	### Alternative: Using OpenVINO GenAI

	1. Install packages required for using OpenVINO GenAI.
	```bash
	pip install openvino-genai huggingface_hub
	```

	2. Download model and run inference.

	```python
	import huggingface_hub as hf_hub

	model_id = "rpanchum/jais-13b-int4-sym-ov"
	model_path = "jais-13b-int4-sym-ov"

	hf_hub.snapshot_download(model_id, local_dir=model_path)

	import openvino_genai as ov_genai

	device = "CPU"
	pipe = ov_genai.LLMPipeline(model_path, device)
	print(pipe.generate("ما هو الذكاء الاصطناعي؟", max_length=200)) # "What is AI?" in Arabic
	print(pipe.generate("What is artificial intelligence?", max_length=200))
	```


	## License

	This model inherits the license of the original [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model.