|
--- |
|
language: |
|
- en |
|
- ar |
|
library_name: openvino |
|
pipeline_tag: text-generation |
|
license: apache-2.0 |
|
base_model: inceptionai/jais-13b |
|
tags: |
|
- openvino |
|
- optimized |
|
- int4 |
|
- awq |
|
- bilingual |
|
- arabic |
|
- english |
|
- jais |
|
--- |
|
|
|
# Jais-13B OpenVINO INT4 |
|
|
|
This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model... |
|
# Jais-13B OpenVINO INT4 |
|
|
|
This repository contains the [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model optimized for inference with Intel's OpenVINO runtime. The model has been quantized to INT4 using the AWQ quantization scheme for improved performance while maintaining quality. |
|
|
|
## Model Details |
|
|
|
* **Original Model**: [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) |
|
* **Model Type**: Bilingual (Arabic-English) Large Language Model |
|
* **Parameters**: 13B |
|
* **OpenVINO Version**: 2024.0+ |
|
* **Quantization**: INT4 Symmetric AWQ (Activation-aware Weight Quantization) |
|
* **Group Size**: -1 (per-channel quantization) |
|
|
|
Jais-13B is a bilingual model that supports both Arabic and English text generation. The model can: |
|
- Generate fluent text in both Arabic and English |
|
- Respond to prompts in either language |
|
- Handle code-switching between the two languages |
|
|
|
## Optimization Details |
|
|
|
This model was converted from the original Hugging Face model to OpenVINO format using the Optimum Intel library. The following optimization command was used: |
|
|
|
```bash |
|
optimum-cli export openvino \ |
|
-m inceptionai/jais-13b \ |
|
--weight-format int4 \ |
|
--sym \ |
|
--dataset auto \ |
|
--awq \ |
|
--group-size -1 \ |
|
--trust-remote-code \ |
|
jais-13b-int4-sym-ov |
|
``` |
|
|
|
### Optimization Parameters: |
|
- **INT4 Quantization**: Weights compressed to 4-bit integers |
|
- **Symmetric Quantization**: Using symmetric quantization for better accuracy |
|
- **AWQ**: Activation-aware Weight Quantization to preserve model quality |
|
- **Auto Dataset**: Used automatic dataset sampling for calibration |
|
- **Group Size**: -1 (quantize each output channel independently) |
|
- **Trust Remote Code**: Enabled to support custom model code |
|
|
|
|
|
## Usage |
|
|
|
### Prerequisites |
|
- OpenVINO 2024.0 or newer |
|
- optimum-intel |
|
- transformers |
|
|
|
### Sample Inference code with Optimum Intel |
|
|
|
```python |
|
from optimum.intel import OVModelForCausalLM |
|
from transformers import AutoTokenizer |
|
|
|
# Load tokenizer and model |
|
model_id = "rpanchum/jais-13b-int4-sym-ov" |
|
tokenizer = AutoTokenizer.from_pretrained(model_id) |
|
model = OVModelForCausalLM.from_pretrained(model_id) |
|
|
|
# Generate text |
|
prompt = "Write a short story about a robot learning to paint:" |
|
input_ids = tokenizer(prompt, return_tensors="pt") |
|
output = model.generate( |
|
**input_ids, |
|
max_new_tokens=512, |
|
temperature=0.7, |
|
top_p=0.9, |
|
) |
|
response = tokenizer.decode(output[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
### Alternative: Using OpenVINO GenAI |
|
|
|
1. Install packages required for using OpenVINO GenAI. |
|
```bash |
|
pip install openvino-genai huggingface_hub |
|
``` |
|
|
|
2. Download model and run inference. |
|
|
|
```python |
|
import huggingface_hub as hf_hub |
|
|
|
model_id = "rpanchum/jais-13b-int4-sym-ov" |
|
model_path = "jais-13b-int4-sym-ov" |
|
|
|
hf_hub.snapshot_download(model_id, local_dir=model_path) |
|
|
|
import openvino_genai as ov_genai |
|
|
|
device = "CPU" |
|
pipe = ov_genai.LLMPipeline(model_path, device) |
|
print(pipe.generate("ما هو الذكاء الاصطناعي؟", max_length=200)) # "What is AI?" in Arabic |
|
print(pipe.generate("What is artificial intelligence?", max_length=200)) |
|
``` |
|
|
|
|
|
## License |
|
|
|
This model inherits the license of the original [inceptionai/jais-13b](https://huggingface.co/inceptionai/jais-13b) model. |
|
|