MengniWang's picture
Update README.md
f41827e
metadata
license: apache-2.0
datasets:
  - lambada
language:
  - en
library_name: transformers
pipeline_tag: text-generation
tags:
  - text-generation-inference
  - causal-lm
  - int8
  - ONNX
  - PostTrainingDynamic
  - Intel® Neural Compressor
  - neural-compressor

Model Details: INT8 GPT-J 6B

GPT-J 6B is a transformer model trained using Ben Wang's Mesh Transformer JAX. "GPT-J" refers to the class of model, while "6B" represents the number of trainable parameters.

This int8 ONNX model is generated by neural-compressor and the fp32 model can be exported with below command:

python -m transformers.onnx --model=EleutherAI/gpt-j-6B onnx_gptj/ --framework pt --opset 13 --feature=causal-lm-with-past
Model Detail Description
Model Authors - Company Intel
Date April 10, 2022
Version 1
Type Text Generation
Paper or Other Resources -
License Apache 2.0
Questions or Comments Community Tab
Intended Use Description
Primary intended uses You can use the raw model for text generation inference
Primary intended users Anyone doing text generation inference
Out-of-scope uses This model in most cases will need to be fine-tuned for your particular task. The model should not be used to intentionally create hostile or alienating environments for people.

How to use

Download the model and script by cloning the repository:

git clone https://huggingface.co/Intel/gpt-j-6B-int8-dynamic

Then you can do inference based on the model and script 'evaluation.ipynb'.

Metrics (Model Performance):

Model Model Size (GB) Lambada Acc
FP32 23 0.7954
INT8 6 0.7926