Detoxify ONNX πŸš€

This project provides an ONNX-exported and quantized version of the Detoxify multilingual model, optimized for runtime inference.
It enables faster and lighter toxicity detection using ONNX Runtime.

πŸ§ͺ ONNX Evaluation Results

Original Model (using Detoxify lib and ONNX):

Threshold Accuracy Precision Recall F1 AUC-ROC
0.2 0.8408 0.4899 0.8659 0.6257 0.9345
0.4 0.8723 0.5628 0.7577 0.6459 0.9345
0.5 0.8845 0.6073 0.7041 0.6521 0.9345
0.7 0.8954 0.6951 0.5691 0.6258 0.9345
0.9 0.8941 0.8501 0.3780 0.5234 0.9345

Time for 1 threshold evaluation =~ 3 min 30s

Quantized model:

Threshold Accuracy Precision Recall F1 AUC-ROC
0.2 0.8581 0.5249 0.8154 0.6387 0.9306
0.4 0.8809 0.6001 0.6748 0.6353 0.9306
0.5 0.8880 0.6408 0.6179 0.6291 0.9306
0.7 0.8969 0.7467 0.4984 0.5978 0.9306
0.9 0.8869 0.8878 0.3024 0.4512 0.9306

Time for 1 threshold evaluation =~ 2 min 41s

πŸ€— Usage

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np

# Load model and tokenizer using optimum
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/detoxify-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/detoxify-onnx")

# Tokenize input
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

# Run inference
outputs = model(**inputs)
logits = outputs.logits

# Optional: convert to probabilities
probs = 1 / (1 + np.exp(-logits))
print(probs)

πŸ™ GitHub Repository:

You can find the full source code, CLI tools, and evaluation scripts in the official GitHub repository.

Downloads last month
53
Safetensors
Model size
278M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for gravitee-io/detoxify-onnx

Base model

unitary/toxic-bert
Quantized
(2)
this model