Llama-Prompt-Guard-2-86M-onnx

This repository provides a ONNX converted and quantized version of meta-llama/Llama-Prompt-Guard-2-86M

🧠 Built With

πŸ“₯ Evaluation Dataset

We use jackhhao/jailbreak-classification for the evaluation

πŸ§ͺ Evaluation Results

Model Accuracy Precision Recall F1 Score AUC-ROC Inference Time
Llama-Prompt-Guard-2-22M 0.9569 0.9879 0.9260 0.9559 0.9259 33s
Llama-Prompt-Guard-2-22M-q 0.9473 1.0000 0.8956 0.9449 0.9032 29s
Llama-Prompt-Guard-2-86M 0.9770 0.9980 0.9564 0.9767 0.9523 1m29s
Llama-Prompt-Guard-2-86M-q 0.8937 1.0000 0.7894 0.8823 0.7263 1m15s

πŸ€— Usage

from transformers import AutoTokenizer
from optimum.onnxruntime import ORTModelForSequenceClassification
import numpy as np

# Load model and tokenizer using optimum
model = ORTModelForSequenceClassification.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx", file_name="model.quant.onnx")
tokenizer = AutoTokenizer.from_pretrained("gravitee-io/Llama-Prompt-Guard-2-86M-onnx")

# Tokenize input
text = "Your comment here"
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

# Run inference
outputs = model(**inputs)
logits = outputs.logits

# Optional: convert to probabilities
probs = 1 / (1 + np.exp(-logits))
print(probs)

πŸ™ GitHub Repository:

You can find the full source code, CLI tools, and evaluation scripts in the official GitHub repository.

Downloads last month
514
Safetensors
Model size
279M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for gravitee-io/Llama-Prompt-Guard-2-86M-onnx

Quantized
(2)
this model