Model Card: HySAC
Hyperbolic Safety-Aware CLIP (HySAC), introduced in the paper Hyperbolic Safety-Aware Vision-Language Models, models hierarchical safety relations to enable effective retrieval of unsafe content, dynamically redirecting it to safer alternatives for enhanced content moderation.
NSFW Definition
In our work we use the Safe-CLIP's definition of NSFW: a finite and fixed set concepts that are considered inappropriate, offensive, or harmful to individuals. These concepts are divided into seven categories: hate, harassment, violence, self-harm, sexual, shocking and illegal activities.
Use HySAC
See the snippet below for downloading and using HySAC. Before proceeding make sure to install the HySAC code from our github repository.
>>> from hysac.models import HySAC
>>> model_id = "aimagelab/hysac"
>>> model = HySAC.from_pretrained(model_id, device="cuda").to("cuda")
Use standard methods encode_image
and encode_text
to encode images and text with the model.
Before using the model to retrieve data, you can apply the safety traversal to the query embedding by calling the traverse_to_safe_image
and traverse_to_safe_text
methods.
Model Details
HySAC is a fine-tuned version of CLIP model fine-tuned in hyperbolic space. The model fine-tuning is done through the ViSU (Visual Safe and Unsafe) Dataset, introduced in the same paper.
ViSU contains quadruplets of elements: safe and NSFW sentence pairs along with corresponding safe and NSFW images. You can find the text portion of ViSU Dataset publicly released on the HuggingFace ViSU-Text page. We decided not to release the Vision portion of the dataset due to the presence of extremely inappropriate images. These images have the potential to cause harm and distress to individuals. Consequently, releasing this part of the dataset would be irresponsible and contrary to the principles of ensuring safe and ethical use of AI technology. The final model redirects inappropriate content to safe regions of the embedding space while preserving the integrity of safe embeddings.
Model Release Date 17 March 2025.
For more information about the model, training details, dataset, and evaluation, please refer to the paper. You can find more in the repository of the paper here.
Official Repository
More example codes in the official HySAC repo.
Citation
Please cite with the following BibTeX:
@inproceedings{poppi2025hyperbolic,
title={{Hyperbolic Safety-Aware Vision-Language Models}},
author={Poppi, Tobia and Kasarla, Tejaswi and Mettes, Pascal and Baraldi, Lorenzo and Cucchiara, Rita},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2025}
}