PreceptCLIP-Memorability is a model designed to predict image memorability (the likelihood of an image to be remembered). This is the official model from the paper "Don't Judge Before You CLIP: A Unified Approach for Perceptual Tasks". We apply LoRA adaptation on the CLIP visual encoder with an additional MLP head. Our model achieves state-of-the-art results.

Training Details

  • Dataset: LaMem (Large-Scale Image Memorability)
  • Architecture: CLIP Vision Encoder (ViT-L/14) with LoRA adaptation
  • Loss Function: Mean Squared Error (MSE) Loss for memorability prediction
  • Optimizer: AdamW
  • Learning Rate: 5e-05
  • Batch Size: 32

Requirements

  • python=3.9.15
  • cudatoolkit=11.7
  • torchvision=0.14.0
  • transformers=4.45.2
  • peft=0.14.0

Usage

To use the model for inference:

from torchvision import transforms
import torch
from PIL import Image
from huggingface_hub import hf_hub_download
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Load model
model_path = hf_hub_download(repo_id="PerceptCLIP/PerceptCLIP_Memorability", filename="perceptCLIP_Memorability.pth")
model = torch.load(model_path).to(device).eval()

# Load an image
image = Image.open("image_path.jpg").convert("RGB")

# Preprocess and predict
def Mem_preprocess():
    transform = transforms.Compose([
        transforms.Resize(224),
        transforms.CenterCrop(size=(224, 224)),  
        transforms.ToTensor(),
        transforms.Normalize(mean=(0.48145466, 0.4578275, 0.40821073), 
                             std=(0.26862954, 0.26130258, 0.27577711))
    ])
    return transform

image = Mem_preprocess()(image).unsqueeze(0).to(device)

with torch.no_grad():
    mem_score = model(image).item()

print(f"Predicted Memorability Score: {mem_score:.4f}")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for PerceptCLIP/PerceptCLIP_Memorability

Finetuned
(63)
this model