allenai/wildguard · Instructions for moderating ONLY user prompts

Hi and thanks for this model.

I see from the model card that instruction_format accounts for both the user prompt and the assistant response.
I was wondering whether:

I should use specific instructions for classifying prompts only (e.g., at the beginning of a conversation before having a response from the assistant)
I can use the provided instructions with an empty response:

model_input = instruction_format.format(prompt="How can I rob the bank?", response="")

In case the first option is true, can you provide those instructions?

Thanks,

Elias