cgus
/

granite-guardian-3.1-8b-exl2

Text Generation

4-bit precision

Model card Files Files and versions Community

cgus commited on Feb 26

Commit

fc16dae

·

verified ·

1 Parent(s): 7567f03

Update README.md

Files changed (1) hide show

README.md +8 -0

README.md CHANGED Viewed

@@ -23,6 +23,14 @@ Made with Exllamav2 0.2.8 with the default dataset. Granite3 models require Exll
 Exl2 models don't support native RAM offloading, so the model has to fully fit into GPU VRAM.
 It's also required to use Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
 # Original model card
 # Granite Guardian 3.1 8B

 Exl2 models don't support native RAM offloading, so the model has to fully fit into GPU VRAM.
 It's also required to use Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
+Just in case if you downloaded the model and it answers only Yes/No, it's [intended behavior](https://github.com/ibm-granite/granite-guardian/tree/main#scope-of-use).
+It's hardcoded in the model's Jinja2 template that can be viewed in tokenizer_config.json file.
+By default in chat mode it evaluates if user's or assistant's message is harmful in general sense according to the model's risk definitions.
+But it allows to choose a different predefined option, to set custom harm definitions or detect risks in RAG or function calling pipelines.
+If you're using TabbyAPI you can either set risk_name or risk_definition via [template variables](https://github.com/theroyallab/tabbyAPI/wiki/04.-Chat-Completions#template-variables).
+For example, you can switch to violence detection by adding: ``"template_vars": {"guardian_config": {"risk_name": "violence"}}`` to v1/chat/completions request.
+For more information refer to Granite Guardian [documentation](https://github.com/ibm-granite/granite-guardian) and its Jinja2 template.
 # Original model card
 # Granite Guardian 3.1 8B