cgus commited on
Commit
fc16dae
·
verified ·
1 Parent(s): 7567f03

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +8 -0
README.md CHANGED
@@ -23,6 +23,14 @@ Made with Exllamav2 0.2.8 with the default dataset. Granite3 models require Exll
23
  Exl2 models don't support native RAM offloading, so the model has to fully fit into GPU VRAM.
24
  It's also required to use Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
25
 
 
 
 
 
 
 
 
 
26
  # Original model card
27
  # Granite Guardian 3.1 8B
28
 
 
23
  Exl2 models don't support native RAM offloading, so the model has to fully fit into GPU VRAM.
24
  It's also required to use Nvidia RTX on Windows or Nvidia RTX/AMD ROCm on Linux.
25
 
26
+ Just in case if you downloaded the model and it answers only Yes/No, it's [intended behavior](https://github.com/ibm-granite/granite-guardian/tree/main#scope-of-use).
27
+ It's hardcoded in the model's Jinja2 template that can be viewed in tokenizer_config.json file.
28
+ By default in chat mode it evaluates if user's or assistant's message is harmful in general sense according to the model's risk definitions.
29
+ But it allows to choose a different predefined option, to set custom harm definitions or detect risks in RAG or function calling pipelines.
30
+ If you're using TabbyAPI you can either set risk_name or risk_definition via [template variables](https://github.com/theroyallab/tabbyAPI/wiki/04.-Chat-Completions#template-variables).
31
+ For example, you can switch to violence detection by adding: ``"template_vars": {"guardian_config": {"risk_name": "violence"}}`` to v1/chat/completions request.
32
+ For more information refer to Granite Guardian [documentation](https://github.com/ibm-granite/granite-guardian) and its Jinja2 template.
33
+
34
  # Original model card
35
  # Granite Guardian 3.1 8B
36