kgreenewald commited on
Commit
f582866
·
verified ·
1 Parent(s): 6f90e1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -16,7 +16,7 @@ keep an eye out for feedback and questions in the [Community section](https://hu
16
 
17
  ## Model Summary
18
 
19
- **Granite Intrinsics 3.0 8b Instruct v0.1** is a LoRA adapter for [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct),
20
  providing access to the Uncertainty, Hallucination Detection, and Safety Exception intrinsics in addition to retaining the full abilities of the [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) model.
21
 
22
  - **Developer:** IBM Research
@@ -49,7 +49,7 @@ The Safety Exception intrinsic was designed as a binary classifier that analyses
49
 
50
  This is an experimental LoRA testing new functionality being developed for IBM's Granite LLM family. We are welcoming the community to test it out and give us feedback, but we are NOT recommending this model be used for real deployments at this time. Stay tuned for more updates on the Granite roadmap.
51
 
52
- **Granite Intrinsics 3.0 8b v0.1** is lightly tuned so that its behavior closely mimics that of [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct),
53
  with the added ability to generate the three specified intrinsics.
54
 
55
 
@@ -89,7 +89,7 @@ we can evaluate the certainty and hallucination status of this reply by invoking
89
 
90
 
91
  ### Intrinsics Example with PDL
92
- Given a hosted instance of **Granite Intrinsics 3.0 8b Instruct v0.1** at `API_BASE` (insert the host address here), this uses the [PDL language](https://github.com/IBM/prompt-declaration-language) to implement the RAG intrinsic invocation scenario described above.
93
  Note that the hosted instance must be supported by LiteLLM ([https://docs.litellm.ai/docs/providers](https://docs.litellm.ai/docs/providers))
94
 
95
  First, create a file `intrinsics.pdl` with the following content.
@@ -288,7 +288,7 @@ def main_chat_flow (s, doc, query):
288
 
289
 
290
  if __name__ == "__main__":
291
- model_path = "ibm-granite/granite-intrinsics-3.0-8b-lora-v0.1"
292
 
293
  # Setting the model_path to the granite model, and chat template to be the granite template
294
  # This assumes "granite3-instruct" chat template has been registered in "sglang/lang/chat_template.py"
@@ -317,20 +317,20 @@ red-teamed examples.
317
  ## Evaluation
318
  We evaluate the performance of the intrinsics themselves and the RAG performance of the model.
319
 
320
- We first find that the performance of the intrinsics in our shared model **Granite Instrinsics 3.0 8b v0.1** is not degraded
321
  versus the baseline procedure of maintaining 3 separate instrinsic models. Here, percent error is shown for the Hallucination Detection and Safety Exception intrinsics as they have
322
  binary output, and Mean Absolute Error (MAE) is shown for the Uncertainty Intrinsic as it outputs numbers 0 to 9. For all, lower is better. Performance is calculated on a randomly drawn 400 sample validation set from each intrinsic's dataset.
323
 
324
 
325
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/NsvMpweFjmjIhWFaKtI-K.png)
326
 
327
- We then find that RAG performance of **Granite Instrinsics 3.0 8b v0.1** does not suffer with respect to the base model [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct). Here we evaluate the RAGBench benchmark on RAGAS faithfulness and correction metrics.
328
 
329
 
330
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/hyOlQmXPirlCYeILLBXhc.png)
331
 
332
  ## Training Details
333
- The **Granite Instrinsics 3.0 8b v0.1** model is a LoRA adapter finetuned to provide 3 desired intrinsic outputs - Uncertainty Quantification, Hallucination Detection, and Safety.
334
 
335
 
336
 
 
16
 
17
  ## Model Summary
18
 
19
+ **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** is a LoRA adapter for [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct),
20
  providing access to the Uncertainty, Hallucination Detection, and Safety Exception intrinsics in addition to retaining the full abilities of the [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct) model.
21
 
22
  - **Developer:** IBM Research
 
49
 
50
  This is an experimental LoRA testing new functionality being developed for IBM's Granite LLM family. We are welcoming the community to test it out and give us feedback, but we are NOT recommending this model be used for real deployments at this time. Stay tuned for more updates on the Granite roadmap.
51
 
52
+ **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** is lightly tuned so that its behavior closely mimics that of [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct),
53
  with the added ability to generate the three specified intrinsics.
54
 
55
 
 
89
 
90
 
91
  ### Intrinsics Example with PDL
92
+ Given a hosted instance of **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** at `API_BASE` (insert the host address here), this uses the [PDL language](https://github.com/IBM/prompt-declaration-language) to implement the RAG intrinsic invocation scenario described above.
93
  Note that the hosted instance must be supported by LiteLLM ([https://docs.litellm.ai/docs/providers](https://docs.litellm.ai/docs/providers))
94
 
95
  First, create a file `intrinsics.pdl` with the following content.
 
288
 
289
 
290
  if __name__ == "__main__":
291
+ model_path = "ibm-granite/granite-3.0-8b-lora-intrinsics-v0.1"
292
 
293
  # Setting the model_path to the granite model, and chat template to be the granite template
294
  # This assumes "granite3-instruct" chat template has been registered in "sglang/lang/chat_template.py"
 
317
  ## Evaluation
318
  We evaluate the performance of the intrinsics themselves and the RAG performance of the model.
319
 
320
+ We first find that the performance of the intrinsics in our shared model **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** is not degraded
321
  versus the baseline procedure of maintaining 3 separate instrinsic models. Here, percent error is shown for the Hallucination Detection and Safety Exception intrinsics as they have
322
  binary output, and Mean Absolute Error (MAE) is shown for the Uncertainty Intrinsic as it outputs numbers 0 to 9. For all, lower is better. Performance is calculated on a randomly drawn 400 sample validation set from each intrinsic's dataset.
323
 
324
 
325
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/NsvMpweFjmjIhWFaKtI-K.png)
326
 
327
+ We then find that RAG performance of **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** does not suffer with respect to the base model [ibm-granite/granite-3.0-8b-instruct](https://huggingface.co/ibm-granite/granite-3.0-8b-instruct). Here we evaluate the RAGBench benchmark on RAGAS faithfulness and correction metrics.
328
 
329
 
330
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6602ffd971410cf02bf42c06/hyOlQmXPirlCYeILLBXhc.png)
331
 
332
  ## Training Details
333
+ The **Granite 3.0 8B Instruct - Intrinsics LoRA v0.1** model is a LoRA adapter finetuned to provide 3 desired intrinsic outputs - Uncertainty Quantification, Hallucination Detection, and Safety.
334
 
335
 
336