Jlonge4 commited on
Commit
56ccee8
·
verified ·
1 Parent(s): e044658

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +154 -3
README.md CHANGED
@@ -12,12 +12,163 @@ language:
12
  - en
13
  ---
14
 
15
- # Uploaded model
16
 
17
  - **Developed by:** Jlonge4
18
  - **License:** apache-2.0
19
- - **Finetuned from model :** unsloth/phi-4-bnb-4bit
20
 
21
- This llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
12
  - en
13
  ---
14
 
15
+ # Reasoning Model for Content Evaluation
16
 
17
  - **Developed by:** Jlonge4
18
  - **License:** apache-2.0
19
+ - **Finetuned from model:** unsloth/phi-4-bnb-4bit
20
 
21
+ This Llama model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library. It specializes in three key reasoning tasks:
22
+
23
+ 1. **Toxicity Detection**: Evaluates whether text content contains toxic elements such as hate speech, harassment, or inappropriate language.
24
+ 2. **Hallucination Detection**: Determines if an AI model's response contains factual inaccuracies or fabricated information when compared to a reference.
25
+ 3. **RAG Relevance Assessment**: Analyzes whether retrieved context is relevant to answering a specific query.
26
+
27
+ ## Model Capabilities
28
+
29
+ The model provides structured reasoning and binary classification for each task:
30
+ - Toxicity: `toxic` vs `non-toxic`
31
+ - Hallucination: `yes` vs `no`
32
+ - RAG Relevance: `relevant` vs `unrelated`
33
+
34
+ All outputs follow a consistent format:
35
+
36
+ ```Markdown
37
+ <reasoning>
38
+ Detailed explanation of the decision
39
+ </reasoning>
40
+ <answer>
41
+ Classification result
42
+ </answer>
43
+ ```
44
+
45
+ ## Functions:
46
+ ### Toxicity
47
+ ```python
48
+ def toxic_format_func(text):
49
+ system = """Your job is to evaluate whether a human text is toxic or non-toxic.
50
+ Respond in the following format:
51
+ <reasoning>
52
+ ...
53
+ </reasoning>
54
+ <answer>
55
+ ...
56
+ </answer>
57
+ """
58
+ prompt = f"""You are examining written text content. Here is the text:
59
+ <info>
60
+ [Text]: {text}
61
+ </info>
62
+ Examine the text and determine whether the text is toxic or not. Toxicity is
63
+ defined as any comment that makes hateful statements, demeans or disparages
64
+ another user, uses inappropriate language, or threatens physical or emotional
65
+ violence."""
66
+ return system, prompt
67
+ ```
68
+ ### Hallucinations
69
+ ```python
70
+ def halu_format_func(reference, query, response):
71
+ system = """Your job is to evaluate whether a machine learning model has hallucinated or not. A hallucination occurs when the response is coherent but factually incorrect or nonsensical outputs that are not grounded in the provided context.
72
+ Respond in the following format:
73
+ <reasoning>
74
+ ...
75
+ </reasoning>
76
+ <answer>
77
+ ...
78
+ </answer>
79
+ """
80
+ prompt = f"""You are given the following information:
81
+ <info>
82
+ [Knowledge]: {reference}
83
+ [User Input]: {query}
84
+ [Model Response]: {response}
85
+ </info>
86
+ Based on the information provided is the model output a hallucination?"""
87
+ return system, prompt
88
+ ```
89
+
90
+ ### Rag Relevance
91
+ ```python
92
+ def rag_format_func(reference, query):
93
+ system = """Your job is to evaluate whether a retrieved context is relevant, or unrelated to a user query.
94
+ Respond in the following format:
95
+ <reasoning>
96
+ ...
97
+ </reasoning>
98
+ <answer>
99
+ ...
100
+ </answer>
101
+ """
102
+ prompt = f"""You are comparing a reference text to a question and trying to determine if the reference text
103
+ contains information relevant to answering the question. Here is the info:
104
+ <info>
105
+ [Question]: {query}
106
+ [Reference text]: {reference}
107
+ </info>
108
+ Compare the Question above to the Reference text. Your response must be single word,
109
+ either "relevant" or "unrelated"."""
110
+ return system, prompt
111
+ ```
112
+
113
+ ## Usage:
114
+ ```python
115
+ from transformers import AutoModelForCausalLM, AutoTokenizer
116
+ model = AutoModelForCausalLM.from_pretrained("grounded-ai/phi4-r1-guard")
117
+ tokenizer = AutoTokenizer.from_pretrained("grounded-ai/phi4-r1-guard")
118
+ ```
119
+
120
+ ### Toxicity Detection Example:
121
+ ```python
122
+ text_to_evaluate = "This is some text to evaluate"
123
+ system, prompt = toxic_format_func(text_to_evaluate)
124
+ inputs = tokenizer(prompt, return_tensors="pt")
125
+ output = model.generate(inputs)
126
+ result = tokenizer.decode(output[0])
127
+ ```
128
+
129
+ ### Hallucination Detection Example:
130
+ ```python
131
+ reference = "The Eiffel Tower was completed in 1889."
132
+ query = "When was the Eiffel Tower built?"
133
+ response = "The Eiffel Tower was completed in 1925."
134
+ system, prompt = halu_format_func(reference, query, response)
135
+ ```
136
+
137
+ ### RAG Relevance Example:
138
+ ```python
139
+ reference = "The process of photosynthesis in plants..."
140
+ query = "How does photosynthesis work?"
141
+ system, prompt = rag_format_func(reference, query)
142
+ )
143
+ ```
144
+ ## Sample Output:
145
+ ```Markdown
146
+ HALLUCINATION - YES CASE:
147
+ System: Your job is to evaluate whether a machine learning model has hallucinated or not. A hallucination occurs when the response is coherent but factually incorrect or nonsensical outputs that are not grounded in the provided context.
148
+
149
+ Respond in the following format:
150
+ <reasoning>
151
+ ...
152
+ </reasoning>
153
+ <answer>
154
+ ...
155
+ </answer>
156
+
157
+ Prompt: You are given the following information:
158
+ <info>
159
+ [Knowledge]: The Eiffel Tower was completed in 1889 and stands 324 meters tall. It was built for the World's Fair in Paris.
160
+ [User Input]: When was the Eiffel Tower built and how tall is it?
161
+ [Model Response]: The Eiffel Tower was completed in 1925 and stands 450 meters tall. It was built to celebrate France's victory in World War I.
162
+ </info>
163
+ Based on the information provided is the model output a hallucination?
164
+
165
+ ##############################################################################
166
+ Result: <reasoning>
167
+ The model's response contains several factual inaccuracies when compared to the provided knowledge. According to the information given, the Eiffel Tower was completed in 1889, not 1925, and it stands 324 meters tall, not 450 meters. Additionally, the Eiffel Tower was built for the World's Fair in Paris, not to celebrate France's victory in World War I. These discrepancies indicate that the model's response is not grounded in the provided context and includes factually incorrect information. Therefore, the model's output can be classified as a hallucination.
168
+ </reasoning>
169
+ <answer>
170
+ Yes, the model output is a hallucination.
171
+ </answer>
172
+ ```
173
 
174
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)