Bharatdeep-H commited on
Commit
9f1553b
·
verified ·
1 Parent(s): 053a558

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +26 -6
README.md CHANGED
@@ -5,7 +5,14 @@
5
  This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) optimized for function calling capabilities. It was trained using GRPO (Guided Reinforcement Policy Optimization) on the [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1) dataset, specifically the `func_calling_singleturn` subset.
6
 
7
  ## Intended Uses
8
- [YET TO FILL]
 
 
 
 
 
 
 
9
 
10
  ## Training Details
11
 
@@ -18,10 +25,19 @@ This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggi
18
  ## Performance and Limitations
19
 
20
  ### Strengths
21
- [YET TO FILL]
 
 
 
 
 
22
 
23
  ### Limitations
24
- [YET TO FILL]
 
 
 
 
25
 
26
  ## Usage
27
 
@@ -67,11 +83,11 @@ Respond in the following format:
67
  """
68
 
69
  SYSTEM_MIX_USER_PROMPT = "You are a function calling AI model. You are provided with function signatures within <tools> </tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.\n\n<tools>[[{'type': 'function', 'function': {'name': 'book_appointment', 'description': 'Books an appointment for a patient with a specific dentist at a given date and time.', 'parameters': {'type': 'object', 'properties': {'patient_id': {'type': 'string', 'description': 'The unique identifier for the patient.'}, 'dentist_id': {'type': 'string', 'description': 'The unique identifier for the dentist.'}, 'preferred_date': {'type': 'string', 'description': 'The preferred date for the appointment.'}, 'time_slot': {'type': 'string', 'description': 'The preferred time slot for the appointment.'}}, 'required': ['patient_id', 'dentist_id', 'preferred_date', 'time_slot']}}}, {'type': 'function', 'function': {'name': 'reschedule_appointment', 'description': 'Reschedules an existing appointment to a new date and time.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the existing appointment.'}, 'new_date': {'type': 'string', 'description': 'The new date for the rescheduled appointment.'}, 'new_time_slot': {'type': 'string', 'description': 'The new time slot for the rescheduled appointment.'}}, 'required': ['appointment_id', 'new_date', 'new_time_slot']}}}, {'type': 'function', 'function': {'name': 'cancel_appointment', 'description': 'Cancels an existing appointment.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the appointment to be canceled.'}}, 'required': ['appointment_id']}}}, {'type': 'function', 'function': {'name': 'find_available_time_slots', 'description': 'Finds available time slots for a dentist on a given date.', 'parameters': {'type': 'object', 'properties': {'dentist_id': {'type': 'string', 'description': 'The unique identifier for the dentist.'}, 'date': {'type': 'string', 'description': 'The date to check for available time slots.'}}, 'required': ['dentist_id', 'date']}}}, {'type': 'function', 'function': {'name': 'send_appointment_reminder', 'description': 'Sends an automated reminder to the patient for an upcoming appointment.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the appointment.'}, 'reminder_time': {'type': 'string', 'description': 'The time before the appointment when the reminder should be sent.'}}, 'required': ['appointment_id', 'reminder_time']}}}]]</tools>\n\nFor each user query, you must:\n\n1. First, generate your reasoning within <chain_of_thought> </chain_of_thought> tags. This should explain your analysis of the user's request and how you determined which function(s) to call, or why no appropriate function is available.\n\n2. Then, call the appropriate function(s) by returning a JSON object within <tool_call> </tool_call> tags using the following schema:\n<tool_call>\n{'arguments': <args-dict>, 'name': <function-name>}\n</tool_call>\n\n3. If you determine that none of the provided tools can appropriately resolve the user's query based on the tools' descriptions, you must still provide your reasoning in <chain_of_thought> tags, followed by:\n<tool_call>NO_CALL_AVAILABLE</tool_call>\n\nRemember that your <chain_of_thought> analysis must ALWAYS precede any <tool_call> tags, regardless of whether a suitable function is available."
70
- COMPLETE_SYSTEM_PROMPT = "As the manager of a dental practice, I'm looking to streamline our booking process. I need to schedule an appointment for our patient, John Doe with ID 'p123', with Dr. Sarah Smith, whose dentist ID is 'd456'. Please book this appointment for May 15, 2023, at 2:00 PM. Additionally, I would like to set up an automated reminder for John Doe to ensure he remembers his appointment. Can you book this appointment and arrange for the reminder to be sent out in advance?"
71
 
72
  text = tokenizer.apply_chat_template([
73
  {'role': 'system', 'content': FORMAT_PROMPT},
74
- {'role': 'user', 'content': SYSTEM_MIX_USER_PROMPT + "\n\nUSER QUERY: " + COMPLETE_SYSTEM_PROMPT}
75
  ], tokenize = False, add_generation_prompt = True)
76
 
77
  sampling_params = SamplingParams(
@@ -88,4 +104,8 @@ output = model.fast_generate(
88
  )[0].outputs[0].text
89
 
90
  print(output)
91
- ```
 
 
 
 
 
5
  This model is a fine-tuned version of [Qwen/Qwen2.5-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-1.5B-Instruct) optimized for function calling capabilities. It was trained using GRPO (Guided Reinforcement Policy Optimization) on the [NousResearch/hermes-function-calling-v1](https://huggingface.co/datasets/NousResearch/hermes-function-calling-v1) dataset, specifically the `func_calling_singleturn` subset.
6
 
7
  ## Intended Uses
8
+
9
+ This model is designed for:
10
+ - Small Agentic Setups where an agent needs to have low latency but good accuracy with medium level tasks
11
+ - Basic Chatbot that needs to scale horizontally with minimal vertical scaling
12
+ - Parsing user requests and identifying when to call specific functions
13
+ - Generating accurate function call schemas based on user inputs
14
+ - Supporting tool use in conversational AI applications
15
+ - Enabling structured data extraction from natural language
16
 
17
  ## Training Details
18
 
 
25
  ## Performance and Limitations
26
 
27
  ### Strengths
28
+ - Format Following ensures it doesn't break when generating multiple tool calls as GRPO was used mainly to enhance its format following capability rather than accuracy
29
+ - CoT enables understanding what's the current alignment of the model, further DPO on this GRPO model can enhance accuracy significantly
30
+ - It's a small model of 1.5B hence it can run on good CPU hardware with a decent speed
31
+ - Efficiently handles function calling with minimal computational resources
32
+ - Maintains the conversational capabilities of the base Qwen2.5-1.5B-Instruct model
33
+ - 4-bit quantization enables deployment on resource-constrained environments
34
 
35
  ### Limitations
36
+ - Beyond 5000 input tokens model starts regressing, but this can be improved if DPO or ORPO is used for specific cases, so with this limitation basically if you want to scale horizontally then descriptions are to be kept brief
37
+ - Reasoning Traces of CoT can become very lenghty at times, model can take in a header for Instructions on CoT 1. to reduce reasoning traces length 2. enhance the accuracy by focusing on a certain format what needs to be put inside the CoT tags (Currently I'm relying alone on model's Cot Capability)
38
+ - Performance may vary compared to larger function calling models
39
+ - 1.5B parameter size inherently limits complexity of reasoning compared to larger models
40
+ - May struggle with highly complex or multi-step function calling scenarios
41
 
42
  ## Usage
43
 
 
83
  """
84
 
85
  SYSTEM_MIX_USER_PROMPT = "You are a function calling AI model. You are provided with function signatures within <tools> </tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions.\n\n<tools>[[{'type': 'function', 'function': {'name': 'book_appointment', 'description': 'Books an appointment for a patient with a specific dentist at a given date and time.', 'parameters': {'type': 'object', 'properties': {'patient_id': {'type': 'string', 'description': 'The unique identifier for the patient.'}, 'dentist_id': {'type': 'string', 'description': 'The unique identifier for the dentist.'}, 'preferred_date': {'type': 'string', 'description': 'The preferred date for the appointment.'}, 'time_slot': {'type': 'string', 'description': 'The preferred time slot for the appointment.'}}, 'required': ['patient_id', 'dentist_id', 'preferred_date', 'time_slot']}}}, {'type': 'function', 'function': {'name': 'reschedule_appointment', 'description': 'Reschedules an existing appointment to a new date and time.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the existing appointment.'}, 'new_date': {'type': 'string', 'description': 'The new date for the rescheduled appointment.'}, 'new_time_slot': {'type': 'string', 'description': 'The new time slot for the rescheduled appointment.'}}, 'required': ['appointment_id', 'new_date', 'new_time_slot']}}}, {'type': 'function', 'function': {'name': 'cancel_appointment', 'description': 'Cancels an existing appointment.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the appointment to be canceled.'}}, 'required': ['appointment_id']}}}, {'type': 'function', 'function': {'name': 'find_available_time_slots', 'description': 'Finds available time slots for a dentist on a given date.', 'parameters': {'type': 'object', 'properties': {'dentist_id': {'type': 'string', 'description': 'The unique identifier for the dentist.'}, 'date': {'type': 'string', 'description': 'The date to check for available time slots.'}}, 'required': ['dentist_id', 'date']}}}, {'type': 'function', 'function': {'name': 'send_appointment_reminder', 'description': 'Sends an automated reminder to the patient for an upcoming appointment.', 'parameters': {'type': 'object', 'properties': {'appointment_id': {'type': 'string', 'description': 'The unique identifier for the appointment.'}, 'reminder_time': {'type': 'string', 'description': 'The time before the appointment when the reminder should be sent.'}}, 'required': ['appointment_id', 'reminder_time']}}}]]</tools>\n\nFor each user query, you must:\n\n1. First, generate your reasoning within <chain_of_thought> </chain_of_thought> tags. This should explain your analysis of the user's request and how you determined which function(s) to call, or why no appropriate function is available.\n\n2. Then, call the appropriate function(s) by returning a JSON object within <tool_call> </tool_call> tags using the following schema:\n<tool_call>\n{'arguments': <args-dict>, 'name': <function-name>}\n</tool_call>\n\n3. If you determine that none of the provided tools can appropriately resolve the user's query based on the tools' descriptions, you must still provide your reasoning in <chain_of_thought> tags, followed by:\n<tool_call>NO_CALL_AVAILABLE</tool_call>\n\nRemember that your <chain_of_thought> analysis must ALWAYS precede any <tool_call> tags, regardless of whether a suitable function is available."
86
+ USER_QUERY = "As the manager of a dental practice, I'm looking to streamline our booking process. I need to schedule an appointment for our patient, John Doe with ID 'p123', with Dr. Sarah Smith, whose dentist ID is 'd456'. Please book this appointment for May 15, 2023, at 2:00 PM. Additionally, I would like to set up an automated reminder for John Doe to ensure he remembers his appointment. Can you book this appointment and arrange for the reminder to be sent out in advance?"
87
 
88
  text = tokenizer.apply_chat_template([
89
  {'role': 'system', 'content': FORMAT_PROMPT},
90
+ {'role': 'user', 'content': SYSTEM_MIX_USER_PROMPT + "\n\nUSER QUERY: " + USER_QUERY}
91
  ], tokenize = False, add_generation_prompt = True)
92
 
93
  sampling_params = SamplingParams(
 
104
  )[0].outputs[0].text
105
 
106
  print(output)
107
+ ```
108
+
109
+ ## Citation
110
+
111
+ If you use intend to use this model for testing, hit me up!