am5uc
/

ServiceNow_Table_Question_Answering

Table Question Answering

Model card Files Files and versions

am5uc commited on Apr 23

Commit

79f4083

·

verified ·

1 Parent(s): 42b6922

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -14,6 +14,10 @@ ServiceNow is a platform that helps businesses automate their processes and work
 For this project, the training data was structured around ServiceNow ITSM tables, specifically Incident, Change, and Problem tables. I used a certain subset fields from Incident, Change, and Problem tables.  For example, Problem tables have a problem id, priority, status, root cause, and resolved at field. Since I can’t use official data from in-use ServiceNow instances, which contain private information, I generated a synthetic dataset with custom code. Then, I had to structure that code in sqa format, which is the best format for the model I was using, TAPAS. For this, I had to save each table in a CSV file. The final refined dataset that I would pass in would contain an id, uestion, table_file, answer_coordinates if the answer was in the table itself, the actual answer, and a float answer if the answer was a numeric value not in the data, such as a count. I do have an aggregation_label field as well, which I set right before the training process, but after the train_test_table split. I used the method train_test_split() to obtain the training, validation, and test data. I specifically used a seed of 42:
 ```python
 train_val_data, test_data = train_test_split(data, test_size=0.1, random_state=42)
 # Then split train+validation into train and validation

 For this project, the training data was structured around ServiceNow ITSM tables, specifically Incident, Change, and Problem tables. I used a certain subset fields from Incident, Change, and Problem tables.  For example, Problem tables have a problem id, priority, status, root cause, and resolved at field. Since I can’t use official data from in-use ServiceNow instances, which contain private information, I generated a synthetic dataset with custom code. Then, I had to structure that code in sqa format, which is the best format for the model I was using, TAPAS. For this, I had to save each table in a CSV file. The final refined dataset that I would pass in would contain an id, uestion, table_file, answer_coordinates if the answer was in the table itself, the actual answer, and a float answer if the answer was a numeric value not in the data, such as a count. I do have an aggregation_label field as well, which I set right before the training process, but after the train_test_table split. I used the method train_test_split() to obtain the training, validation, and test data. I specifically used a seed of 42:
+Example CSV with training data:
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/67885e8302ab11c0b0ed0853/-8piWOY40wzTk3qU1tmRS.png)
 ```python
 train_val_data, test_data = train_test_split(data, test_size=0.1, random_state=42)
 # Then split train+validation into train and validation