AIEngineerYvar commited on
Commit
08fd18e
·
verified ·
1 Parent(s): 05550a6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +124 -3
README.md CHANGED
@@ -1,3 +1,124 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ metrics:
6
+ - precision
7
+ - recall
8
+ - accuracy
9
+ pipeline_tag: tabular-classification
10
+ library_name: sklearn
11
+ tags:
12
+ - healthcare
13
+ - science
14
+ ---
15
+
16
+
17
+ ### Model Description
18
+
19
+ <!-- Provide a longer summary of what this model is. -->
20
+ The following model is designed to predict, given a certain number of inputs, whether a person has and/or is it at risk of acquiring heart disease.
21
+ This model is composed of 13 input features, and is designed to work within form-based applications, i.e. software applications which require
22
+ user input.
23
+
24
+ NOTE: The following model is meant as an assistive tool, and must NOT directly be used to produce the final verdict on a person or patient's condition.
25
+ As it is meant to promote further evaluations upon having completed its prediction.
26
+
27
+
28
+ - **Developed by:** DeepNeural
29
+ - **Model type:** Tabular Classifier
30
+ - **Language(s):** English
31
+ - **License:** MIT
32
+
33
+ ### Model Inputs
34
+ | Variable Name | Type | Description & Input Value |
35
+ |--------------------|---------|-------------------------------------------------------------------------------|
36
+ | age | Integer |Patient's age
37
+ | sex | Binary | Patient's sex (1 = male 0 = female)
38
+ | chest pain type | Integer | 1 = Typical angina, 2 = atypical angina 3 = non-anginal pain 4 = asymptomatic
39
+ | resting blood pressure | Integer | resting blood pressure (in mm Hg on admission to the hospital)
40
+ | serum cholestoral in mg/dl | Integer |
41
+ | fasting blood sugar > 120 mg/dl | Binary | is the patient's blood sugar level greater than 120 mg/dl?
42
+ | resting electrocardiographic results (values 0,1,2) | Integer | 0 = normal 1 = having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV) 2 = showing probable or definite left ventricular hypertrophy by Estes' criteria |
43
+ | maximum heart rate achieved | Integer |
44
+ | exercise induced angina | Binary | Does the patient suffer from exercise induced angina?
45
+ | oldpeak | Integer | ST depression induced by exercise relative to rest |
46
+ | the slope of the peak exercise ST segment | Integer | 1 = upsloping 2 = flat 3 = downsloping
47
+ | number of major vessels (0-3) colored by flourosopy | Integer |
48
+ | thal | Integer | 0 = normal; 1 = fixed defect; 2 = reversable defect
49
+
50
+ ### Model Sources
51
+
52
+ <!-- Provide the basic links for the model. -->
53
+
54
+ - **Repository:** https://www.kaggle.com/datasets/johnsmith88/heart-disease-dataset
55
+
56
+ ## Uses
57
+
58
+ <!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
59
+ This model is primarily designed for Data Scientists, Software Engineers and Machine Learning Engineers who have an interest in developing heart disease
60
+ software applications, for various healthcare institutions, ranging from hospitals to clinics. Furthermore, this model is also designed for educational
61
+ purposes within acadamia, whereby diabetic risk-analysis is a priority of the study.
62
+
63
+ Foreseeable users of the software applications to be developed with this model include: doctors, nurses (with respect to their patients)
64
+
65
+
66
+ ## Bias, Risks, and Limitations
67
+
68
+ <!-- This section is meant to convey both technical and sociotechnical limitations. -->
69
+ Please be adviced that our model was trained on a specific dataset for heart disease classification,
70
+ and although it has an high level of accuracy and precision, there may come certain moments where misclassifications occur.
71
+
72
+ ### Recommendations
73
+
74
+ <!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
75
+
76
+ Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More research needed for further recommendations.
77
+ Furthermore, the following model will continously undergo improvements and testing for better results capable of fixing the limitations mentioned in the previous
78
+ section.
79
+
80
+ ## How to Get Started with the Model
81
+ To properly make use of this model, please refer to the illustration below, which
82
+ showcases how this model can be loaded directly into an application. Please note, that,
83
+ because it was built with the Scikit-Learn Machine Learning library, the model has been saved
84
+ as a .joblib file. With that in mind, please proceed by copying the following code into your coding environment (Python).
85
+
86
+ 1. Install Joblib
87
+ ```python
88
+ !pip install joblib
89
+
90
+ ```
91
+
92
+ 2. Load the model Upon Installation
93
+ ```python
94
+ my_model = joblib.load('heart_disease_classifier_model_v1.joblib')
95
+
96
+ ```
97
+
98
+ 3. Make predictions (Binary or Probability)
99
+ ```python
100
+ my_model.predict(X_test)
101
+
102
+ # For probability-based outputs
103
+
104
+ my_model.predict_proba(X_test)
105
+ ```
106
+
107
+ NOTE: This model requires input data in a 2-Dimensional format (Pandas Series) with the column names,
108
+ considering the model is to be used in form-based applications.
109
+
110
+
111
+ #### Metrics
112
+
113
+ <!-- These are the evaluation metrics being used, ideally with a description of why. -->
114
+ We tested our model by implementing various ML models, namely: logistic regression, Stochastic Gradient Descent, Support Vector Machines, and
115
+ K-Nearest Neighbor models. After performing hyperparameter tuning we opted to prioritize the K-Nearest Neighbor model for predictive purposes
116
+ as it showed the best results. The metrics used were accuracy, precision, recall, f1-score and AUC.
117
+ The results for our model can be seen in the 'Results' section.
118
+
119
+ ### Results
120
+
121
+ Accuracy - 94%
122
+ Precision - 94%
123
+ Recall - 94%
124
+ AUC ROC - 94%