jbeno commited on
Commit
3524153
·
1 Parent(s): eef716b

Added more performance metrics to README

Browse files
Files changed (1) hide show
  1. README.md +107 -2
README.md CHANGED
@@ -82,7 +82,7 @@ The code used to train the model can be found on GitHub:
82
 
83
  The research paper can be found here: [ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis](https://github.com/jbeno/sentiment/research_paper.pdf)
84
 
85
- ### Performance
86
 
87
  - **Merged Dataset**
88
  - Macro Average F1: **79.29**
@@ -97,7 +97,6 @@ The research paper can be found here: [ELECTRA and GPT-4o: Cost-Effective Partne
97
  - Macro Average F1: **69.95**
98
  - Accuracy: **78.24**
99
 
100
-
101
  ## Model Architecture
102
 
103
  - **Base Model**: ELECTRA base discriminator (`google/electra-base-discriminator`)
@@ -255,6 +254,112 @@ The model's configuration (config.json) includes custom parameters:
255
  - `dropout_rate`: Dropout rate used in the classifier.
256
  - `pooling`: Pooling strategy used ('mean').
257
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
258
  ## License
259
 
260
  This model is licensed under the MIT License.
 
82
 
83
  The research paper can be found here: [ELECTRA and GPT-4o: Cost-Effective Partners for Sentiment Analysis](https://github.com/jbeno/sentiment/research_paper.pdf)
84
 
85
+ ### Performance Summary
86
 
87
  - **Merged Dataset**
88
  - Macro Average F1: **79.29**
 
97
  - Macro Average F1: **69.95**
98
  - Accuracy: **78.24**
99
 
 
100
  ## Model Architecture
101
 
102
  - **Base Model**: ELECTRA base discriminator (`google/electra-base-discriminator`)
 
254
  - `dropout_rate`: Dropout rate used in the classifier.
255
  - `pooling`: Pooling strategy used ('mean').
256
 
257
+ ## Performance by Dataset
258
+
259
+ ### Merged Dataset
260
+
261
+ ```
262
+ Merged Dataset Classification Report
263
+
264
+ precision recall f1-score support
265
+
266
+ negative 0.847081 0.777211 0.810643 2352
267
+ neutral 0.704453 0.761072 0.731669 1829
268
+ positive 0.828047 0.844615 0.836249 2349
269
+
270
+ accuracy 0.796937 6530
271
+ macro avg 0.793194 0.794299 0.792854 6530
272
+ weighted avg 0.800285 0.796937 0.797734 6530
273
+
274
+ ROC AUC: 0.926344
275
+
276
+ Predicted negative neutral positive
277
+ Actual
278
+ negative 1828 331 193
279
+ neutral 218 1392 219
280
+ positive 112 253 1984
281
+
282
+ Macro F1 Score: 0.79
283
+ ```
284
+
285
+ ### DynaSent Round 1
286
+
287
+ ```
288
+ DynaSent Round 1 Classification Report
289
+
290
+ precision recall f1-score support
291
+
292
+ negative 0.901222 0.737500 0.811182 1200
293
+ neutral 0.745957 0.922500 0.824888 1200
294
+ positive 0.850970 0.804167 0.826907 1200
295
+
296
+ accuracy 0.821389 3600
297
+ macro avg 0.832716 0.821389 0.820992 3600
298
+ weighted avg 0.832716 0.821389 0.820992 3600
299
+
300
+ ROC AUC: 0.945131
301
+
302
+ Predicted negative neutral positive
303
+ Actual
304
+ negative 885 201 114
305
+ neutral 38 1107 55
306
+ positive 59 176 965
307
+
308
+ Macro F1 Score: 0.82
309
+ ```
310
+
311
+ ### DynaSent Round 2
312
+
313
+ ```
314
+ DynaSent Round 2 Classification Report
315
+
316
+ precision recall f1-score support
317
+
318
+ negative 0.696154 0.754167 0.724000 240
319
+ neutral 0.770408 0.629167 0.692661 240
320
+ positive 0.704545 0.775000 0.738095 240
321
+
322
+ accuracy 0.719444 720
323
+ macro avg 0.723702 0.719444 0.718252 720
324
+ weighted avg 0.723702 0.719444 0.718252 720
325
+
326
+ ROC AUC: 0.88842
327
+
328
+ Predicted negative neutral positive
329
+ Actual
330
+ negative 181 26 33
331
+ neutral 44 151 45
332
+ positive 35 19 186
333
+
334
+ Macro F1 Score: 0.72
335
+ ```
336
+
337
+ ### Stanford Sentiment Treebank (SST-3)
338
+
339
+ ```
340
+ SST-3 Classification Report
341
+
342
+ precision recall f1-score support
343
+
344
+ negative 0.831878 0.835526 0.833698 912
345
+ neutral 0.452703 0.344473 0.391241 389
346
+ positive 0.834669 0.916392 0.873623 909
347
+
348
+ accuracy 0.782353 2210
349
+ macro avg 0.706417 0.698797 0.699521 2210
350
+ weighted avg 0.766284 0.782353 0.772239 2210
351
+
352
+ ROC AUC: 0.885009
353
+
354
+ Predicted negative neutral positive
355
+ Actual
356
+ negative 762 104 46
357
+ neutral 136 134 119
358
+ positive 18 58 833
359
+
360
+ Macro F1 Score: 0.70
361
+ ```
362
+
363
  ## License
364
 
365
  This model is licensed under the MIT License.