Multilingual Bert base (multilingual uncased) model trained to predict CAP issue codes.

Model training on 120,000 assorted political documents -- mostly from the Comparative Agendas Project

Countries:

LABELS USED IN TRAINING

Model labels -> CAP labels:
{0: 1.0, 1: 2.0, 2: 3.0, 3: 4.0, 4: 5.0, 5: 6.0, 6: 7.0, 7: 8.0, 8: 9.0, 9: 10.0, 10: 12.0, 11: 13.0, 12: 14.0, 13: 15.0, 14: 16.0, 15: 17.0, 16: 18.0, 17: 19.0, 18: 20.0, 19: 23.0}
Model labels -> CAP issues:
{0: 'macroeconomics', 1: 'civil_rights', 2: 'healthcare', 3: 'agriculture', 4: 'labour', 5: 'education', 6: 'environment', 7: 'energy', 8: 'immigration', 9: 'transportation', 10: 'law_crime', 11: 'social_welfare', 12: 'housing', 13: 'domestic_commerce', 14: 'defense', 15: 'technology', 16: 'foreign_trade', 17: 'international_affairs', 18: 'government_operations', 19: 'culture'}

Class	Precision	Recall	F1-score	Support
0	0.72	0.83	0.77	211
1	0.82	0.77	0.79	242
2	0.82	0.86	0.84	251
3	0.92	0.89	0.90	228
4	0.81	0.85	0.83	220
5	0.90	0.93	0.91	244
6	0.87	0.87	0.87	230
7	0.92	0.88	0.90	251
8	0.94	0.90	0.92	237
9	0.87	0.88	0.87	263
10	0.70	0.88	0.78	189
11	0.90	0.81	0.85	248
12	0.87	0.90	0.88	222
13	0.76	0.72	0.74	255
14	0.84	0.84	0.84	241
15	0.92	0.79	0.85	276
16	0.95	0.90	0.92	258
17	0.71	0.82	0.76	200
18	0.77	0.73	0.75	215
19	0.92	0.91	0.92	239
Accuracy	--- 0.85 ---
Macro Avg	0.85	0.85	0.85	4720
Weighted Avg	0.85	0.85	0.85	4720