Multilingual Bert base (multilingual uncased) model trained to predict CAP issue codes.
Model training on 120,000 assorted political documents -- mostly from the Comparative Agendas Project
Countries:
- Italy
- Sweden
- France
- Switzerland
- Poland
- Netherlands
- Germany
- Denmark
- Spain
- UK
- Austria
- Ireland
LABELS USED IN TRAINING
Model labels -> CAP labels:
{0: 1.0, 1: 2.0, 2: 3.0, 3: 4.0, 4: 5.0, 5: 6.0, 6: 7.0, 7: 8.0, 8: 9.0, 9: 10.0, 10: 12.0, 11: 13.0, 12: 14.0, 13: 15.0, 14: 16.0, 15: 17.0, 16: 18.0, 17: 19.0, 18: 20.0, 19: 23.0}
Model labels -> CAP issues:
{0: 'macroeconomics', 1: 'civil_rights', 2: 'healthcare', 3: 'agriculture', 4: 'labour', 5: 'education', 6: 'environment', 7: 'energy', 8: 'immigration', 9: 'transportation', 10: 'law_crime', 11: 'social_welfare', 12: 'housing', 13: 'domestic_commerce', 14: 'defense', 15: 'technology', 16: 'foreign_trade', 17: 'international_affairs', 18: 'government_operations', 19: 'culture'}
Validation
Class | Precision | Recall | F1-score | Support |
---|---|---|---|---|
0 | 0.72 | 0.83 | 0.77 | 211 |
1 | 0.82 | 0.77 | 0.79 | 242 |
2 | 0.82 | 0.86 | 0.84 | 251 |
3 | 0.92 | 0.89 | 0.90 | 228 |
4 | 0.81 | 0.85 | 0.83 | 220 |
5 | 0.90 | 0.93 | 0.91 | 244 |
6 | 0.87 | 0.87 | 0.87 | 230 |
7 | 0.92 | 0.88 | 0.90 | 251 |
8 | 0.94 | 0.90 | 0.92 | 237 |
9 | 0.87 | 0.88 | 0.87 | 263 |
10 | 0.70 | 0.88 | 0.78 | 189 |
11 | 0.90 | 0.81 | 0.85 | 248 |
12 | 0.87 | 0.90 | 0.88 | 222 |
13 | 0.76 | 0.72 | 0.74 | 255 |
14 | 0.84 | 0.84 | 0.84 | 241 |
15 | 0.92 | 0.79 | 0.85 | 276 |
16 | 0.95 | 0.90 | 0.92 | 258 |
17 | 0.71 | 0.82 | 0.76 | 200 |
18 | 0.77 | 0.73 | 0.75 | 215 |
19 | 0.92 | 0.91 | 0.92 | 239 |
Accuracy | --- 0.85 --- | |||
Macro Avg | 0.85 | 0.85 | 0.85 | 4720 |
Weighted Avg | 0.85 | 0.85 | 0.85 | 4720 |
- Downloads last month
- 22