Update README.md
Browse files
README.md
CHANGED
@@ -3,84 +3,247 @@ library_name: transformers
|
|
3 |
license: apache-2.0
|
4 |
base_model: answerdotai/ModernBERT-base
|
5 |
tags:
|
6 |
-
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
model-index:
|
8 |
-
- name:
|
9 |
-
results:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
---
|
11 |
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
#
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
##
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
49 |
-
|
50 |
-
|
51 |
-
-
|
52 |
-
-
|
53 |
-
-
|
54 |
-
-
|
55 |
-
-
|
56 |
-
|
57 |
-
|
58 |
-
|
59 |
-
|
60 |
-
|
61 |
-
|
62 |
-
|
63 |
-
|
64 |
-
|
65 |
-
|
66 |
-
|
67 |
-
|
68 |
-
|
69 |
-
|
70 |
-
|
71 |
-
|
72 |
-
|
73 |
-
|
74 |
-
|
75 |
-
|
76 |
-
|
77 |
-
|
78 |
-
|
79 |
-
|
80 |
-
|
81 |
-
|
82 |
-
|
83 |
-
|
84 |
-
|
85 |
-
|
86 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
license: apache-2.0
|
4 |
base_model: answerdotai/ModernBERT-base
|
5 |
tags:
|
6 |
+
- actuarial
|
7 |
+
- insurance
|
8 |
+
- multilabel-classification
|
9 |
+
- sentence-classification
|
10 |
+
- skills-extraction
|
11 |
+
- career-planning
|
12 |
+
- modernbert
|
13 |
+
- job-analysis
|
14 |
+
datasets:
|
15 |
+
- actuarial-jobs-7k
|
16 |
+
language:
|
17 |
+
- en
|
18 |
+
metrics:
|
19 |
+
- f1
|
20 |
+
- precision
|
21 |
+
- recall
|
22 |
model-index:
|
23 |
+
- name: modernbert-actuarial-skills-classifier
|
24 |
+
results:
|
25 |
+
- task:
|
26 |
+
type: text-classification
|
27 |
+
name: Multi-Label Text Classification
|
28 |
+
metrics:
|
29 |
+
- type: f1_micro
|
30 |
+
value: 0.6728
|
31 |
+
name: F1 Micro
|
32 |
+
- type: f1_macro
|
33 |
+
value: 0.1060
|
34 |
+
name: F1 Macro
|
35 |
+
- type: precision_micro
|
36 |
+
value: 0.7915
|
37 |
+
name: Precision Micro
|
38 |
+
- type: recall_micro
|
39 |
+
value: 0.5850
|
40 |
+
name: Recall Micro
|
41 |
+
widget:
|
42 |
+
- text: "I am looking for an entry-level actuarial position in life insurance pricing where I can apply my knowledge of mortality tables and statistical analysis. I have strong Python programming skills and experience with GLM models from my university projects. I am particularly interested in learning more about IFRS 17 implementation and would like to work with modern actuarial software like Prophet or MoSes."
|
43 |
+
example_title: "Life Insurance Entry Level"
|
44 |
+
- text: "I have three years of experience as a reserving actuary in property and casualty insurance, working primarily with workers compensation and general liability lines. I am proficient in R and SQL for data analysis and have built several predictive models using machine learning techniques. I am now seeking a senior analyst role where I can lead pricing projects and mentor junior actuaries, with a target salary range of at least 85000 dollars annually."
|
45 |
+
example_title: "P&C Career Growth"
|
46 |
+
- text: "After completing my actuarial exams up to ASA level, I want to transition into a health insurance role focusing on medical cost trend analysis and risk adjustment. I enjoy working with large datasets and have self-taught Python and SAS for healthcare analytics. My ideal position would involve building pricing models for group health products and I am hoping to find opportunities that offer around 70000 dollars per year as I build my specialization in this area."
|
47 |
+
example_title: "Health Insurance Transition"
|
48 |
+
- text: "I am a recent mathematics graduate passionate about pension actuarial work and retirement planning. I have limited professional experience but completed internships where I learned about defined benefit schemes, asset liability management, and regulatory compliance under Solvency II. I am eager to develop my Excel and VBA skills further and would consider positions starting at 40000 dollars minimum while I continue studying for my actuarial fellowship exams."
|
49 |
+
example_title: "Pensions Graduate Role"
|
50 |
+
- text: "As a data scientist looking to move into the actuarial field, I bring extensive experience with machine learning frameworks like TensorFlow and PyTorch, as well as strong programming abilities in Python and Scala. I am particularly interested in applying deep learning techniques to mortality forecasting and longevity risk modeling in life insurance. I am seeking roles that value innovation in actuarial modeling and offer competitive compensation of at least 95000 dollars given my technical background."
|
51 |
+
example_title: "Data Science to Actuarial"
|
52 |
+
inference:
|
53 |
+
parameters:
|
54 |
+
threshold: 0.5
|
55 |
+
top_k: 15
|
56 |
---
|
57 |
|
58 |
+
<div align="center">
|
59 |
+
<img src="./photo.png" width="150" height="150" style="border-radius: 50%; margin: 20px 0;">
|
60 |
+
|
61 |
+
# ๐ Connect with me on LinkedIn!
|
62 |
+
|
63 |
+
[](https://www.linkedin.com/in/manuel-caccone-42872141/)
|
64 |
+
|
65 |
+
**Manuel Caccone - Actuarial Data Scientist & Open Source Educator**
|
66 |
+
|
67 |
+
*Let's discuss actuarial science, AI, and career development!*
|
68 |
+
|
69 |
+
---
|
70 |
+
|
71 |
+
</div>
|
72 |
+
|
73 |
+

|
74 |
+
|
75 |
+
# ๐ฏ ModernBERT Actuarial Skills Classifier: Your Career Planning Assistant
|
76 |
+
|
77 |
+
---
|
78 |
+
|
79 |
+
## ๐ฉ Model Description
|
80 |
+
|
81 |
+
**ModernBERT-actuarial-skills-classifier** is a fine-tuned ModernBERT-base model trained on over 7,000 actuarial job postings, purpose-built to extract and identify actuarial competencies and technical skills from natural language descriptions. It powers career planning, skills gap analysis, and learning roadmap generation for actuarial professionals and students.
|
82 |
+
|
83 |
+
---
|
84 |
+
|
85 |
+
## โจ Key Features
|
86 |
+
|
87 |
+
- ๐ฏ **Multi-Label Classification:** Identifies multiple relevant skills from a single description
|
88 |
+
- ๐ **Career-Focused:** Trained on real job postings covering Life, P&C, Health, and Pensions
|
89 |
+
- ๐ **Instant Analysis:** Get results in under 1 second
|
90 |
+
- ๐ **Open Source:** Apache 2.0 License for educational and commercial use
|
91 |
+
- ๐ **Interactive Demo:** Full-featured Gradio Space with learning roadmaps and batch processing
|
92 |
+
|
93 |
+
---
|
94 |
+
|
95 |
+
## ๐ก Intended Use Cases
|
96 |
+
|
97 |
+
- **Career Planning:** Students and early-career actuaries discovering required skills for target roles
|
98 |
+
- **Job Analysis:** Extracting structured skill requirements from job descriptions
|
99 |
+
- **Skills Gap Assessment:** Identifying learning priorities when changing specializations
|
100 |
+
- **Market Research:** Analyzing trends in actuarial job requirements across industries
|
101 |
+
- **Resume Optimization:** Matching your background to employer expectations
|
102 |
+
|
103 |
+
### Examples
|
104 |
+
|
105 |
+
```
|
106 |
+
Input: "I am looking for an entry-level actuarial position in life insurance pricing
|
107 |
+
where I can apply my knowledge of mortality tables and statistical analysis. I have
|
108 |
+
strong Python programming skills and experience with GLM models from my university
|
109 |
+
projects. I am particularly interested in learning more about IFRS 17 implementation."
|
110 |
+
|
111 |
+
Output: Life Insurance Pricing (92%), Python (88%), GLM Modeling (85%),
|
112 |
+
Statistical Analysis (82%), Mortality Tables (78%), IFRS 17 (75%),
|
113 |
+
Entry Level (71%), Excel (68%)...
|
114 |
+
```
|
115 |
+
|
116 |
+
---
|
117 |
+
|
118 |
+
## ๐ Training Data
|
119 |
+
|
120 |
+
- **Dataset Size:** 7,000+ real actuarial job postings
|
121 |
+
- **Time Period:** 2023-2025 job market
|
122 |
+
- **Coverage:** Life, P&C, Health, Pensions, Reinsurance, Consulting
|
123 |
+
- **Labels:** 100+ unique skills covering actuarial domains, programming, tools, certifications, and soft skills
|
124 |
+
- **Format:** Multi-label classification with manual validation by actuarial professionals
|
125 |
+
|
126 |
+
---
|
127 |
+
|
128 |
+
## ๐ Training Statistics
|
129 |
+
|
130 |
+
| Metric | Value | Notes |
|
131 |
+
|--------------------|----------------|--------------------------------------------|
|
132 |
+
| Epochs | 10 | Best model at epoch 7 |
|
133 |
+
| Final F1 Micro | 0.6728 | Overall performance across all skills |
|
134 |
+
| Final F1 Macro | 0.1060 | Average per skill (handles class imbalance)|
|
135 |
+
| Precision Micro | 0.7915 | 79% of predictions are correct |
|
136 |
+
| Recall Micro | 0.5850 | Captures 58% of relevant skills |
|
137 |
+
| Hamming Loss | 0.0207 | Only 2% label error rate |
|
138 |
+
| Training Loss | 0.0602 | Final validation loss |
|
139 |
+
| Learning Rate | 2e-5 | With 10% warmup |
|
140 |
+
| Batch Size | 16 | Effective (8 per device, 2 grad accum) |
|
141 |
+
| Hardware | GPU | Mixed precision training (FP16) |
|
142 |
+
|
143 |
+
---
|
144 |
+
|
145 |
+
## ๐ ๏ธ Dependencies
|
146 |
+
|
147 |
+
```
|
148 |
+
transformers>=4.44.0
|
149 |
+
torch>=2.0.0
|
150 |
+
pandas
|
151 |
+
numpy
|
152 |
+
```
|
153 |
+
|
154 |
+
---
|
155 |
+
|
156 |
+
## โ ๏ธ Limitations & Ethics
|
157 |
+
|
158 |
+
- **Domain-Specific:** Optimized for actuarial and insurance contexts only
|
159 |
+
- **English Only:** Trained exclusively on English job postings
|
160 |
+
- **Class Imbalance:** Rare skills may have lower prediction confidence
|
161 |
+
- **Not Exhaustive:** Cannot predict skills not present in training data
|
162 |
+
- **Career Guidance Only:** Not a substitute for professional career counseling
|
163 |
+
- **Geographic Bias:** Primarily reflects US, UK, and EU job markets
|
164 |
+
|
165 |
+
---
|
166 |
+
|
167 |
+
## ๐ป Usage Example
|
168 |
+
|
169 |
+
```python
|
170 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
171 |
+
import torch
|
172 |
+
|
173 |
+
# Load model
|
174 |
+
model_name = "manuelcaccone/modernbert-actuarial-skills-classifier"
|
175 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
176 |
+
model = AutoModelForSequenceClassification.from_pretrained(model_name)
|
177 |
+
|
178 |
+
# Prepare text
|
179 |
+
text = """I am a recent mathematics graduate passionate about pension actuarial work
|
180 |
+
and retirement planning. I have limited professional experience but completed internships
|
181 |
+
where I learned about defined benefit schemes and regulatory compliance. I am eager to
|
182 |
+
develop my Excel skills further and would consider positions starting at 40000 dollars
|
183 |
+
minimum while I continue studying for my actuarial exams."""
|
184 |
+
|
185 |
+
# Tokenize and predict
|
186 |
+
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
|
187 |
+
with torch.no_grad():
|
188 |
+
outputs = model(**inputs)
|
189 |
+
probabilities = torch.sigmoid(outputs.logits)
|
190 |
+
|
191 |
+
# Get predictions above threshold
|
192 |
+
threshold = 0.5
|
193 |
+
predicted_indices = torch.where(probabilities[0] > threshold)[0]
|
194 |
+
|
195 |
+
# Display results
|
196 |
+
print("Predicted Skills:")
|
197 |
+
for idx in predicted_indices:
|
198 |
+
skill = model.config.id2label[idx.item()]
|
199 |
+
confidence = probabilities[0][idx].item()
|
200 |
+
print(f" {skill}: {confidence:.1%}")
|
201 |
+
```
|
202 |
+
|
203 |
+
---
|
204 |
+
|
205 |
+
## ๐ Related Resources
|
206 |
+
|
207 |
+
This model is part of an actuarial AI ecosystem:
|
208 |
+
|
209 |
+
- **Interactive Demo:** [Skills Classifier Space](https://huggingface.co/spaces/manuelcaccone/actuarial-skills-classifier)
|
210 |
+
- **Model Repository:** [modernbert-actuarial-skills-classifier](https://huggingface.co/manuelcaccone/modernbert-actuarial-skills-classifier)
|
211 |
+
- **Base Model:** [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
|
212 |
+
|
213 |
+
---
|
214 |
+
|
215 |
+
## ๐ค Author & Citation
|
216 |
+
|
217 |
+
- **Creator:** Manuel Caccone (Actuarial Data Scientist & Open Source Educator)
|
218 |
+
- [LinkedIn](https://www.linkedin.com/in/manuel-caccone-42872141/) ยท [[email protected]](mailto:[email protected])
|
219 |
+
|
220 |
+
```bibtex
|
221 |
+
@model{caccone2025actuarialskills,
|
222 |
+
title={ModernBERT Actuarial Skills Classifier: Career Planning with Multi-Label Classification},
|
223 |
+
author={Caccone, Manuel},
|
224 |
+
year={2025},
|
225 |
+
publisher={Hugging Face},
|
226 |
+
url={https://huggingface.co/manuelcaccone/modernbert-actuarial-skills-classifier},
|
227 |
+
note={Fine-tuned ModernBERT for actuarial skills extraction from job descriptions}
|
228 |
+
}
|
229 |
+
```
|
230 |
+
|
231 |
+
---
|
232 |
+
|
233 |
+
## ๐ License
|
234 |
+
|
235 |
+
Apache 2.0 License โ use, modify, and cite for ethical, research, educational, and commercial purposes.
|
236 |
+
|
237 |
+
---
|
238 |
+
|
239 |
+
<div align="center">
|
240 |
+
|
241 |
+
### ๐ค Want to collaborate or discuss actuarial AI?
|
242 |
+
|
243 |
+
[](https://www.linkedin.com/in/manuel-caccone-42872141/)
|
244 |
+
|
245 |
+
</div>
|
246 |
+
|
247 |
+
---
|
248 |
+
|
249 |
+
*Part of the actuarial open-source education initiativeโbringing AI tools to the actuarial community!*
|