manuelcaccone
/

modernbert-actuarial-skills-multilabel

@@ -3,84 +3,247 @@ library_name: transformers
 license: apache-2.0
 base_model: answerdotai/ModernBERT-base
 tags:
-- generated_from_trainer
 model-index:
-- name: modernbert_multilabel_model
-  results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
-# modernbert_multilabel_model
-This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the None dataset.
-It achieves the following results on the evaluation set:
-- Loss: 0.0616
-- F1 Micro: 0.6728
-- F1 Macro: 0.1060
-- Precision Micro: 0.7915
-- Precision Macro: 0.2013
-- Recall Micro: 0.5850
-- Recall Macro: 0.0860
-- Hamming Loss: 0.0207
-- Exact Match Accuracy: 0.0457
-## Model description
-More information needed
-## Intended uses & limitations
-More information needed
-## Training and evaluation data
-More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 2e-05
-- train_batch_size: 8
-- eval_batch_size: 16
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 16
-- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
-- lr_scheduler_type: linear
-- lr_scheduler_warmup_ratio: 0.1
-- num_epochs: 10
-- mixed_precision_training: Native AMP
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss | F1 Micro | F1 Macro | Precision Micro | Precision Macro | Recall Micro | Recall Macro | Hamming Loss | Exact Match Accuracy |
-|:-------------:|:------:|:----:|:---------------:|:--------:|:--------:|:---------------:|:---------------:|:------------:|:------------:|:------------:|:--------------------:|
-| 0.1435        | 0.5797 | 100  | 0.0719          | 0.5833   | 0.0436   | 0.7128          | 0.0604          | 0.4936       | 0.0412       | 0.0253       | 0.0054               |
-| 0.1352        | 1.1565 | 200  | 0.0684          | 0.6069   | 0.0558   | 0.7282          | 0.0756          | 0.5202       | 0.0508       | 0.0242       | 0.0098               |
-| 0.1314        | 1.7362 | 300  | 0.0643          | 0.6137   | 0.0605   | 0.8010          | 0.0911          | 0.4975       | 0.0515       | 0.0225       | 0.0098               |
-| 0.1141        | 2.3130 | 400  | 0.0632          | 0.6173   | 0.0707   | 0.7935          | 0.1299          | 0.5052       | 0.0576       | 0.0225       | 0.0087               |
-| 0.1153        | 2.8928 | 500  | 0.0611          | 0.6483   | 0.0697   | 0.7948          | 0.1330          | 0.5474       | 0.0585       | 0.0213       | 0.0207               |
-| 0.1033        | 3.4696 | 600  | 0.0593          | 0.6649   | 0.0875   | 0.7772          | 0.1448          | 0.5810       | 0.0746       | 0.0210       | 0.0174               |
-| 0.0983        | 4.0464 | 700  | 0.0589          | 0.6699   | 0.0934   | 0.7770          | 0.1504          | 0.5887       | 0.0794       | 0.0208       | 0.0229               |
-| 0.0874        | 4.6261 | 800  | 0.0593          | 0.6661   | 0.0894   | 0.7864          | 0.1740          | 0.5777       | 0.0732       | 0.0208       | 0.0250               |
-| 0.0771        | 5.2029 | 900  | 0.0591          | 0.6684   | 0.0968   | 0.8022          | 0.1778          | 0.5729       | 0.0783       | 0.0204       | 0.0229               |
-| 0.0702        | 5.7826 | 1000 | 0.0583          | 0.6803   | 0.1035   | 0.7895          | 0.1850          | 0.5976       | 0.0845       | 0.0202       | 0.0305               |
-| 0.0596        | 6.3594 | 1100 | 0.0592          | 0.6722   | 0.1043   | 0.8050          | 0.1927          | 0.5770       | 0.0833       | 0.0202       | 0.0348               |
-| 0.0563        | 6.9391 | 1200 | 0.0588          | 0.6815   | 0.1069   | 0.7962          | 0.1845          | 0.5957       | 0.0868       | 0.0200       | 0.0283               |
-| 0.0461        | 7.5159 | 1300 | 0.0592          | 0.6845   | 0.1082   | 0.7849          | 0.1944          | 0.6069       | 0.0875       | 0.0201       | 0.0348               |
-| 0.0422        | 8.0928 | 1400 | 0.0599          | 0.6810   | 0.1075   | 0.7969          | 0.1916          | 0.5945       | 0.0868       | 0.0200       | 0.0348               |
-| 0.0369        | 8.6725 | 1500 | 0.0602          | 0.6803   | 0.1076   | 0.7919          | 0.1948          | 0.5962       | 0.0866       | 0.0201       | 0.0359               |
-| 0.0334        | 9.2493 | 1600 | 0.0601          | 0.6833   | 0.1075   | 0.7920          | 0.1939          | 0.6008       | 0.0865       | 0.0200       | 0.0370               |
-| 0.0325        | 9.8290 | 1700 | 0.0602          | 0.6809   | 0.1062   | 0.7921          | 0.1909          | 0.5971       | 0.0856       | 0.0201       | 0.0326               |
-### Framework versions
-- Transformers 4.56.1
-- Pytorch 2.8.0+cu126
-- Datasets 4.0.0
-- Tokenizers 0.22.0

 license: apache-2.0
 base_model: answerdotai/ModernBERT-base
 tags:
+- actuarial
+- insurance
+- multilabel-classification
+- sentence-classification
+- skills-extraction
+- career-planning
+- modernbert
+- job-analysis
+datasets:
+- actuarial-jobs-7k
+language:
+- en
+metrics:
+- f1
+- precision
+- recall
 model-index:
+- name: modernbert-actuarial-skills-classifier
+  results:
+  - task:
+      type: text-classification
+      name: Multi-Label Text Classification
+    metrics:
+    - type: f1_micro
+      value: 0.6728
+      name: F1 Micro
+    - type: f1_macro
+      value: 0.1060
+      name: F1 Macro
+    - type: precision_micro
+      value: 0.7915
+      name: Precision Micro
+    - type: recall_micro
+      value: 0.5850
+      name: Recall Micro
+widget:
+- text: "I am looking for an entry-level actuarial position in life insurance pricing where I can apply my knowledge of mortality tables and statistical analysis. I have strong Python programming skills and experience with GLM models from my university projects. I am particularly interested in learning more about IFRS 17 implementation and would like to work with modern actuarial software like Prophet or MoSes."
+  example_title: "Life Insurance Entry Level"
+- text: "I have three years of experience as a reserving actuary in property and casualty insurance, working primarily with workers compensation and general liability lines. I am proficient in R and SQL for data analysis and have built several predictive models using machine learning techniques. I am now seeking a senior analyst role where I can lead pricing projects and mentor junior actuaries, with a target salary range of at least 85000 dollars annually."
+  example_title: "P&C Career Growth"
+- text: "After completing my actuarial exams up to ASA level, I want to transition into a health insurance role focusing on medical cost trend analysis and risk adjustment. I enjoy working with large datasets and have self-taught Python and SAS for healthcare analytics. My ideal position would involve building pricing models for group health products and I am hoping to find opportunities that offer around 70000 dollars per year as I build my specialization in this area."
+  example_title: "Health Insurance Transition"
+- text: "I am a recent mathematics graduate passionate about pension actuarial work and retirement planning. I have limited professional experience but completed internships where I learned about defined benefit schemes, asset liability management, and regulatory compliance under Solvency II. I am eager to develop my Excel and VBA skills further and would consider positions starting at 40000 dollars minimum while I continue studying for my actuarial fellowship exams."
+  example_title: "Pensions Graduate Role"
+- text: "As a data scientist looking to move into the actuarial field, I bring extensive experience with machine learning frameworks like TensorFlow and PyTorch, as well as strong programming abilities in Python and Scala. I am particularly interested in applying deep learning techniques to mortality forecasting and longevity risk modeling in life insurance. I am seeking roles that value innovation in actuarial modeling and offer competitive compensation of at least 95000 dollars given my technical background."
+  example_title: "Data Science to Actuarial"
+inference:
+  parameters:
+    threshold: 0.5
+    top_k: 15
 ---
+<div align="center">
+  <img src="./photo.png" width="150" height="150" style="border-radius: 50%; margin: 20px 0;">
+  # 👋 Connect with me on LinkedIn!
+  [![LinkedIn](https://img.shields.io/badge/LinkedIn-Connect-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/manuel-caccone-42872141/)
+  **Manuel Caccone - Actuarial Data Scientist & Open Source Educator**
+  *Let's discuss actuarial science, AI, and career development!*
+  ---
+</div>
+![Model Thumbnail](./Thumbnail.png)
+# 🎯 ModernBERT Actuarial Skills Classifier: Your Career Planning Assistant
+---
+## 🚩 Model Description
+**ModernBERT-actuarial-skills-classifier** is a fine-tuned ModernBERT-base model trained on over 7,000 actuarial job postings, purpose-built to extract and identify actuarial competencies and technical skills from natural language descriptions. It powers career planning, skills gap analysis, and learning roadmap generation for actuarial professionals and students.
+---
+## ✨ Key Features
+- 🎯 **Multi-Label Classification:** Identifies multiple relevant skills from a single description
+- 📚 **Career-Focused:** Trained on real job postings covering Life, P&C, Health, and Pensions
+- 🚀 **Instant Analysis:** Get results in under 1 second
+- 🔓 **Open Source:** Apache 2.0 License for educational and commercial use
+- 🌐 **Interactive Demo:** Full-featured Gradio Space with learning roadmaps and batch processing
+---
+## 💡 Intended Use Cases
+- **Career Planning:** Students and early-career actuaries discovering required skills for target roles
+- **Job Analysis:** Extracting structured skill requirements from job descriptions
+- **Skills Gap Assessment:** Identifying learning priorities when changing specializations
+- **Market Research:** Analyzing trends in actuarial job requirements across industries
+- **Resume Optimization:** Matching your background to employer expectations
+### Examples
+```
+Input: "I am looking for an entry-level actuarial position in life insurance pricing
+where I can apply my knowledge of mortality tables and statistical analysis. I have
+strong Python programming skills and experience with GLM models from my university
+projects. I am particularly interested in learning more about IFRS 17 implementation."
+Output: Life Insurance Pricing (92%), Python (88%), GLM Modeling (85%),
+        Statistical Analysis (82%), Mortality Tables (78%), IFRS 17 (75%),
+        Entry Level (71%), Excel (68%)...
+```
+---
+## 📂 Training Data
+- **Dataset Size:** 7,000+ real actuarial job postings
+- **Time Period:** 2023-2025 job market
+- **Coverage:** Life, P&C, Health, Pensions, Reinsurance, Consulting
+- **Labels:** 100+ unique skills covering actuarial domains, programming, tools, certifications, and soft skills
+- **Format:** Multi-label classification with manual validation by actuarial professionals
+---
+## 📊 Training Statistics
+| Metric             | Value          | Notes                                      |
+|--------------------|----------------|--------------------------------------------|
+| Epochs             | 10             | Best model at epoch 7                      |
+| Final F1 Micro     | 0.6728         | Overall performance across all skills      |
+| Final F1 Macro     | 0.1060         | Average per skill (handles class imbalance)|
+| Precision Micro    | 0.7915         | 79% of predictions are correct             |
+| Recall Micro       | 0.5850         | Captures 58% of relevant skills            |
+| Hamming Loss       | 0.0207         | Only 2% label error rate                   |
+| Training Loss      | 0.0602         | Final validation loss                      |
+| Learning Rate      | 2e-5           | With 10% warmup                            |
+| Batch Size         | 16             | Effective (8 per device, 2 grad accum)     |
+| Hardware           | GPU            | Mixed precision training (FP16)            |
+---
+## 🛠️ Dependencies
+```
+transformers>=4.44.0
+torch>=2.0.0
+pandas
+numpy
+```
+---
+## ⚠️ Limitations & Ethics
+- **Domain-Specific:** Optimized for actuarial and insurance contexts only
+- **English Only:** Trained exclusively on English job postings
+- **Class Imbalance:** Rare skills may have lower prediction confidence
+- **Not Exhaustive:** Cannot predict skills not present in training data
+- **Career Guidance Only:** Not a substitute for professional career counseling
+- **Geographic Bias:** Primarily reflects US, UK, and EU job markets
+---
+## 💻 Usage Example
+```python
+from transformers import AutoTokenizer, AutoModelForSequenceClassification
+import torch
+# Load model
+model_name = "manuelcaccone/modernbert-actuarial-skills-classifier"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForSequenceClassification.from_pretrained(model_name)
+# Prepare text
+text = """I am a recent mathematics graduate passionate about pension actuarial work
+and retirement planning. I have limited professional experience but completed internships
+where I learned about defined benefit schemes and regulatory compliance. I am eager to
+develop my Excel skills further and would consider positions starting at 40000 dollars
+minimum while I continue studying for my actuarial exams."""
+# Tokenize and predict
+inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
+with torch.no_grad():
+    outputs = model(**inputs)
+    probabilities = torch.sigmoid(outputs.logits)
+# Get predictions above threshold
+threshold = 0.5
+predicted_indices = torch.where(probabilities[0] > threshold)[0]
+# Display results
+print("Predicted Skills:")
+for idx in predicted_indices:
+    skill = model.config.id2label[idx.item()]
+    confidence = probabilities[0][idx].item()
+    print(f"  {skill}: {confidence:.1%}")
+```
+---
+## 🌟 Related Resources
+This model is part of an actuarial AI ecosystem:
+- **Interactive Demo:** [Skills Classifier Space](https://huggingface.co/spaces/manuelcaccone/actuarial-skills-classifier)
+- **Model Repository:** [modernbert-actuarial-skills-classifier](https://huggingface.co/manuelcaccone/modernbert-actuarial-skills-classifier)
+- **Base Model:** [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base)
+---
+## 👤 Author & Citation
+- **Creator:** Manuel Caccone (Actuarial Data Scientist & Open Source Educator)
+- [LinkedIn](https://www.linkedin.com/in/manuel-caccone-42872141/) · [[email protected]](mailto:[email protected])
+```bibtex
+@model{caccone2025actuarialskills,
+  title={ModernBERT Actuarial Skills Classifier: Career Planning with Multi-Label Classification},
+  author={Caccone, Manuel},
+  year={2025},
+  publisher={Hugging Face},
+  url={https://huggingface.co/manuelcaccone/modernbert-actuarial-skills-classifier},
+  note={Fine-tuned ModernBERT for actuarial skills extraction from job descriptions}
+}
+```
+---
+## 📜 License
+Apache 2.0 License — use, modify, and cite for ethical, research, educational, and commercial purposes.
+---
+<div align="center">
+### 🤝 Want to collaborate or discuss actuarial AI?
+[![LinkedIn](https://img.shields.io/badge/Let's_Connect_on_LinkedIn!-blue?style=for-the-badge&logo=linkedin&logoColor=white)](https://www.linkedin.com/in/manuel-caccone-42872141/)
+</div>
+---
+*Part of the actuarial open-source education initiative—bringing AI tools to the actuarial community!*