HERBERT: Leveraging UMLS Hierarchical Knowledge to Enhance Clinical Entity Normalization in Spanish
HERBERT-P is a contrastive-learning-based bi-encoder for medical entity normalization in Spanish, leveraging synonym and parent relationships from UMLS to enhance candidate retrieval for entity linking in clinical texts.
Key features:
- Base model: PlanTL-GOB-ES/roberta-base-biomedical-clinical-es
- Trained with 15 positive pairs per anchor (synonyms + parents)
- Task: Normalization of disease, procedure, and symptom mentions to SNOMED-CT/UMLS codes.
- Domain: Spanish biomedical/clinical texts.
- Corpora: DisTEMIST, MedProcNER, SympTEMIST.
Benchmark Results
Corpus | Top-1 | Top-5 | Top-25 | Top-200 |
---|---|---|---|---|
DisTEMIST | 0.574 | 0.720 | 0.803 | 0.869 |
SympTEMIST | 0.630 | 0.779 | 0.881 | 0.945 |
MedProcNER | 0.651 | 0.763 | 0.838 | 0.892 |
- Downloads last month
- 5
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support