Model Details

Model Description

This model is used for sentence segmentation of MIMIC-III notes. It takes the clinical text as input and predict BIO tagging, where B indicates the Beginning of a sentence, I represents Inside of a sentence, and O denotes Outside of a sentence. More details of this model is in the paper Automatic sentence segmentation of clinical record narratives in real-world data. The smaple code of using this model is at github

Out segmentation model is based on microsoft/BiomedNLP-BiomedBERT-base-uncased-abstract-fulltext, and we trained on MIMIC-III notes for a sequence labeling (token classification) task.

Citation

Dongfang Xu, Davy Weissenbacher, Karen O’Connor, Siddharth Rawal, and Graciela Gonzalez Hernandez. 2024. Automatic sentence segmentation of clinical record narratives in real-world data. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 20780–20793, Miami, Florida, USA. Association for Computational Linguistics.

Downloads last month
5
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dongfangxu/SentenceSegmenter-MIMIC