mdocekal's picture
Update README.md
7116e96 verified
---
library_name: transformers
datasets:
- SoFairOA/sofair_softcite_somesci
metrics:
- recall
- precision
- f1
- accuracy
base_model:
- answerdotai/ModernBERT-base
---
# SoFair ModernBERT base filter
Fine-tuned ModernBERT to identify candidate documents for software mention extraction.
It was trained on [SoFairOA/sofair_softcite_somesci](https://huggingface.co/datasets/SoFairOA/sofair_softcite_somesci) (sofair_softcite_somesci_documents) to classify whether the given document contains at least one annotation.
## Usage
We created [https://github.com/SoFairOA/filter](https://github.com/SoFairOA/filter), a simple command-line tool to use this model for processing a collection of documents.
## Evaluation
We evaluated this model on the test set of [SoFairOA/sofair_softcite_somesci](https://huggingface.co/datasets/SoFairOA/sofair_softcite_somesci) (sofair_softcite_somesci_documents) dataset:
<table>
<tr>
<th>precision</th>
<td>0.8625730994152047</td>
</tr>
<tr>
<th>recall</th>
<td>0.9104938271604939</td>
</tr>
<tr>
<th>f1</th>
<td>0.8858858858858859</td>
</tr>
<tr>
<th>accuracy</th>
<td>0.9268527430221367</td>
</tr>
</table>