Change example and fix preprocessing script link
Browse files
README.md
CHANGED
@@ -26,10 +26,12 @@ ner_pipeline = pipeline("token-classification",
|
|
26 |
aggregation_strategy="max")
|
27 |
|
28 |
# Apply it to some text
|
29 |
-
ner_pipeline("
|
30 |
|
31 |
# Output:
|
32 |
-
# [ {"entity_group": "
|
|
|
|
|
33 |
```
|
34 |
|
35 |
## Dataset Info
|
@@ -38,7 +40,7 @@ ner_pipeline("EGFR T790M mutations have been known to affect treatment outcomes
|
|
38 |
|
39 |
The dataset should be cited with: Wei, Chih-Hsuan, Hung-Yu Kao, and Zhiyong Lu. "GNormPlus: an integrative approach for tagging genes, gene families, and protein domains." BioMed research international 2015.1 (2015): 918710. DOI: [10.1155/2015/918710](https://doi.org/10.1155/2015/918710)
|
40 |
|
41 |
-
**Preprocessing:** The training set was split 75/25 to create a training and validation set. No changes were made to the annotations. The preprocessing script for this dataset is [
|
42 |
|
43 |
## Performance
|
44 |
|
|
|
26 |
aggregation_strategy="max")
|
27 |
|
28 |
# Apply it to some text
|
29 |
+
ner_pipeline("ZNF598 is a Zinc finger containing E3 ubiquitin ligase.")
|
30 |
|
31 |
# Output:
|
32 |
+
# [ {"entity_group": "Gene", "score": 0.99889, "word": "znf598", "start": 0, "end": 6},
|
33 |
+
# {"entity_group": "DomainMotif", "score": 0.74961, "word": "zinc finger", "start": 12, "end": 23},
|
34 |
+
# {"entity_group": "FamilyName", "score": 0.89084, "word": "e3 ubiquitin ligase", "start": 35, "end": 54} ]
|
35 |
```
|
36 |
|
37 |
## Dataset Info
|
|
|
40 |
|
41 |
The dataset should be cited with: Wei, Chih-Hsuan, Hung-Yu Kao, and Zhiyong Lu. "GNormPlus: an integrative approach for tagging genes, gene families, and protein domains." BioMed research international 2015.1 (2015): 918710. DOI: [10.1155/2015/918710](https://doi.org/10.1155/2015/918710)
|
42 |
|
43 |
+
**Preprocessing:** The training set was split 75/25 to create a training and validation set. No changes were made to the annotations. The preprocessing script for this dataset is [prepare_gnormplus.py](https://github.com/Glasgow-AI4BioMed/bioner/blob/main/prepare_gnormplus.py).
|
44 |
|
45 |
## Performance
|
46 |
|