elderprince commited on
Commit
4a953d7
·
verified ·
1 Parent(s): bf1000e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -1
README.md CHANGED
@@ -3,4 +3,21 @@ license: gpl-3.0
3
  base_model:
4
  - naver-clova-ix/donut-base
5
  pipeline_tag: visual-document-retrieval
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  base_model:
4
  - naver-clova-ix/donut-base
5
  pipeline_tag: visual-document-retrieval
6
+ ---
7
+
8
+ # HeR-T: Herbarium specimen label Recognition Transformer
9
+
10
+ ## 📃 Paper
11
+ Application of computer vision to the automated extraction of metadata from natural history specimen labels: A case study on herbarium specimens (Under Review)
12
+
13
+ ## 💁 Authors
14
+ Zacchigna, Jacopo; Liu, Weiwei; Pellegrino, Felice Andrea; Peron, Adriano; Roma-Marzio, Francesco; Peruzzi, Lorenzo; Martellos, Stefano
15
+
16
+ ## 🚀 Overview
17
+ HeR-T (Herbarium specimen label Recognition Transformer) is a fine-tuned vision-language model designed for automated metadata extraction of history specimen labels, especially herbarium specimen labels. It leverages Donut-base and has been fine-tuned with 55,089 herbarium specimen images from the Herbarium of the University of Pisa (international acronym PI).
18
+
19
+ ## 🔥 Features
20
+ - **Fine-tuned on** specimen images from the Herbarium of the University of Pisa for automated metadata extraction of history specimen labels
21
+ - **Supports** image inputs with labels containing printed, handwritten, or mixed-format texts
22
+ - **Evaluation**: Tree Edit Distance (TED) accuracy score with the formula max(0, 1−TED(pr, gt)/TED(φ, gt)), where gt, pr, and φ stand for ground truth, prediction, and empty trees respectively
23
+ - **Pre-trained weights** are loaded from Donut-base (naver-clova-ix/donut-base)