Update README.md
Browse files
README.md
CHANGED
@@ -54,13 +54,13 @@ from ultralytics import YOLO
|
|
54 |
|
55 |
# Download and load nano model
|
56 |
model = YOLO(hf_hub_download(
|
57 |
-
repo_id="
|
58 |
filename="yolo11n.pt"
|
59 |
))
|
60 |
|
61 |
# Or download and load small model
|
62 |
model = YOLO(hf_hub_download(
|
63 |
-
repo_id="
|
64 |
filename="yolo11s.pt"
|
65 |
))
|
66 |
|
@@ -84,19 +84,19 @@ for result in results:
|
|
84 |
|
85 |
## 📰 Use Case
|
86 |
|
87 |
-
The model is
|
88 |
|
89 |
## 📊 Dataset
|
90 |
|
91 |
-
The models were trained on the [Beyond Words dataset](https://labs.loc.gov/work/experiments/beyond-words/), which was created via crowdsourcing and augmented with expert-
|
92 |
|
93 |
- 3,559 annotated newspaper pages (train+val)
|
94 |
-
- 48,409
|
95 |
- 7 content categories
|
96 |
|
97 |
Annotations are provided in COCO format and were aligned with METS/ALTO OCR for further downstream use.
|
98 |
|
99 |
-
It's important to note that the training data does not represent a representative sample of historical newspaper content. The training data is drawn from the Library of Congress's _Chronicling America_ collection, which is focused on American newspapers from the late 19th and early 20th centuries. This also means that the models are likely to do much better on images from this
|
100 |
|
101 |
## 🧠Model Details
|
102 |
|
@@ -194,4 +194,4 @@ If you use this model or dataset, please cite the following:
|
|
194 |
|
195 |
## License
|
196 |
|
197 |
-
This model is released under the AGPL-3.0 license.
|
|
|
54 |
|
55 |
# Download and load nano model
|
56 |
model = YOLO(hf_hub_download(
|
57 |
+
repo_id="biglam/historic-newspaper-illustrations-yolov11",
|
58 |
filename="yolo11n.pt"
|
59 |
))
|
60 |
|
61 |
# Or download and load small model
|
62 |
model = YOLO(hf_hub_download(
|
63 |
+
repo_id="biglam/historic-newspaper-illustrations-yolov11",
|
64 |
filename="yolo11s.pt"
|
65 |
))
|
66 |
|
|
|
84 |
|
85 |
## 📰 Use Case
|
86 |
|
87 |
+
The model is intended to extract visual content from historical newspapers.
|
88 |
|
89 |
## 📊 Dataset
|
90 |
|
91 |
+
The models were trained on the [Beyond Words dataset](https://labs.loc.gov/work/experiments/beyond-words/), which was created via crowdsourcing and augmented with expert-labelled annotations for headlines and advertisements. The dataset consists of:
|
92 |
|
93 |
- 3,559 annotated newspaper pages (train+val)
|
94 |
+
- 48,409 labelled visual content regions
|
95 |
- 7 content categories
|
96 |
|
97 |
Annotations are provided in COCO format and were aligned with METS/ALTO OCR for further downstream use.
|
98 |
|
99 |
+
It's important to note that the training data does not represent a representative sample of historical newspaper content. The training data is drawn from the Library of Congress's _Chronicling America_ collection, which is focused on American newspapers from the late 19th and early 20th centuries. This also means that the models are likely to do much better on images from this period and digitised similarly.
|
100 |
|
101 |
## 🧠Model Details
|
102 |
|
|
|
194 |
|
195 |
## License
|
196 |
|
197 |
+
This model is released under the AGPL-3.0 license. The dataset is licensed CC0 (Public Domain).
|