Update README.md
Browse files
README.md
CHANGED
@@ -69,7 +69,7 @@ print(output)
|
|
69 |
|
70 |
## Training Data
|
71 |
|
72 |
-
Karamaru was trained using a custom Edo-period text dataset totaling approximately
|
73 |
1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
|
74 |
2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
|
75 |
3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).
|
|
|
69 |
|
70 |
## Training Data
|
71 |
|
72 |
+
Karamaru was trained using a custom Edo-period text dataset totaling approximately 25 million characters.
|
73 |
1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
|
74 |
2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
|
75 |
3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).
|