tkasasagi commited on
Commit
1741192
·
verified ·
1 Parent(s): e10984f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -69,7 +69,7 @@ print(output)
69
 
70
  ## Training Data
71
 
72
- Karamaru was trained using a custom Edo-period text dataset totaling approximately 24 million characters.
73
  1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
74
  2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
75
  3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).
 
69
 
70
  ## Training Data
71
 
72
+ Karamaru was trained using a custom Edo-period text dataset totaling approximately 25 million characters.
73
  1. [Minna de Honkoku](https://www.honkoku.org/) 12 millions characters.
74
  2. [Kuzushiji Dataset](https://codh.rois.ac.jp/char-shape/) 1 million characters.
75
  3. [Pre-Modern Japanese Text Dataset](https://codh.rois.ac.jp/pmjt/) 12 million characters using AI Kuzushiji OCR model [RURI](https://codh.rois.ac.jp/miwo/) and using Sakana AI's LLM based [classical Japanese OCR Refiner](https://ipsj.ixsq.nii.ac.jp/records/241512).