DenisT commited on
Commit
7cf86f8
Β·
1 Parent(s): 62accd1

converted app into gradio application, made faster

Browse files
.gitattributes CHANGED
@@ -34,3 +34,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  fonts/**/* filter=lfs diff=lfs merge=lfs -text
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  fonts/**/* filter=lfs diff=lfs merge=lfs -text
37
+ model_creation/runs/detect/train5/weights/best.pt filter=lfs diff=lfs merge=lfs -text
38
+ model_creation/runs/detect/train5/weights/last.pt filter=lfs diff=lfs merge=lfs -text
.gitignore CHANGED
@@ -166,3 +166,4 @@ Pipfile.lock
166
 
167
  data/
168
  bounding_box_images/
 
 
166
 
167
  data/
168
  bounding_box_images/
169
+ image.png
Dockerfile DELETED
@@ -1,29 +0,0 @@
1
- # read the doc: https://huggingface.co/docs/hub/spaces-sdks-docker
2
- # you will also find guides on how best to write your Dockerfile
3
-
4
- FROM python:3.11
5
-
6
- WORKDIR /code
7
-
8
- COPY ./requirements.txt /code/requirements.txt
9
-
10
- RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt
11
-
12
- # Install OpenCV to combat the error: "ImportError: libGL.so.1: cannot open shared object file: No such file or directory"
13
- RUN apt-get update && apt-get install -y python3-opencv
14
- RUN pip install opencv-python
15
-
16
- COPY . .
17
-
18
- RUN useradd -m -u 1000 user
19
-
20
- USER user
21
-
22
- ENV HOME=/home/user \
23
- PATH=/home/user/.local/bin:$PATH
24
-
25
- WORKDIR $HOME/app
26
-
27
- COPY --chown=user . $HOME/app
28
-
29
- CMD ["uvicorn", "server:app", "--host", "0.0.0.0", "--port", "7860"]
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
README.md CHANGED
@@ -1,10 +1,13 @@
1
  ---
2
  title: Manga Translator
 
 
3
  emoji: πŸ“–
4
  colorFrom: pink
5
  colorTo: yellow
6
- sdk: docker
7
- pinned: false
 
8
  ---
9
 
10
  Check out the configuration reference at <https://huggingface.co/docs/hub/spaces-config-reference>
@@ -13,24 +16,28 @@ Check out the configuration reference at <https://huggingface.co/docs/hub/spaces
13
 
14
  - [Manga Translator](#manga-translator)
15
  - [Introduction](#introduction)
 
16
  - [Approach](#approach)
17
  - [Data Collection](#data-collection)
18
  - [Yolov8](#yolov8)
19
  - [Manga-ocr](#manga-ocr)
20
  - [Deep-translator](#deep-translator)
21
- - [Server](#server)
22
- - [Demo](#demo)
23
 
24
  ## Introduction
25
 
26
  I love reading manga, and I can't wait for the next chapter of my favorite manga to be released. However, the newest chapters are usually in Japanese, and they are translated to English after some time. I want to read the newest chapters as soon as possible, so I decided to build a manga translator that can translate Japanese manga to English.
27
 
 
 
 
 
28
  ## Approach
29
 
30
  I want to translate the text in the manga images from Japanese to English. I will first need to know where these speech bubbles are on the image. For this I will use `Yolov8` to detect the speech bubbles. Once I have the speech bubbles, I will use `manga-ocr` to extract the text from the speech bubbles. Finally, I will use `deep-translator` to translate the text from Japanese to English.
31
 
32
  ![Manga Translator](./assets/MangaTranslator.png)
33
 
 
34
  ### Data Collection
35
 
36
  This [dataset](https://universe.roboflow.com/speechbubbledetection-y9yz3/bubble-detection-gbjon/dataset/2#) contains over 8500 images of manga pages together with their annotations from Roboflow. I will use this dataset to train `Yolov8` to detect the speech bubbles in the manga images. To use this dataset with Yolov8, I will need to convert the annotations to the YOLO format, which is a text file containing the class label and the bounding box coordinates of the object in the image.
@@ -50,34 +57,3 @@ Optical character recognition for Japanese text, with the main focus being Japan
50
  ### Deep-translator
51
 
52
  `Deep-translator` is a Python package that uses the Google Translate API to translate text from one language to another. I will use `deep-translator` to translate the text extracted from the manga images from Japanese to English.
53
-
54
- ## Server
55
-
56
- I created a simple server and client using FastAPI. The server will receive the manga image from the client, detect the speech bubbles, extract the text from the speech bubbles, and translate the text from Japanese to English. The server will then send the translated text back to the client.
57
-
58
- To run the server, you will need to install the required packages. You can do this by running the following command:
59
-
60
- ```bash
61
- pip install -r requirements.txt
62
- ```
63
-
64
- You can then start the server by running the following command:
65
-
66
- ```bash
67
- python app.py
68
- ```
69
-
70
- The server will start running on `http://localhost:8000`. You can then send a POST request to `http://localhost:8000/predict` with the manga image as the request body.
71
-
72
- ```json
73
- POST /predict
74
- {
75
- "image": "base64_encoded_image"
76
- }
77
- ```
78
-
79
- ## Demo
80
-
81
- The following video is a screen recording of the client sending a manga image to the server, and the server detecting the speech bubbles, extracting the text, and translating the text from Japanese to English.
82
-
83
- [![Manga Translator](./assets/MangaTranslator.png)](https://www.youtube.com/watch?v=P0VZu4whrz4)
 
1
  ---
2
  title: Manga Translator
3
+ short_description: Translate manga from Japanese to English
4
+ tags: ["manga", "translate", "manga panel"]
5
  emoji: πŸ“–
6
  colorFrom: pink
7
  colorTo: yellow
8
+ sdk: gradio
9
+ pinned: true
10
+ app_file: app.py
11
  ---
12
 
13
  Check out the configuration reference at <https://huggingface.co/docs/hub/spaces-config-reference>
 
16
 
17
  - [Manga Translator](#manga-translator)
18
  - [Introduction](#introduction)
19
+ - [GitHub Project](#github-project)
20
  - [Approach](#approach)
21
  - [Data Collection](#data-collection)
22
  - [Yolov8](#yolov8)
23
  - [Manga-ocr](#manga-ocr)
24
  - [Deep-translator](#deep-translator)
 
 
25
 
26
  ## Introduction
27
 
28
  I love reading manga, and I can't wait for the next chapter of my favorite manga to be released. However, the newest chapters are usually in Japanese, and they are translated to English after some time. I want to read the newest chapters as soon as possible, so I decided to build a manga translator that can translate Japanese manga to English.
29
 
30
+ ## GitHub Project
31
+
32
+ The GitHub repository for this project can be found [here](https://github.com/Detopall/manga-translator).
33
+
34
  ## Approach
35
 
36
  I want to translate the text in the manga images from Japanese to English. I will first need to know where these speech bubbles are on the image. For this I will use `Yolov8` to detect the speech bubbles. Once I have the speech bubbles, I will use `manga-ocr` to extract the text from the speech bubbles. Finally, I will use `deep-translator` to translate the text from Japanese to English.
37
 
38
  ![Manga Translator](./assets/MangaTranslator.png)
39
 
40
+
41
  ### Data Collection
42
 
43
  This [dataset](https://universe.roboflow.com/speechbubbledetection-y9yz3/bubble-detection-gbjon/dataset/2#) contains over 8500 images of manga pages together with their annotations from Roboflow. I will use this dataset to train `Yolov8` to detect the speech bubbles in the manga images. To use this dataset with Yolov8, I will need to convert the annotations to the YOLO format, which is a text file containing the class label and the bounding box coordinates of the object in the image.
 
57
  ### Deep-translator
58
 
59
  `Deep-translator` is a Python package that uses the Google Translate API to translate text from one language to another. I will use `deep-translator` to translate the text extracted from the manga images from Japanese to English.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
app.py ADDED
@@ -0,0 +1,46 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import numpy as np
2
+ from PIL import Image
3
+ import gradio as gr
4
+
5
+ from main import predict
6
+
7
+ def process_image(image):
8
+ if image is not None:
9
+ if not isinstance(image, np.ndarray):
10
+ image = np.array(Image.open(image))
11
+ print(image)
12
+
13
+ translated_image = predict(image)
14
+ return translated_image
15
+ return None
16
+
17
+
18
+ with gr.Blocks() as demo:
19
+ gr.Markdown(
20
+ """
21
+ <div style="display: flex; align-items: center; flex-direction: row; justify-content: center; margin-bottom: 20px; text-align: center;">
22
+ <a href="https://github.com/Detopall/manga-translator" target="_blank" rel="noopener noreferrer" style="text-decoration: none;">
23
+ <h1 style="display: inline; margin-left: 10px; text-decoration: underline;">Manga Translator</h1>
24
+ </a>
25
+ </div>
26
+ """
27
+ )
28
+
29
+ with gr.Row():
30
+ with gr.Column(scale=1):
31
+ image_input = gr.Image()
32
+ submit_button = gr.Button("Translate")
33
+ with gr.Column(scale=1):
34
+ image_output = gr.Image()
35
+
36
+ submit_button.click(process_image, inputs=image_input, outputs=image_output)
37
+
38
+ examples = gr.Examples(examples=[
39
+ ["./examples/ex1.jpg"],
40
+ ["./examples/ex2.jpg"],
41
+ ["./examples/ex3.jpg"],
42
+ ["./examples/ex4.jpg"],
43
+ ], inputs=image_input)
44
+
45
+ if __name__ == "__main__":
46
+ demo.launch()
assets/MangaTranslator.png DELETED
Binary file (413 kB)
 
examples/ex1.jpg ADDED
examples/ex2.jpg ADDED
examples/ex3.jpg ADDED
examples/ex4.jpg ADDED
fonts/mangat.ttf CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7de5680ca9a17be79d6b311c1010865e4352da94f3512f6f1738111381a59a26
3
- size 29964
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:da397371e46e5ee93be5f59478a667c3a2c2434754a60624561034e18c8beaa9
3
+ size 32756
main.py ADDED
@@ -0,0 +1,53 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import io
2
+ import base64
3
+
4
+ import numpy as np
5
+ from PIL import Image
6
+ from ultralytics import YOLO
7
+
8
+ from utils.predict_bounding_boxes import predict_bounding_boxes
9
+ from utils.manga_ocr_utils import get_text_from_image
10
+ from utils.translate_manga import translate_manga
11
+ from utils.process_contour import process_contour
12
+ from utils.write_text_on_image import add_text
13
+
14
+ MODEL_PATH = "./model_creation/runs/detect/train5/weights/best.pt"
15
+ object_detection_model = YOLO(MODEL_PATH)
16
+
17
+ def extract_text_from_regions(image: np.ndarray, results: list):
18
+
19
+ for result in results:
20
+ x1, y1, x2, y2, _, _ = result
21
+ detected_image = image[int(y1):int(y2), int(x1):int(x2)]
22
+ if detected_image.shape[-1] == 4:
23
+ detected_image = detected_image[:, :, :3]
24
+ im = Image.fromarray(np.uint8(detected_image * 255))
25
+ text = get_text_from_image(im)
26
+
27
+ processed_image, cont = process_contour(detected_image)
28
+ translated_text = translate_manga(text, source_lang="auto", target_lang="en")
29
+ add_text(processed_image, translated_text, cont)
30
+
31
+
32
+ def convert_image_to_base64(image: Image.Image) -> str:
33
+ buff = io.BytesIO()
34
+ image.save(buff, format="PNG")
35
+ return base64.b64encode(buff.getvalue()).decode("utf-8")
36
+
37
+
38
+ def predict(image: np.ndarray):
39
+
40
+ image = Image.fromarray(image)
41
+ image.save("image.png")
42
+
43
+ try:
44
+ np_image = np.array(image)
45
+
46
+ results = predict_bounding_boxes(object_detection_model, "image.png")
47
+ extract_text_from_regions(np_image, results)
48
+
49
+ return np_image
50
+
51
+ except Exception as e:
52
+ print(f"Error: {str(e)}")
53
+ return None
model_creation/{011.jpg β†’ 011.png} RENAMED
File without changes
requirements.txt CHANGED
@@ -1,10 +1,5 @@
1
- ipykernel==6.29.4
2
  pillow==10.3.0
3
- ultralytics==8.2.23
4
- manga-ocr==0.1.11
5
- googletrans==4.0.0-rc1
6
  deep-translator==1.11.4
7
- fastapi==0.110.3
8
- uvicorn==0.30.0
9
- opencv-python==4.9.0.80
10
- numpy==1.26.4
 
 
1
  pillow==10.3.0
2
+ ultralytics==8.3.78
3
+ manga-ocr==0.1.14
 
4
  deep-translator==1.11.4
5
+ torch==2.6.0
 
 
 
server.py DELETED
@@ -1,104 +0,0 @@
1
- """
2
- This file contains the FastAPI application that serves the web interface and handles the API requests.
3
- """
4
-
5
- import os
6
- import io
7
- import base64
8
- from typing import Dict
9
-
10
- import numpy as np
11
- from fastapi import FastAPI
12
- from fastapi import status
13
- from fastapi.middleware.cors import CORSMiddleware
14
- from fastapi.staticfiles import StaticFiles
15
- from fastapi.responses import JSONResponse
16
- from fastapi.templating import Jinja2Templates
17
- from starlette.requests import Request
18
- from PIL import Image
19
- import uvicorn
20
- from ultralytics import YOLO
21
-
22
- from utils.predict_bounding_boxes import predict_bounding_boxes
23
- from utils.manga_ocr import get_text_from_image
24
- from utils.translate_manga import translate_manga
25
- from utils.process_contour import process_contour
26
- from utils.write_text_on_image import add_text
27
-
28
-
29
- # Load the object detection model
30
- best_model_path = "./model_creation/runs/detect/train5"
31
- object_detection_model = YOLO(os.path.join(best_model_path, "weights/best.pt"))
32
-
33
- app = FastAPI()
34
-
35
- # Add CORS middleware
36
- app.add_middleware(
37
- CORSMiddleware,
38
- allow_origins=["*"],
39
- allow_methods=["*"],
40
- allow_headers=["*"]
41
- )
42
-
43
- # Serve static files and templates
44
- app.mount("/static", StaticFiles(directory="static"), name="static")
45
- app.mount("/fonts", StaticFiles(directory="fonts"), name="fonts")
46
- templates = Jinja2Templates(directory="templates")
47
-
48
- @app.get("/")
49
- def home(request: Request):
50
- return templates.TemplateResponse("index.html", {"request": request})
51
-
52
-
53
- @app.post("/predict")
54
- def predict(request: Dict):
55
- try:
56
-
57
- image = request["image"]
58
-
59
- # Decode base64-encoded image
60
- image = base64.b64decode(image)
61
- image = Image.open(io.BytesIO(image))
62
- image_path = "image.png"
63
- translated_image_path = "translated_image.png"
64
-
65
- # Save the image locally
66
- image.save(image_path)
67
-
68
- results = predict_bounding_boxes(object_detection_model, image_path)
69
- image = np.array(image)
70
-
71
- for result in results:
72
- x1, y1, x2, y2, _, _ = result
73
- detected_image = image[int(y1):int(y2), int(x1):int(x2)]
74
- im = Image.fromarray(np.uint8((detected_image)*255))
75
- text = get_text_from_image(im)
76
- detected_image, cont = process_contour(detected_image)
77
- text_translated = translate_manga(text)
78
- add_text(detected_image, text_translated, cont)
79
-
80
- # Display the translated image
81
- result_image = Image.fromarray(image, 'RGB')
82
- result_image.save(translated_image_path)
83
-
84
- # Convert the image to base64
85
- buff = io.BytesIO()
86
- result_image.save(buff, format="PNG")
87
- img_str = base64.b64encode(buff.getvalue()).decode("utf-8")
88
-
89
- # Clean up
90
- os.remove(image_path)
91
- os.remove(translated_image_path)
92
-
93
- return {"image": img_str}
94
- except Exception as e:
95
- # Return with status code 500 (Internal Server Error) if an error occurs
96
- return JSONResponse(
97
- status_code=500,
98
- content={
99
- "code": status.HTTP_500_INTERNAL_SERVER_ERROR,
100
- "message": "Internal Server Error"}
101
- )
102
-
103
- if __name__ == '__main__':
104
- uvicorn.run('app:app', host='localhost', port=8000, reload=True)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
static/index.js DELETED
@@ -1,81 +0,0 @@
1
- "use strict";
2
-
3
- const fileInput = document.getElementById('fileInput');
4
- const translateButton = document.getElementById('translateButton');
5
- const spinner = document.getElementById('spinner');
6
- const inputImage = document.getElementById('inputImage');
7
- const outputImage = document.getElementById('outputImage');
8
- const downloadButton = document.getElementById('downloadButton');
9
-
10
- downloadButton.style.display = 'none';
11
-
12
- fileInput.addEventListener('change', () => {
13
- if (fileInput.files.length === 0) {
14
- alert('Please select an image file.');
15
- return;
16
- }
17
-
18
- // Clear the previous images
19
- inputImage.src = '';
20
- outputImage.src = '';
21
-
22
- const file = fileInput.files[0];
23
- const reader = new FileReader();
24
-
25
- reader.onload = function () {
26
- const base64Image = reader.result.split(',')[1];
27
- inputImage.src = `data:image/jpeg;base64,${base64Image}`;
28
- inputImage.style.display = 'block';
29
- };
30
-
31
- reader.readAsDataURL(file);
32
- });
33
-
34
- async function predict() {
35
- if (fileInput.files.length === 0) {
36
- alert('Please select an image file.');
37
- return;
38
- }
39
-
40
- const file = fileInput.files[0];
41
- const reader = new FileReader();
42
-
43
- reader.onloadend = async function () {
44
- const base64Image = reader.result.split(',')[1];
45
-
46
- const response = await fetch('/predict', {
47
- method: 'POST',
48
- headers: {
49
- 'Content-Type': 'application/json'
50
- },
51
- body: JSON.stringify({ image: base64Image })
52
- });
53
-
54
- const result = await response.json();
55
- if (response.status !== 200) {
56
- alert(result.message);
57
-
58
- // Reset the input
59
- fileInput.value = '';
60
- inputImage.style.display = 'none';
61
- outputImage.style.display = 'none';
62
- spinner.style.display = 'none';
63
- downloadButton.style.display = 'none';
64
- translateButton.style.display = 'block';
65
- return;
66
- }
67
-
68
- outputImage.src = `data:image/jpeg;base64,${result.image}`;
69
- outputImage.style.display = 'block';
70
- downloadButton.querySelector('a').href = outputImage.src;
71
- downloadButton.style.display = 'block';
72
-
73
- translateButton.style.display = 'inline-block';
74
- spinner.style.display = 'none';
75
- };
76
-
77
- reader.readAsDataURL(file);
78
-
79
- translateButton.style.display = 'none';
80
- spinner.style.display = 'block';
81
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
static/styles.css DELETED
@@ -1,113 +0,0 @@
1
- @font-face {
2
- font-family: "MangaFont";
3
- src: url("../fonts/mangat.ttf") format("truetype");
4
- }
5
-
6
- body {
7
- font-family: "MangaFont", Arial, sans-serif;
8
- text-align: center;
9
- background-color: #f0f0f0;
10
- margin: 0;
11
- padding: 0;
12
- }
13
-
14
- header {
15
- background-color: #4caf50;
16
- color: white;
17
- padding: 10px 0;
18
- }
19
-
20
- a {
21
- color: white;
22
- text-decoration: none;
23
- }
24
-
25
- .container {
26
- padding: 20px;
27
- }
28
-
29
- .actions {
30
- display: flex;
31
- justify-content: center;
32
- align-items: center;
33
- flex-flow: column wrap;
34
- gap: 1rem;
35
- }
36
-
37
- input[type="file"] {
38
- margin: 20px 0;
39
- padding: 10px;
40
- border: 2px solid #4caf50;
41
- border-radius: 5px;
42
- background-color: #fff;
43
- cursor: pointer;
44
- transition: border-color 0.3s;
45
- }
46
-
47
- input[type="file"]:hover {
48
- border-color: #45a049;
49
- }
50
-
51
- button {
52
- padding: 10px 20px;
53
- background-color: #4caf50;
54
- color: white;
55
- border: none;
56
- border-radius: 5px;
57
- cursor: pointer;
58
- font-size: 16px;
59
- transition: background-color 0.3s, transform 0.3s;
60
- }
61
-
62
- button:hover {
63
- background-color: #45a049;
64
- transform: scale(1.05);
65
- }
66
-
67
- .spinner {
68
- border: 16px solid #f3f3f3;
69
- border-top: 16px solid #4caf50;
70
- border-radius: 50%;
71
- width: 50px;
72
- height: 50px;
73
- animation: spin 2s linear infinite;
74
- margin: 20px auto;
75
- }
76
-
77
- @keyframes spin {
78
- 0% {
79
- transform: rotate(0deg);
80
- }
81
- 100% {
82
- transform: rotate(360deg);
83
- }
84
- }
85
-
86
- .images-container {
87
- display: flex;
88
- justify-content: space-around;
89
- margin-top: 20px;
90
- }
91
-
92
- .image-wrapper {
93
- width: 45%;
94
- }
95
-
96
- .image-wrapper h3 {
97
- margin-bottom: 10px;
98
- }
99
-
100
- #fileInput::file-selector-button {
101
- padding: 10px 20px;
102
- background-color: #4caf50;
103
- color: white;
104
- border: none;
105
- border-radius: 5px;
106
- cursor: pointer;
107
- transition: background-color 0.3s, transform 0.3s;
108
- }
109
-
110
- #fileInput::file-selector-button:hover {
111
- background-color: #45a049;
112
- transform: scale(1.05);
113
- }
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
templates/index.html DELETED
@@ -1,51 +0,0 @@
1
- <!DOCTYPE html>
2
- <html lang="en">
3
- <head>
4
- <meta charset="UTF-8" />
5
- <meta name="viewport" content="width=device-width, initial-scale=1.0" />
6
- <title>Manga Translator</title>
7
- <link rel="stylesheet" href="/static/styles.css" />
8
- </head>
9
- <body>
10
- <header>
11
- <h1>Manga Translator</h1>
12
- <p>
13
- Translate your manga panels from <b>Japanese</b> to
14
- <b>English</b>!
15
- </p>
16
- <p>
17
- Make sure the image is clear, black and white, and has text in
18
- Japanese.
19
- </p>
20
- </header>
21
- <div class="container">
22
- <div class="actions">
23
- <input type="file" id="fileInput" accept="image/*" />
24
- <button id="translateButton" onclick="predict()">
25
- Translate
26
- </button>
27
- <button id="downloadButton" style="display: none">
28
- <a href="#" download="translated_manga.jpg">
29
- Download Translated Image
30
- </a>
31
- </button>
32
- </div>
33
- <div id="spinner" class="spinner" style="display: none"></div>
34
- <div class="images-container">
35
- <div class="image-wrapper">
36
- <h3>Original Image</h3>
37
- <img id="inputImage" style="max-width: 100%" />
38
- </div>
39
- <div class="image-wrapper">
40
- <h3>Translated Image</h3>
41
- <img
42
- id="outputImage"
43
- alt="Translated Manga"
44
- style="max-width: 100%; display: none"
45
- />
46
- </div>
47
- </div>
48
- </div>
49
- <script src="/static/index.js"></script>
50
- </body>
51
- </html>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
utils/{manga_ocr.py β†’ manga_ocr_utils.py} RENAMED
@@ -4,11 +4,16 @@ This module is used to extract text from images using manga_ocr.
4
 
5
  from manga_ocr import MangaOcr
6
 
 
7
 
8
  def get_text_from_image(image):
9
  """
10
  Extract text from images using manga_ocr.
11
  """
12
- mocr = MangaOcr()
13
 
14
- return mocr(image)
 
 
 
 
 
 
4
 
5
  from manga_ocr import MangaOcr
6
 
7
+ mocr = MangaOcr()
8
 
9
  def get_text_from_image(image):
10
  """
11
  Extract text from images using manga_ocr.
12
  """
 
13
 
14
+ try:
15
+ result = mocr(image)
16
+ return result
17
+ except Exception as e:
18
+ print(f"An error occurred: {str(e)}")
19
+ return None
utils/predict_bounding_boxes.py CHANGED
@@ -31,10 +31,9 @@ def predict_bounding_boxes(model: YOLO, image_path: str) -> List:
31
  label = result.names[box.cls[0].item()]
32
  coords = [round(x) for x in box.xyxy[0].tolist()]
33
  prob = round(box.conf[0].item(), 4)
34
- print("Object: {}\nCoordinates: {}\nProbability: {}".format(label, coords, prob))
35
  cropped_image = image.crop(coords)
36
 
37
  # save each image under a unique name
38
  cropped_image.save(f"{bounding_box_images_path}/{uuid.uuid4()}.png")
39
-
40
  return result.boxes.data.tolist()
 
31
  label = result.names[box.cls[0].item()]
32
  coords = [round(x) for x in box.xyxy[0].tolist()]
33
  prob = round(box.conf[0].item(), 4)
 
34
  cropped_image = image.crop(coords)
35
 
36
  # save each image under a unique name
37
  cropped_image.save(f"{bounding_box_images_path}/{uuid.uuid4()}.png")
38
+
39
  return result.boxes.data.tolist()
utils/translate_manga.py CHANGED
@@ -1,15 +1,21 @@
1
  """
2
- This module is used to translate manga from Japanese to English.
3
  """
4
 
5
  from deep_translator import GoogleTranslator
6
 
7
- def translate_manga(text: str) -> str:
8
- """
9
- Translate manga from Japanese to English.
10
- """
11
- translated_text = GoogleTranslator(source="ja", target="en").translate(text)
12
- print("Original text:", text)
13
- print("Translated text:", translated_text)
14
-
15
- return translated_text
 
 
 
 
 
 
 
1
  """
2
+ This module is used to translate manga from one language to another.
3
  """
4
 
5
  from deep_translator import GoogleTranslator
6
 
7
+
8
+ def translate_manga(text: str, source_lang: str = "ja", target_lang: str = "en") -> str:
9
+ """
10
+ Translate manga from one language to another.
11
+ """
12
+
13
+ if source_lang == target_lang:
14
+ return text
15
+
16
+ translated_text = GoogleTranslator(
17
+ source=source_lang, target=target_lang).translate(text)
18
+ print("Original text:", text)
19
+ print("Translated text:", translated_text)
20
+
21
+ return translated_text