priteshmistry commited on
Commit
8073d84
·
verified ·
1 Parent(s): 4d81aed

Upload 16 files

Browse files
.dockerignore ADDED
@@ -0,0 +1,25 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Ignore Git repository files
2
+ .git/
3
+
4
+ # Ignore Docker-related files (not needed in the image)
5
+ # Dockerfile
6
+ # .dockerignore
7
+
8
+ # Ignore cache and temporary files
9
+ .cache
10
+ tmp/*
11
+
12
+ # Ignore all files and subdirectories in models/ EXCEPT version.txt
13
+ models/*
14
+ !models/version.txt
15
+
16
+ # Ignore virtual environments (if using venv)
17
+ python_env/
18
+
19
+ # Ignore compiled Python bytecode
20
+ **/__pycache__/
21
+
22
+ # Ignore specific directories inside the `audiobooks` project
23
+ audiobooks/cli/*
24
+ audiobooks/gui/gradio/*
25
+ audiobooks/gui/host/*
CODE_OF_CONDUCT.md ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ We as members, contributors, and leaders pledge to make participation in our
6
+ community a harassment-free experience for everyone, regardless of age, body
7
+ size, visible or invisible disability, ethnicity, sex characteristics, gender
8
+ identity and expression, level of experience, education, socio-economic status,
9
+ nationality, personal appearance, race, religion, or sexual identity
10
+ and orientation.
11
+
12
+ We pledge to act and interact in ways that contribute to an open, welcoming,
13
+ diverse, inclusive, and healthy community.
14
+
15
+ ## Our Standards
16
+
17
+ Examples of behavior that contributes to a positive environment for our
18
+ community include:
19
+
20
+ * Demonstrating empathy and kindness toward other people
21
+ * Being respectful of differing opinions, viewpoints, and experiences
22
+ * Giving and gracefully accepting constructive feedback
23
+ * Accepting responsibility and apologizing to those affected by our mistakes,
24
+ and learning from the experience
25
+ * Focusing on what is best not just for us as individuals, but for the
26
+ overall community
27
+
28
+ Examples of unacceptable behavior include:
29
+
30
+ * The use of sexualized language or imagery, and sexual attention or
31
+ advances of any kind
32
+ * Trolling, insulting or derogatory comments, and personal or political attacks
33
+ * Public or private harassment
34
+ * Publishing others' private information, such as a physical or email
35
+ address, without their explicit permission
36
+ * Other conduct which could reasonably be considered inappropriate in a
37
+ professional setting
38
+
39
+ ## Enforcement Responsibilities
40
+
41
+ Community leaders are responsible for clarifying and enforcing our standards of
42
+ acceptable behavior and will take appropriate and fair corrective action in
43
+ response to any behavior that they deem inappropriate, threatening, offensive,
44
+ or harmful.
45
+
46
+ Community leaders have the right and responsibility to remove, edit, or reject
47
+ comments, commits, code, wiki edits, issues, and other contributions that are
48
+ not aligned to this Code of Conduct, and will communicate reasons for moderation
49
+ decisions when appropriate.
50
+
51
+ ## Scope
52
+
53
+ This Code of Conduct applies within all community spaces, and also applies when
54
+ an individual is officially representing the community in public spaces.
55
+ Examples of representing our community include using an official e-mail address,
56
+ posting via an official social media account, or acting as an appointed
57
+ representative at an online or offline event.
58
+
59
+ ## Enforcement
60
+
61
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
62
+ reported to the community leaders responsible for enforcement at
63
+ email.
64
+ All complaints will be reviewed and investigated promptly and fairly.
65
+
66
+ All community leaders are obligated to respect the privacy and security of the
67
+ reporter of any incident.
68
+
69
+ ## Enforcement Guidelines
70
+
71
+ Community leaders will follow these Community Impact Guidelines in determining
72
+ the consequences for any action they deem in violation of this Code of Conduct:
73
+
74
+ ### 1. Correction
75
+
76
+ **Community Impact**: Use of inappropriate language or other behavior deemed
77
+ unprofessional or unwelcome in the community.
78
+
79
+ **Consequence**: A private, written warning from community leaders, providing
80
+ clarity around the nature of the violation and an explanation of why the
81
+ behavior was inappropriate. A public apology may be requested.
82
+
83
+ ### 2. Warning
84
+
85
+ **Community Impact**: A violation through a single incident or series
86
+ of actions.
87
+
88
+ **Consequence**: A warning with consequences for continued behavior. No
89
+ interaction with the people involved, including unsolicited interaction with
90
+ those enforcing the Code of Conduct, for a specified period of time. This
91
+ includes avoiding interactions in community spaces as well as external channels
92
+ like social media. Violating these terms may lead to a temporary or
93
+ permanent ban.
94
+
95
+ ### 3. Temporary Ban
96
+
97
+ **Community Impact**: A serious violation of community standards, including
98
+ sustained inappropriate behavior.
99
+
100
+ **Consequence**: A temporary ban from any sort of interaction or public
101
+ communication with the community for a specified period of time. No public or
102
+ private interaction with the people involved, including unsolicited interaction
103
+ with those enforcing the Code of Conduct, is allowed during this period.
104
+ Violating these terms may lead to a permanent ban.
105
+
106
+ ### 4. Permanent Ban
107
+
108
+ **Community Impact**: Demonstrating a pattern of violation of community
109
+ standards, including sustained inappropriate behavior, harassment of an
110
+ individual, or aggression toward or disparagement of classes of individuals.
111
+
112
+ **Consequence**: A permanent ban from any sort of public interaction within
113
+ the community.
114
+
115
+ ## Attribution
116
+
117
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage],
118
+ version 2.0, available at
119
+ https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
120
+
121
+ Community Impact Guidelines were inspired by [Mozilla's code of conduct
122
+ enforcement ladder](https://github.com/mozilla/diversity).
123
+
124
+ [homepage]: https://www.contributor-covenant.org
125
+
126
+ For answers to common questions about this code of conduct, see the FAQ at
127
+ https://www.contributor-covenant.org/faq. Translations are available at
128
+ https://www.contributor-covenant.org/translations.
Dockerfile ADDED
@@ -0,0 +1,127 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ARG BASE=python:3.12
2
+ ARG BASE_IMAGE=base
3
+ FROM ${BASE} AS base
4
+
5
+ # Set environment PATH for local installations
6
+ ENV PATH="/root/.local/bin:$PATH"
7
+ # Set non-interactive mode to prevent tzdata prompt
8
+ ENV DEBIAN_FRONTEND=noninteractive
9
+ # Install system packages
10
+ RUN apt-get update && \
11
+ apt-get install -y gcc g++ make wget git calibre ffmpeg libmecab-dev mecab mecab-ipadic-utf8 libsndfile1-dev libc-dev curl espeak-ng sox && \
12
+ curl -fsSL https://deb.nodesource.com/setup_18.x | bash - && \
13
+ apt-get install -y nodejs && \
14
+ apt-get clean && \
15
+ rm -rf /var/lib/apt/lists/*
16
+ # Install Rust compiler
17
+ RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
18
+ ENV PATH="/root/.cargo/bin:${PATH}"
19
+ # Copy the application
20
+ WORKDIR /app
21
+ COPY . /app
22
+ # Install UniDic (non-torch dependent)
23
+ RUN pip install --no-cache-dir unidic-lite unidic && \
24
+ python3 -m unidic download && \
25
+ mkdir -p /root/.local/share/unidic
26
+ ENV UNIDIC_DIR=/root/.local/share/unidic
27
+
28
+ # Second stage for PyTorch installation + swappable base image if you want to use a pulled image
29
+ FROM $BASE_IMAGE AS pytorch
30
+ # Add parameter for PyTorch version with a default empty value
31
+ ARG TORCH_VERSION=""
32
+ # Add parameter to control whether to skip the XTTS test
33
+ ARG SKIP_XTTS_TEST="false"
34
+
35
+
36
+ # Extract torch versions from requirements.txt or set to empty strings if not found
37
+ RUN TORCH_VERSION_REQ=$(grep -E "^torch==" requirements.txt | cut -d'=' -f3 || echo "") && \
38
+ TORCHAUDIO_VERSION_REQ=$(grep -E "^torchaudio==" requirements.txt | cut -d'=' -f3 || echo "") && \
39
+ TORCHVISION_VERSION_REQ=$(grep -E "^torchvision==" requirements.txt | cut -d'=' -f3 || echo "") && \
40
+ echo "Found in requirements: torch==$TORCH_VERSION_REQ torchaudio==$TORCHAUDIO_VERSION_REQ torchvision==$TORCHVISION_VERSION_REQ"
41
+
42
+ # Install PyTorch with CUDA support if specified
43
+ RUN if [ ! -z "$TORCH_VERSION" ]; then \
44
+ # Check if we need to use specific versions or get the latest
45
+ if [ ! -z "$TORCH_VERSION_REQ" ] && [ ! -z "$TORCHVISION_VERSION_REQ" ] && [ ! -z "$TORCHAUDIO_VERSION_REQ" ]; then \
46
+ echo "Using specific versions from requirements.txt" && \
47
+ TORCH_SPEC="torch==${TORCH_VERSION_REQ}" && \
48
+ TORCHVISION_SPEC="torchvision==${TORCHVISION_VERSION_REQ}" && \
49
+ TORCHAUDIO_SPEC="torchaudio==${TORCHAUDIO_VERSION_REQ}"; \
50
+ else \
51
+ echo "Using latest versions for the selected variant" && \
52
+ TORCH_SPEC="torch" && \
53
+ TORCHVISION_SPEC="torchvision" && \
54
+ TORCHAUDIO_SPEC="torchaudio"; \
55
+ fi && \
56
+ \
57
+ # Check if TORCH_VERSION contains "cuda" and extract version number
58
+ if echo "$TORCH_VERSION" | grep -q "cuda"; then \
59
+ CUDA_VERSION=$(echo "$TORCH_VERSION" | sed 's/cuda//g') && \
60
+ echo "Detected CUDA version: $CUDA_VERSION" && \
61
+ echo "Attempting to install PyTorch nightly for CUDA $CUDA_VERSION..." && \
62
+ #if ! pip install --no-cache-dir --pre $TORCH_SPEC $TORCHVISION_SPEC $TORCHAUDIO_SPEC --index-url https://download.pytorch.org/whl/nightly/cu${CUDA_VERSION}; then \
63
+ if ! pip install --no-cache-dir --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu${CUDA_VERSION}; then \
64
+ echo "❌ Nightly build for CUDA $CUDA_VERSION not available or failed" && \
65
+ echo "🔄 Trying stable release for CUDA $CUDA_VERSION..." && \
66
+ #if pip install --no-cache-dir $TORCH_SPEC $TORCHVISION_SPEC $TORCHAUDIO_SPEC --extra-index-url https://download.pytorch.org/whl/cu${CUDA_VERSION}; then \
67
+ if pip install --no-cache-dir torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu${CUDA_VERSION}; then \
68
+ echo "✅ Successfully installed stable PyTorch for CUDA $CUDA_VERSION"; \
69
+ else \
70
+ echo "❌ Both nightly and stable builds failed for CUDA $CUDA_VERSION"; \
71
+ echo "💡 This CUDA version may not be supported by PyTorch"; \
72
+ exit 1; \
73
+ fi; \
74
+ else \
75
+ echo "✅ Successfully installed nightly PyTorch for CUDA $CUDA_VERSION"; \
76
+ fi; \
77
+ else \
78
+ # Handle non-CUDA cases (existing functionality)
79
+ case "$TORCH_VERSION" in \
80
+ "rocm") \
81
+ # Using the correct syntax for ROCm PyTorch installation
82
+ pip install --no-cache-dir $TORCH_SPEC $TORCHVISION_SPEC $TORCHAUDIO_SPEC --extra-index-url https://download.pytorch.org/whl/rocm6.2 \
83
+ ;; \
84
+ "xpu") \
85
+ # Install PyTorch with Intel XPU support through IPEX
86
+ pip install --no-cache-dir $TORCH_SPEC $TORCHVISION_SPEC $TORCHAUDIO_SPEC && \
87
+ pip install --no-cache-dir intel-extension-for-pytorch --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/xpu/us/ \
88
+ ;; \
89
+ "cpu") \
90
+ pip install --no-cache-dir $TORCH_SPEC $TORCHVISION_SPEC $TORCHAUDIO_SPEC --extra-index-url https://download.pytorch.org/whl/cpu \
91
+ ;; \
92
+ *) \
93
+ pip install --no-cache-dir $TORCH_VERSION \
94
+ ;; \
95
+ esac; \
96
+ fi && \
97
+ # Install remaining requirements, skipping torch packages that might be there
98
+ grep -v -E "^torch==|^torchvision==|^torchaudio==|^torchvision$" requirements.txt > requirements_no_torch.txt && \
99
+ pip install --no-cache-dir --upgrade -r requirements_no_torch.txt && \
100
+ rm requirements_no_torch.txt; \
101
+ else \
102
+ # Install all requirements as specified
103
+ pip install --no-cache-dir --upgrade -r requirements.txt; \
104
+ fi
105
+
106
+ # Do a test run to pre-download and bake base models into the image, but only if SKIP_XTTS_TEST is not true
107
+ RUN if [ "$SKIP_XTTS_TEST" != "true" ]; then \
108
+ echo "Running XTTS test to pre-download models..."; \
109
+ if [ "$TORCH_VERSION" = "xpu" ]; then \
110
+ TORCH_DEVICE_BACKEND_AUTOLOAD=0 python app.py --headless --ebook test.txt --script_mode full_docker; \
111
+ else \
112
+ python app.py --headless --language eng --ebook "tools/workflow-testing/test1.txt" --tts_engine XTTSv2 --script_mode full_docker; \
113
+ fi; \
114
+ else \
115
+ echo "Skipping XTTS test run as requested."; \
116
+ fi
117
+
118
+
119
+ # Expose the required port
120
+ EXPOSE 7860
121
+ # Start the Gradio app with the required flag
122
+ ENTRYPOINT ["python", "app.py", "--script_mode", "full_docker"]
123
+
124
+
125
+ #docker build --pull --build-arg BASE_IMAGE=athomasson2/ebook2audiobook:latest -t your-image-name .
126
+ #The --pull flag forces Docker to always try to pull the latest version of the image, even if it already exists locally.
127
+ #Without --pull, Docker will only use the local version if it exists, which might not be the latest.
LICENSE ADDED
@@ -0,0 +1,201 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Apache License
2
+ Version 2.0, January 2004
3
+ http://www.apache.org/licenses/
4
+
5
+ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
6
+
7
+ 1. Definitions.
8
+
9
+ "License" shall mean the terms and conditions for use, reproduction,
10
+ and distribution as defined by Sections 1 through 9 of this document.
11
+
12
+ "Licensor" shall mean the copyright owner or entity authorized by
13
+ the copyright owner that is granting the License.
14
+
15
+ "Legal Entity" shall mean the union of the acting entity and all
16
+ other entities that control, are controlled by, or are under common
17
+ control with that entity. For the purposes of this definition,
18
+ "control" means (i) the power, direct or indirect, to cause the
19
+ direction or management of such entity, whether by contract or
20
+ otherwise, or (ii) ownership of fifty percent (50%) or more of the
21
+ outstanding shares, or (iii) beneficial ownership of such entity.
22
+
23
+ "You" (or "Your") shall mean an individual or Legal Entity
24
+ exercising permissions granted by this License.
25
+
26
+ "Source" form shall mean the preferred form for making modifications,
27
+ including but not limited to software source code, documentation
28
+ source, and configuration files.
29
+
30
+ "Object" form shall mean any form resulting from mechanical
31
+ transformation or translation of a Source form, including but
32
+ not limited to compiled object code, generated documentation,
33
+ and conversions to other media types.
34
+
35
+ "Work" shall mean the work of authorship, whether in Source or
36
+ Object form, made available under the License, as indicated by a
37
+ copyright notice that is included in or attached to the work
38
+ (an example is provided in the Appendix below).
39
+
40
+ "Derivative Works" shall mean any work, whether in Source or Object
41
+ form, that is based on (or derived from) the Work and for which the
42
+ editorial revisions, annotations, elaborations, or other modifications
43
+ represent, as a whole, an original work of authorship. For the purposes
44
+ of this License, Derivative Works shall not include works that remain
45
+ separable from, or merely link (or bind by name) to the interfaces of,
46
+ the Work and Derivative Works thereof.
47
+
48
+ "Contribution" shall mean any work of authorship, including
49
+ the original version of the Work and any modifications or additions
50
+ to that Work or Derivative Works thereof, that is intentionally
51
+ submitted to Licensor for inclusion in the Work by the copyright owner
52
+ or by an individual or Legal Entity authorized to submit on behalf of
53
+ the copyright owner. For the purposes of this definition, "submitted"
54
+ means any form of electronic, verbal, or written communication sent
55
+ to the Licensor or its representatives, including but not limited to
56
+ communication on electronic mailing lists, source code control systems,
57
+ and issue tracking systems that are managed by, or on behalf of, the
58
+ Licensor for the purpose of discussing and improving the Work, but
59
+ excluding communication that is conspicuously marked or otherwise
60
+ designated in writing by the copyright owner as "Not a Contribution."
61
+
62
+ "Contributor" shall mean Licensor and any individual or Legal Entity
63
+ on behalf of whom a Contribution has been received by Licensor and
64
+ subsequently incorporated within the Work.
65
+
66
+ 2. Grant of Copyright License. Subject to the terms and conditions of
67
+ this License, each Contributor hereby grants to You a perpetual,
68
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
69
+ copyright license to reproduce, prepare Derivative Works of,
70
+ publicly display, publicly perform, sublicense, and distribute the
71
+ Work and such Derivative Works in Source or Object form.
72
+
73
+ 3. Grant of Patent License. Subject to the terms and conditions of
74
+ this License, each Contributor hereby grants to You a perpetual,
75
+ worldwide, non-exclusive, no-charge, royalty-free, irrevocable
76
+ (except as stated in this section) patent license to make, have made,
77
+ use, offer to sell, sell, import, and otherwise transfer the Work,
78
+ where such license applies only to those patent claims licensable
79
+ by such Contributor that are necessarily infringed by their
80
+ Contribution(s) alone or by combination of their Contribution(s)
81
+ with the Work to which such Contribution(s) was submitted. If You
82
+ institute patent litigation against any entity (including a
83
+ cross-claim or counterclaim in a lawsuit) alleging that the Work
84
+ or a Contribution incorporated within the Work constitutes direct
85
+ or contributory patent infringement, then any patent licenses
86
+ granted to You under this License for that Work shall terminate
87
+ as of the date such litigation is filed.
88
+
89
+ 4. Redistribution. You may reproduce and distribute copies of the
90
+ Work or Derivative Works thereof in any medium, with or without
91
+ modifications, and in Source or Object form, provided that You
92
+ meet the following conditions:
93
+
94
+ (a) You must give any other recipients of the Work or
95
+ Derivative Works a copy of this License; and
96
+
97
+ (b) You must cause any modified files to carry prominent notices
98
+ stating that You changed the files; and
99
+
100
+ (c) You must retain, in the Source form of any Derivative Works
101
+ that You distribute, all copyright, patent, trademark, and
102
+ attribution notices from the Source form of the Work,
103
+ excluding those notices that do not pertain to any part of
104
+ the Derivative Works; and
105
+
106
+ (d) If the Work includes a "NOTICE" text file as part of its
107
+ distribution, then any Derivative Works that You distribute must
108
+ include a readable copy of the attribution notices contained
109
+ within such NOTICE file, excluding those notices that do not
110
+ pertain to any part of the Derivative Works, in at least one
111
+ of the following places: within a NOTICE text file distributed
112
+ as part of the Derivative Works; within the Source form or
113
+ documentation, if provided along with the Derivative Works; or,
114
+ within a display generated by the Derivative Works, if and
115
+ wherever such third-party notices normally appear. The contents
116
+ of the NOTICE file are for informational purposes only and
117
+ do not modify the License. You may add Your own attribution
118
+ notices within Derivative Works that You distribute, alongside
119
+ or as an addendum to the NOTICE text from the Work, provided
120
+ that such additional attribution notices cannot be construed
121
+ as modifying the License.
122
+
123
+ You may add Your own copyright statement to Your modifications and
124
+ may provide additional or different license terms and conditions
125
+ for use, reproduction, or distribution of Your modifications, or
126
+ for any such Derivative Works as a whole, provided Your use,
127
+ reproduction, and distribution of the Work otherwise complies with
128
+ the conditions stated in this License.
129
+
130
+ 5. Submission of Contributions. Unless You explicitly state otherwise,
131
+ any Contribution intentionally submitted for inclusion in the Work
132
+ by You to the Licensor shall be under the terms and conditions of
133
+ this License, without any additional terms or conditions.
134
+ Notwithstanding the above, nothing herein shall supersede or modify
135
+ the terms of any separate license agreement you may have executed
136
+ with Licensor regarding such Contributions.
137
+
138
+ 6. Trademarks. This License does not grant permission to use the trade
139
+ names, trademarks, service marks, or product names of the Licensor,
140
+ except as required for reasonable and customary use in describing the
141
+ origin of the Work and reproducing the content of the NOTICE file.
142
+
143
+ 7. Disclaimer of Warranty. Unless required by applicable law or
144
+ agreed to in writing, Licensor provides the Work (and each
145
+ Contributor provides its Contributions) on an "AS IS" BASIS,
146
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
147
+ implied, including, without limitation, any warranties or conditions
148
+ of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
149
+ PARTICULAR PURPOSE. You are solely responsible for determining the
150
+ appropriateness of using or redistributing the Work and assume any
151
+ risks associated with Your exercise of permissions under this License.
152
+
153
+ 8. Limitation of Liability. In no event and under no legal theory,
154
+ whether in tort (including negligence), contract, or otherwise,
155
+ unless required by applicable law (such as deliberate and grossly
156
+ negligent acts) or agreed to in writing, shall any Contributor be
157
+ liable to You for damages, including any direct, indirect, special,
158
+ incidental, or consequential damages of any character arising as a
159
+ result of this License or out of the use or inability to use the
160
+ Work (including but not limited to damages for loss of goodwill,
161
+ work stoppage, computer failure or malfunction, or any and all
162
+ other commercial damages or losses), even if such Contributor
163
+ has been advised of the possibility of such damages.
164
+
165
+ 9. Accepting Warranty or Additional Liability. While redistributing
166
+ the Work or Derivative Works thereof, You may choose to offer,
167
+ and charge a fee for, acceptance of support, warranty, indemnity,
168
+ or other liability obligations and/or rights consistent with this
169
+ License. However, in accepting such obligations, You may act only
170
+ on Your own behalf and on Your sole responsibility, not on behalf
171
+ of any other Contributor, and only if You agree to indemnify,
172
+ defend, and hold each Contributor harmless for any liability
173
+ incurred by, or claims asserted against, such Contributor by reason
174
+ of your accepting any such warranty or additional liability.
175
+
176
+ END OF TERMS AND CONDITIONS
177
+
178
+ APPENDIX: How to apply the Apache License to your work.
179
+
180
+ To apply the Apache License to your work, attach the following
181
+ boilerplate notice, with the fields enclosed by brackets "[]"
182
+ replaced with your own identifying information. (Don't include
183
+ the brackets!) The text should be enclosed in the appropriate
184
+ comment syntax for the file format. We also recommend that a
185
+ file or class name and description of purpose be included on the
186
+ same "printed page" as the copyright notice for easier
187
+ identification within third-party archives.
188
+
189
+ Copyright [yyyy] [name of copyright owner]
190
+
191
+ Licensed under the Apache License, Version 2.0 (the "License");
192
+ you may not use this file except in compliance with the License.
193
+ You may obtain a copy of the License at
194
+
195
+ http://www.apache.org/licenses/LICENSE-2.0
196
+
197
+ Unless required by applicable law or agreed to in writing, software
198
+ distributed under the License is distributed on an "AS IS" BASIS,
199
+ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
200
+ See the License for the specific language governing permissions and
201
+ limitations under the License.
Mac Ebook2Audiobook Launcher.command ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ #!/bin/zsh
2
+ # Prevent Conda from initializing
3
+ export CONDA_SHLVL=0
4
+ unset CONDA_PREFIX
5
+ unset CONDA_DEFAULT_ENV
6
+ # Change directory to the location of the launcher
7
+ cd "$(dirname "$0")"
8
+ # Execute the ebook2audiobook.sh script
9
+ ./ebook2audiobook.sh
README.md CHANGED
@@ -1,12 +1,530 @@
1
- ---
2
- title: Ebook2audiobook
3
- emoji: 🏆
4
- colorFrom: indigo
5
- colorTo: purple
6
- sdk: docker
7
- pinned: false
8
- license: mit
9
- short_description: ebook to audiobook
10
- ---
11
-
12
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # 📚 ebook2audiobook
2
+ CPU/GPU Converter from eBooks to audiobooks with chapters and metadata<br/>
3
+ using XTTSv2, Bark, Vits, Fairseq, YourTTS, Tacotron and more. Supports voice cloning and +1110 languages!
4
+ > [!IMPORTANT]
5
+ **This tool is intended for use with non-DRM, legally acquired eBooks only.** <br>
6
+ The authors are not responsible for any misuse of this software or any resulting legal consequences. <br>
7
+ Use this tool responsibly and in accordance with all applicable laws.
8
+
9
+ [![Discord](https://dcbadge.limes.pink/api/server/https://discord.gg/63Tv3F65k6)](https://discord.gg/63Tv3F65k6)
10
+
11
+ ### Thanks to support ebook2audiobook developers!
12
+ [![Ko-Fi](https://img.shields.io/badge/Ko--fi-F16061?style=for-the-badge&logo=ko-fi&logoColor=white)](https://ko-fi.com/athomasson2)
13
+
14
+ ### Run locally
15
+
16
+ [![Quick Start](https://img.shields.io/badge/Quick%20Start-blue?style=for-the-badge)](#launching-gradio-web-interface)
17
+
18
+ [![Docker Build](https://github.com/DrewThomasson/ebook2audiobook/actions/workflows/Docker-Build.yml/badge.svg)](https://github.com/DrewThomasson/ebook2audiobook/actions/workflows/Docker-Build.yml) [![Download](https://img.shields.io/badge/Download-Now-blue.svg)](https://github.com/DrewThomasson/ebook2audiobook/releases/latest)
19
+
20
+
21
+ <a href="https://github.com/DrewThomasson/ebook2audiobook">
22
+ <img src="https://img.shields.io/badge/Platform-mac%20|%20linux%20|%20windows-lightgrey" alt="Platform">
23
+ </a><a href="https://hub.docker.com/r/athomasson2/ebook2audiobook">
24
+ <img alt="Docker Pull Count" src="https://img.shields.io/docker/pulls/athomasson2/ebook2audiobook.svg"/>
25
+ </a>
26
+
27
+ ### Run Remotely
28
+ [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/ebook2audiobook)
29
+ [![Free Google Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrewThomasson/ebook2audiobook/blob/main/Notebooks/colab_ebook2audiobook.ipynb) [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=flat&logo=kaggle&logoColor=white)](https://github.com/Rihcus/ebook2audiobookXTTS/blob/main/Notebooks/kaggle-ebook2audiobook.ipynb)
30
+
31
+ #### GUI Interface
32
+ ![demo_web_gui](assets/demo_web_gui.gif)
33
+
34
+ <details>
35
+ <summary>Click to see images of Web GUI</summary>
36
+ <img width="1728" alt="GUI Screen 1" src="assets/gui_1.png">
37
+ <img width="1728" alt="GUI Screen 2" src="assets/gui_2.png">
38
+ <img width="1728" alt="GUI Screen 3" src="assets/gui_3.png">
39
+ </details>
40
+
41
+ ## Demos
42
+
43
+ **New Default Voice Demo**
44
+
45
+ https://github.com/user-attachments/assets/750035dc-e355-46f1-9286-05c1d9e88cea
46
+
47
+ <details>
48
+ <summary>More Demos</summary>
49
+
50
+ **ASMR Voice**
51
+
52
+ https://github.com/user-attachments/assets/68eee9a1-6f71-4903-aacd-47397e47e422
53
+
54
+ **Rainy Day Voice**
55
+
56
+ https://github.com/user-attachments/assets/d25034d9-c77f-43a9-8f14-0d167172b080
57
+
58
+ **Scarlett Voice**
59
+
60
+ https://github.com/user-attachments/assets/b12009ee-ec0d-45ce-a1ef-b3a52b9f8693
61
+
62
+ **David Attenborough Voice**
63
+
64
+ https://github.com/user-attachments/assets/81c4baad-117e-4db5-ac86-efc2b7fea921
65
+
66
+ **Example**
67
+
68
+ ![Example](https://github.com/DrewThomasson/VoxNovel/blob/dc5197dff97252fa44c391dc0596902d71278a88/readme_files/example_in_app.jpeg)
69
+ </details>
70
+
71
+ ## README.md
72
+
73
+ ## Table of Contents
74
+ - [ebook2audiobook](#-ebook2audiobook)
75
+ - [Features](#features)
76
+ - [GUI Interface](#gui-interface)
77
+ - [Demos](#demos)
78
+ - [Supported Languages](#supported-languages)
79
+ - [Minimum Requirements](#hardware-requirements)
80
+ - [Usage](#launching-gradio-web-interface)
81
+ - [Run Locally](#launching-gradio-web-interface)
82
+ - [Launching Gradio Web Interface](#launching-gradio-web-interface)
83
+ - [Basic Headless Usage](#basic--usage)
84
+ - [Headless Custom XTTS Model Usage](#example-of-custom-model-zip-upload)
85
+ - [Help command output](#help-command-output)
86
+ - [Run Remotely](#run-remotely)
87
+ - [Fine Tuned TTS models](#fine-tuned-tts-models)
88
+ - [Collection of Fine-Tuned TTS Models](#fine-tuned-tts-collection)
89
+ - [Train XTTSv2](#fine-tune-your-own-xttsv2-model)
90
+ - [Docker](#docker-gpu-options)
91
+ - [GPU options](#docker-gpu-options)
92
+ - [Docker Run](#running-the-pre-built-docker-container)
93
+ - [Docker Build](#building-the-docker-container)
94
+ - [Docker Compose](#docker-compose)
95
+ - [Docker headless guide](#docker-headless-guide)
96
+ - [Docker container file locations](#docker-container-file-locations)
97
+ - [Common Docker issues](#common-docker-issues)
98
+ - [Supported eBook Formats](#supported-ebook-formats)
99
+ - [Output Formats](#output-formats)
100
+ - [Updating to Latest Version](#updating-to-latest-version)
101
+ - [Revert to older Version](#reverting-to-older-versions)
102
+ - [Common Issues](#common-issues)
103
+ - [Special Thanks](#special-thanks)
104
+ - [Table of Contents](#table-of-contents)
105
+
106
+
107
+ ## Features
108
+ - 📚 Splits eBook into chapters for organized audio.
109
+ - 🎙️ High-quality text-to-speech with [Coqui XTTSv2](https://huggingface.co/coqui/XTTS-v2) and [Fairseq](https://github.com/facebookresearch/fairseq/tree/main/examples/mms) (and more).
110
+ - 🗣️ Optional voice cloning with your own voice file.
111
+ - 🌍 Supports +1110 languages (English by default). [List of Supported languages](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
112
+ - 🖥️ Designed to run on 4GB RAM.
113
+
114
+
115
+ ## Supported Languages
116
+ | **Arabic (ar)** | **Chinese (zh)** | **English (en)** | **Spanish (es)** |
117
+ |:------------------:|:------------------:|:------------------:|:------------------:|
118
+ | **French (fr)** | **German (de)** | **Italian (it)** | **Portuguese (pt)** |
119
+ | **Polish (pl)** | **Turkish (tr)** | **Russian (ru)** | **Dutch (nl)** |
120
+ | **Czech (cs)** | **Japanese (ja)** | **Hindi (hi)** | **Bengali (bn)** |
121
+ | **Hungarian (hu)** | **Korean (ko)** | **Vietnamese (vi)**| **Swedish (sv)** |
122
+ | **Persian (fa)** | **Yoruba (yo)** | **Swahili (sw)** | **Indonesian (id)**|
123
+ | **Slovak (sk)** | **Croatian (hr)** | **Tamil (ta)** | **Danish (da)** |
124
+ - [**+1100 languages and dialects here**](https://dl.fbaipublicfiles.com/mms/tts/all-tts-languages.html)
125
+
126
+
127
+ ## Hardware Requirements
128
+ - 4gb RAM minimum, 8GB recommended
129
+ - Virtualization enabled if running on windows (Docker only)
130
+ - CPU (intel, AMD, ARM), GPU (Nvidia, AMD*, Intel*) (Recommended), MPS (Apple Silicon CPU)
131
+ *available very soon
132
+
133
+ > [!IMPORTANT]
134
+ **Before to post an install or bug issue search carefully to the opened and closed issues TAB<br>
135
+ to be sure your issue does not exist already.**
136
+
137
+
138
+ >[!NOTE]
139
+ **Lacking of any standards structure like what is a chapter, paragraph, preface etc.<br>
140
+ you should first remove manually any text you don't want to be converted in audio.**
141
+
142
+ ### Installation Instructions
143
+ 1. **Clone repo**
144
+ ```bash
145
+ git clone https://github.com/DrewThomasson/ebook2audiobook.git
146
+ cd ebook2audiobook
147
+ ```
148
+
149
+ ### Launching Gradio Web Interface
150
+ 1. **Run ebook2audiobook**:
151
+ - **Linux/MacOS**
152
+ ```bash
153
+ ./ebook2audiobook.sh # Run launch script
154
+ ```
155
+
156
+ - **Mac Launcher**
157
+ Double click `Mac Ebook2Audiobook Launcher.command`
158
+
159
+
160
+ - **Windows**
161
+ ```bash
162
+ ebook2audiobook.cmd # Run launch script or double click on it
163
+ ```
164
+
165
+ - **Windows Launcher**
166
+ Double click `ebook2audiobook.cmd`
167
+
168
+
169
+ - **Manual Python Install**
170
+ ```bash
171
+ # (for experts only!)
172
+ REQUIRED_PROGRAMS=("calibre" "ffmpeg" "nodejs" "mecab" "espeak-ng" "rust" "sox")
173
+ REQUIRED_PYTHON_VERSION="3.12"
174
+ pip install -r requirements.txt # Install Python Requirements
175
+ python app.py # Run Ebook2Audiobook
176
+ ```
177
+
178
+ 1. **Open the Web App**: Click the URL provided in the terminal to access the web app and convert eBooks. `http://localhost:7860/`
179
+ 2. **For Public Link**:
180
+ `python app.py --share` (all OS)
181
+ `./ebook2audiobook.sh --share` (Linux/MacOS)
182
+ `ebook2audiobook.cmd --share` (Windows)
183
+
184
+ > [!IMPORTANT]
185
+ **If the script is stopped and run again, you need to refresh your gradio GUI interface<br>
186
+ to let the web page reconnect to the new connection socket.**
187
+
188
+ ### Basic Usage
189
+ - **Linux/MacOS**:
190
+ ```bash
191
+ ./ebook2audiobook.sh --headless --ebook <path_to_ebook_file> \
192
+ --voice [path_to_voice_file] --language [language_code]
193
+ ```
194
+ - **Windows**
195
+ ```bash
196
+ ebook2audiobook.cmd --headless --ebook <path_to_ebook_file>
197
+ --voice [path_to_voice_file] --language [language_code]
198
+ ```
199
+
200
+ - **[--ebook]**: Path to your eBook file
201
+ - **[--voice]**: Voice cloning file path (optional)
202
+ - **[--language]**: Language code in ISO-639-3 (i.e.: ita for italian, eng for english, deu for german...).<br>
203
+ Default language is eng and --language is optional for default language set in ./lib/lang.py.<br>
204
+ The ISO-639-1 2 letters codes are also supported.
205
+
206
+
207
+ ### Example of Custom Model Zip Upload
208
+ (must be a .zip file containing the mandatory model files. Example for XTTSv2: config.json, model.pth, vocab.json and ref.wav)
209
+ - **Linux/MacOS**
210
+ ```bash
211
+ ./ebook2audiobook.sh --headless --ebook <ebook_file_path> \
212
+ --voice <target_voice_file_path> --language <language> --custom_model <custom_model_path>
213
+ ```
214
+ - **Windows**
215
+ ```bash
216
+ ebook2audiobook.cmd --headless --ebook <ebook_file_path> \
217
+ --voice <target_voice_file_path> --language <language> --custom_model <custom_model_path>
218
+ ```
219
+ - **<custom_model_path>**: Path to `model_name.zip` file,
220
+ which must contain (according to the tts engine) all the mandatory files<br>
221
+ (see ./lib/models.py).
222
+
223
+
224
+ ### For Detailed Guide with list of all Parameters to use
225
+ - **Linux/MacOS**
226
+ ```bash
227
+ ./ebook2audiobook.sh --help
228
+ ```
229
+ - **Windows**
230
+ ```bash
231
+ ebook2audiobook.cmd --help
232
+ ```
233
+ - **Or for all OS**
234
+ ```python
235
+ app.py --help
236
+ ```
237
+
238
+ <a id="help-command-output"></a>
239
+ ```bash
240
+ usage: app.py [-h] [--session SESSION] [--share] [--headless] [--ebook EBOOK]
241
+ [--ebooks_dir EBOOKS_DIR] [--language LANGUAGE] [--voice VOICE]
242
+ [--device {cpu,gpu,mps}]
243
+ [--tts_engine {XTTSv2,BARK,VITS,FAIRSEQ,TACOTRON2,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}]
244
+ [--custom_model CUSTOM_MODEL] [--fine_tuned FINE_TUNED]
245
+ [--output_format OUTPUT_FORMAT] [--temperature TEMPERATURE]
246
+ [--length_penalty LENGTH_PENALTY] [--num_beams NUM_BEAMS]
247
+ [--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K]
248
+ [--top_p TOP_P] [--speed SPEED] [--enable_text_splitting]
249
+ [--text_temp TEXT_TEMP] [--waveform_temp WAVEFORM_TEMP]
250
+ [--output_dir OUTPUT_DIR] [--version]
251
+
252
+ Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the Gradio interface or run the script in headless mode for direct conversion.
253
+
254
+ options:
255
+ -h, --help show this help message and exit
256
+ --session SESSION Session to resume the conversion in case of interruption, crash,
257
+ or reuse of custom models and custom cloning voices.
258
+
259
+ **** The following options are for all modes:
260
+ Optional
261
+
262
+ **** The following option are for gradio/gui mode only:
263
+ Optional
264
+
265
+ --share Enable a public shareable Gradio link.
266
+
267
+ **** The following options are for --headless mode only:
268
+ --headless Run the script in headless mode
269
+ --ebook EBOOK Path to the ebook file for conversion. Cannot be used when --ebooks_dir is present.
270
+ --ebooks_dir EBOOKS_DIR
271
+ Relative or absolute path of the directory containing the files to convert.
272
+ Cannot be used when --ebook is present.
273
+ --language LANGUAGE Language of the e-book. Default language is set
274
+ in ./lib/lang.py sed as default if not present. All compatible language codes are in ./lib/lang.py
275
+
276
+ optional parameters:
277
+ --voice VOICE (Optional) Path to the voice cloning file for TTS engine.
278
+ Uses the default voice if not present.
279
+ --device {cpu,gpu,mps}
280
+ (Optional) Pprocessor unit type for the conversion.
281
+ Default is set in ./lib/conf.py if not present. Fall back to CPU if GPU not available.
282
+ --tts_engine {XTTSv2,BARK,VITS,FAIRSEQ,TACOTRON2,YOURTTS,xtts,bark,vits,fairseq,tacotron,yourtts}
283
+ (Optional) Preferred TTS engine (available are: ['XTTSv2', 'BARK', 'VITS', 'FAIRSEQ', 'TACOTRON2', 'YOURTTS', 'xtts', 'bark', 'vits', 'fairseq', 'tacotron', 'yourtts'].
284
+ Default depends on the selected language. The tts engine should be compatible with the chosen language
285
+ --custom_model CUSTOM_MODEL
286
+ (Optional) Path to the custom model zip file cntaining mandatory model files.
287
+ Please refer to ./lib/models.py
288
+ --fine_tuned FINE_TUNED
289
+ (Optional) Fine tuned model path. Default is builtin model.
290
+ --output_format OUTPUT_FORMAT
291
+ (Optional) Output audio format. Default is set in ./lib/conf.py
292
+ --temperature TEMPERATURE
293
+ (xtts only, optional) Temperature for the model.
294
+ Default to config.json model. Higher temperatures lead to more creative outputs.
295
+ --length_penalty LENGTH_PENALTY
296
+ (xtts only, optional) A length penalty applied to the autoregressive decoder.
297
+ Default to config.json model. Not applied to custom models.
298
+ --num_beams NUM_BEAMS
299
+ (xtts only, optional) Controls how many alternative sequences the model explores. Must be equal or greater than length penalty.
300
+ Default to config.json model.
301
+ --repetition_penalty REPETITION_PENALTY
302
+ (xtts only, optional) A penalty that prevents the autoregressive decoder from repeating itself.
303
+ Default to config.json model.
304
+ --top_k TOP_K (xtts only, optional) Top-k sampling.
305
+ Lower values mean more likely outputs and increased audio generation speed.
306
+ Default to config.json model.
307
+ --top_p TOP_P (xtts only, optional) Top-p sampling.
308
+ Lower values mean more likely outputs and increased audio generation speed. Default to config.json model.
309
+ --speed SPEED (xtts only, optional) Speed factor for the speech generation.
310
+ Default to config.json model.
311
+ --enable_text_splitting
312
+ (xtts only, optional) Enable TTS text splitting. This option is known to not be very efficient.
313
+ Default to config.json model.
314
+ --text_temp TEXT_TEMP
315
+ (bark only, optional) Text Temperature for the model.
316
+ Default to 0.85. Higher temperatures lead to more creative outputs.
317
+ --waveform_temp WAVEFORM_TEMP
318
+ (bark only, optional) Waveform Temperature for the model.
319
+ Default to 0.5. Higher temperatures lead to more creative outputs.
320
+ --output_dir OUTPUT_DIR
321
+ (Optional) Path to the output directory. Default is set in ./lib/conf.py
322
+ --version Show the version of the script and exit
323
+
324
+ Example usage:
325
+ Windows:
326
+ Gradio/GUI:
327
+ ebook2audiobook.cmd
328
+ Headless mode:
329
+ ebook2audiobook.cmd --headless --ebook '/path/to/file'
330
+ Linux/Mac:
331
+ Gradio/GUI:
332
+ ./ebook2audiobook.sh
333
+ Headless mode:
334
+ ./ebook2audiobook.sh --headless --ebook '/path/to/file'
335
+
336
+ Tip: to add of silence (1.4 seconds) into your text just use "###" or "[pause]".
337
+
338
+ ```
339
+
340
+ NOTE: in gradio/gui mode, to cancel a running conversion, just click on the [X] from the ebook upload component.
341
+
342
+ TIP: if it needs some more pauses, just add '###' or '[pause]' between the words you wish more pause. one [pause] equals to 1.4 seconds
343
+
344
+ #### Docker GPU Options
345
+
346
+ Available pre-build tags: `latest` (CUDA 11.8)
347
+ #### Edit: IF GPU isn't detected then you'll have to build the image -> [Building the Docker Container](#building-the-docker-container)
348
+
349
+
350
+
351
+ #### Running the pre-built Docker Container
352
+
353
+ -Run with CPU only
354
+ ```powershell
355
+ docker run --pull always --rm -p 7860:7860 athomasson2/ebook2audiobook
356
+ ```
357
+ -Run with GPU Speedup (NVIDIA compatible only)
358
+ ```powershell
359
+ docker run --pull always --rm --gpus all -p 7860:7860 athomasson2/ebook2audiobook
360
+ ```
361
+
362
+ This command will start the Gradio interface on port 7860.(localhost:7860)
363
+ - For more options add the parameter `--help`
364
+
365
+
366
+ #### Building the Docker Container
367
+ - You can build the docker image with the command:
368
+ ```powershell
369
+ docker build -t athomasson2/ebook2audiobook .
370
+ ```
371
+ #### Avalible Docker Build Arguments
372
+
373
+ `--build-arg TORCH_VERSION=cuda118` Available tags: [cuda121, cuda118, cuda128, rocm, xpu, cpu]
374
+
375
+ All CUDA version numbers should work, Ex: CUDA 11.6-> cuda116
376
+
377
+ `--build-arg SKIP_XTTS_TEST=true` (Saves space by not baking XTTSv2 model into docker image)
378
+
379
+
380
+ ## Docker container file locations
381
+ All ebook2audiobooks will have the base dir of `/app/`
382
+ For example:
383
+ `tmp` = `/app/tmp`
384
+ `audiobooks` = `/app/audiobooks`
385
+
386
+
387
+ ## Docker headless guide
388
+
389
+ - Before you do run this you need to create a dir named "input-folder" in your current dir
390
+ which will be linked, This is where you can put your input files for the docker image to see
391
+ ```bash
392
+ mkdir input-folder && mkdir Audiobooks
393
+ ```
394
+ - In the command below swap out **YOUR_INPUT_FILE.TXT** with the name of your input file
395
+ ```bash
396
+ docker run --pull always --rm \
397
+ -v $(pwd)/input-folder:/app/input_folder \
398
+ -v $(pwd)/audiobooks:/app/audiobooks \
399
+ athomasson2/ebook2audiobook \
400
+ --headless --ebook /input_folder/YOUR_EBOOK_FILE
401
+ ```
402
+ - The output Audiobooks will be found in the Audiobook folder which will also be located
403
+ in your local dir you ran this docker command in
404
+
405
+
406
+ ## To get the help command for the other parameters this program has you can run this
407
+
408
+ ```bash
409
+ docker run --pull always --rm athomasson2/ebook2audiobook --help
410
+
411
+ ```
412
+ That will output this
413
+ [Help command output](#help-command-output)
414
+
415
+
416
+ ### Docker Compose
417
+ This project uses Docker Compose to run locally. You can enable or disable GPU support
418
+ by setting either `*gpu-enabled` or `*gpu-disabled` in `docker-compose.yml`
419
+
420
+
421
+ #### Steps to Run
422
+ 1. **Clone the Repository** (if you haven't already):
423
+ ```bash
424
+ git clone https://github.com/DrewThomasson/ebook2audiobook.git
425
+ cd ebook2audiobook
426
+ ```
427
+ 2. **Set GPU Support (disabled by default)**
428
+ To enable GPU support, modify `docker-compose.yml` and change `*gpu-disabled` to `*gpu-enabled`
429
+ 3. **Start the service:**
430
+ ```bash
431
+ # Docker
432
+ docker-compose up -d # To update add --build
433
+
434
+ # Podman
435
+ podman compose -f podman-compose.yml up -d # To update add --build
436
+ ```
437
+ 4. **Access the service:**
438
+ The service will be available at http://localhost:7860.
439
+
440
+
441
+ ## Common Docker Issues
442
+
443
+ - My NVIDIA GPU isnt being detected?? -> [GPU ISSUES Wiki Page](https://github.com/DrewThomasson/ebook2audiobook/wiki/GPU-ISSUES)
444
+
445
+ - `python: can't open file '/home/user/app/app.py': [Errno 2] No such file or directory` (Just remove all post arguments as I replaced the `CMD` with `ENTRYPOINT` in the [Dockerfile](Dockerfile))
446
+ - Example: `docker run --pull always athomasson2/ebook2audiobook app.py --script_mode full_docker` - > corrected - > `docker run --pull always athomasson2/ebook2audiobook`
447
+ - Arguments can be easily added like this now `docker run --pull always athomasson2/ebook2audiobook --share`
448
+
449
+ - Docker gets stuck downloading Fine-Tuned models.
450
+ (This does not happen for every computer but some appear to run into this issue)
451
+ Disabling the progress bar appears to fix the issue,
452
+ as discussed [here in #191](https://github.com/DrewThomasson/ebook2audiobook/issues/191)
453
+ Example of adding this fix in the `docker run` command
454
+ ```Dockerfile
455
+ docker run --pull always --rm --gpus all -e HF_HUB_DISABLE_PROGRESS_BARS=1 -e HF_HUB_ENABLE_HF_TRANSFER=0 \
456
+ -p 7860:7860 athomasson2/ebook2audiobook
457
+ ```
458
+
459
+
460
+ ## Fine Tuned TTS models
461
+ #### Fine Tune your own XTTSv2 model
462
+
463
+ [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/xtts-finetune-webui-gpu) [![Kaggle](https://img.shields.io/badge/Kaggle-035a7d?style=flat&logo=kaggle&logoColor=white)](https://github.com/DrewThomasson/ebook2audiobook/blob/v25/Notebooks/finetune/xtts/kaggle-xtts-finetune-webui-gradio-gui.ipynb) [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/DrewThomasson/ebook2audiobook/blob/v25/Notebooks/finetune/xtts/colab_xtts_finetune_webui.ipynb)
464
+
465
+
466
+
467
+
468
+
469
+ #### De-noise training data
470
+
471
+ [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Spaces-yellow?style=flat&logo=huggingface)](https://huggingface.co/spaces/drewThomasson/DeepFilterNet2_no_limit) [![GitHub Repo](https://img.shields.io/badge/DeepFilterNet-181717?logo=github)](https://github.com/Rikorose/DeepFilterNet)
472
+
473
+
474
+ ### Fine Tuned TTS Collection
475
+
476
+ [![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Models-yellow?style=flat&logo=huggingface)](https://huggingface.co/drewThomasson/fineTunedTTSModels/tree/main)
477
+
478
+ For an XTTSv2 custom model a ref audio clip of the voice reference is mandatory:
479
+
480
+
481
+ ## Supported eBook Formats
482
+ - `.epub`, `.pdf`, `.mobi`, `.txt`, `.html`, `.rtf`, `.chm`, `.lit`,
483
+ `.pdb`, `.fb2`, `.odt`, `.cbr`, `.cbz`, `.prc`, `.lrf`, `.pml`,
484
+ `.snb`, `.cbc`, `.rb`, `.tcr`
485
+ - **Best results**: `.epub` or `.mobi` for automatic chapter detection
486
+
487
+
488
+ ## Output Formats
489
+ - Creates a `['m4b', 'm4a', 'mp4', 'webm', 'mov', 'mp3', 'flac', 'wav', 'ogg', 'aac']` (set in ./lib/conf.py) file with metadata and chapters.
490
+
491
+ ## Updating to Latest Version
492
+ ```bash
493
+ git pull # Locally/Compose
494
+
495
+ docker pull athomasson2/ebook2audiobook:latest # For Pre-build docker images
496
+ ```
497
+
498
+ ## Reverting to older Versions
499
+ Releases can be found -> [here](https://github.com/DrewThomasson/ebook2audiobook/releases)
500
+ ```bash
501
+ git checkout tags/VERSION_NUM # Locally/Compose -> Example: git checkout tags/v25.7.7
502
+
503
+ athomasson2/ebook2audiobook:VERSION_NUM # For Pre-build docker images -> Example: athomasson2/ebook2audiobook:v25.7.7
504
+ ```
505
+
506
+ ## Common Issues:
507
+ - My NVIDIA GPU isnt being detected?? -> [GPU ISSUES Wiki Page](https://github.com/DrewThomasson/ebook2audiobook/wiki/GPU-ISSUES)
508
+ - CPU is slow (better on server smp CPU) while NVIDIA GPU can have almost real time conversion.
509
+ [Discussion about this](https://github.com/DrewThomasson/ebook2audiobook/discussions/19#discussioncomment-10879846)
510
+ For faster multilingual generation I would suggest my other
511
+ [project that uses piper-tts](https://github.com/DrewThomasson/ebook2audiobookpiper-tts) instead
512
+ (It doesn't have zero-shot voice cloning though, and is Siri quality voices, but it is much faster on cpu).
513
+ - "I'm having dependency issues" - Just use the docker, its fully self contained and has a headless mode,
514
+ add `--help` parameter at the end of the docker run command for more information.
515
+ - "Im getting a truncated audio issue!" - PLEASE MAKE AN ISSUE OF THIS,
516
+ we don't speak every language and need advise from users to fine tune the sentence splitting logic.😊
517
+
518
+
519
+ ## What we need help with! 🙌
520
+ ## [Full list of things can be found here](https://github.com/DrewThomasson/ebook2audiobook/issues/32)
521
+ - Any help from people speaking any of the supported languages to help us improve the models
522
+
523
+ ## Do you need to rent a GPU to boost service from us?
524
+ - A poll is open here https://github.com/DrewThomasson/ebook2audiobook/discussions/889
525
+
526
+ ## Special Thanks
527
+ - **Coqui TTS**: [Coqui TTS GitHub](https://github.com/idiap/coqui-ai-TTS)
528
+ - **Calibre**: [Calibre Website](https://calibre-ebook.com)
529
+ - **FFmpeg**: [FFmpeg Website](https://ffmpeg.org)
530
+ - [@shakenbake15 for better chapter saving method](https://github.com/DrewThomasson/ebook2audiobook/issues/8)
VERSION.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ 25.8.18
app.py ADDED
@@ -0,0 +1,331 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import argparse
2
+ import filecmp
3
+ import importlib.util
4
+ import os
5
+ import shutil
6
+ import socket
7
+ import subprocess
8
+ import sys
9
+ import tempfile
10
+
11
+ from pathlib import Path
12
+ from lib import *
13
+
14
+ def check_virtual_env(script_mode):
15
+ current_version = sys.version_info[:2] # (major, minor)
16
+ if str(os.path.basename(sys.prefix)) == 'python_env' or script_mode == FULL_DOCKER or current_version >= min_python_version and current_version <= max_python_version:
17
+ return True
18
+ error = f'''***********
19
+ Wrong launch! ebook2audiobook must run in its own virtual environment!
20
+ NOTE: If you are running a Docker so you are probably using an old version of ebook2audiobook.
21
+ To solve this issue go to download the new version at https://github.com/DrewThomasson/ebook2audiobook
22
+ If the directory python_env does not exist in the ebook2audiobook root directory,
23
+ run your command with "./ebook2audiobook.sh" for Linux and Mac or "ebook2audiobook.cmd" for Windows
24
+ to install it all automatically.
25
+ {install_info}
26
+ ***********'''
27
+ print(error)
28
+ return False
29
+
30
+ def check_python_version():
31
+ current_version = sys.version_info[:2] # (major, minor)
32
+ if current_version < min_python_version or current_version > max_python_version:
33
+ error = f'''***********
34
+ Wrong launch: Your OS Python version is not compatible! (current: {current_version[0]}.{current_version[1]})
35
+ In order to install and/or use ebook2audiobook correctly you must run
36
+ "./ebook2audiobook.sh" for Linux and Mac or "ebook2audiobook.cmd" for Windows.
37
+ {install_info}
38
+ ***********'''
39
+ print(error)
40
+ return False
41
+ else:
42
+ return True
43
+
44
+ def check_and_install_requirements(file_path):
45
+ if not os.path.exists(file_path):
46
+ error = f'Warning: File {file_path} not found. Skipping package check.'
47
+ print(error)
48
+ return False
49
+ try:
50
+ from importlib.metadata import version, PackageNotFoundError
51
+ try:
52
+ from packaging.specifiers import SpecifierSet
53
+ except ImportError:
54
+ subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--no-cache-dir', 'packaging'])
55
+ from packaging.specifiers import SpecifierSet
56
+ import regex as re
57
+ from tqdm import tqdm
58
+ with open(file_path, 'r') as f:
59
+ contents = f.read().replace('\r', '\n')
60
+ packages = [
61
+ pkg.strip()
62
+ for pkg in contents.splitlines()
63
+ if pkg.strip() and re.search(r'[a-zA-Z0-9]', pkg)
64
+ ]
65
+ missing_packages = []
66
+ for package in packages:
67
+ # remove extras so '[lang]==x.y' becomes 'pkg==x.y'
68
+ clean_pkg = re.sub(r'\[.*?\]', '', package)
69
+ pkg_name = re.split(r'[<>=]', clean_pkg, 1)[0].strip()
70
+ try:
71
+ installed_version = version(pkg_name)
72
+ if pkg_name == 'num2words':
73
+ code = "ZH_CN"
74
+ spec = importlib.util.find_spec(f"num2words.lang_{code}")
75
+ if spec is None:
76
+ missing_packages.append(package)
77
+ except PackageNotFoundError:
78
+ error = f'{package} is missing.'
79
+ print(error)
80
+ missing_packages.append(package)
81
+ else:
82
+ # get specifier from clean_pkg, not from the raw string
83
+ spec_str = clean_pkg[len(pkg_name):].strip()
84
+ if spec_str:
85
+ spec = SpecifierSet(spec_str)
86
+ if installed_version not in spec:
87
+ error = (f'{pkg_name} (installed {installed_version}) does not satisfy "{spec_str}".')
88
+ print(error)
89
+ missing_packages.append(package)
90
+ if missing_packages:
91
+ msg = '\nInstalling missing or upgrade packages...\n'
92
+ print(msg)
93
+ tmp_dir = tempfile.mkdtemp()
94
+ os.environ['TMPDIR'] = tmp_dir
95
+ result = subprocess.call([sys.executable, '-m', 'pip', 'cache', 'purge'])
96
+ subprocess.check_call([sys.executable, '-m', 'pip', 'install', '--upgrade', 'pip'])
97
+ with tqdm(total=len(packages),
98
+ desc='Installation 0.00%',
99
+ bar_format='{desc}: {n_fmt}/{total_fmt} ',
100
+ unit='step') as t:
101
+ for package in tqdm(missing_packages, desc="Installing", unit="pkg"):
102
+ try:
103
+ if package == 'num2words':
104
+ pkgs = ['git+https://github.com/savoirfairelinux/num2words.git', '--force']
105
+ else:
106
+ pkgs = [package]
107
+ subprocess.check_call([
108
+ sys.executable, '-m', 'pip', 'install',
109
+ '--no-cache-dir', '--use-pep517',
110
+ *pkgs
111
+ ])
112
+ t.update(1)
113
+ except subprocess.CalledProcessError as e:
114
+ error = f'Failed to install {package}: {e}'
115
+ print(error)
116
+ return False
117
+ msg = '\nAll required packages are installed.'
118
+ print(msg)
119
+ return True
120
+ except Exception as e:
121
+ error = f'check_and_install_requirements() error: {e}'
122
+ raise SystemExit(error)
123
+ return False
124
+
125
+ def check_dictionary():
126
+ import unidic
127
+ unidic_path = unidic.DICDIR
128
+ dicrc = os.path.join(unidic_path, 'dicrc')
129
+ if not os.path.exists(dicrc) or os.path.getsize(dicrc) == 0:
130
+ try:
131
+ error = 'UniDic dictionary not found or incomplete. Downloading now...'
132
+ print(error)
133
+ subprocess.run(['python', '-m', 'unidic', 'download'], check=True)
134
+ except subprocess.CalledProcessError as e:
135
+ error = f'Failed to download UniDic dictionary. Error: {e}. Unable to continue without UniDic. Exiting...'
136
+ raise SystemExit(error)
137
+ return False
138
+ return True
139
+
140
+ def is_port_in_use(port):
141
+ with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
142
+ return s.connect_ex(('0.0.0.0', port)) == 0
143
+
144
+ def main():
145
+ # Argument parser to handle optional parameters with descriptions
146
+ parser = argparse.ArgumentParser(
147
+ description='Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the Gradio interface or run the script in headless mode for direct conversion.',
148
+ epilog='''
149
+ Example usage:
150
+ Windows:
151
+ Gradio/GUI:
152
+ ebook2audiobook.cmd
153
+ Headless mode:
154
+ ebook2audiobook.cmd --headless --ebook '/path/to/file'
155
+ Linux/Mac:
156
+ Gradio/GUI:
157
+ ./ebook2audiobook.sh
158
+ Headless mode:
159
+ ./ebook2audiobook.sh --headless --ebook '/path/to/file'
160
+
161
+ Tip: to add of silence (1.4 seconds) into your text just use "###" or "[pause]".
162
+ ''',
163
+ formatter_class=argparse.RawTextHelpFormatter
164
+ )
165
+ options = [
166
+ '--script_mode', '--session', '--share', '--headless',
167
+ '--ebook', '--ebooks_dir', '--language', '--voice', '--device', '--tts_engine',
168
+ '--custom_model', '--fine_tuned', '--output_format',
169
+ '--temperature', '--length_penalty', '--num_beams', '--repetition_penalty', '--top_k', '--top_p', '--speed', '--enable_text_splitting',
170
+ '--text_temp', '--waveform_temp',
171
+ '--output_dir', '--version', '--workflow', '--help'
172
+ ]
173
+ tts_engine_list_keys = [k for k in TTS_ENGINES.keys()]
174
+ tts_engine_list_values = [k for k in TTS_ENGINES.values()]
175
+ all_group = parser.add_argument_group('**** The following options are for all modes', 'Optional')
176
+ all_group.add_argument(options[0], type=str, help=argparse.SUPPRESS)
177
+ parser.add_argument(options[1], type=str, help='''Session to resume the conversion in case of interruption, crash,
178
+ or reuse of custom models and custom cloning voices.''')
179
+ gui_group = parser.add_argument_group('**** The following option are for gradio/gui mode only', 'Optional')
180
+ gui_group.add_argument(options[2], action='store_true', help='''Enable a public shareable Gradio link.''')
181
+ headless_group = parser.add_argument_group('**** The following options are for --headless mode only')
182
+ headless_group.add_argument(options[3], action='store_true', help='''Run the script in headless mode''')
183
+ headless_group.add_argument(options[4], type=str, help='''Path to the ebook file for conversion. Cannot be used when --ebooks_dir is present.''')
184
+ headless_group.add_argument(options[5], type=str, help=f'''Relative or absolute path of the directory containing the files to convert.
185
+ Cannot be used when --ebook is present.''')
186
+ headless_group.add_argument(options[6], type=str, default=default_language_code, help=f'''Language of the e-book. Default language is set
187
+ in ./lib/lang.py sed as default if not present. All compatible language codes are in ./lib/lang.py''')
188
+ headless_optional_group = parser.add_argument_group('optional parameters')
189
+ headless_optional_group.add_argument(options[7], type=str, default=None, help='''(Optional) Path to the voice cloning file for TTS engine.
190
+ Uses the default voice if not present.''')
191
+ headless_optional_group.add_argument(options[8], type=str, default=default_device, choices=device_list, help=f'''(Optional) Pprocessor unit type for the conversion.
192
+ Default is set in ./lib/conf.py if not present. Fall back to CPU if GPU not available.''')
193
+ headless_optional_group.add_argument(options[9], type=str, default=None, choices=tts_engine_list_keys+tts_engine_list_values, help=f'''(Optional) Preferred TTS engine (available are: {tts_engine_list_keys+tts_engine_list_values}.
194
+ Default depends on the selected language. The tts engine should be compatible with the chosen language''')
195
+ headless_optional_group.add_argument(options[10], type=str, default=None, help=f'''(Optional) Path to the custom model zip file cntaining mandatory model files.
196
+ Please refer to ./lib/models.py''')
197
+ headless_optional_group.add_argument(options[11], type=str, default=default_fine_tuned, help='''(Optional) Fine tuned model path. Default is builtin model.''')
198
+ headless_optional_group.add_argument(options[12], type=str, default=default_output_format, help=f'''(Optional) Output audio format. Default is set in ./lib/conf.py''')
199
+ headless_optional_group.add_argument(options[13], type=float, default=None, help=f"""(xtts only, optional) Temperature for the model.
200
+ Default to config.json model. Higher temperatures lead to more creative outputs.""")
201
+ headless_optional_group.add_argument(options[14], type=float, default=None, help=f"""(xtts only, optional) A length penalty applied to the autoregressive decoder.
202
+ Default to config.json model. Not applied to custom models.""")
203
+ headless_optional_group.add_argument(options[15], type=int, default=None, help=f"""(xtts only, optional) Controls how many alternative sequences the model explores. Must be equal or greater than length penalty.
204
+ Default to config.json model.""")
205
+ headless_optional_group.add_argument(options[16], type=float, default=None, help=f"""(xtts only, optional) A penalty that prevents the autoregressive decoder from repeating itself.
206
+ Default to config.json model.""")
207
+ headless_optional_group.add_argument(options[17], type=int, default=None, help=f"""(xtts only, optional) Top-k sampling.
208
+ Lower values mean more likely outputs and increased audio generation speed.
209
+ Default to config.json model.""")
210
+ headless_optional_group.add_argument(options[18], type=float, default=None, help=f"""(xtts only, optional) Top-p sampling.
211
+ Lower values mean more likely outputs and increased audio generation speed. Default to config.json model.""")
212
+ headless_optional_group.add_argument(options[19], type=float, default=None, help=f"""(xtts only, optional) Speed factor for the speech generation.
213
+ Default to config.json model.""")
214
+ headless_optional_group.add_argument(options[20], action='store_true', help=f"""(xtts only, optional) Enable TTS text splitting. This option is known to not be very efficient.
215
+ Default to config.json model.""")
216
+ headless_optional_group.add_argument(options[21], type=float, default=None, help=f"""(bark only, optional) Text Temperature for the model.
217
+ Default to {default_engine_settings[TTS_ENGINES['BARK']]['text_temp']}. Higher temperatures lead to more creative outputs.""")
218
+ headless_optional_group.add_argument(options[22], type=float, default=None, help=f"""(bark only, optional) Waveform Temperature for the model.
219
+ Default to {default_engine_settings[TTS_ENGINES['BARK']]['waveform_temp']}. Higher temperatures lead to more creative outputs.""")
220
+ headless_optional_group.add_argument(options[23], type=str, help=f'''(Optional) Path to the output directory. Default is set in ./lib/conf.py''')
221
+ headless_optional_group.add_argument(options[24], action='version', version=f'ebook2audiobook version {prog_version}', help='''Show the version of the script and exit''')
222
+ headless_optional_group.add_argument(options[25], action='store_true', help=argparse.SUPPRESS)
223
+
224
+ for arg in sys.argv:
225
+ if arg.startswith('--') and arg not in options:
226
+ error = f'Error: Unrecognized option "{arg}"'
227
+ print(error)
228
+ sys.exit(1)
229
+
230
+ args = vars(parser.parse_args())
231
+
232
+ if not 'help' in args:
233
+ if not check_virtual_env(args['script_mode']):
234
+ sys.exit(1)
235
+
236
+ if not check_python_version():
237
+ sys.exit(1)
238
+
239
+ # Check if the port is already in use to prevent multiple launches
240
+ if not args['headless'] and is_port_in_use(interface_port):
241
+ error = f'Error: Port {interface_port} is already in use. The web interface may already be running.'
242
+ print(error)
243
+ sys.exit(1)
244
+
245
+ args['script_mode'] = args['script_mode'] if args['script_mode'] else NATIVE
246
+ args['session'] = 'ba800d22-ee51-11ef-ac34-d4ae52cfd9ce' if args['workflow'] else args['session'] if args['session'] else None
247
+ args['share'] = args['share'] if args['share'] else False
248
+ args['ebook_list'] = None
249
+
250
+ print(f"v{prog_version} {args['script_mode']} mode")
251
+
252
+ if args['script_mode'] == NATIVE:
253
+ check_pkg = check_and_install_requirements(requirements_file)
254
+ if check_pkg:
255
+ if not check_dictionary():
256
+ sys.exit(1)
257
+ else:
258
+ error = 'Some packages could not be installed'
259
+ print(error)
260
+ sys.exit(1)
261
+
262
+ from lib.functions import SessionContext, convert_ebook_batch, convert_ebook, web_interface
263
+ ctx = SessionContext()
264
+ # Conditions based on the --headless flag
265
+ if args['headless']:
266
+ args['is_gui_process'] = False
267
+ args['audiobooks_dir'] = os.path.abspath(args['output_dir']) if args['output_dir'] else audiobooks_cli_dir
268
+ args['device'] = 'cuda' if args['device'] == 'gpu' else args['device']
269
+ args['tts_engine'] = TTS_ENGINES[args['tts_engine']] if args['tts_engine'] in TTS_ENGINES.keys() else args['tts_engine'] if args['tts_engine'] in TTS_ENGINES.values() else None
270
+ args['output_split'] = default_output_split
271
+ args['output_split_hours'] = default_output_split_hours
272
+ # Condition to stop if both --ebook and --ebooks_dir are provided
273
+ if args['ebook'] and args['ebooks_dir']:
274
+ error = 'Error: You cannot specify both --ebook and --ebooks_dir in headless mode.'
275
+ print(error)
276
+ sys.exit(1)
277
+ # convert in absolute path voice, custom_model if any
278
+ if args['voice']:
279
+ if os.path.exists(args['voice']):
280
+ args['voice'] = os.path.abspath(args['voice'])
281
+ if args['custom_model']:
282
+ if os.path.exists(args['custom_model']):
283
+ args['custom_model'] = os.path.abspath(args['custom_model'])
284
+ if not os.path.exists(args['audiobooks_dir']):
285
+ error = 'Error: --output_dir path does not exist.'
286
+ print(error)
287
+ sys.exit(1)
288
+ if args['ebooks_dir']:
289
+ args['ebooks_dir'] = os.path.abspath(args['ebooks_dir'])
290
+ if not os.path.exists(args['ebooks_dir']):
291
+ error = f'Error: The provided --ebooks_dir "{args["ebooks_dir"]}" does not exist.'
292
+ print(error)
293
+ sys.exit(1)
294
+ args['ebook_list'] = []
295
+ for file in os.listdir(args['ebooks_dir']):
296
+ if any(file.endswith(ext) for ext in ebook_formats):
297
+ full_path = os.path.abspath(os.path.join(args['ebooks_dir'], file))
298
+ args['ebook_list'].append(full_path)
299
+ progress_status, passed = convert_ebook_batch(args, ctx)
300
+ if passed is False:
301
+ error = f'Conversion failed: {progress_status}'
302
+ print(error)
303
+ sys.exit(1)
304
+ elif args['ebook']:
305
+ args['ebook'] = os.path.abspath(args['ebook'])
306
+ if not os.path.exists(args['ebook']):
307
+ error = f'Error: The provided --ebook "{args["ebook"]}" does not exist.'
308
+ print(error)
309
+ sys.exit(1)
310
+ progress_status, passed = convert_ebook(args, ctx)
311
+ if passed is False:
312
+ error = f'Conversion failed: {progress_status}'
313
+ print(error)
314
+ sys.exit(1)
315
+ else:
316
+ error = 'Error: In headless mode, you must specify either an ebook file using --ebook or an ebook directory using --ebooks_dir.'
317
+ print(error)
318
+ sys.exit(1)
319
+ else:
320
+ args['is_gui_process'] = True
321
+ passed_arguments = sys.argv[1:]
322
+ allowed_arguments = {'--share', '--script_mode'}
323
+ passed_args_set = {arg for arg in passed_arguments if arg.startswith('--')}
324
+ if passed_args_set.issubset(allowed_arguments):
325
+ web_interface(args, ctx)
326
+ else:
327
+ error = 'Error: In non-headless mode, no option or only --share can be passed'
328
+ print(error)
329
+ sys.exit(1)
330
+ if __name__ == '__main__':
331
+ main()
docker-compose.yml ADDED
@@ -0,0 +1,40 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ x-gpu-enabled: &gpu-enabled
2
+ devices:
3
+ - driver: nvidia
4
+ count: all
5
+ capabilities:
6
+ - gpu # Enables GPU access for the container.
7
+
8
+ x-gpu-disabled: &gpu-disabled
9
+ devices: [] # Disables GPU access (default for systems without an NVIDIA GPU).
10
+
11
+ services:
12
+ ebook2audiobook:
13
+ build:
14
+ context: .
15
+ args:
16
+ #TORCH_VERSION: cuda118 # Available tags: [cuda121, cuda118, cuda128, rocm, xpu, cpu] # All CUDA version numbers should work, Ex: CUDA 11.6-> cuda116
17
+ SKIP_XTTS_TEST: "true" # (Saves space by not baking xtts model into docker image)
18
+ # To update ebook2audiobook to the latest you may have to rebuild
19
+ entrypoint: ["python", "app.py", "--script_mode", "full_docker"]
20
+ command: [] # <- Extra ebook2audiobook parameters can be added here
21
+ tty: true
22
+ stdin_open: true
23
+ ports:
24
+ - 7860:7860 # Maps container's port 7860 to the host's port 7860.
25
+ deploy:
26
+ resources:
27
+ reservations:
28
+ <<: *gpu-disabled # Use *gpu-enabled if you have an NVIDIA GPU.
29
+ limits: {} # Keeps limits as an empty mapping to avoid errors. Uncomment and configure below.
30
+ volumes:
31
+ - ./:/app # Maps the local directory to the container.
32
+
33
+ # Common Issues: ----
34
+ # --> `python: can't open file '/home/user/app/app.py': [Errno 2] No such file or directory`
35
+ # Removed all post arguments as CMD was replaced with ENTRYPOINT in the Dockerfile
36
+ # Example correction:
37
+ # Before: command: ["python", "app.py", "--script_mode", "full_docker"] or -> `docker run athomasson2/ebook2audiobook python app.py --script_mode full_docker`
38
+ # After: nothing needed or just -> `docker run athomasson2/ebook2audiobook`
39
+ # Extra arguments after app.py can still be added to the -> command: []
40
+ # Example adding extra arguments -> command: ["--share"] or -> command: ["--help"]
ebook2audiobook.cmd ADDED
@@ -0,0 +1,245 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ @echo off
2
+ setlocal enabledelayedexpansion
3
+
4
+ :: Capture all arguments into ARGS
5
+ set "ARGS=%*"
6
+
7
+ set "NATIVE=native"
8
+ set "FULL_DOCKER=full_docker"
9
+
10
+ set "SCRIPT_MODE=%NATIVE%"
11
+ set "SCRIPT_DIR=%~dp0"
12
+
13
+ set "ARCH=%PROCESSOR_ARCHITECTURE%"
14
+ set "PYTHON_VERSION=3.12"
15
+ set "PYTHON_ENV=python_env"
16
+ set "PYTHONUTF8=1"
17
+ set "PYTHONIOENCODING=utf-8"
18
+ set "CURRENT_ENV="
19
+
20
+ set "PROGRAMS_LIST=calibre-normal ffmpeg nodejs espeak-ng sox"
21
+
22
+ set "TMP=%SCRIPT_DIR%\tmp"
23
+ set "TEMP=%SCRIPT_DIR%\tmp"
24
+
25
+ set "ESPEAK_DATA_PATH=%USERPROFILE%\scoop\apps\espeak-ng\current\eSpeak NG\espeak-ng-data"
26
+
27
+ set "SCOOP_HOME=%USERPROFILE%\scoop"
28
+ set "SCOOP_SHIMS=%SCOOP_HOME%\shims"
29
+ set "SCOOP_APPS=%SCOOP_HOME%\apps"
30
+
31
+ set "CONDA_URL=https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe"
32
+ set "CONDA_INSTALL_DIR=%USERPROFILE%\Miniforge3"
33
+ set "CONDA_INSTALLER=Miniforge3-Windows-x86_64.exe"
34
+ set "CONDA_ENV=%CONDA_INSTALL_DIR%\condabin\conda.bat"
35
+ set "CONDA_PATH=%CONDA_INSTALL_DIR%\condabin"
36
+
37
+ set "NODE_PATH=%SCOOP_HOME%\apps\nodejs\current"
38
+
39
+ set "PATH=%SCOOP_SHIMS%;%SCOOP_APPS%;%CONDA_PATH%;%NODE_PATH%;%PATH%" 2>&1 >nul
40
+
41
+ set "SCOOP_CHECK=0"
42
+ set "CONDA_CHECK=0"
43
+ set "PROGRAMS_CHECK=0"
44
+ set "DOCKER_CHECK=0"
45
+
46
+ set "HELP_FOUND=%ARGS:--help=%"
47
+
48
+ :: Refresh environment variables (append registry Path to current PATH)
49
+ for /f "tokens=2,*" %%A in ('reg query "HKLM\SYSTEM\CurrentControlSet\Control\Session Manager\Environment" /v Path') do (
50
+ set "PATH=%%B;%PATH%"
51
+ )
52
+
53
+ cd /d "%SCRIPT_DIR%"
54
+
55
+ if "%ARCH%"=="x86" (
56
+ echo Error: 32-bit architecture is not supported.
57
+ goto :failed
58
+ )
59
+
60
+ :: Check if running inside Docker
61
+ if defined CONTAINER (
62
+ set "SCRIPT_MODE=%FULL_DOCKER%"
63
+ goto :main
64
+ )
65
+
66
+ goto :scoop_check
67
+
68
+ :scoop_check
69
+ where /Q scoop
70
+ if %errorlevel% neq 0 (
71
+ echo Scoop is not installed.
72
+ set "SCOOP_CHECK=1"
73
+ goto :install_components
74
+ )
75
+ goto :conda_check
76
+ exit /b
77
+
78
+ :conda_check
79
+ where /Q conda
80
+ if %errorlevel% neq 0 (
81
+ call rmdir /s /q "%CONDA_INSTALL_DIR%" 2>nul
82
+ echo Miniforge3 is not installed.
83
+ set "CONDA_CHECK=1"
84
+ goto :install_components
85
+ )
86
+ :: Check if running in a Conda environment
87
+ if defined CONDA_DEFAULT_ENV (
88
+ set "CURRENT_ENV=%CONDA_PREFIX%"
89
+ )
90
+ :: Check if running in a Python virtual environment
91
+ if defined VIRTUAL_ENV (
92
+ set "CURRENT_ENV=%VIRTUAL_ENV%"
93
+ )
94
+ for /f "delims=" %%i in ('where /Q python') do (
95
+ if defined CONDA_PREFIX (
96
+ if /i "%%i"=="%CONDA_PREFIX%\Scripts\python.exe" (
97
+ set "CURRENT_ENV=%CONDA_PREFIX%"
98
+ break
99
+ )
100
+ ) else if defined VIRTUAL_ENV (
101
+ if /i "%%i"=="%VIRTUAL_ENV%\Scripts\python.exe" (
102
+ set "CURRENT_ENV=%VIRTUAL_ENV%"
103
+ break
104
+ )
105
+ )
106
+ )
107
+ if not "%CURRENT_ENV%"=="" (
108
+ echo Current python virtual environment detected: %CURRENT_ENV%.
109
+ echo This script runs with its own virtual env and must be out of any other virtual environment when it's launched.
110
+ goto :failed
111
+ )
112
+ goto :programs_check
113
+ exit /b
114
+
115
+ :programs_check
116
+ set "missing_prog_array="
117
+ for %%p in (%PROGRAMS_LIST%) do (
118
+ set "prog=%%p"
119
+ if "%%p"=="nodejs" set "prog=node"
120
+ if "%%p"=="calibre-normal" set "prog=calibre"
121
+ where /Q !prog!
122
+ if !errorlevel! neq 0 (
123
+ echo %%p is not installed.
124
+ set "missing_prog_array=!missing_prog_array! %%p"
125
+ )
126
+ )
127
+ if not "%missing_prog_array%"=="" (
128
+ set "PROGRAMS_CHECK=1"
129
+ goto :install_components
130
+ )
131
+ goto :dispatch
132
+ exit /b
133
+
134
+ :install_components
135
+ :: Install Scoop if not already installed
136
+ if not "%SCOOP_CHECK%"=="0" (
137
+ echo Installing Scoop...
138
+ call powershell -command "Set-ExecutionPolicy RemoteSigned -scope CurrentUser"
139
+ call powershell -command "iwr -useb get.scoop.sh | iex"
140
+ call scoop install git
141
+ call scoop bucket add muggle https://github.com/hu3rror/scoop-muggle.git
142
+ call scoop bucket add extras
143
+ call scoop bucket add versions
144
+ echo Scoop installed successfully.
145
+ if "%PROGRAMS_CHECK%"=="0" (
146
+ set "SCOOP_CHECK=0"
147
+ )
148
+ start "" cmd /k cd /d "%CD%" ^& call "%~f0"
149
+ exit
150
+ )
151
+ :: Install Conda if not already installed
152
+ if not "%CONDA_CHECK%"=="0" (
153
+ echo Installing Miniforge...
154
+ call powershell -Command "Invoke-WebRequest -Uri %CONDA_URL% -OutFile "%CONDA_INSTALLER%"
155
+ call start /wait "" "%CONDA_INSTALLER%" /InstallationType=JustMe /RegisterPython=0 /S /D=%UserProfile%\Miniforge3
156
+ where /Q conda
157
+ if !errorlevel! neq 0 (
158
+ echo Conda installation failed.
159
+ goto :failed
160
+ )
161
+ call conda config --set auto_activate_base false
162
+ call conda update conda -y
163
+ del "%CONDA_INSTALLER%"
164
+ set "CONDA_CHECK=0"
165
+ echo Conda installed successfully.
166
+ start "" cmd /k cd /d "%CD%" ^& call "%~f0"
167
+ exit
168
+ )
169
+ :: Install missing packages one by one
170
+ if not "%PROGRAMS_CHECK%"=="0" (
171
+ echo Installing missing programs...
172
+ if "%SCOOP_CHECK%"=="0" (
173
+ call scoop bucket add muggle b https://github.com/hu3rror/scoop-muggle.git
174
+ call scoop bucket add extras
175
+ call scoop bucket add versions
176
+ )
177
+ for %%p in (%missing_prog_array%) do (
178
+ call scoop install %%p
179
+ set "prog=%%p"
180
+ if "%%p"=="nodejs" (
181
+ set "prog=node"
182
+ )
183
+ if "%%p"=="calibre-normal" set "prog=calibre"
184
+ where /Q !prog!
185
+ if !errorlevel! neq 0 (
186
+ echo %%p installation failed...
187
+ goto :failed
188
+ )
189
+ )
190
+ call powershell -command "[System.Environment]::SetEnvironmentVariable('Path', [System.Environment]::GetEnvironmentVariable('Path', 'User') + '%SCOOP_SHIMS%;%SCOOP_APPS%;%CONDA_PATH%;%NODE_PATH%;', 'User')"
191
+ set "SCOOP_CHECK=0"
192
+ set "PROGRAMS_CHECK=0"
193
+ set "missing_prog_array="
194
+ )
195
+ goto :dispatch
196
+ exit /b
197
+
198
+ :dispatch
199
+ if "%SCOOP_CHECK%"=="0" (
200
+ if "%PROGRAMS_CHECK%"=="0" (
201
+ if "%CONDA_CHECK%"=="0" (
202
+ if "%DOCKER_CHECK%"=="0" (
203
+ goto :main
204
+ ) else (
205
+ goto :failed
206
+ )
207
+ )
208
+ )
209
+ )
210
+ echo PROGRAMS_CHECK: %PROGRAMS_CHECK%
211
+ echo CONDA_CHECK: %CONDA_CHECK%
212
+ echo DOCKER_CHECK: %DOCKER_CHECK%
213
+ goto :install_components
214
+ exit /b
215
+
216
+ :main
217
+ if "%SCRIPT_MODE%"=="%FULL_DOCKER%" (
218
+ call python %SCRIPT_DIR%\app.py --script_mode %SCRIPT_MODE% %ARGS%
219
+ ) else (
220
+ if not exist "%SCRIPT_DIR%\%PYTHON_ENV%" (
221
+ call conda create --prefix "%SCRIPT_DIR%\%PYTHON_ENV%" python=%PYTHON_VERSION% -y
222
+ call %CONDA_ENV% activate base
223
+ call conda activate "%SCRIPT_DIR%\%PYTHON_ENV%"
224
+ call python -m pip cache purge >nul 2>&1
225
+ call python -m pip install --upgrade pip
226
+ for /f "usebackq delims=" %%p in ("requirements.txt") do (
227
+ echo Installing %%p...
228
+ call python -m pip install --upgrade --no-cache-dir --use-pep517 --progress-bar=on "%%p"
229
+ )
230
+ echo All required packages are installed.
231
+ ) else (
232
+ call %CONDA_ENV% activate base
233
+ call conda activate "%SCRIPT_DIR%\%PYTHON_ENV%"
234
+ )
235
+ call python "%SCRIPT_DIR%\app.py" --script_mode %SCRIPT_MODE% %ARGS%
236
+ call conda deactivate
237
+ )
238
+ exit /b
239
+
240
+ :failed
241
+ echo ebook2audiobook is not correctly installed or run.
242
+ exit /b
243
+
244
+ endlocal
245
+ pause
ebook2audiobook.sh ADDED
@@ -0,0 +1,326 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+
3
+ if [[ "$OSTYPE" = "darwin"* && -z "$SWITCHED_TO_ZSH" && "$(ps -p $$ -o comm=)" != "zsh" ]]; then
4
+ export SWITCHED_TO_ZSH=1
5
+ exec env zsh "$0" "$@"
6
+ fi
7
+
8
+ unset SWITCHED_TO_ZSH
9
+
10
+ ARCH=$(uname -m)
11
+ PYTHON_VERSION="3.12"
12
+
13
+ export PYTHONUTF8="1"
14
+ export PYTHONIOENCODING="utf-8"
15
+ export TTS_CACHE="./models"
16
+
17
+ ARGS=("$@")
18
+
19
+ declare -A arguments # associative array
20
+ declare -a programs_missing # indexed array
21
+
22
+ # Parse arguments
23
+ while [[ "$#" -gt 0 ]]; do
24
+ case "$1" in
25
+ --*)
26
+ key="${1/--/}" # Remove leading '--'
27
+ if [[ -n "$2" && ! "$2" =~ ^-- ]]; then
28
+ # If the next argument is a value (not another option)
29
+ arguments[$key]="$2"
30
+ shift # Move past the value
31
+ else
32
+ # Set to true for flags without values
33
+ arguments[$key]=true
34
+ fi
35
+ ;;
36
+ *)
37
+ echo "Unknown option: $1"
38
+ exit 1
39
+ ;;
40
+ esac
41
+ shift # Move to the next argument
42
+ done
43
+
44
+ NATIVE="native"
45
+ FULL_DOCKER="full_docker"
46
+
47
+ SCRIPT_MODE="$NATIVE"
48
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
49
+
50
+ WGET=$(which wget 2>/dev/null)
51
+ REQUIRED_PROGRAMS=("curl" "calibre" "ffmpeg" "nodejs" "espeak-ng" "rust" "sox")
52
+ PYTHON_ENV="python_env"
53
+ CURRENT_ENV=""
54
+
55
+ if [[ "$OSTYPE" != "linux"* && "$OSTYPE" != "darwin"* ]]; then
56
+ echo "Error: OS $OSTYPE unsupported."
57
+ exit 1;
58
+ fi
59
+
60
+ if [[ "$OSTYPE" = "darwin"* ]]; then
61
+ CONDA_URL="https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-$(uname -m).sh"
62
+ CONFIG_FILE="$HOME/.zshrc"
63
+ if [[ "$ARCH" == "x86_64" ]]; then
64
+ PYTHON_VERSION="3.11"
65
+ fi
66
+ elif [[ "$OSTYPE" = "linux"* ]]; then
67
+ CONDA_URL="https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
68
+ CONFIG_FILE="$HOME/.bashrc"
69
+ fi
70
+
71
+ CONDA_INSTALLER="/tmp/Miniforge3.sh"
72
+ CONDA_INSTALL_DIR="$HOME/Miniforge3"
73
+ CONDA_PATH="$CONDA_INSTALL_DIR/bin"
74
+ CONDA_ENV="$CONDA_INSTALL_DIR/etc/profile.d/conda.sh"
75
+
76
+ export TMPDIR="$SCRIPT_DIR/.cache"
77
+ export PATH="$CONDA_PATH:$PATH"
78
+
79
+ # Check if the current script is run inside a docker container
80
+ if [[ -n "$container" || -f /.dockerenv ]]; then
81
+ SCRIPT_MODE="$FULL_DOCKER"
82
+ else
83
+ if [[ -n "${arguments['script_mode']+exists}" ]]; then
84
+ if [ "${arguments['script_mode']}" = "$NATIVE" ]; then
85
+ SCRIPT_MODE="${arguments['script_mode']}"
86
+ fi
87
+ fi
88
+ fi
89
+
90
+ if [[ -n "${arguments['help']+exists}" && ${arguments['help']} = true ]]; then
91
+ python app.py "${ARGS[@]}"
92
+ else
93
+ # Check if running in a Conda or Python virtual environment
94
+ if [[ -n "$CONDA_DEFAULT_ENV" ]]; then
95
+ CURRENT_ENV="$CONDA_PREFIX"
96
+ elif [[ -n "$VIRTUAL_ENV" ]]; then
97
+ CURRENT_ENV="$VIRTUAL_ENV"
98
+ fi
99
+
100
+ # If neither environment variable is set, check Python path
101
+ if [[ -z "$CURRENT_ENV" ]]; then
102
+ PYTHON_PATH=$(which python 2>/dev/null)
103
+ if [[ ( -n "$CONDA_PREFIX" && "$PYTHON_PATH" = "$CONDA_PREFIX/bin/python" ) || ( -n "$VIRTUAL_ENV" && "$PYTHON_PATH" = "$VIRTUAL_ENV/bin/python" ) ]]; then
104
+ CURRENT_ENV="${CONDA_PREFIX:-$VIRTUAL_ENV}"
105
+ fi
106
+ fi
107
+
108
+ # Output result if a virtual environment is detected
109
+ if [[ -n "$CURRENT_ENV" ]]; then
110
+ echo -e "Current python virtual environment detected: $CURRENT_ENV."
111
+ echo -e "This script runs with its own virtual env and must be out of any other virtual environment when it's launched."
112
+ echo -e "If you are using conda then you would type in:"
113
+ echo -e "conda deactivate"
114
+ exit 1
115
+ fi
116
+
117
+ # Check if .cache folder exists inside the eb2ab folder for Miniforge3
118
+ if [[ ! -d .cache ]]; then
119
+ mkdir .cache
120
+ fi
121
+
122
+ function required_programs_check {
123
+ local programs=("$@")
124
+ programs_missing=()
125
+ for program in "${programs[@]}"; do
126
+ if [ "$program" = "nodejs" ]; then
127
+ bin="node"
128
+ elif [ "$program" = "rust" ]; then
129
+ if command -v apt-get &> /dev/null; then
130
+ bin="rustc"
131
+ fi
132
+ else
133
+ bin="$program"
134
+ fi
135
+ if ! command -v "$bin" >/dev/null 2>&1; then
136
+ echo -e "\e[33m$program is not installed.\e[0m"
137
+ programs_missing+=("$program")
138
+ fi
139
+ done
140
+ local count=${#programs_missing[@]}
141
+ if [[ $count -eq 0 ]]; then
142
+ return 0
143
+ else
144
+ return 1
145
+ fi
146
+ }
147
+
148
+ function install_programs {
149
+ if [[ "$OSTYPE" = "darwin"* ]]; then
150
+ echo -e "\e[33mInstalling required programs...\e[0m"
151
+ if [ ! -d $TMPDIR ]; then
152
+ mkdir -p $TMPDIR
153
+ fi
154
+ SUDO=""
155
+ PACK_MGR="brew install"
156
+ if ! command -v brew &> /dev/null; then
157
+ echo -e "\e[33mHomebrew is not installed. Installing Homebrew...\e[0m"
158
+ /usr/bin/env bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
159
+ echo 'eval "$(/opt/homebrew/bin/brew shellenv)"' >> $HOME/.zprofile
160
+ eval "$(/opt/homebrew/bin/brew shellenv)"
161
+ fi
162
+ else
163
+ SUDO="sudo"
164
+ echo -e "\e[33mInstalling required programs. NOTE: you must have 'sudo' priviliges to install ebook2audiobook.\e[0m"
165
+ PACK_MGR_OPTIONS=""
166
+ if command -v emerge &> /dev/null; then
167
+ PACK_MGR="emerge"
168
+ elif command -v dnf &> /dev/null; then
169
+ PACK_MGR="dnf install"
170
+ PACK_MGR_OPTIONS="-y"
171
+ elif command -v yum &> /dev/null; then
172
+ PACK_MGR="yum install"
173
+ PACK_MGR_OPTIONS="-y"
174
+ elif command -v zypper &> /dev/null; then
175
+ PACK_MGR="zypper install"
176
+ PACK_MGR_OPTIONS="-y"
177
+ elif command -v pacman &> /dev/null; then
178
+ PACK_MGR="pacman -Sy"
179
+ elif command -v apt-get &> /dev/null; then
180
+ $SUDO apt-get update
181
+ PACK_MGR="apt-get install"
182
+ PACK_MGR_OPTIONS="-y"
183
+ elif command -v apk &> /dev/null; then
184
+ PACK_MGR="apk add"
185
+ else
186
+ echo "Cannot recognize your applications package manager. Please install the required applications manually."
187
+ return 1
188
+ fi
189
+
190
+ fi
191
+ if [ -z "$WGET" ]; then
192
+ echo -e "\e[33m wget is missing! trying to install it... \e[0m"
193
+ result=$(eval "$PACK_MGR wget $PACK_MGR_OPTIONS" 2>&1)
194
+ result_code=$?
195
+ if [ $result_code -eq 0 ]; then
196
+ WGET=$(which wget 2>/dev/null)
197
+ else
198
+ echo "Cannot 'wget'. Please install 'wget' manually."
199
+ return 1
200
+ fi
201
+ fi
202
+ for program in "${programs_missing[@]}"; do
203
+ if [ "$program" = "calibre" ];then
204
+ # avoid conflict with calibre builtin lxml
205
+ pip uninstall lxml -y 2>/dev/null
206
+ echo -e "\e[33mInstalling Calibre...\e[0m"
207
+ if [[ "$OSTYPE" = "darwin"* ]]; then
208
+ eval "$PACK_MGR --cask calibre"
209
+ else
210
+ $WGET -nv -O- https://download.calibre-ebook.com/linux-installer.sh | $SUDO sh /dev/stdin
211
+ fi
212
+ if command -v $program >/dev/null 2>&1; then
213
+ echo -e "\e[32m===============>>> Calibre is installed! <<===============\e[0m"
214
+ else
215
+ eval "$SUDO $PACK_MGR $program $PACK_MGR_OPTIONS"
216
+ if command -v $program >/dev/null 2>&1; then
217
+ echo -e "\e[32m===============>>> $program is installed! <<===============\e[0m"
218
+ else
219
+ echo "$program installation failed."
220
+ fi
221
+ fi
222
+ elif [ "$program" = "rust" ]; then
223
+ if command -v apt-get &> /dev/null; then
224
+ app="rustc"
225
+ else
226
+ app="$program"
227
+ fi
228
+ curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
229
+ source $HOME/.cargo/env
230
+ if command -v $app &>/dev/null; then
231
+ echo -e "\e[32m===============>>> $program is installed! <<===============\e[0m"
232
+ else
233
+ echo "$program installation failed."
234
+ fi
235
+ else
236
+ eval "$SUDO $PACK_MGR $program $PACK_MGR_OPTIONS"
237
+ if command -v $program >/dev/null 2>&1; then
238
+ echo -e "\e[32m===============>>> $program is installed! <<===============\e[0m"
239
+ else
240
+ echo "$program installation failed."
241
+ fi
242
+ fi
243
+ done
244
+ if required_programs_check "${REQUIRED_PROGRAMS[@]}"; then
245
+ return 0
246
+ else
247
+ echo "Some programs didn't install successfuly, please report the log to the support"
248
+ fi
249
+ }
250
+
251
+ function conda_check {
252
+ if ! command -v conda &> /dev/null || [ ! -f "$CONDA_ENV" ]; then
253
+ echo -e "\e[33mDownloading Miniforge3 installer...\e[0m"
254
+ if [[ "$OSTYPE" = "darwin"* ]]; then
255
+ curl -fsSLo "$CONDA_INSTALLER" "$CONDA_URL"
256
+ else
257
+ wget -O "$CONDA_INSTALLER" "$CONDA_URL"
258
+ fi
259
+ if [[ -f "$CONDA_INSTALLER" ]]; then
260
+ echo -e "\e[33mInstalling Miniforge3...\e[0m"
261
+ bash "$CONDA_INSTALLER" -b -u -p "$CONDA_INSTALL_DIR"
262
+ rm -f "$CONDA_INSTALLER"
263
+ if [[ -f "$CONDA_INSTALL_DIR/bin/conda" ]]; then
264
+ $CONDA_INSTALL_DIR/bin/conda config --set auto_activate_base false
265
+ source $CONDA_ENV
266
+ echo -e "\e[32m===============>>> conda is installed! <<===============\e[0m"
267
+ else
268
+ echo -e "\e[31mconda installation failed.\e[0m"
269
+ return 1
270
+ fi
271
+ else
272
+ echo -e "\e[31mFailed to download Miniforge3 installer.\e[0m"
273
+ echo -e "\e[33mI'ts better to use the install.sh to install everything needed.\e[0m"
274
+ return 1
275
+ fi
276
+ fi
277
+ if [[ ! -d "$SCRIPT_DIR/$PYTHON_ENV" ]]; then
278
+ # Use this condition to chmod writable folders once
279
+ chmod -R 777 ./audiobooks ./tmp ./models
280
+ conda create --prefix "$SCRIPT_DIR/$PYTHON_ENV" python=$PYTHON_VERSION -y
281
+ conda init > /dev/null 2>&1
282
+ source $CONDA_ENV
283
+ conda activate "$SCRIPT_DIR/$PYTHON_ENV"
284
+ python -m pip cache purge > /dev/null 2>&1
285
+ python -m pip install --upgrade pip
286
+ python -m pip install --upgrade --no-cache-dir --use-pep517 --progress-bar=on -r requirements.txt
287
+ tts_version=$(python -c "import importlib.metadata; print(importlib.metadata.version('coqui-tts'))" 2>/dev/null)
288
+ if [[ -n "$tts_version" ]]; then
289
+ if [[ "$(printf '%s\n' "$tts_version" "0.26.1" | sort -V | tail -n1)" == "0.26.1" ]]; then
290
+ python -m pip install --no-cache-dir --use-pep517 --progress-bar=on 'transformers<=4.51.3'
291
+ fi
292
+ fi
293
+ conda deactivate
294
+ fi
295
+ return 0
296
+ }
297
+
298
+ if [ "$SCRIPT_MODE" = "$FULL_DOCKER" ]; then
299
+ python app.py --script_mode "$SCRIPT_MODE" "${ARGS[@]}"
300
+ conda deactivate
301
+ conda deactivate
302
+ elif [ "$SCRIPT_MODE" = "$NATIVE" ]; then
303
+ pass=true
304
+ if [ "$SCRIPT_MODE" = "$NATIVE" ]; then
305
+ if ! required_programs_check "${REQUIRED_PROGRAMS[@]}"; then
306
+ if ! install_programs; then
307
+ pass=false
308
+ fi
309
+ fi
310
+ fi
311
+ if [ $pass = true ]; then
312
+ if conda_check; then
313
+ conda init > /dev/null 2>&1
314
+ source $CONDA_ENV
315
+ conda activate "$SCRIPT_DIR/$PYTHON_ENV"
316
+ python app.py --script_mode "$SCRIPT_MODE" "${ARGS[@]}"
317
+ conda deactivate
318
+ conda deactivate
319
+ fi
320
+ fi
321
+ else
322
+ echo -e "\e[33mebook2audiobook is not correctly installed or run.\e[0m"
323
+ fi
324
+ fi
325
+
326
+ exit 0
favicon.ico ADDED
podman-compose.yml ADDED
@@ -0,0 +1,34 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ x-gpu-enabled: &gpu-enabled
2
+ devices:
3
+ - /dev/nvidia0:/dev/nvidia0
4
+ - /dev/nvidiactl:/dev/nvidiactl
5
+ - /dev/nvidia-uvm:/dev/nvidia-uvm
6
+ x-gpu-disabled: &gpu-disabled
7
+ devices: [] # Disables GPU access (default for systems without an NVIDIA GPU).
8
+
9
+ services:
10
+ ebook2audiobook:
11
+ build:
12
+ context: .
13
+ args:
14
+ #TORCH_VERSION: cuda118 # Available tags: [cuda121, cuda118, cuda128, rocm, xpu, cpu] # All CUDA version numbers should work, Ex: CUDA 11.6-> cuda116
15
+ SKIP_XTTS_TEST: "true" # (Saves space by not baking xtts model into docker image)
16
+ # To update ebook2audiobook to the latest you may have to rebuild
17
+ entrypoint: ["python", "app.py", "--script_mode", "full_docker"]
18
+ command: [] # <- Extra ebook2audiobook parameters can be added here
19
+ tty: true
20
+ stdin_open: true
21
+ ports:
22
+ - 7860:7860 # Maps container's port 7860 to the host's port 7860.
23
+ <<: *gpu-disabled # Use *gpu-enabled if you have an NVIDIA GPU.
24
+ volumes:
25
+ - ./:/app # Maps the local directory to the container.
26
+
27
+ # Common Issues: ----
28
+ # --> `python: can't open file '/home/user/app/app.py': [Errno 2] No such file or directory`
29
+ # Removed all post arguments as CMD was replaced with ENTRYPOINT in the Dockerfile
30
+ # Example correction:
31
+ # Before: command: ["python", "app.py", "--script_mode", "full_docker"] or -> `podman run athomasson2/ebook2audiobook python app.py --script_mode full_docker`
32
+ # After: nothing needed or just -> `podman run athomasson2/ebook2audiobook`
33
+ # Extra arguments after app.py can still be added to the -> command: []
34
+ # Example adding extra arguments -> command: ["--share"] or -> command: ["--help"]
pyproject.toml ADDED
@@ -0,0 +1,64 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [build-system]
2
+ name = "ebook2audiobook"
3
+ requires = ["setuptools >= 64"]
4
+ build-backend = "setuptools.build_meta"
5
+
6
+ [tool.poetry]
7
+ name = "ebook2audiobook"
8
+ version = "0.0.0"
9
+
10
+ [tool.setuptools.dynamic]
11
+ version = {file = "VERSION.txt"}
12
+
13
+ [project]
14
+ name = "ebook2audiobook"
15
+ description = "Convert eBooks to audiobooks with chapters and metadata"
16
+ authors = [
17
+ { name = "Drew Thomasson" }
18
+ ]
19
+ dependencies = [
20
+ "argostranslate",
21
+ "beautifulsoup4",
22
+ "cutlet",
23
+ "deep_translator",
24
+ "demucs",
25
+ "docker",
26
+ "ebooklib",
27
+ "fastapi",
28
+ "fugashi",
29
+ "gradio>=5.42.0",
30
+ "hangul-romanize",
31
+ "indic-nlp-library",
32
+ "iso-639",
33
+ "jieba",
34
+ "soynlp",
35
+ "pythainlp">
36
+ "pydub",
37
+ "pyannote-audio",
38
+ "mutagen",
39
+ "nvidia-ml-py",
40
+ "PyOpenGL",
41
+ "pypinyin",
42
+ "ray",
43
+ "regex",
44
+ "translate",
45
+ "tqdm",
46
+ "unidic",
47
+ "pymupdf4llm",
48
+ "sudachipy",
49
+ "sudachidict_core",
50
+ "transformers==4.51.3",
51
+ "coqui-tts[languages]==0.26.0",
52
+ "torchvggish"
53
+ ]
54
+ readme = "README.md"
55
+ requires-python = ">3.9,<3.13"
56
+ classifiers = [
57
+ "Programming Language :: Python :: 3",
58
+ "License :: OSI Approved :: MIT License",
59
+ "Operating System :: OS Independent",
60
+ ]
61
+ scripts = { "ebook2audiobook" = "app:main" }
62
+
63
+ [project.urls]
64
+ "Homepage" = "https://github.com/DrewThomasson/ebook2audiobook"
requirements.txt ADDED
@@ -0,0 +1,35 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ argostranslate
2
+ beautifulsoup4
3
+ cutlet
4
+ deep_translator
5
+ demucs
6
+ docker
7
+ ebooklib
8
+ fastapi
9
+ fugashi
10
+ gradio>=5.42.0
11
+ hangul-romanize
12
+ indic-nlp-library
13
+ iso-639
14
+ jieba
15
+ soynlp
16
+ num2words
17
+ pythainlp
18
+ mutagen
19
+ nvidia-ml-py
20
+ phonemizer-fork
21
+ pydub
22
+ pyannote-audio
23
+ PyOpenGL
24
+ pypinyin
25
+ ray
26
+ regex
27
+ translate
28
+ tqdm
29
+ unidic
30
+ pymupdf4llm
31
+ sudachipy
32
+ sudachidict_core
33
+ transformers==4.51.3
34
+ coqui-tts[languages]==0.26.0
35
+ torchvggish
setup.py ADDED
@@ -0,0 +1,54 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import subprocess
2
+ import sys
3
+ from setuptools import setup, find_packages
4
+ from setuptools.command.develop import develop
5
+ from setuptools.command.install import install
6
+ import os
7
+
8
+ cwd = os.path.dirname(os.path.abspath(__file__))
9
+
10
+ def get_version():
11
+ with open("VERSION.txt", "r") as f:
12
+ return f.read().strip()
13
+
14
+ with open("README.md", "r", encoding='utf-8') as fh:
15
+ long_description = fh.read()
16
+
17
+ with open('requirements.txt') as f:
18
+ requirements = f.read().splitlines()
19
+
20
+ class PostInstallCommand(install):
21
+ def run(self):
22
+ install.run(self)
23
+ try:
24
+ subprocess.run([sys.executable, 'python -m', 'unidic', 'download'], check=True)
25
+ except Exception:
26
+ print("unidic download failed during installation, but it will be re-attempted a diffrent way when the app itself runs.")
27
+
28
+
29
+ setup(
30
+ name='ebook2audiobook',
31
+ version=get_version(),
32
+ python_requires=">3.9,<3.13",
33
+ author="Drew Thomasson",
34
+ description="Convert eBooks to audiobooks with chapters and metadata",
35
+ long_description=long_description,
36
+ long_description_content_type="text/markdown",
37
+ url="https://github.com/DrewThomasson/ebook2audiobook",
38
+ packages=find_packages(),
39
+ install_requires=requirements,
40
+ classifiers=[
41
+ "Programming Language :: Python :: 3",
42
+ "License :: OSI Approved :: MIT License",
43
+ "Operating System :: OS Independent",
44
+ ],
45
+ include_package_data=True,
46
+ entry_points={
47
+ "console_scripts": [
48
+ "ebook2audiobook = app:main",
49
+ ],
50
+ },
51
+ cmdclass={
52
+ 'install': PostInstallCommand,
53
+ }
54
+ )