fadi77
/

StyleTTS2-LibriTTS-arabic

Model card Files Files and versions Community

fadi77 commited on Apr 19

Commit

49ae1d1

·

verified ·

1 Parent(s): ec38903

Update README.md

Files changed (1) hide show

README.md +10 -9

README.md CHANGED Viewed

@@ -14,6 +14,16 @@ hardware: H100
 This is an Arabic text-to-speech model based on StyleTTS2 architecture, specifically adapted for Arabic language synthesis. The model achieves good quality Arabic speech synthesis, though not yet state-of-the-art, and further experimentation is needed to optimize performance for Arabic language specifically. All training objectives from the original StyleTTS2 were maintained, except for the WavLM objectives which were removed as they were primarily designed for English speech.
 ## Efficiency and Performance
 A key strength of this model lies in its efficiency and performance characteristics:
@@ -25,15 +35,6 @@ A key strength of this model lies in its efficiency and performance characterist
 Note: According to the StyleTTS2 authors, performance should improve further when training a single-speaker model from scratch rather than fine-tuning. This wasn't attempted in our case due to computational resource constraints, suggesting potential for even better results with more extensive training.
-## Example
-Here is an example output from the model:
-#### Sample 1
-<audio controls>
-  <source src="https://huggingface.co/fadi77/StyleTTS2-LibriTTS-arabic/resolve/main/synthesized_audio.wav" type="audio/wav">
-  Your browser does not support the audio element.
-</audio>
 ## Model Details

 This is an Arabic text-to-speech model based on StyleTTS2 architecture, specifically adapted for Arabic language synthesis. The model achieves good quality Arabic speech synthesis, though not yet state-of-the-art, and further experimentation is needed to optimize performance for Arabic language specifically. All training objectives from the original StyleTTS2 were maintained, except for the WavLM objectives which were removed as they were primarily designed for English speech.
+## Example
+Here is an example output from the model:
+#### Sample 1
+<audio controls>
+  <source src="https://huggingface.co/fadi77/StyleTTS2-LibriTTS-arabic/resolve/main/synthesized_audio.wav" type="audio/wav">
+  Your browser does not support the audio element.
+</audio>
 ## Efficiency and Performance
 A key strength of this model lies in its efficiency and performance characteristics:
 Note: According to the StyleTTS2 authors, performance should improve further when training a single-speaker model from scratch rather than fine-tuning. This wasn't attempted in our case due to computational resource constraints, suggesting potential for even better results with more extensive training.
 ## Model Details