Text-to-Speech (TTS) with Tacotron2 trained on a custom german dataset with 12 days voice using speechbrain.
Trained for 39 epochs (english speechbrain models are trained for 750 epochs) so there is room for improvement and the model is most likely to be updated soon. The hifigan vocoder can fortunately be used language-independently.
How to use
Install speechbrain.
pip install speechbrain
Generate spectrogram (line 17) and generate wav using hifigan (line 18).
import torchaudio
from speechbrain.pretrained import Tacotron2
from speechbrain.pretrained import HIFIGAN
# Intialize TTS (tacotron2) and Vocoder (HiFIGAN)
tacotron2 = Tacotron2.from_hparams(source="padmalcom/tts-tacotron2-german", savedir="tmpdir_tts")
hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder")
# Running the TTS
mel_output, mel_length, alignment = tacotron2.encode_text("Die Sonne schien den ganzen Tag.")
# Running Vocoder (spectrogram-to-waveform)
waveforms = hifi_gan.decode_batch(mel_output)
# Save the waverform
torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)
數據統計
數據評估
本站OpenI提供的padmalcom/tts-tacotron2-german都來源于網絡,不保證外部鏈接的準確性和完整性,同時,對于該外部鏈接的指向,不由OpenI實際控制,在2023年 5月 26日 下午6:13收錄時,該網頁上的內容,都屬于合規合法,后期網頁的內容如出現違規,可以直接聯系網站管理員進行刪除,OpenI不承擔任何責任。