Vocoder with HiFIGAN trained on LJSpeech

This repository provides all the necessary tools for using a HiFIGAN vocoder trained with LJSpeech.
The pre-trained model takes in input a spectrogram and produces a waveform in output. Typically, a vocoder is used after a TTS model that converts an input text into a spectrogram.
The sampling frequency is 22050 Hz.

Install SpeechBrain

pip install speechbrain

Please notice that we encourage you to read our tutorials and learn more about
SpeechBrain.

Using the Vocoder

import torch from speechbrain.pretrained import HIFIGAN hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir") mel_specs = torch.rand(2, 80,298) waveforms = hifi_gan.decode_batch(mel_specs)

Using the Vocoder with the TTS

import torchaudio from speechbrain.pretrained import Tacotron2 from speechbrain.pretrained import HIFIGAN # Intialize TTS (tacotron2) and Vocoder (HiFIGAN) tacotron2 = Tacotron2.from_hparams(source="speechbrain/tts-tacotron2-ljspeech", savedir="tmpdir_tts") hifi_gan = HIFIGAN.from_hparams(source="speechbrain/tts-hifigan-ljspeech", savedir="tmpdir_vocoder") # Running the TTS mel_output, mel_length, alignment = tacotron2.encode_text("Mary had a little lamb") # Running Vocoder (spectrogram-to-waveform) waveforms = hifi_gan.decode_batch(mel_output) # Save the waverform torchaudio.save('example_TTS.wav',waveforms.squeeze(1), 22050)

Inference on GPU

To perform inference on the GPU, add run_opts={"device":"cuda"} when calling the from_hparams method.

Training

The model was trained with SpeechBrain.
To train it from scratch follow these steps:

Clone SpeechBrain:

git clone https://github.com/speechbrain/speechbrain/

Install it:

cd speechbrain pip install -r requirements.txt pip install -e .

Run Training:

cd recipes/LJSpeech/TTS/vocoder/hifi_gan/ python train.py hparams/train.yaml --data_folder /path/to/LJspeech

You can find our training results (models, logs, etc) here.

數據評估

speechbrain/tts-hifigan-ljspeech瀏覽人數已經達到544，如你需要查詢該站的相關權重信息，可以點擊"5118數據""愛站數據""Chinaz數據"進入；以目前的網站數據參考，建議大家請以愛站數據為準，更多網站價值評估因素如：speechbrain/tts-hifigan-ljspeech的訪問速度、搜索引擎收錄以及索引量、用戶體驗等；當然要評估一個站的價值，最主要還是需要根據您自身的需求以及需要，一些確切的數據則需要找speechbrain/tts-hifigan-ljspeech的站長進行洽談提供。如該站的IP、PV、跳出率等！

特別聲明

本站OpenI提供的speechbrain/tts-hifigan-ljspeech都來源于網絡，不保證外部鏈接的準確性和完整性，同時，對于該外部鏈接的指向，不由OpenI實際控制，在2023年 5月 26日下午6:12收錄時，該網頁上的內容，都屬于合規合法，后期網頁的內容如出現違規，可以直接聯系網站管理員進行刪除，OpenI不承擔任何責任。

OpenI致力于優質、實用的網絡站點資源收集與分享！本文地址http://m.futurefh.com/sites/10771.html轉載請注明

相關導航

mio/Artoria

ESPnet2 TTS model mio/...

padmalcom/tts-tacotron2-german

Text-to-Speech (TTS) with T...

Microsoft SAM Text to Speech

Microsoft SAM Text to Speech 是一個基于瀏覽器的文本轉語音工具，重現了經典的Windows XP語音合成器，提供懷舊的語音體驗。

espnet/kan-bayashi_ljspeech_joint_finetune_conformer_fastspeech2_hifigan

ESPnet2 TTS pretrained mode...

espnet/kan-bayashi_jvs_tts_finetune_jvs001_jsut_vits_raw_phn_jaconv_pyopenjta-truncated-178804

ESPnet2 TTS pretrained mode...

VideoPlus Studio

VideoPlus Studio官網入口網址，VideoPlus Studio: VideoPlus Studio是一個免費的字幕編輯和翻譯工具，允許您為視頻添加字幕、進行編輯和翻譯到其他語言。 VideoPlus Studio的特色是對于每個字幕，您可以選擇一個演講者，該演講者具有將文本轉化為特定語言和聲音的屬性，并由您選擇的頭像朗讀。

暫無評論

暫無評論...

国产精品亚洲mnbav网站_成人午夜亚洲精品无码网站_日韩va亚洲va欧洲va国产_亚洲欧洲精品成人久久曰影片

speechbrain/tts-hifigan-ljspeech

ChatGPT

玩虛擬模特？