lunarlist
/

tts-thai-last-step

Model card Files Files and versions Community

tts-thai-last-step / README.md

lunarlist's picture

Update README.md (#1)

de2b736 about 1 year ago

|

history blame contribute delete

No virus

1.38 kB

	---
	license: mit
	datasets:
	- lunarlist/edited_common_voice
	language:
	- th
	library_name: nemo
	pipeline_tag: text-to-speech
	---

	This model is a Thai TTS model that use a voice from [Common Voice dataset](https://commonvoice.mozilla.org/) and modify the voice to not to sound like the original.

	> pip install nemo_toolkit['tts'] soundfile

	```python
	from nemo.collections.tts.models import UnivNetModel
	from nemo.collections.tts.models import Tacotron2Model
	import torch
	import soundfile as sf

	model = Tacotron2Model.from_pretrained("lunarlist/tts-thai-last-step").to('cpu')
	vcoder_model = UnivNetModel.from_pretrained(model_name="tts_en_libritts_univnet")
	text='ภาษาไทย ง่าย นิด เดียว'
	dict_idx={k:i for i,k in enumerate(model.hparams["cfg"]['labels'])}
	parsed2=torch.Tensor([[66]+[dict_idx[i] for i in text if i]+[67]]).int().to("cpu")
	spectrogram2 = model.generate_spectrogram(tokens=parsed2)
	audio2 = vcoder_model.convert_spectrogram_to_audio(spec=spectrogram2)

	# Save the audio to disk in a file called speech.wav
	sf.write("speech.wav", audio2.to('cpu').detach().numpy()[0], 22050)
	```

	Medium: [Text-To-Speech ภาษาไทยด้วย Tacotron2](https://medium.com/@taetiyateachamatavorn/text-to-speech-%E0%B8%A0%E0%B8%B2%E0%B8%A9%E0%B8%B2%E0%B9%84%E0%B8%97%E0%B8%A2%E0%B8%94%E0%B9%89%E0%B8%A7%E0%B8%A2-tacotron2-986417b44edc)