X Tutup
The Wayback Machine - https://web.archive.org/web/20201209181706/https://github.com/TensorSpeech/TensorFlowTTS/issues/201
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🇨🇳 Chinese TTS now available 😘 #201

Open
dathudeptrai opened this issue Aug 11, 2020 · 28 comments
Open

🇨🇳 Chinese TTS now available 😘 #201

dathudeptrai opened this issue Aug 11, 2020 · 28 comments

Comments

@dathudeptrai
Copy link
Collaborator

@dathudeptrai dathudeptrai commented Aug 11, 2020

Chinese TTS now available, thank @azraelkuan for his support :D. The model used Baker dataset here (https://www.data-baker.com/open_source.htmlt). The pretrained model licensed under CC BY-NC-SA 4.0 (https://creativecommons.org/licenses/by-nc-sa/4.0/) since the dataset is non-commercial :D

Pls check out the colab bellow and enjoy :D.

https://colab.research.google.com/drive/1YpSHRBRPBI7cnTkQn1UcVTWEQVbsUm1S?usp=sharing

Note: this is just init results, there are more things can be done to make the model better.

cc: @candlewill @l4zyf9x @machineko

@wyp1996
Copy link

@wyp1996 wyp1996 commented Aug 27, 2020

Hello. Thanks for your great work! I'm new to the TTS area and this notebook could be a good start.
However, I gave it a try and found out the Chinese model at present doesn't make pauses. I wonder has this been one of your potential improvements yet?

@dathudeptrai
Copy link
Collaborator Author

@dathudeptrai dathudeptrai commented Aug 27, 2020

cc: @azraelkuan (person in charge)

@azraelkuan
Copy link
Collaborator

@azraelkuan azraelkuan commented Aug 28, 2020

@wyp1996 for now, we do not have a frontend model, but we have place #1,2,3,sil in the training

@jinfagang
Copy link

@jinfagang jinfagang commented Aug 29, 2020

@dathudeptrai Does it already in master branch for support?

@dathudeptrai
Copy link
Collaborator Author

@dathudeptrai dathudeptrai commented Aug 29, 2020

@jinfagang everything is on master branch. (updated content :D.)

@jinfagang
Copy link

@jinfagang jinfagang commented Aug 30, 2020

@dathudeptrai Any readme on how to train on Biaobei data?

@azraelkuan
Copy link
Collaborator

@azraelkuan azraelkuan commented Aug 30, 2020

@jinfagang jsut download biaobei data and extract it to baker

tensorflow-tts-preprocess --dataset baker --rootdir ~/Data/baker --outdir dump --config ./preprocess/baker_preprocess.yaml

and train it using baker's yaml.

@IreneZhou2018
Copy link

@IreneZhou2018 IreneZhou2018 commented Sep 4, 2020

@azraelkuan As I know the sampling rate of the audio in the Biaobei dataset is 48k, but in the baker_preprocess.yaml the sampling rate is set as 24k. I didn't try the preprocess. Is that a mistake or I misunderstand the code?

@dathudeptrai
Copy link
Collaborator Author

@dathudeptrai dathudeptrai commented Sep 4, 2020

@IreneZhou2018 the sampling rate in config is target sampling rate, if the dataset's sample rate is 48k so we re-sample it (see code here https://github.com/TensorSpeech/TensorFlowTTS/blob/master/tensorflow_tts/bin/preprocess.py#L194-L196)

@IreneZhou2018
Copy link

@IreneZhou2018 IreneZhou2018 commented Sep 4, 2020

@dathudeptrai ok, thanks for your reply and the work is amazing!

@MachineLP
Copy link

@MachineLP MachineLP commented Sep 8, 2020

TensorflowTTS训练数据生成:拉取文本数据、将文本专为拼音、基于阿里云TTS生成TensorflowTTS训练音频、训练前的preprocess/normalize:https://github.com/MachineLP/TensorFlowTTS_chinese/tree/master/generate_tts_data

@wyp1996
Copy link

@wyp1996 wyp1996 commented Sep 8, 2020

TensorflowTTS训练数据生成:拉取文本数据、将文本专为拼音、基于阿里云TTS生成TensorflowTTS训练音频、训练前的preprocess/normalize:https://github.com/MachineLP/TensorFlowTTS_chinese/tree/master/generate_tts_data

Hi, do you have a more specific Readme? It seems promising and I'd like to have to try :)

@Hongpeng1992
Copy link

@Hongpeng1992 Hongpeng1992 commented Sep 8, 2020

@jinfagang jsut download biaobei data and extract it to baker

tensorflow-tts-preprocess --dataset baker --rootdir ~/Data/baker --outdir dump --config ./preprocess/baker_preprocess.yaml

and train it using baker's yaml.

@Hongpeng1992
Copy link

@Hongpeng1992 Hongpeng1992 commented Sep 8, 2020

it seems that fastspeech2 model do not work properly when sentence is long ? like 君不见 黄河之水天上来 奔流到海不复回 君不见 高堂明镜悲白发 朝如青丝暮成雪 人生得意须尽欢 莫使金樽空对月

@dathudeptrai
Copy link
Collaborator Author

@dathudeptrai dathudeptrai commented Sep 8, 2020

it seems that fastspeech2 model do not work properly when sentence is long ? like 君不见 黄河之水天上来 奔流到海不复回 君不见 高堂明镜悲白发 朝如青丝暮成雪 人生得意须尽欢 莫使金樽空对月

#208 (comment)

@Hongpeng1992
Copy link

@Hongpeng1992 Hongpeng1992 commented Sep 8, 2020

Thank you . I am still evaluating the model .

@MachineLP
Copy link

@MachineLP MachineLP commented Sep 8, 2020

Chinese TTS欢迎加微信:lp9628,进入微信群讨论训练测试细节问题。

@jinfagang
Copy link

@jinfagang jinfagang commented Oct 16, 2020

@dathudeptrai I try to train get this error:

2020-10-16 22:28:20.499294: W tensorflow/core/grappler/optimizers/loop_optimizer.cc:906] Skipping loop optimization for Merge node with control input: cond/branch_executed/_8
Traceback (most recent call last):
  File "examples/tacotron2/train_tacotron2.py", line 488, in <module>
    main()
  File "examples/tacotron2/train_tacotron2.py", line 476, in main
    trainer.fit(
  File "/media/jintian/samsung/source/ai/swarm/exp/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 870, in fit
    self.run()
  File "/media/jintian/samsung/source/ai/swarm/exp/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 101, in run
    self._train_epoch()
  File "/media/jintian/samsung/source/ai/swarm/exp/TensorFlowTTS/tensorflow_tts/trainers/base_trainer.py", line 123, in _train_epoch
    self._train_step(batch)
  File "examples/tacotron2/train_tacotron2.py", line 109, in _train_step
    self.one_step_forward(batch)
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 780, in __call__
    result = self._call(*args, **kwds)
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 840, in _call
    return self._stateless_fn(*args, **kwds)
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 2829, in __call__
    return graph_function._filtered_call(args, kwargs)  # pylint: disable=protected-access
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1843, in _filtered_call
    return self._call_flat(
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 1923, in _call_flat
    return self._build_call_outputs(self._inference_function.call(
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 545, in call
    outputs = execute.execute(
  File "/home/jintian/anaconda3/lib/python3.8/site-packages/tensorflow/python/eager/execute.py", line 59, in quick_execute
    tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.InvalidArgumentError:    Trying to access element 62 in a list with 62 elements.
	 [[{{node while_19/body/_1/while/TensorArrayV2Read_1/TensorListGetItem}}]]
	 [[tacotron2/encoder/bilstm/forward_lstm/PartitionedCall]] [Op:__inference__one_step_forward_23575]

Function call stack:
_one_step_forward -> _one_step_forward -> _one_step_forward

My command:

python examples/tacotron2/train_tacotron2.py \                                                                                                               ⎇  master ✘ !?|73cac7f
  --train-dir ./dump/train/ \
  --dev-dir ./dump/valid/ \
  --outdir ./examples/tacotron2/exp/train.tacotron2.baker.v1/ \
  --config ./examples/tacotron2/conf/tacotron2.baker.v1.yaml \
  --use-norm 1 \
  --mixed_precision 0 \
  --resume ""

@leijue222
Copy link

@leijue222 leijue222 commented Oct 21, 2020

  1. The punctuation pause does not seem to be handled.
  2. Arabic numbers cannot be directly predicted.
  3. Hope this can be mixed in Chinese and English.
@jucaowei
Copy link

@jucaowei jucaowei commented Oct 22, 2020

Chinese TTS now available, thank @azraelkuan for his support :D. The model used Baker dataset here (https://www.data-baker.com/open_source.htmlt). The pretrained model licensed under CC BY-NC-SA 4.0 (https://creativecommons.org/licenses/by-nc-sa/4.0/) since the dataset is non-commercial :D

Pls check out the colab bellow and enjoy :D.

https://colab.research.google.com/drive/1YpSHRBRPBI7cnTkQn1UcVTWEQVbsUm1S?usp=sharing

Note: this is just init results, there are more things can be done to make the model better.

cc: @candlewill @l4zyf9x @machineko

@jinfagang everything is on master branch. (updated content :D.)

hello, the link to baker dataset was expired , and the officical website show that have no right to access the dataset , i hate to say that ,but can you provide annother way to get the dataset?

@leijue222
Copy link

@leijue222 leijue222 commented Oct 22, 2020

@jucaowei The link is here. The data only has a female voice.

@jucaowei
Copy link

@jucaowei jucaowei commented Oct 22, 2020

@jucaowei The link is here. The data only has a female voice.

thank you !!

@jucaowei
Copy link

@jucaowei jucaowei commented Oct 22, 2020

@jucaowei The link is here. The data only has a female voice.

404 error ,you can acess the website ? i got 404 not found
HTTP Status 404 – Not Found
Type Status Report
Message /open_source.htmlt
Description The origin server did not find a current representation for the target resource or is not willing to disclose that one exists.

@leijue222
Copy link

@leijue222 leijue222 commented Oct 22, 2020

@jucaowei
I can access the website normally. Which country are you in now? Maybe a VPN is needed for network problems?

@jucaowei
Copy link

@jucaowei jucaowei commented Oct 22, 2020

@jucaowei
I can access the website normally. Which country are you in now? Maybe a VPN is needed for network problems?

i already use VPN with HK node,not working, but my friend access the website right now, really appreciate for you reply

@leijue222
Copy link

@leijue222 leijue222 commented Nov 7, 2020

@azraelkuan Hi! Thanks for your work. Compared with some other reproduction projects, your reproduced tacotron2 can synthesize very long sentences without stress or omission.
I have tried your job to achieve a maximum of about 90 seconds. To be reasonable, the Biaobei dataset is relatively short sentences, and the model trained with Biaobei should not be able to synthesize such long sentences.
Have you done any special treatment of long sentences?

@azraelkuan
Copy link
Collaborator

@azraelkuan azraelkuan commented Nov 9, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
9 participants
You can’t perform that action at this time.
X Tutup