US Patent:
20220310056, Sep 29, 2022
Inventors:
- Mountain View CA, US
Zhehuai Chen - Jersey City NJ, US
Fadi Biadsy - Mountain View CA, US
Pedro J. Moreno Mengibar - Jersey City NJ, US
Assignee:
Google LLC - Mountain View CA
International Classification:
G10L 13/027
G10L 25/18
G10L 15/22
G10L 15/16
G10L 13/047
Abstract:
A method for speech conversion includes receiving, as input to an encoder of a speech conversion model, an input spectrogram corresponding to an utterance, the encoder including a stack of self-attention blocks. The method further includes generating, as output from the encoder, an encoded spectrogram and receiving, as input to a spectrogram decoder of the speech conversion model, the encoded spectrogram generated as output from the encoder. The method further includes generating, as output from the spectrogram decoder, an output spectrogram corresponding to a synthesized speech representation of the utterance.