No, it’s not similar (other than superficially, as both use diffusion methods in...

		woodson on April 21, 2023 \| parent \| context \| favorite \| on: NaturalSpeech 2: Zero-shot speech and singing synt... No, it’s not similar (other than superficially, as both use diffusion methods in some way). It uses diffusion to generate latent vectors that are entered in a neural audio codec model to produce speech.