MelNet can also mimic Stephen Hawking, George Takei, and Jane Goodall, among others, which makes sense considering that it was trained on audiobooks as well as a 452-hour dataset of TED talks. Unlike WaveNet and other programs that are trained with audio waveforms, MelNet relies on the spectrogram, which allows it to capture more subtle consistencies known as "high-level structure" in a person's voice, which could one day result in higher-quality AI assistants.