AI Voice Synthesis with Zach Johnson | LMNT | Generative AI Games
Voice Synthesis is about to transform game development, virtual worlds and simulations of all kinds. I spoke with Zach Johnson, cofounder of LMNT, an AI voice synthesis company that uses the magic of prosody imitation to make voices that sound strikingly like the original speaker.
We covered how these technologies work; demonstrated cases with David Attenborough and a flight simulator; discussed the new type of creator economy this will form; technologies to enable scalability and reliability (such as memory safe languages, e.g., Rust) as well as other things Zach learned from his time working on Google Glass.
If you prefer to listen to the podcast, you can listen to it on Spotify here:
Voice Synthesis with Zach Johnson, LMNT - on Spotify
However, I suggest at least watching the flight simulator demo at 55:28 because it shows a really compelling example of what can be done with real-time voice synthesis in a simulated environment.
Time Stamps
00:00 Introduction
03:38 LMNT Voice Synthesis Demo
06:55 Prosody
10:15 Creativity with Generative AI|
12:22 Intentionality in Voice Synthesis
19:05 How AI Voice Tech works
23:18 Zach's Background at Google
32:08 Scalability of Systems
36:21 Learning New Things
46:28 AI Transforming the World
51:52 New AI Creator Economy
55:28 AI Voice in Flight Simulator ← must see!
59:40 Business Models
1:02:26 User Generated Content
Show Notes
Zach’s company is LMNT (pronounced “Element”).
You can try a live demo of LMNT voiuce synthesis at this URL: app.lmnt.com
Zach referred to a Berkeley paper, “Denoting Diffusion Probabilistic Models” about image synthesis using diffusion, when we discussed how the first voice synthesis models (e.g., DiffWave) used a similar approach: https://arxiv.org/pdf/2006.11239.pdf
…and also a Stanford paper, Score Based Generative Modeling Through Stochastic Differential Equations, which improved upon early diffusion techniques: https://arxiv.org/pdf/2011.13456.pdf
The DiffWave paper covers how diffusion techniques were applied to audio synthesis
The LMNT implementation of DiffWave can be found in the lmnt-diffwave GitHub repo.