Building Voice AI Agents That Don’t Suck with Kwindla Kramer

15 Jul 2025 • 73 min • EN
73 min
00:00
01:13:02
No file found

In this episode, Kwindla Kramer, co-founder and CEO of Daily and creator of the open source Pipecat framework, joins us to discuss the architecture and challenges of building real-time, production-ready conversational voice AI. Kwin breaks down the full stack for voice agents—from the models and APIs to the critical orchestration layer that manages the complexities of multi-turn conversations. We explore why many production systems favor a modular, multi-model approach over the end-to-end models demonstrated by large AI labs, and how this impacts everything from latency and cost to observability and evaluation. Kwin also digs into the core challenges of interruption handling, turn-taking, and creating truly natural conversational dynamics, and how to overcome them. We discuss use cases, thoughts on where the technology is headed, the move toward hybrid edge-cloud pipelines, and the exciting future of real-time video avatars, and much more. The complete show notes for this episode can be found at https://twimlai.com/go/739.

From "The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)"

Listen on your iPhone

Download our iOS app and listen to interviews anywhere. Enjoy all of the listener functions in one slick package. Why not give it a try?

App Store Logo
application screenshot

Popular categories