Eye On A.I.

#222 Andrew Feldman: How Cerebras Systems Is Disrupting AI Inference

28 Nov 2024 • 42 min • EN

This episode is sponsored by Shopify. Shopify is a commerce platform that allows anyone to set up an online store and sell their products. Whether you’re selling online, on social media, or in person, Shopify has you covered on every base. With Shopify you can sell physical and digital products. You can sell services, memberships, ticketed events, rentals and even classes and lessons. Sign up for a $1 per month trial period at http://shopify.com/eyeonai In this episode of the Eye on AI podcast, Andrew D. Feldman, Co-Founder and CEO of Cerebras Systems, unveils how Cerebras is disrupting AI inference and high-performance computing. Andrew joins Craig Smith to discuss the groundbreaking wafer-scale engine, Cerebras’ record-breaking inference speeds, and the future of AI in enterprise workflows. From designing the fastest inference platform to simplifying AI deployment with an API-driven cloud service, Cerebras is setting new standards in AI hardware innovation. We explore the shift from GPUs to custom architectures, the rise of large language models like Llama and GPT, and how AI is driving enterprise transformation. Andrew also dives into the debate over open-source vs. proprietary models, AI’s role in climate mitigation, and Cerebras’ partnerships with global supercomputing centers and industry leaders. Discover how Cerebras is shaping the future of AI inference and why speed and scalability are redefining what’s possible in computing. Don’t miss this deep dive into AI’s next frontier with Andrew Feldman. Like, subscribe, and hit the notification bell for more episodes! Stay Updated: Craig Smith Twitter: https://twitter.com/craigss Eye on A.I. Twitter: https://twitter.com/EyeOn_AI (00:00) Intro to Andrew Feldman & Cerebras Systems (00:43) The rise of AI inference (03:16) Cerebras’ API-powered cloud (04:48) Competing with NVIDIA’s CUDA (06:52) The rise of Llama and LLMs (07:40) OpenAI's hardware strategy (10:06) Shifting focus from training to inference (13:28) Open-source vs proprietary AI (15:00) AI's role in enterprise workflows (17:42) Edge computing vs cloud AI (19:08) Edge AI for consumer apps (20:51) Machine-to-machine AI inference (24:20) Managing uncertainty with models (27:24) Impact of U.S.–China export rules (30:29) U.S. innovation policy challenges (33:31) Developing wafer-scale engines (34:45) Cerebras’ fast inference service (37:40) Global partnerships in AI (38:14) AI in climate & energy solutions (39:58) Training and inference cycles (41:33) AI training market competition

From "Eye On A.I."