
AI safety researcher Nate Soares explains why he believes there's at least a 95% chance that current AI development will lead to human extinction, and why we're accelerating toward that outcome. Soares, who has been working on AI alignment since 2012, breaks down the fundamental problem: we're building increasingly intelligent systems without any ability to control what they actually want or pursue.The conversation covers current AI behavior that wasn't programmed: threatening users, keeping psychotic people in delusional states, and repeatedly lying when caught. Soares explains why these aren't bugs to be fixed but symptoms of a deeper problem. We can't point AI systems at any specific goal, not even something simple like "make a diamond." Instead, we get systems with bizarre drives that are only distantly related to their training.Soares addresses the "racing China" argument and why it misunderstands the threat. He explains why AI engineers can build powerful systems without understanding what's actually happening inside them, and why this matters. Using examples from evolutionary biology, he shows why there's no reason to expect AI systems to develop human-like morality or values.The discussion covers why a catastrophic warning event probably won't help, what international coordination could look like, and why current safety efforts fall short of what's needed. Soares is direct about industry motivations, technical limitations, and the timeline we're facing.Nate Soares has been researching AI alignment and safety since 2012. He works at the Machine Intelligence Research Institute (MIRI), one of the pioneering organizations focused on ensuring advanced AI systems are aligned with human values.
From "Doomer Optimism"
Comments
Add comment Feedback