Secure Talk Podcast

Hijacking AI Memory: Inside Johann Rehberger's ChatGPT Security Breakthrough

01 Apr 2025 • 46 min • EN

In this eye-opening episode of SecureTalk, host Justin Beals interviews Johann Rehberger, a seasoned cybersecurity expert and Red Team Director at Electronic Arts, about his groundbreaking discovery of a critical vulnerability in ChatGPT's memory system. Johann shares how his security background and curiosity about AI led him to uncover the "SPAIWARE" attack - a persistent malicious instruction that can be injected into ChatGPT's long-term memory, potentially leading to data exfiltration and other security risks. Key Topics CoveredJohann's journey from Microsoft development consultant to becoming a leading red team expert specializing in AI securityThe discovery of ChatGPT's memory system vulnerability and how it could be exploitedHow traditional security concepts like the CIA security triad (Confidentiality, Integrity, Availability) apply to AI systemsThe development of "SPAIWARE" - a persistent prompt injection attack that can leak user dataCommand and control infrastructure using prompt injection techniquesThe challenges of securing agentic AI systems that can control web browsers and execute tasksThe evolving relationship between security researchers and AI companies like OpenAI Notable Quotes "I think using this system is just so important because it can help you. They are so powerful. I started using it daily. But the security mindset of course too, because I use it for my productivity, but I always use it for trying to find the flaws and trying to understand how it works." - Johann Rehberger "What I did basically was use that technique and then insert that instruction in memory. So that whenever there's a conversation turn, the user has a question, ChatGPT responds. Every single conversation turn will be sent to the third-party server. So this is where the word spyware basically kind of came from." - Johann Rehberger "The better the models become, the better they follow instructions, including attacker instructions." - Johann Rehberger About Johann Rehberger Johann Rehberger is the Red Team Director at Electronic Arts with extensive experience in cybersecurity. His career includes roles at Microsoft, where he led the Red Team for Azure Data, and Uber, where he served as Red Team Lead. Johann is known for his pioneering work in AI security, specifically identifying and responsibly disclosing vulnerabilities in large language models like ChatGPT. Resources Mentioned Johann's blog on machine learning security (https://embracethered.com/blog/index.html)Black Hat Europe presentation on ChatGPT security vulnerabilitiesLLM Owasp Top 10 vulnerability classifications Connect With Us Follow SecureTalk for more insights on cybersecurity trends and emerging threats. Visit our website at www.securetalkpodcast.com for more episodes and resources. #AISecurityRisks #PromptInjection #ChatGPT #Cybersecurity #AIVulnerabilities #RedTeaming #SecureTalk

From "Secure Talk Podcast"