Beyond Guardrails: Defending LLMs Against Sophisticated Attacks

22 May 2025 • 44 min • EN
44 min
00:00
44:31
No file found

Jason Martin is an AI Security Researcher at HiddenLayer. This episode explores “policy puppetry,” a universal attack technique bypassing safety features in all major language models using structured formats like XML or JSON. Subscribe to the Gradient Flow Newsletter 📩  https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon ·  RSS. Detailed show notes - with links to many references - can be found on The Data Exchange web site.

From "The Data Exchange with Ben Lorica"

Listen on your iPhone

Download our iOS app and listen to interviews anywhere. Enjoy all of the listener functions in one slick package. Why not give it a try?

App Store Logo
application screenshot

Popular categories