Unlocking Unstructured Data with LLMs

03 Jul 2025 • 27 min • EN
27 min
00:00
27:46
No file found

Shreya Shankar is a  PhD student at UC Berkeley in the EECS department. This episode explores how Large Language Models (LLMs) are revolutionizing the processing of unstructured enterprise data like text documents and PDFs. It introduces DocETL, a framework using a MapReduce approach with LLMs for semantic extraction, thematic analysis, and summarization at scale. Subscribe to the Gradient Flow Newsletter 📩  https://gradientflow.substack.com/ Subscribe: Apple · Spotify · Overcast · Pocket Casts · AntennaPod · Podcast Addict · Amazon ·  RSS. Detailed show notes - with links to many references - can be found on The Data Exchange web site.

From "The Data Exchange with Ben Lorica"

Listen on your iPhone

Download our iOS app and listen to interviews anywhere. Enjoy all of the listener functions in one slick package. Why not give it a try?

App Store Logo
application screenshot

Popular categories