Diving Deep into Synthetic Data with Alex Watson of Gretel.ai
Alex Watson is the co-founder and CEO of Gretel.ai, a startup that offers APIs for creating anonymized and synthetic datasets. Previously he was the founder of Harvest.ai, whose product Macie, an analytics platform protecting against data breaches, was acquired by AWS. Learn more about Alex and Gretel AI: http://gretel.ai Every Thursday I send out the most useful things I’ve learned, curated specifically for the busy machine learning engineer. Sign up here: https://www.cyou.ai/newsletter Follow Charlie on Twitter: https://twitter.com/CharlieYouAI Subscribe to ML Engineered: https://mlengineered.com/listen Comments? Questions? Submit them here: http://bit.ly/mle-survey Take the Giving What We Can Pledge: https://www.givingwhatwecan.org/ Timestamps: 02:15 Introducing Alex Watson 03:45 How Alex was first exposed to programming 05:00 Alex's experience starting Harvest AI, getting acquired by AWS, and integrating their product at massive scale 21:20 How Alex first saw the opportunity for Gretel.ai 24:20 The most exciting use-cases for synthetic data 28:55 Theoretical guarantees of anonymized data with differential privacy 36:40 Combining pre-training with synthetic data 38:40 When to anonymize data and when to synthesize it 41:25 How Gretel's synthetic data engine works 44:50 Requirements of a dataset to create a synthetic version 49:25 Augmenting datasets with synthetic examples to address representation bias 52:45 How Alex recommends teams get started with Gretel.ai 59:00 Expected accuracy loss from training models on synthetic data 01:03:15 Biggest surprises from building Gretel.ai 01:05:25 Organizational patterns for protecting sensitive data 01:07:40 Alex's vision for Gretel's data catalog 01:11:15 Rapid fire questions Links: Gretel.ai Blog NetFlix Cancels Recommendation Contest After Privacy Lawsuit Greylock - The Github of Data Improving massively imbalanced datasets in machine learning with synthetic data Deep dive on generating synthetic data for Healthcare Gretel’s New Synthetic Performance Report The...
From "Machine Learning Engineered"
Comments
Add comment Feedback