Predictive Models on Random Data - podcast episode cover

Predictive Models on Random Data

Jul 22, 201637 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This week is an insightful discussion with Claudia Perlich about some situations in machine learning where models can be built, perhaps by well-intentioned practitioners, to appear to be highly predictive despite being trained on random data. Our discussion covers some novel observations about ROC and AUC, as well as an informative discussion of leakage.

Much of our discussion is inspired by two excellent papers Claudia authored: Leakage in Data Mining: Formulation, Detection, and Avoidance and On Cross Validation and Stacking: Building Seemingly Predictive Models on Random Data. Both are highly recommended reading!

For the best experience, listen in Metacast app for iOS or Android