Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing - podcast episode cover

Self-Boost via Optimal Retraining: An Analysis via Approximate Message Passing

Nov 27, 202515 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This research presents a principled framework to Bayes-optimaly retrain** when input data contains noisy labels. The central contribution is the derivation of the **Bayes optimal aggregator function**, which determines the mathematically ideal method for combining a model’s current predictions with the initial, noisy labels to minimize prediction error. Using the **Approximate Message Passing (AMP)** framework, the authors analyze this iterative procedure for two ground truth settings: the **Gaussian mixture model (GMM)** and the **generalized linear model (GLM)**. This analysis provides a precise state evolution recursion that characterizes the asymptotic behavior of the estimator across multiple retraining rounds. Furthermore, a practical variant of the optimal function is developed for real-world application in linear probing, where it is shown to significantly outperform existing retraining baselines, particularly in **high label noise regimes**.

For the best experience, listen in Metacast app for iOS or Android