Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior - podcast episode cover

Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Oct 22, 202519 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper introduces an experimental recipe for interventional analyses designed to study how training data specifically affects the behavior of language models (LMs). This methodology, termed "Rewriting History," involves a three-stage process: selecting target evaluation items, matching relevant pretraining documents to those items, and then modifying those documents before retraining the model to measure the effects. The authors demonstrate the utility of this approach through case studies on factual knowledge acquisition in LMs, examining how both term cooccurrence and information retrieval (IR) methods relate to a model's ability to learn and report facts. The overall aim is to provide a standardized, flexible method for researchers to test fine-grained hypotheses about the relationship between pretraining data and specific model behaviors, moving beyond solely observational studies.

For the best experience, listen in Metacast app for iOS or Android