Rewriting History: A Recipe for Interventional Analyses to Study Data Effects on Model Behavior

Best AI papers explained

Oct 22, 2025•19 min

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

This paper introduces an experimental recipe for interventional analyses designed to study how training data specifically affects the behavior of language models (LMs). This methodology, termed "Rewriting History," involves a three-stage process: selecting target evaluation items, matching relevant pretraining documents to those items, and then modifying those documents before retraining the model to measure the effects. The authors demonstrate the utility of this approach through case studies on factual knowledge acquisition in LMs, examining how both term cooccurrence and information retrieval (IR) methods relate to a model's ability to learn and report facts. The overall aim is to provide a standardized, flexible method for researchers to test fine-grained hypotheses about the relationship between pretraining data and specific model behaviors, moving beyond solely observational studies.

For the best experience, listen in Metacast app for iOS or Android