(Voiceover) Building on evaluation quicksand - podcast episode cover

(Voiceover) Building on evaluation quicksand

Oct 16, 202417 min
--:--
--:--
Listen in podcast apps:

Episode description

Read the full post here: https://www.interconnects.ai/p/building-on-evaluation-quicksand

Chapters

00:00 Building on evaluation quicksand

01:26 The causes of closed evaluation silos

06:35 The challenge facing open evaluation tools

10:47 Frontiers in evaluation

11:32 New types of synthetic data contamination

13:57 Building harder evaluations

Figures

Fig 1: https://huggingface.co/datasets/natolambert/interconnects-figures/resolve/main/manual/openai-predictions.webp



Get full access to Interconnects at www.interconnects.ai/subscribe
(Voiceover) Building on evaluation quicksand | Interconnects podcast - Listen or read transcript on Metacast