128 - Dynamic Benchmarking, with Douwe Kiela - podcast episode cover

128 - Dynamic Benchmarking, with Douwe Kiela

Jun 19, 202147 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

We discussed adversarial dataset construction and dynamic benchmarking in this episode with Douwe Kiela, a research scientist at Facebook AI Research who has been working on a dynamic benchmarking platform called Dynabench. Dynamic benchmarking tries to address the issue of many recent datasets getting solved with little progress being made towards solving the corresponding tasks. The idea is to involve models in the data collection loop to encourage humans to provide data points that are hard for those models, thereby continuously collecting harder datasets. We discussed the details of this approach, and some potential caveats. We also discussed dynamic leaderboards, a recent addition to Dynabench that rank systems based on their utility given specific use cases. Papers discussed in this episode: 1. Dynabench: Rethinking Benchmarking in NLP (https://www.semanticscholar.org/paper/Dynabench%3A-Rethinking-Benchmarking-in-NLP-Kiela-Bartolo/77a096d80eb4dd4ccd103d1660c5a5498f7d026b) 2. Dynaboard: An Evaluation-As-A-Service Platform for Holistic Next-Generation Benchmarking (https://www.semanticscholar.org/paper/Dynaboard%3A-An-Evaluation-As-A-Service-Platform-for-Ma-Ethayarajh/d25bb256e5b69f769a429750217b0d9ec1cf4d86) 3. Adversarial NLI: A New Benchmark for Natural Language Understanding (https://www.semanticscholar.org/paper/Adversarial-NLI%3A-A-New-Benchmark-for-Natural-Nie-Williams/9d87300892911275520a4f7a5e5abf4f1c002fec) 4. DynaSent: A Dynamic Benchmark for Sentiment Analysis (https://www.semanticscholar.org/paper/DynaSent%3A-A-Dynamic-Benchmark-for-Sentiment-Potts-Wu/284dfcf7f25ca87b2db235c6cdc848b4143d3923) Douwe Kiela's webpage: https://douwekiela.github.io/ The hosts for this episode are Pradeep Dasigi and Alexis Ross.
For the best experience, listen in Metacast app for iOS or Android