706: Large Language Model Leaderboards and Benchmarks - podcast episode cover

706: Large Language Model Leaderboards and Benchmarks

Aug 18, 202333 min
--:--
--:--
Listen in podcast apps:
Metacast
Spotify
Youtube
RSS

Episode description

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.Additional materials: www.superdatascience.com/706Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.
For the best experience, listen in Metacast app for iOS or Android
Open in Metacast
706: Large Language Model Leaderboards and Benchmarks | Super Data Science: ML & AI Podcast with Jon Krohn - Listen or read transcript on Metacast