BigCodeBench Challenges, Cambrian-1 Leap, D-MERIT's Evaluation, Long Context Breakthrough in Vision
Jun 27, 2024•11 min•Ep. 55
Episode description
DreamBench++: A Human-Aligned Benchmark for Personalized Image
Generation
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls
and Complex Instructions
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
Evaluating D-MERIT of Partial-annotation on Information Retrieval
Long Context Transfer from Language to Vision
For the best experience, listen in Metacast app for iOS or Android
