Scaling In-Context Reinforcement Learning, ChartMimic's AI Benchmark, Multimodal Document Comprehension, Long Context Reasoning Challenges
Jun 21, 2024•11 min•Ep. 53
Episode description
XLand-100B: A Large-Scale Multi-Task Dataset for In-Context
Reinforcement Learning
Make It Count: Text-to-Image Generation with an Accurate Number of
Objects
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via
Chart-to-Code Generation
Needle In A Multimodal Haystack
BABILong: Testing the Limits of LLMs with Long Context
Reasoning-in-a-Haystack
For the best experience, listen in Metacast app for iOS or Android
