Today, we're joined by Amir Bar, a PhD candidate at Tel Aviv University and UC Berkeley to discuss his research on visual-based learning, including his recent paper, “EgoPet: Egomotion and Interaction Data from an Animal’s Perspective.” Amir shares his research projects focused on self-supervised object detection and analogy reasoning for general computer vision tasks. We also discuss the current limitations of caption-based datasets in model training, the ‘learning problem’ in robotics, and the...
Jul 09, 2024•43 min•Ep 692•Transcript available on Metacast Today, we're joined by Sarah Bird, chief product officer of responsible AI at Microsoft. We discuss the testing and evaluation techniques Microsoft applies to ensure safe deployment and use of generative AI, large language models, and image generation. In our conversation, we explore the unique risks and challenges presented by generative AI, the balance between fairness and security concerns, the application of adaptive and layered defense strategies for rapid response to unforeseen AI behavior...
Jul 01, 2024•57 min•Ep 691•Transcript available on Metacast Today, we're joined by Eric Nguyen, PhD student at Stanford University. In our conversation, we explore his research on long context foundation models and their application to biology particularly Hyena, and its evolution into Hyena DNA and Evo models. We discuss Hyena, a convolutional-based language model developed to tackle the challenges posed by long context lengths in language modeling. We dig into the limitations of transformers in dealing with longer sequences, the motivation for using co...
Jun 25, 2024•46 min•Ep 690•Transcript available on Metacast Today, we're joined by Andres Ravinet, sustainability global black belt at Microsoft, to discuss the role of AI in sustainability. We explore real-world use cases where AI-driven solutions are leveraged to help tackle environmental and societal challenges, from early warning systems for extreme weather events to reducing food waste along the supply chain to conserving the Amazon rainforest. We cover the major threats that sustainability aims to address, the complexities in standardized sustainab...
Jun 18, 2024•48 min•Ep 689•Transcript available on Metacast Today we’re joined by Fatih Porikli, senior director of technology at Qualcomm AI Research. In our conversation, we covered several of the Qualcomm team’s 16 accepted main track and workshop papers at this year’s CVPR conference. The papers span a variety of generative AI and traditional computer vision topics, with an emphasis on increased training and inference efficiency for mobile and edge deployment. We explore efficient diffusion models for text-to-image generation, grounded reasoning in v...
Jun 10, 2024•1 hr 11 min•Ep 688•Transcript available on Metacast Today, we're joined by Sasha Luccioni, AI and Climate lead at Hugging Face, to discuss the environmental impact of AI models. We dig into her recent research into the relative energy consumption of general purpose pre-trained models vs. task-specific, non-generative models for common AI tasks. We discuss the implications of the significant difference in efficiency and power consumption between the two types of models. Finally, we explore the complexities of energy efficiency and performance benc...
Jun 03, 2024•48 min•Ep 687•Transcript available on Metacast Today, we're joined by Christopher Manning, the Thomas M. Siebel professor in Machine Learning at Stanford University and a recent recipient of the 2024 IEEE John von Neumann medal. In our conversation with Chris, we discuss his contributions to foundational research areas in NLP, including word embeddings and attention. We explore his perspectives on the intersection of linguistics and large language models, their ability to learn human language structures, and their potential to teach us about...
May 27, 2024•56 min•Ep 686•Transcript available on Metacast Today we're joined by Abdul Fatir Ansari, a machine learning scientist at AWS AI Labs in Berlin, to discuss his paper, "Chronos: Learning the Language of Time Series." Fatir explains the challenges of leveraging pre-trained language models for time series forecasting. We explore the advantages of Chronos over statistical models, as well as its promising results in zero-shot forecasting benchmarks. Finally, we address critiques of Chronos, the ongoing research to improve synthetic data quality, a...
May 20, 2024•43 min•Ep 685•Transcript available on Metacast Today we're joined by Joel Hestness, principal research scientist and lead of the core machine learning team at Cerebras. We discuss Cerebras’ custom silicon for machine learning, Wafer Scale Engine 3, and how the latest version of the company’s single-chip platform for ML has evolved to support large language models. Joel shares how WSE3 differs from other AI hardware solutions, such as GPUs, TPUs, and AWS’ Inferentia, and talks through the homogenous design of the WSE chip and its memory archi...
May 13, 2024•55 min•Ep 684•Transcript available on Metacast Today we're joined by Laurent Boinot, power and utilities lead for the Americas at Microsoft, to discuss the intersection of AI and energy infrastructure. We discuss the many challenges faced by current power systems in North America and the role AI is beginning to play in driving efficiencies in areas like demand forecasting and grid optimization. Laurent shares a variety of examples along the way, including some of the ways utility companies are using AI to ensure secure systems, interact with...
May 07, 2024•50 min•Ep 683•Transcript available on Metacast Today we're joined by Azarakhsh (Aza) Jalalvand, a research scholar at Princeton University, to discuss his work using deep reinforcement learning to control plasma instabilities in nuclear fusion reactors. Aza explains his team developed a model to detect and avoid a fatal plasma instability called ‘tearing mode’. Aza walks us through the process of collecting and pre-processing the complex diagnostic data from fusion experiments, training the models, and deploying the controller algorithm on t...
Apr 29, 2024•42 min•Ep 682•Transcript available on Metacast Today we're joined by Kirk Marple, CEO and founder of Graphlit, to explore the emerging paradigm of "GraphRAG," or Graph Retrieval Augmented Generation. In our conversation, Kirk digs into the GraphRAG architecture and how Graphlit uses it to offer a multi-stage workflow for ingesting, processing, retrieving, and generating content using LLMs (like GPT-4) and other Generative AI tech. He shares how the system performs entity extraction to build a knowledge graph and how graph, vector, and object...
Apr 22, 2024•47 min•Ep 681•Transcript available on Metacast Today we're joined by Alex Havrilla, a PhD student at Georgia Tech, to discuss "Teaching Large Language Models to Reason with Reinforcement Learning." Alex discusses the role of creativity and exploration in problem solving and explores the opportunities presented by applying reinforcement learning algorithms to the challenge of improving reasoning in large language models. Alex also shares his research on the effect of noise on language model training, highlighting the robustness of LLM archite...
Apr 16, 2024•46 min•Ep 680•Transcript available on Metacast Today we're joined by Peter Hase, a fifth-year PhD student at the University of North Carolina NLP lab. We discuss "scalable oversight", and the importance of developing a deeper understanding of how large neural networks make decisions. We learn how matrices are probed by interpretability researchers, and explore the two schools of thought regarding how LLMs store knowledge. Finally, we discuss the importance of deleting sensitive information from model weights, and how "easy-to-hard generaliza...
Apr 08, 2024•50 min•Ep 679•Transcript available on Metacast Today we're joined by Jonas Geiping, a research group leader at the ELLIS Institute, to explore his paper: "Coercing LLMs to Do and Reveal (Almost) Anything". Jonas explains how neural networks can be exploited, highlighting the risk of deploying LLM agents that interact with the real world. We discuss the role of open models in enabling security research, the challenges of optimizing over certain constraints, and the ongoing difficulties in achieving robustness in neural networks. Finally, we d...
Apr 01, 2024•48 min•Ep 678•Transcript available on Metacast Today we’re joined by Mido Assran, a research scientist at Meta’s Fundamental AI Research (FAIR). In this conversation, we discuss V-JEPA, a new model being billed as “the next step in Yann LeCun's vision” for true artificial reasoning. V-JEPA, the video version of Meta’s Joint Embedding Predictive Architecture, aims to bridge the gap between human and machine intelligence by training models to learn abstract concepts in a more efficient predictive manner than generative models. V-JEPA uses a no...
Mar 25, 2024•48 min•Ep 677•Transcript available on Metacast Today we’re joined by Sherry Yang, senior research scientist at Google DeepMind and a PhD student at UC Berkeley. In this interview, we discuss her new paper, "Video as the New Language for Real-World Decision Making,” which explores how generative video models can play a role similar to language models as a way to solve tasks in the real world. Sherry draws the analogy between natural language as a unified representation of information and text prediction as a common task interface and demonstr...
Mar 18, 2024•50 min•Ep 676•Transcript available on Metacast Today we’re joined by Sayash Kapoor, a Ph.D. student in the Department of Computer Science at Princeton University. Sayash walks us through his paper: "On the Societal Impact of Open Foundation Models.” We dig into the controversy around AI safety, the risks and benefits of releasing open model weights, and how we can establish common ground for assessing the threats posed by AI. We discuss the application of the framework presented in the paper to specific risks, such as the biosecurity risk of...
Mar 11, 2024•40 min•Ep 675•Transcript available on Metacast Today we’re joined by Akshita Bhagia, a senior research engineer at the Allen Institute for AI. Akshita joins us to discuss OLMo, a new open source language model with 7 billion and 1 billion variants, but with a key difference compared to similar models offered by Meta, Mistral, and others. Namely, the fact that AI2 has also published the dataset and key tools used to train the model. In our chat with Akshita, we dig into the OLMo models and the various projects falling under the OLMo umbrella,...
Mar 04, 2024•32 min•Ep 674•Transcript available on Metacast Today we’re joined by Ben Prystawski, a PhD student in the Department of Psychology at Stanford University working at the intersection of cognitive science and machine learning. Our conversation centers on Ben’s recent paper, “Why think step by step? Reasoning emerges from the locality of experience,” which he recently presented at NeurIPS 2023. In this conversation, we start out exploring basic questions about LLM reasoning, including whether it exists, how we can define it, and how techniques ...
Feb 26, 2024•25 min•Ep 673•Transcript available on Metacast Today we're joined by Armineh Nourbakhsh of JP Morgan AI Research to discuss the development and capabilities of DocLLM, a layout-aware large language model for multimodal document understanding. Armineh provides a historical overview of the challenges of document AI and an introduction to the DocLLM model. Armineh explains how this model, distinct from both traditional LLMs and document AI models, incorporates both textual semantics and spatial layout in processing enterprise documents like rep...
Feb 19, 2024•46 min•Ep 672•Transcript available on Metacast Today we’re joined by Sanmi Koyejo, assistant professor at Stanford University, to continue our NeurIPS 2024 series. In our conversation, Sanmi discusses his two recent award-winning papers. First, we dive into his paper, “Are Emergent Abilities of Large Language Models a Mirage?”. We discuss the different ways LLMs are evaluated and the excitement surrounding their“emergent abilities” such as the ability to perform arithmetic Sanmi describes how evaluating model performance using nonlinear metr...
Feb 12, 2024•1 hr 6 min•Ep 671•Transcript available on Metacast Today we’re joined by Kamyar Azizzadenesheli, a staff researcher at Nvidia, to continue our AI Trends 2024 series. In our conversation, Kamyar updates us on the latest developments in reinforcement learning (RL), and how the RL community is taking advantage of the abstract reasoning abilities of large language models (LLMs). Kamyar shares his insights on how LLMs are pushing RL performance forward in a variety of applications, such as ALOHA, a robot that can learn to fold clothes, and Voyager, a...
Feb 05, 2024•1 hr 10 min•Ep 670•Transcript available on Metacast Today we’re joined by Ram Sriharsha, VP of engineering at Pinecone. In our conversation, we dive into the topic of vector databases and retrieval augmented generation (RAG). We explore the trade-offs between relying solely on LLMs for retrieval tasks versus combining retrieval in vector databases and LLMs, the advantages and complexities of RAG with vector databases, the key considerations for building and deploying real-world RAG-based applications, and an in-depth look at Pinecone's new server...
Jan 29, 2024•35 min•Ep 669•Transcript available on Metacast Today we’re joined by Ben Zhao, a Neubauer professor of computer science at the University of Chicago. In our conversation, we explore his research at the intersection of security and generative AI. We focus on Ben’s recent Fawkes, Glaze, and Nightshade projects, which use “poisoning” approaches to provide users with security and protection against AI encroachments. The first tool we discuss, Fawkes, imperceptibly “cloaks” images in such a way that models perceive them as highly distorted, effec...
Jan 22, 2024•40 min•Ep 668•Transcript available on Metacast Today, we continue our NeurIPS series with Dan Friedman, a PhD student in the Princeton NLP group. In our conversation, we explore his research on mechanistic interpretability for transformer models, specifically his paper, Learning Transformer Programs. The LTP paper proposes modifications to the transformer architecture which allow transformer models to be easily converted into human-readable programs, making them inherently interpretable. In our conversation, we compare the approach proposed ...
Jan 15, 2024•39 min•Ep 667•Transcript available on Metacast Today we continue our AI Trends 2024 series with a conversation with Thomas Dietterich, distinguished professor emeritus at Oregon State University. As you might expect, Large Language Models figured prominently in our conversation, and we covered a vast array of papers and use cases exploring current research into topics such as monolithic vs. modular architectures, hallucinations, the application of uncertainty quantification (UQ), and using RAG as a sort of memory module for LLMs. Lastly, don...
Jan 08, 2024•1 hr 5 min•Ep 666•Transcript available on Metacast Today we kick off our AI Trends 2024 series with a conversation with Naila Murray, director of AI research at Meta. In our conversation with Naila, we dig into the latest trends and developments in the realm of computer vision. We explore advancements in the areas of controllable generation, visual programming, 3D Gaussian splatting, and multimodal models, specifically vision plus LLMs. We discuss tools and open source projects, including Segment Anything–a tool for versatile zero-shot image seg...
Jan 02, 2024•52 min•Ep 665•Transcript available on Metacast Today we’re joined by Ed Anuff, chief product officer at DataStax. In our conversation, we discuss Ed’s insights on RAG, vector databases, embedding models, and more. We dig into the underpinnings of modern vector databases (like HNSW and DiskANN) that allow them to efficiently handle massive and unstructured data sets, and discuss how they help users serve up relevant results for RAG, AI assistants, and other use cases. We also discuss embedding models and their role in vector comparisons and d...
Dec 28, 2023•48 min•Ep 664•Transcript available on Metacast Today we’re joined by Markus Nagel, research scientist at Qualcomm AI Research, who helps us kick off our coverage of NeurIPS 2023. In our conversation with Markus, we cover his accepted papers at the conference, along with other work presented by Qualcomm AI Research scientists. Markus’ first paper, Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing, focuses on tackling activation quantization issues introduced by the attention mechanism and how to solve them. We ...
Dec 26, 2023•47 min•Ep 663•Transcript available on Metacast