Hailey Joren is a Ph.D. student at UCSD! Hailey and collaborators at Duke University and Google have recently published Sufficient Context: A New Lens on Retrieval Augmented Generation Systems in ICLR 2025! There are so many interesting nuggets to this work! Firstly, it really helped me understand the difference between *relevant* search results and sufficient context for answering the question. Armed with this lens of looking at retrieved context, Hailey and collaborators make all sorts of inte...
Jul 02, 2025•51 min
Nandan Thakur is a Ph.D. student at the University of Waterloo! Nandan has worked on many of the most impactful works in Retrieval-Augmented Generation (RAG) and Information Retrieval. His work ranges from benchmarks such as BEIR, MIRACLE, TREC, and FreshStack, to improving the training of embedding models and re-rankings, and more!
Jun 25, 2025•1 hr 5 min
Multi-vector retrieval offers richer, more nuanced search, but often comes with a significant cost in storage and computational overhead. How can we harness the power of multi-vector representations without breaking the bank? Rajesh Jayaram, the first author of the groundbreaking MUVERA algorithm from Google, and Roberto Esposito from Weaviate, who spearheaded its implementation, reveal how MUVERA tackles this critical challenge. Dive deep into MUVERA, a novel compression technique specifically ...
May 28, 2025•1 hr 13 min
AI agents are getting more complex and harder to debug. How do you know what's happening when your agent makes 20+ function calls? What if you have a Multi-Agent System orchestrating several Agents? Anand Kannappan, co-founder of Patronus AI, reveals how their groundbreaking tool Percival transforms agent debugging and evaluation. Percival can instantly analyze complex agent traces, it pinpoints failures across 60 different modes, and it automatically suggests prompt fixes to improve performance...
May 15, 2025•1 hr 1 min
How do you ensure your AI systems actually do what you expect them to do? Leonard Tang takes us deep into the revolutionary world of AI evaluation with concrete techniques you can apply today. Learn how Haize Labs is transforming AI testing through "scaling judge-time compute" - stacking weaker models to effectively evaluate stronger ones. Leonard unpacks the game-changing Verdict library that outperforms frontier models by 10-20% while dramatically reducing costs. Discover practical insights on...
May 12, 2025•54 min
Ben walks us through Box's three-layer infrastructure puzzle: First, the mind-boggling base infrastructure (think millions of interactions per second and trillions of files). Second, their unique multi-tenant security challenge - unlike most SaaS platforms, Box users share content across company boundaries, making traditional tenant isolation impossible. And third, ensuring AI respects all these complex permissions while still delivering value. The podcast then dives further into how vector embe...
May 07, 2025•56 min
Hey everyone! Thanks so much for watching another episode of the Weaviate Podcast! Dive into the fascinating world of structured outputs with Will Kurt and Cameron Pfeiffer, the brilliant minds behind Outlines, the revolutionary open-source library from .txt.ai that's changing how we interact with LLMs. In this episode, we explore how constrained decoding enables predictable, reliable outputs from language models—unlocking everything from perfect JSON generation to guided reasoning processes.Wil...
Apr 09, 2025•1 hr 10 min
Synthetic Data: The Building Bocks of AI's Future! Hey everyone! I am SUPER EXCITED to publish the 118th episode of the Weaviate Podcast featuring David Berenstein and Ben Burtenshaw from HuggingFace! This podcast explores the intricacies of synthetic data generation, detailing methodologies such as data augmentation, distillation, and instruction refinement. The conversation delves into persona-driven synthetic data, highlighting applications like Persona Hub, and discusses algorithms to enhanc...
Mar 25, 2025•1 hr 2 min
Hey everyone! Thank you so much for watching the 117th episode of the Weaviate podcast! In this episode, we dive deep into the cutting edge of AI agent development with Sarah Wooders, co-founder and CTO of Letta AI. Emerging from Berkeley's Sky Computing Lab, Sarah and her team have pioneered a revolutionary approach to stateful agents - AI systems that genuinely remember both you and themselves across extended conversations. The conversation explores how the groundbreaking MemGPT project evolve...
Mar 03, 2025•58 min
Hey everyone! Thank you so much for watching another episode of the Weaviate Podcast! I am SUPER excited to welcome Matt Biilmann, Co-Founder and CEO of Netlify, as well as Sebastian Witalec and Charles Pierse from Weaviate to discuss Agent Experience! You have probably heard about how you can connect LLMs to external software tools. This supercharges the capabilities of AI systems and what they can do. So what does that mean for you as a software developer?This podcast explores different ideas ...
Feb 27, 2025•52 min
Hey everyone! Thank you so much for watching the 115th episode of the Weaviate Podcast featuring Shirley Wu from Stanford University! We explore the innovative Avatar Optimizer—a novel framework that leverages contrastive reasoning to refine LLM agent prompts for optimal tool usage. Shirley explains how this self-improving system evolves through iterative feedback by contrasting positive and negative examples, enabling agents to handle complex tasks more effectively. We also dive into the STaRK ...
Feb 19, 2025•1 hr
Hey everyone! Thank you so much for watching the 114th episode of the Weaviate Podcast featuring Amanpreet Singh, Co-Founder and CTO of Contextual AI! Contextual AI is at the forefront of production-grade RAG agents! I learned so much from this conversation! We began by discussing the vision of RAG 2.0, jointly optimizing generative and retrieval models! This then lead us to discuss Agentic RAG and how the RAG 2.0 roadmap is evolving with emerging perspectives on tool use. Amanpreet continues to...
Feb 12, 2025•58 min
Hey everyone! Thank you so much for watching the 113th episode of the Weaviate Podcast with Karan Goel from Cartesia AI! Cartesia AI is leading the AI world in text-to-speech models! As exciting as these new applications in speech generation are, Cartesia is also building around an incredibly exciting new neural network architecture that cuts across all of AI -- State Space Models. State Space Models (SSMs) present a new approach to modeling long sequences circumventing the quadratic attention b...
Jan 28, 2025•54 min
Hey everyone! Thank you so much for watching the 112th episode of the Weaviate Podcast! This is another super exciting one, diving into the release of the Vertex AI RAG Engine, its integration with Weaviate and thoughts on the future of connecting AI systems with knowledge sources! The podcast begins by reflecting on Bob's experience speaking at Google in 2016 on Knowledge Graphs! This transitions into discussing the evolution of knowledge representation perspectives and things like the semantic...
Jan 15, 2025•58 min
Hey everyone! I am SUPER EXCITED to publish the 111th Weaviate Podcast with Aravind Kesiraju from Morningstar! Aravind is a Principal Software Engineer who has lead the development behind the Morningstar Intelligence Engine! There are so many interesting aspects to this, and if you are building Agentic systems that would benefit from a high-quality financial retrieval API, you should check this out right now! The podcast dives into all sorts of ingredients that went into building this system: fr...
Jan 08, 2025•53 min
Hey everyone! Thank you so much for watching the 110th episode of the Weaviate Podcast! Today we are diving into Snowflake’s Arctic Embedding model series and their newly released Arctic Embed 2.0 open-source model, additionally supporting multilingual text embeddings. The podcast covers the origin of Arctic Embed, Pre-training embedding models, Matryoshka Representation Learning (MRL), Fine-tuning embedding models, Synthetic Query Generation, Hard Negative Mining, and Single-Vector Embeddings M...
Dec 18, 2024•1 hr 34 min
Hey everyone! Thank you so much for watching the 109th episode of the Weaviate Podcast with Erika Cardenas! Erika, in collaboration with Leonie Monigatti, have recently published "What is Agentic RAG". This blog post that was even covered in VentureBeat with additional quotes from Weaviate Co-Founder and CEO Bob van Luijt! This podcast continues the discussion on all things Agentic RAG, covering the basics of Agents, how Agentic RAG changes the game compared to Vanilla RAG systems, Multi-Agent S...
Nov 13, 2024•34 min
JSON mode has been one of the biggest enablers for working with Large Language Models! JSON mode is even expanding into Multimodal Foundation models! But how exactly is JSON mode achieved? There are generally 3 paths to JSON mode: (1) constrained generation (such as Outlines), (2) begging the model for a JSON response in the prompt, and (3) A two stage process of generate-then-format. I am BEYOND EXCITED to publish the 108th Weaviate Podcast with Zhi Rui Tam, the lead author of Let Me Speak Free...
Nov 07, 2024•40 min
Hey everyone! Thank you so much for watching the 107th episode of the Weaviate Podcast! This one dives into SWE-bench, SWE-agent, and most recently SWE-bench Multimodal with John Yang from Stanford University and Carlos E. Jimenez from Princeton University! One of the most impactful applications of AI we have seen so far is in programming and software engineering! John, Carlos, and team are at the cutting-edge of developing and benchmarking these systems! I learned so much from the conversation ...
Oct 30, 2024•58 min
Hey everyone! I am SUPER excited to publish the 106th episode of the Weaviate Podcast featuring Rose E. Wang!! Rose is a Ph.D. student at Stanford University where she has lead incredible research at the cutting-edge of AI applications in Education. The podcast heavily discusses her recent work on Tutor CoPilot! Tutor CoPilot is one of the world's largest randomized control trials on the impact AI is having on education, testing 900 students and 1800 tutors in grades K-12. I think this is such a...
Oct 22, 2024•51 min
Hey everyone! Thank you so much for tuning into the 105th episode of the Weaviate Podcast! This one features Philip Kiely diving into all sorts of apsects related to Compound AI Systems! We are now seeing far better results with AI models by breaking up tasks into multiple stages and inferences. Philip explains the work they are doing at Baseten to optimize and scale deployments of these emerging systems and all sorts of aspects about them from Structured Generation to their distinction with Age...
Oct 17, 2024•57 min
AI Researchers have overfit to maximizing state-of-the-art accuracy at the expense of the cost to run these AI systems! We need to account for cost during optimization. Even if a chatbot can produce an amazing answer, it isn't that valuable if it costs, say $5 per response! I am beyond excited to present the 104th Weaviate Podcast with Sayash Kapoor and Benedikt Stroebl from Princeton Language and Intelligence! Sayash and Benedikt are co-first authors of "AI Agents That Matter"! This is one of m...
Sep 18, 2024•1 hr 1 min
I am beyond excited to publish our interview with Krista Opsahl-Ong from Stanford University! Krista is the lead author of MIPRO, short for Multi-prompt Instruction Proposal Optimizer, and one of the leading developers and scientists behind DSPy! This was such a fun discussion beginning with the motivation of Automated Prompt Engineering, Multi-Layer Language Programs (also commonly referred to as Compound AI Systems), and their intersection. We then dove into the details of how MIPRO achieves t...
Aug 28, 2024•1 hr 1 min
AI is completely transforming how we build software! But how exactly? What does it mean for a software application to be AI-Native versus AI-Enabled? How many other aspects of software development and creativity are impacted by AI? I am super excited to publish our 102nd Weaviate Podcast with Guy Podjarny and Bob van Luijt on AI-Native Development! Guy Podjarny is a co-founder of Snyk, a remarkably successful Cybersecurity company. He is now back on the founder journey, diving into AI-Native Dev...
Aug 14, 2024•53 min
Hey everyone! Thank you so much for watching the 101st episode of the Weaviate Podcast with Devin Petersohn! Devin is the creator of Modin, one of the world's most advanced systems for scaling Pandas! Devin then went onto co-found Ponder, which was acquired by Snowflake in early 2023. This was one of my favorite podcasts of all time, I learned so much about the internals of Data Systems and I hope you do as well!
Jul 17, 2024•48 min
What is an AI-native application? This has been one of the questions we are most interested in answering at Weaviate! This podcast explores this question with Weaviate Co-founder Bob van Luijt and Lucas Negritto. Formerly at OpenAI, Lucas is now building Odapt, a remarkable example of such an application where we no longer use front-end code, rather rendering the UI entirely within the generative model!! There are many interesting topics covered such as of course, firstly how this works and how ...
Jul 04, 2024•44 min
Liana Patel is a Ph.D. student at Stanford University who is the lead author of ACORN, a breakthrough in Approximate Nearest Neighbor Search with Filters! Also joining the podcast is Abdel Rodriguez, a Vector Index Researcher and Engineer at Weaviate. This podcast dives into all sorts of details behind ACORN. Starting with how Liana developed her interest in Approximate Nearest Neighbor Search algorithms and then transitioning into how ACORN differs from previous approaches, the Two-Hop Neighbor...
Jun 25, 2024•54 min
Josh Engels is a Ph.D. student at MIT who has published several works advancing the state of the art in Vector Search. Josh has recently developed the Window Search Tree, a new algorithm particularly targeted for improving Filtered Vector Search. Even more particularly than that, the WST algorithm targets Filtered Search with continuous-valued filters such as "price" or "date", also known as range filters. This is a huge application for Vector Databases and it was incredible getting to pick Josh...
Jun 19, 2024•59 min
Hey everyone! I am SUPER excited to publish our 97th Weaviate Podcast on the state of AI-powered Search technology featuring Nils Reimers and Erika Cardenas! Erika and I have been super excited about Cohere's latest works to advance RAG and Search and it was amazing getting to pick Nils' brain about all these topics! We began with the development of Compass! Nils explains the current problem with embeddings as a soup!! For example, imagine embedding this video description, the first part is abou...
Jun 11, 2024•1 hr
Hey everyone! Thank you so much for watching the 96th episode of the Weaviate podcast featuring Letitia Parcalabescu! While completing her Ph.D. studies at the University of Heidelberg, Letitia started her YouTube channel: AI Coffee Break with Letitia! Her videos break down complex concepts in AI with a creative mix of technical expertise and visualizations unlike anyone else in the space!We began the podcast by discussing our shared background in creating content on YouTube from starting, to pl...
Jun 05, 2024•1 hr 35 min