Hey everyone! Thank you so much for watching the 66th Weaviate Podcast with David Garnitz, the creator of VectorFlow! VectorFlow (open-sourced on GH and linked below) is a new tool for ingesting data into Vector Databases such as Weaviate! There is quite an interesting End-to-End stack emerging at the ingestion layer, from retrieving data from misc. sources such as Slack, Salesforce, GitHub, Google Drive, Notion, ... to then Chunking the Text (maybe with the use of Visual Document Layout parsers...
Sep 07, 2023•1 hr 5 min
Hey everyone! Thank you so much for watching the Weaviate Podcast! I am SUPER excited to publish my conversation with Ofir Press! Ofir has done incredible work pioneering AliBi attention and Self-Ask prompting and I learned so much from speaking with him! As always we are more than happy to answer any questions or discuss any ideas you have about the content in the podcast! +Huge Congratulations on your Ph.D. Ofir! AliBi Attention: https://arxiv.org/abs/2108.12409 Self-Ask Prompting: https://arx...
Aug 31, 2023•1 hr 7 min
Hey everyone! Thank you so much for watching the 64th Weaviate Podcast with Shishir Patil and Tianjun Zhang, co-authors of Gorilla: Large Language Models Connected with Massive APIs! I learned so much about Gorilla from Shishir and Tianjun, from the APIBench dataset to the continually evolving APIZoo, how the models are trained with Retrieval-Aware Training, Self-Instruct Training data and how the authors think of fine-tuning LLaMA-7B models for tasks such as this, and many more! I hope you enjo...
Aug 30, 2023•49 min
Hey everyone! Thank you so much for watching the 63rd Weaviate Podcast, I couldn't be more excited to welcome Nils Reimers back to the podcast!! Similar to our debut episode together, we began by describing the latest collaboration of Weaviate and Cohere (episode 1, new multilingual embedding models; episode 2, rerankers!), and then continued into some of the key questions around search technology. In this one, we discussed the importance of temporal queries and metadata extraction, long documen...
Aug 17, 2023•1 hr 5 min
Hey everyone! Thank you so much for watching the 62nd Weaviate Podcast with Atai Barkai! We are stepping into the meta with this one for a podcast about podcasts! Podcasts are one of the biggest opportunities of new technologies, starting with Whisper's ability to transcribe audio to text and advances with speaker diarization, .. the question to be explored is, What Vector Database and LLM applications can we build with this data?! What is the future of podcasting with these new technologies?! I...
Aug 09, 2023•56 min
Hey everyone! Thank you so much for watching the 61st episode of the Weaviate Podcast! I am beyond excited to publish this one! I first met Rohit at the Cal Hacks event hosted by UC Berkeley where we had a debate about the impact of Semantic Caching! Rohit taught me a ton about the topic and I think it's going to be one of the most impactful early applications of Generative Feedback Loops! Rohit is building Portkey, a SUPER interesting LLM middleware that does things like load balancing between ...
Aug 03, 2023•49 min
Hey everyone! Thank you so much for watching the 60th Weaviate podcast with Patrice Bourgougnon! Patrice is the creator of WPSolr, integrating AI search capabilities with Wordpress and Woocommerce. Patrice is one of the most active contributors to Weaviate, filing issues and poking holes in new releases! Patrice shared incredible feedback on Weaviate and how he sees the state of Vector Databases and Search! As always, we are more than happy to answer any questions or ideas you have about the con...
Aug 02, 2023•1 hr 26 min
Hey everyone! Thank you so much for watching the 58th episode of the Weaviate Podcast! I am SUPER excited to welcome Andriy Muylar! Andriy is the Co-Founder of Nomic AI, a company fresh off a $17M series A raise! Nomic has created some incredible products such as Atlas and GPT4All! I was really impressed by Andriy's vision of the state and forecasted evolution of these topics! I hope you enjoy the podcast! As always, we are more than happy to answer any questions or discuss any ideas you have ab...
Jul 18, 2023•58 min
Hey everyone! Thank you so much for watching the 57th Weaviate podcast with Charles Frye! Charles is an educator at Full Stack Deep Learning, one of the world's top courses on Deep Learning with lectures available on YouTube (link below)! This was one of the most thorough Weaviate podcasts published so far, covering all sorts of topics around the evolution of Deep Learning! Particularly we discussed the Retrieval-Augmented Generation stack with Vector Databases and Zero-Shot Large Language Model...
Jul 13, 2023•1 hr 39 min
Chapters 0:00 Weaviate 1.20!!! 0:40 Multi-Tenancy 35:36 PQ Rescoring 47:20 Re-Ranking, AutoCut, Rank Fusion 58:58 Cloud Monitoring Metrics
Jul 12, 2023•1 hr 3 min
Hey everyone! Thank you so much for watching the 55th episode of the Weaviate Podcast with Aleksa Gordcic! This episodes dives into Aleksa's incredible story from Deep Learning YouTube to DeepMind and now creating Ortus! We dived into all sorts of topics, I loved hearing about the latest updates on Ortus and how Aleksa is sees the current state of AI development! We are more than happy to answer any questions or discuss any ideas you might have about the content in the podcast! Thanks so much fo...
Jul 05, 2023•1 hr 7 min
Chapters 0:00 Introduction 0:38 Founding Story of Vody 8:15 Custom Embedding Models 12:42 Movie Genre Vectors 13:42 Classification and Contrastive Learning 15:45 Foundation Model Tuning 21:13 Multimodal Generative Models 25:08 Training Embedding Models 33:20 Tabular Data Ranking Models 36:00 RoomGPT 41:36 Diversity in Recommendations 48:25 Future Directions in Multimodal AI 51:15 Open-Source 55:45 Keeping up with Vody!
Jun 22, 2023•56 min
Hey everyone, thank you so much for watching the 52nd episode of the Weaviate Podcast with Yana Welinder! Yana is the Founder and CEO of Kratful (https://www.kraftful.com/). Kratful is an incredibly interesting "ChatGPT but for Product Research" -- curating specific skills for Product Managers into a collection of prompts. We discussed all sorts of things from the latest innovations in LLMs to the ChatGPT marketplace and product management, I really hope you enjoy the podcast!
Jun 14, 2023•42 min
Hey everyone, thank you so much for watching the 51st episode of the Weaviate Podcast with Greg Kamradt and Colin Harmon! Greg and Colin are both entrepreneurs in the space of new AI tools powered by LLMs! This podcast is about keeping up with the evolution of LLM Agents from AutoGPT to connecting LLMs with Vector Databases or Wolfram Alpha, as well as the ChatGPT Marketplace, Personalized LLMs, Private LLMs, and many more! I think there are so many interesting nuggets from this podcast, thank y...
Jun 07, 2023•55 min
This video explores a new paper exploring the use of summarization chains to represent long texts and use (original text, summary) pairs for optimizing text embeddings models! Here are 3 main takeaways I think everyone working with Weaviate may get value from: 1. Understanding of Summary Indexing and the Prompts (as well as Prompt Chains) used to build them. 2. Continued development of LLM-generated data for search -- creating (full text, summary) pairs gives you (1) data to build a summary inde...
Jun 02, 2023•28 min
Hey everyone, thank you so much for watching the 50th (!!!) Weaviate Podcast with Emil Sorensen and Finn Bauer from Kapa AI! Are you curious about taking either your, or your company's, specific information and putting into a Vector DB + LLM system? Emil and Finn are doing this at the highest level, taking the documentation of software companies like Weaviate and building these LLM-augmetnted assistant systems for them. This podcast takes a complete tour from Data Ingestion to Cleaning, Chunking...
May 31, 2023•36 min
Hey everyone, thank you so much for watching the 49th episode of the Weaviate Podcast!! This podcast features Professor Laura Dietz from the University of New Hampshire! I came across Dr. Dietz's tutorial at ECIR on Neuro-Symbolic Approaches for Information Retrieval and am so grateful that she was interested in joining the Weaviate Podcast! I learned so much about Neurosymbolic Search, especially around the role of Entity Linking and Entity Re-Ranking -- as well as the topic of Knowledge Graphs...
May 25, 2023•1 hr 30 min
Hey everyone, thank you so much for watching the 48th episode of the Weaviate Podcast!! This is a SUPER exciting one, welcoming Brian Raymond the CEO / Founder of Unstructured! Unstructured is a perfect complimenting technology for Weaviate, helping people get their Unstructured data into Weaviate! The podcast dives into the nuances of this task, but it generally revolves around Unstructured's abstraction of Partitioning, Cleaning, and Staging! Unstructured is making groundbreaking innovations o...
May 23, 2023•43 min
Hey everyone, thank you so much for watching the Weaviate podcast! I am so excited about this episode! ChatArena is a software framework for multi-agent chat games. There are quite a few interesting applications of this, firstly we can use this kind of system to evaluate the intelligence of an LLM based on how intelligent it sounds in conversation with another LLM! Another interesting idea is to have the LLM impersonate people such as Lex Fridman or Sam Altman and simulate conversations between ...
May 17, 2023•52 min
Hey everyone! Thank you so much for watching the Weaviate Podcast! This is pretty novel episode featuring both Weaviate Co-Founders Bob van Luijt and Etienne Dilocker! This is also extremely novel because we are featuring a competitor vector database, HyperDB! John Dagdelen is the founder of HyperDB which is a hyper-fast local vector database for use with LLM Agents. Now accepting SAFEs at $135M cap. HyperDB: https://github.com/jdagdelen/hyperDB More seriously, John has produced an incredible bo...
May 10, 2023•1 hr 6 min
Hey everyone! Thank you so much for watching the Generative Feedback Loops Podcast! We have also created a blog post and GitHub repository for more information! Chapters 0:00 Bob the Podcast Host 1:20 Retrieval-Augmented Generation 4:10 Hallucination in LLMs 6:15 Solving Hallucination with RLHF 7:44 LLM Monster - Reasoning and Knowledge 10:12 Feedback Loops 11:00 Hands-on Code Demo 26:00 Demo Analysis from Bob and Connor 30:35 Star Wars Wes Anderson Generated Video 34:12 Multimodal Vector Databa...
May 05, 2023•55 min
Hey everyone! Thank you so much for watching the Weaviate 1.19 release podcast! We have all sorts of cool new features, in addition to the database and module features, I really want to encourage readers to see the `groupBy` search discussed at 14:32, quite an interesting idea for improving search performance! Chapters 0:00 Welcome Etienne! 0:38 gRPC API 9:50 Generative Cohere 14:32 groupBy search 19:33 Bitmap or BM25 index tuning 22:20 Additional Tokenization Options 24:05 Tunable Consistency...
May 04, 2023•27 min
Thank you so much for watching the 43rd episode of the Weaviate Podcast with Roman Grebennikov and Vesvolod Goloviznin from Metarank, as well as Erika Cardenas from Weaviate! This podcast is a masterclass on Ranking models, additionally touching on the connection between Search and Recommendation. Learning-to-rank is an exciting idea where we use models that produce more fine-grained relevance scores than the offline indexing techniques of vector search and bm25, however with the tradeoff of the...
Apr 12, 2023•1 hr 1 min
Thank you so much for watching the 42nd episode of the Weaviate Podcast! Ethan Steininger is the founder of Mixpeek, an intelligence layer that sits on top of your S3 bucket, so you can search and analyze unstructured data at scale. Ethan has also created Collie with the headline of "Enter your website and Collie will fetch every asset, then give you an embedded search bar that wows your users". Ethan began the podcast by describing his background at MongoDB and integrating the database with ful...
Apr 05, 2023•1 hr 23 min
Chapters 0:00 Welcome Dennis Xu! 0:30 Founding Vision of Mem 4:18 Personalized Embeddings 6:02 GPT-4, How will this change everything? 11:00 Writing code with LLMs 13:18 Embeddings at Mem 17:10 Structure in Vector Search 19:10 Zero-Shot vs. Fine-Tuned Models 25:05 Ranking Models and LLM Distillation
Mar 29, 2023•45 min
Chapters 0:00 Weaviate 1.18!!! 0:32 Bitmap Indexing! 11:40 HNSW PQ 25:33 Cursor API 30:03 Filters in Hybrid Search 32:55 WAND Scoring 40:35 Replication 49:10 Building a Database in Golang 1:00:55 Thank you!
Mar 07, 2023•1 hr 3 min
Check out the website here! https://openverkiezingen.nl/
Mar 06, 2023•37 min
Hey everyone! Thank you so much for watching the 38th episode of the Weaviate podcast! This episode features Leo Boystov, an expert in Information Retrieval technology! We discussed a very wide range of topics from an overview of IR methods such as BM25, Neural Bi-Encoder and Cross-Encoder rankers, and a super exciting new work Leo has co-authored on using Large Language Models to generate training data for Neural Ranking models titled "InPars-Light: Cost-Effective Unsupervised Training of Effic...
Mar 01, 2023•1 hr 28 min
Hey everyone! Thank you so much for watching the 37th episode of the Weaviate podcast! This episode discusses some of the ideas behind GPT Index. GPT Index presents really exciting ideas about how we use LLMs to index our data and then traverse these data structures. We began the podcast by discussing the origins of the tool and the ideas behind the Tree Index. We then discussed generalizing these trees to graphs and whether we are headed to the Knowledge Graph 2.0. Another really interesting to...
Feb 22, 2023•52 min
Hey everyone! Thank you so much for watching the 36th episode of the Weaviate podcast! This episode continues on the marriage between LLMs and Semantic Search, welcoming back Weaviate CEO and Co-Founder Bob van Luijt! Enter LangChain and its creator, Harrison Chase, providing the glue between LLMs and tools, such as semantic search. LangChain provides a set of abstractions around chaining multiple language model calls with different prompts, strategies for overcoming the 4096 token limit, and co...
Feb 15, 2023•48 min