David Garnitz on VectorFlow - Weaviate Podcast #66! - podcast episode cover

David Garnitz on VectorFlow - Weaviate Podcast #66!

Sep 07, 20231 hr 5 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Hey everyone! Thank you so much for watching the 66th Weaviate Podcast with David Garnitz, the creator of VectorFlow! VectorFlow (open-sourced on GH and linked below) is a new tool for ingesting data into Vector Databases such as Weaviate! There is quite an interesting End-to-End stack emerging at the ingestion layer, from retrieving data from misc. sources such as Slack, Salesforce, GitHub, Google Drive, Notion, ... to then Chunking the Text (maybe with the use of Visual Document Layout parsers like what Unstructured is imagining), extracting Metadata potentially (say the "age" of an NBA player as in the Evaporate-Code+ research) -- then sending this data off to embedding model inference and unpacking that can of worms from inference acceleration to load balancing, and finally -- importing the vectors themselves to Weaviate! I learned so much from this conversation, I really hope you enjoy listening and please check out VectorFlow below! VectorFlow: https://github.com/dgarnitz/vectorflow Chapters 0:00 VectorFlow on GitHub! 0:52 Welcome David Garnitz! 1:17 Vector Flow, Founding Vision 2:00 Billions of Vectors in Weaviate! 4:20 End-to-end data importing 6:30 Metadata Extraction in Vector Database Flows 10:15 Vectorizing 100s of millions of billions of chunks 15:58 Fine-Tuning Embedding Models 23:50 Zero-Shot Models in Metadata and Chunking 36:36 Vector + SQL 42:45 Self-Driving Databases 49:23 Generative Feedback Loop REST API 51:38 GPT Cache 55:55 Building VectorFlow

For the best experience, listen in Metacast app for iOS or Android