177: Vector Databases

Programming Throwdown

Nov 04, 2024•1 hr 28 min•Ep. 177

--:--

Listen in podcast apps:

Apple Podcasts

Spotify

Download

Listen to this episode in Metacast mobile app

Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Intro topic: Buying a Car

News/Links:

Cognitive Load is what Matters
- https://github.com/zakirullin/cognitive-load
Diffusion models are Real-Time Game Engines
- https://gamengen.github.io/
Your Company Needs Junior Devs
- https://softwaredoug.com/blog/2024/09/07/your-team-needs-juniors
Seamless Streaming / Fish Speech / LLaMA Omni
- Seamless: https://huggingface.co/facebook/seamless-streaming
- Fish: https://github.com/fishaudio/fish-speech
- LLaMA Omni: https://github.com/ictnlp/LLaMA-Omni

Book of the Show

Patrick:
- Thought Emporium Youtube
  - https://youtu.be/8X1_HEJk2Hw?si=T8EaHul-QMahyUvQ
Jason:
- Novel Minds
  - https://www.novelminds.ai/

Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h

Tool of the Show

Patrick:
- Escape Simulator
  - https://pinestudio.com/games/escape-simulator/
Jason:
- Cursor IDE
  - https://www.cursor.com/

Topic: Vector Databases (~54 min)

How computers represent data traditionally
- ASCII values
- RGB values
How traditional compression works
- Huffman encoding (tree structure)
- Lossy example: Fourier Transform & store coefficients
How embeddings are computed
- Pairwise (contrastive) methods
- Forward models (self-supervised)
Similarity metrics
Approximate Nearest Neighbors (ANN)
Sub-Linear ANN
- Clustering
- Space Partitioning (e.g. K-D Trees)
What a vector database does
- Perform nearest-neighbors with many different similarity metrics
- Store the vectors and the data structures to support sub-linear ANN
- Handle updates, deletes, rebalancing/reclustering, backups/restores
Examples
- pgvector: a vector-database plugin for postgres
- Weaviate, Pinecone
- Milvus

★ Support this podcast on Patreon ★

For the best experience, listen in Metacast app for iOS or Android