Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm HNSW - podcast episode cover

Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm HNSW

Jan 31, 20221 hr 30 minSeason 1Ep. 7
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Topics:

00:00 Introduction

01:04 Yury’s background in laser physics, computer vision and startups

05:14 How Yury entered the field of nearest neighbor search and his impression of it

09:03 “Not all Small Worlds are Navigable”

10:10 Gentle introduction into the theory of Small World Navigable Graphs and related concepts

13:55 Further clarification on the input constraints for the NN search algorithm design

15:03 What did not work in NSW algorithm and how did Yury set up to invent new algorithm called HNSW

24:06 Collaboration with Leo Boytsov on integrating HNSW in nmslib

26:01 Differences between HNSW and NSW

27:55 Does algorithm always converge?

31:56 How FAISS’s implementation is different from the original HNSW

33:13 Could Yury predict that his algorithm would be implemented in so many frameworks and vector databases in languages like Go and Rust?

36:51 How our perception of high-dimensional spaces change compared to 3D?

38:30 ANN Benchmarks

41:33 Feeling proud of the invention and publication process during 2,5 years!

48:10 Yury’s effort to maintain HNSW and its GitHub community and the algorithm’s design principles

53:29 Dmitry’s ANN algorithm KANNDI, which uses HNSW as a building block

1:02:16 Java / Python Virtual Machines, profiling and benchmarking. “Your analysis of performance contradicts the profiler”

1:05:36 What are Yury’s hopes and goals for HNSW and role of symbolic filtering in ANN in general

1:13:05 The future of ANN field: search inside a neural network, graph ANN

1:15:14 Multistage ranking with graph based nearest neighbor search

1:18:18 Do we have the “best” ANN algorithm? How ANN algorithms influence each other

1:21:27 Yury’s plans on publishing his ideas

1:23:42 The intriguing question of Why

Show notes:

- HNSW library: https://github.com/nmslib/hnswlib/

- HNSW paper Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. TPAMI, 42(4), 824-836. (arxiv:1603.09320)

- NSW paper Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V. (2014). Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45, 61-68.

- Yury Lifshits’s paper: https://yury.name/papers/lifshits2009combinatorial.pdf

- Sergey Brin’s work in nearest neighbour search: GNAT - Geometric Near-neighbour Access Tree: [CiteSeerX — Near neighbor search in large metric spaces](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.8156)

- Podcast with Leo Boytsov: https://rare-technologies.com/rrp-4-leo-boytsov-knn-search/

- Million-Scale ANN Benchmarks: http://ann-benchmarks.com/

- Billion Scale ANN Benchmarks: https://github.com/harsha-simhadri/big-ann-benchmarks

- FALCONN algorithm: https://github.com/falconn-lib/falconn

- Mentioned navigable small world papers:

Kleinberg, J. M. (2000). Navigation in a small world. Nature, 406(6798), 845-845.;

Boguna, M., Krioukov, D., & Claffy, K. C. (2009). Navigability of complex networks. Nature Physics, 5(1), 74-80.

For the best experience, listen in Metacast app for iOS or Android