Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm HNSW - podcast episode cover

Yury Malkov - Staff Engineer, Twitter - Author of the most adopted ANN algorithm HNSW

Jan 31, 20222 hr 30 minEp 7Transcript available on Metacast
--:--
--:--
Listen in podcast apps:

Episode description

Topics:

00:00 Introduction

01:04 Yury’s background in laser physics, computer vision and startups

05:14 How Yury entered the field of nearest neighbor search and his impression of it

09:03 “Not all Small Worlds are Navigable”

10:10 Gentle introduction into the theory of Small World Navigable Graphs and related concepts

13:55 Further clarification on the input constraints for the NN search algorithm design

15:03 What did not work in NSW algorithm and how did Yury set up to invent new algorithm called HNSW

24:06 Collaboration with Leo Boytsov on integrating HNSW in nmslib

26:01 Differences between HNSW and NSW

27:55 Does algorithm always converge?

31:56 How FAISS’s implementation is different from the original HNSW

33:13 Could Yury predict that his algorithm would be implemented in so many frameworks and vector databases in languages like Go and Rust?

36:51 How our perception of high-dimensional spaces change compared to 3D?

38:30 ANN Benchmarks

41:33 Feeling proud of the invention and publication process during 2,5 years!

48:10 Yury’s effort to maintain HNSW and its GitHub community and the algorithm’s design principles

53:29 Dmitry’s ANN algorithm KANNDI, which uses HNSW as a building block

1:02:16 Java / Python Virtual Machines, profiling and benchmarking. “Your analysis of performance contradicts the profiler”

1:05:36 What are Yury’s hopes and goals for HNSW and role of symbolic filtering in ANN in general

1:13:05 The future of ANN field: search inside a neural network, graph ANN

1:15:14 Multistage ranking with graph based nearest neighbor search

1:18:18 Do we have the “best” ANN algorithm? How ANN algorithms influence each other

1:21:27 Yury’s plans on publishing his ideas

1:23:42 The intriguing question of Why

Show notes:

- HNSW library: https://github.com/nmslib/hnswlib/

- HNSW paper Malkov, Y. A., & Yashunin, D. A. (2018). Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. TPAMI, 42(4), 824-836. (arxiv:1603.09320)

- NSW paper Malkov, Y., Ponomarenko, A., Logvinov, A., & Krylov, V. (2014). Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems, 45, 61-68.

- Yury Lifshits’s paper: https://yury.name/papers/lifshits2009combinatorial.pdf

- Sergey Brin’s work in nearest neighbour search: GNAT - Geometric Near-neighbour Access Tree: [CiteSeerX — Near neighbor search in large metric spaces](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.173.8156)

- Podcast with Leo Boytsov: https://rare-technologies.com/rrp-4-leo-boytsov-knn-search/

- Million-Scale ANN Benchmarks: http://ann-benchmarks.com/

- Billion Scale ANN Benchmarks: https://github.com/harsha-simhadri/big-ann-benchmarks

- FALCONN algorithm: https://github.com/falconn-lib/falconn

- Mentioned navigable small world papers:

Kleinberg, J. M. (2000). Navigation in a small world. Nature, 406(6798), 845-845.;

Boguna, M., Krioukov, D., & Claffy, K. C. (2009). Navigability of complex networks. Nature Physics, 5(1), 74-80.