![Joan Fontanals - Principal Engineer - Jina AI - podcast episode cover](https://media.rss.com/vector-podcast/20220119_090157_f67877f44bb32ae14fd380d9328691ec.jpg)
Episode description
Topics:
00:00 Intro
00:42 Joan's background
01:46 What attracted Joan's attention in Jina as a company and product?
04:39 Main area of focus for Joan in the product
05:46 How Open Source model works for Jina?
08:38 Deeper dive into Jina.AI as a product and technology stack
11:57 Does Jina fit the use cases of smaller / mid-size players with smaller amount of data?
13:45 KNN/ANN algorithms available in Jina
16:05 BigANN competition and BuddyPQ, increasing 12% in recall over FAISS
17:07 Does Jina support customers in model training? Finetuner
20:46 How does Jina framework compare to Vector Databases?
26:46 Jina's investment in user-friendly APIs
31:04 Applications of Jina beyond search engines, like question answering systems
33:20 How to bring bits of neural search into traditional keyword retrieval? Connection to model interpretability
41:14 Does Jina allow going multimodal, including images / audio etc?
46:03 The magical question of Why
55:20 Product announcement from Joan
Order your Jina swag https://docs.google.com/forms/d/e/1FAIpQLSedYVfqiwvdzWPX-blCpVu-tQoiFiUJQz2QnIHU1ggy1oyg/ Use this promo code: vectorPodcastxJinaAI
Show notes:
- HNSW + PostgreSQL Indexer: [GitHub - jina-ai/executor-hnsw-postgres: A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL](https://github.com/jina-ai/executor-h...)
- pqlite: [GitHub - jina-ai/pqlite: A fast embedded library for Approximate Nearest Neighbor Search integrated with the Jina ecosystem](https://github.com/jina-ai/pqlite)
- BuddyPQ: [Billion-Scale Vector Search: Team Sisu and BuddyPQ | by Dmitry Kan | Big-ANN-Benchmarks | Nov, 2021 | Medium](https://medium.com/big-ann-benchmarks...)
- PaddlePaddle: [GitHub - PaddlePaddle/Paddle: PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)](https://github.com/PaddlePaddle/Paddle)
- Jina Finetuner: [Finetuner 0.3.1 documentation](https://finetuner.jina.ai/)
- [Not All Vector Databases Are Made Equal | by Dmitry Kan | Towards Data Science](https://towardsdatascience.com/milvus...)
- Fluent interface (method chaining): [Fluent interfaces in Python | Florian Einfalt – Developer](https://florianeinfalt.de/posts/fluen...)
- Sujit Pal’s blog: [Salmon Run](http://sujitpal.blogspot.com/)
- ByT5: Towards a token-free future with pre-trained byte-to-byte models https://arxiv.org/abs/2105.13626
Special thanks to Saurabh Rai for the Podcast Thumbnail: https://twitter.com/srbhr_ https://www.linkedin.com/in/srbh077/