Joan Fontanals - Principal Engineer - Jina AI - podcast episode cover

Joan Fontanals - Principal Engineer - Jina AI

Jan 19, 202257 minSeason 1Ep. 6
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

Topics:

00:00 Intro

00:42 Joan's background

01:46 What attracted Joan's attention in Jina as a company and product?

04:39 Main area of focus for Joan in the product

05:46 How Open Source model works for Jina?

08:38 Deeper dive into Jina.AI as a product and technology stack

11:57 Does Jina fit the use cases of smaller / mid-size players with smaller amount of data?

13:45 KNN/ANN algorithms available in Jina

16:05 BigANN competition and BuddyPQ, increasing 12% in recall over FAISS

17:07 Does Jina support customers in model training? Finetuner

20:46 How does Jina framework compare to Vector Databases?

26:46 Jina's investment in user-friendly APIs

31:04 Applications of Jina beyond search engines, like question answering systems

33:20 How to bring bits of neural search into traditional keyword retrieval? Connection to model interpretability

41:14 Does Jina allow going multimodal, including images / audio etc?

46:03 The magical question of Why

55:20 Product announcement from Joan

Order your Jina swag https://docs.google.com/forms/d/e/1FAIpQLSedYVfqiwvdzWPX-blCpVu-tQoiFiUJQz2QnIHU1ggy1oyg/ Use this promo code: vectorPodcastxJinaAI

Show notes:

- Jina.AI: https://jina.ai/

- HNSW + PostgreSQL Indexer: [GitHub - jina-ai/executor-hnsw-postgres: A production-ready, scalable Indexer for the Jina neural search framework, based on HNSW and PSQL](https://github.com/jina-ai/executor-h...)

- pqlite: [GitHub - jina-ai/pqlite: A fast embedded library for Approximate Nearest Neighbor Search integrated with the Jina ecosystem](https://github.com/jina-ai/pqlite)

- BuddyPQ: [Billion-Scale Vector Search: Team Sisu and BuddyPQ | by Dmitry Kan | Big-ANN-Benchmarks | Nov, 2021 | Medium](https://medium.com/big-ann-benchmarks...)

- PaddlePaddle: [GitHub - PaddlePaddle/Paddle: PArallel Distributed Deep LEarning: Machine Learning Framework from Industrial Practice (『飞桨』核心框架,深度学习&机器学习高性能单机、分布式训练和跨平台部署)](https://github.com/PaddlePaddle/Paddle)

- Jina Finetuner: [Finetuner 0.3.1 documentation](https://finetuner.jina.ai/)

- [Not All Vector Databases Are Made Equal | by Dmitry Kan | Towards Data Science](https://towardsdatascience.com/milvus...)

- Fluent interface (method chaining): [Fluent interfaces in Python | Florian Einfalt – Developer](https://florianeinfalt.de/posts/fluen...)

- Sujit Pal’s blog: [Salmon Run](http://sujitpal.blogspot.com/)

- ByT5: Towards a token-free future with pre-trained byte-to-byte models https://arxiv.org/abs/2105.13626

Special thanks to Saurabh Rai for the Podcast Thumbnail: https://twitter.com/srbhr_ https://www.linkedin.com/in/srbh077/

For the best experience, listen in Metacast app for iOS or Android