AI Engineering Podcast

Tobias Macey•www.aiengineeringpodcast.com

This show is your guidebook to building scalable and maintainable AI systems. You will learn how to architect AI applications, apply AI to your work, and the considerations involved in building or customizing new models. Everything that you need to know to deliver real impact and value with machine learning and artificial intelligence.

Follow on

Episodes

The Power of Community in AI Development with Oumi

Summary In this episode of the AI Engineering Podcast Emmanouil (Manos) Koukoumidis, CEO of Oumi, about his vision for an open platform for building, evaluating, and deploying AI foundation models. Manos shares his journey from working on natural language AI services at Google Cloud to founding Oumi with a mission to advance open-source AI, emphasizing the importance of community collaboration and accessibility. He discusses the need for open-source models that are not constrained by proprietary...

Mar 16, 2025•56 min•Ep. 48

Arch Gateway: Add AI To Your Apps Without Custom Development

Summary In this episode of the AI Engineering Podcast Adil Hafiz talks about the Arch project, a gateway designed to simplify the integration of AI agents into business systems. He discusses how the gateway uses Rust and Envoy to provide a unified interface for handling prompts and integrating large language models (LLMs), allowing developers to focus on core business logic rather than AI complexities. The conversation also touches on the target audience, challenges, and future directions for th...

Feb 26, 2025•31 min•Ep. 47

The Role Of Synthetic Data In Building Better AI Applications

Summary In this episode of the AI Engineering Podcast Ali Golshan, co-founder and CEO of Gretel.ai, talks about the transformative role of synthetic data in AI systems. Ali explains how synthetic data can be purpose-built for AI use cases, emphasizing privacy, quality, and structural stability. He highlights the shift from traditional methods to using language models, which offer enhanced capabilities in understanding data's deep structure and generating high-quality datasets. The conversation e...

Feb 16, 2025•54 min•Ep. 46

Optimize Your AI Applications Automatically With The TensorZero LLM Gateway

Summary In this episode of the AI Engineering podcast Viraj Mehta, CTO and co-founder of TensorZero, talks about the use of LLM gateways for managing interactions between client-side applications and various AI models. He highlights the benefits of using such a gateway, including standardized communication, credential management, and potential features like request-response caching and audit logging. The conversation also explores TensorZero's architecture and functionality in optimizing AI appl...

Jan 22, 2025•1 hr 3 min•Ep. 45

Harnessing The Engine Of AI

Summary In this episode of the AI Engineering Podcast Ron Green, co-founder and CTO of KungFu AI, talks about the evolving landscape of AI systems and the challenges of harnessing generative AI engines. Ron shares his insights on the limitations of large language models (LLMs) as standalone solutions and emphasizes the need for human oversight, multi-agent systems, and robust data management to support AI initiatives. He discusses the potential of domain-specific AI solutions, RAG approaches, an...

Dec 16, 2024•55 min•Ep. 44

The Complex World of Generative AI Governance

Summary In this episode of the AI Engineering Podcast Jim Olsen, CTO of ModelOp, talks about the governance of generative AI models and applications. Jim shares his extensive experience in software engineering and machine learning, highlighting the importance of governance in high-risk applications like healthcare. He explains that governance is more about the use cases of AI models rather than the models themselves, emphasizing the need for proper inventory and monitoring to ensure compliance a...

Dec 01, 2024•54 min•Ep. 43

Building Semantic Memory for AI With Cognee

Summary In this episode of the AI Engineering Podcast, Vasilije Markovich talks about enhancing Large Language Models (LLMs) with memory to improve their accuracy. He discusses the concept of memory in LLMs, which involves managing context windows to enhance reasoning without the high costs of traditional training methods. He explains the challenges of forgetting in LLMs due to context window limitations and introduces the idea of hierarchical memory, where immediate retrieval and long-term info...

Nov 25, 2024•55 min•Ep. 42

The Impact of Generative AI on Software Development

Summary In this episode of the AI Engineering Podcast, Tanner Burson, VP of Engineering at Prismatic, talks about the evolving impact of generative AI on software developers. Tanner shares his insights from engineering leadership and data engineering initiatives, discussing how AI is blurring the lines of developer roles and the strategic value of AI in software development. He explores the current landscape of AI tools, such as GitHub's Copilot, and their influence on productivity and workflow,...

Nov 22, 2024•53 min•Ep. 41

ML Infrastructure Without The Ops: Simplifying The ML Developer Experience With Runhouse

Summary Machine learning workflows have long been complex and difficult to operationalize. They are often characterized by a period of research, resulting in an artifact that gets passed to another engineer or team to prepare for running in production. The MLOps category of tools have tried to build a new set of utilities to reduce that friction, but have instead introduced a new barrier at the team and organizational level. Donny Greenberg took the lessons that he learned on the PyTorch team at...

Nov 11, 2024•1 hr 16 min•Ep. 40

Building AI Systems on Postgres: An Inside Look at pgai Vectorizer

Summary With the growth of vector data as a core element of any AI application comes the need to keep those vectors up to date. When you go beyond prototypes and into production you will need a way to continue experimenting with new embedding models, chunking strategies, etc. You will also need a way to keep the embeddings up to date as your data changes. The team at Timescale created the pgai Vectorizer toolchain to let you manage that work in your Postgres database. In this episode Avthar Sewr...

Nov 11, 2024•54 min•Ep. 39

Running Generative AI Models In Production

Summary In this episode Philip Kiely from BaseTen talks about the intricacies of running open models in production. Philip shares his journey into AI and ML engineering, highlighting the importance of understanding product-level requirements and selecting the right model for deployment. The conversation covers the operational aspects of deploying AI models, including model evaluation, compound AI, and model serving frameworks such as TensorFlow Serving and AWS SageMaker. Philip also discusses th...

Oct 28, 2024•58 min•Ep. 38

Enhancing AI Retrieval with Knowledge Graphs: A Deep Dive into GraphRAG

Summary In this episode of the AI Engineering podcast, Philip Rathle, CTO of Neo4J, talks about the intersection of knowledge graphs and AI retrieval systems, specifically Retrieval Augmented Generation (RAG). He delves into GraphRAG, a novel approach that combines knowledge graphs with vector-based similarity search to enhance generative AI models. Philip explains how GraphRAG works by integrating a graph database for structured data storage, providing more accurate and explainable AI responses...

Sep 10, 2024•59 min•Ep. 37

Harnessing Generative AI for Effective Digital Advertising Campaigns

Summary In this episode of the AI Engineering podcast Praveen Gujar, Director of Product at LinkedIn, talks about the applications of generative AI in digital advertising. He highlights the key areas of digital advertising, including audience targeting, content creation, and ROI measurement, and delves into how generative AI is revolutionizing these aspects. Praveen shares successful case studies of generative AI in digital advertising, including campaigns by Heinz, the Barbie movie, and Maggi, ...

Sep 02, 2024•42 min•Ep. 36

Building Scalable ML Systems on Kubernetes

Summary In this episode of the AI Engineering podcast, host Tobias Macy interviews Tammer Saleh, founder of SuperOrbital, about the potentials and pitfalls of using Kubernetes for machine learning workloads. The conversation delves into the specific needs of machine learning workflows, such as model tracking, versioning, and the use of Jupyter Notebooks, and how Kubernetes can support these tasks. Tammer emphasizes the importance of a unified API for different teams and the flexibility Kubernete...

Aug 15, 2024•50 min•Ep. 35

Expert Insights On Retrieval Augmented Generation And How To Build It

Summary In this episode we're joined by Matt Zeiler, founder and CEO of Clarifai, as he dives into the technical aspects of retrieval augmented generation (RAG). From his journey into AI at the University of Toronto to founding one of the first deep learning AI companies, Matt shares his insights on the evolution of neural networks and generative models over the last 15 years. He explains how RAG addresses issues with large language models, including data staleness and hallucinations, by providi...

Jul 28, 2024•1 hr 3 min•Ep. 34

Barking Up The Wrong GPTree: Building Better AI With A Cognitive Approach

Summary Artificial intelligence has dominated the headlines for several months due to the successes of large language models. This has prompted numerous debates about the possibility of, and timeline for, artificial general intelligence (AGI). Peter Voss has dedicated decades of his life to the pursuit of truly intelligent software through the approach of cognitive AI. In this episode he explains his approach to building AI in a more human-like fashion and the emphasis on learning rather than st...

Jul 28, 2024•53 min•Ep. 33

Build Your Second Brain One Piece At A Time

Summary Generative AI promises to accelerate the productivity of human collaborators. Currently the primary way of working with these tools is through a conversational prompt, which is often cumbersome and unwieldy. In order to simplify the integration of AI capabilities into developer workflows Tsavo Knott helped create Pieces, a powerful collection of tools that complements the tools that developers already use. In this episode he explains the data collection and preparation process, the colle...

Jul 28, 2024•48 min•Ep. 32

Strategies For Building A Product Using LLMs At DataChat

Summary Large Language Models (LLMs) have rapidly captured the attention of the world with their impressive capabilities. Unfortunately, they are often unpredictable and unreliable. This makes building a product based on their capabilities a unique challenge. Jignesh Patel is building DataChat to bring the capabilities of LLMs to organizational analytics, allowing anyone to have conversations with their business data. In this episode he shares the methods that he is using to build a product on t...

Mar 03, 2024•49 min•Ep. 31

Improve The Success Rate Of Your Machine Learning Projects With bizML

Summary Machine learning is a powerful set of technologies, holding the potential to dramatically transform businesses across industries. Unfortunately, the implementation of ML projects often fail to achieve their intended goals. This failure is due to a lack of collaboration and investment across technological and organizational boundaries. To help improve the success rate of machine learning projects Eric Siegel developed the six step bizML framework, outlining the process to ensure that ever...

Feb 18, 2024•50 min•Ep. 30

Using Generative AI To Accelerate Feature Engineering At FeatureByte

Summary One of the most time consuming aspects of building a machine learning model is feature engineering. Generative AI offers the possibility of accelerating the discovery and creation of feature pipelines. In this episode Colin Priest explains how FeatureByte is applying generative AI models to the challenge of building and maintaining machine learning pipelines. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea ...

Feb 11, 2024•45 min•Ep. 29

Learn And Automate Critical Business Workflows With 8Flow

Summary Every business develops their own specific workflows to address their internal organizational needs. Not all of them are properly documented, or even visible. Workflow automation tools have tried to reduce the manual burden involved, but they are rigid and require substantial investment of time to discover and develop the routines. Boaz Hecht co-founded 8Flow to iteratively discover and automate pieces of workflows, bringing visibility and collaboration to the internal organizational pro...

Jan 28, 2024•43 min•Ep. 28

Considering The Ethical Responsibilities Of ML And AI Engineers

Summary Machine learning and AI applications hold the promise of drastically impacting every aspect of modern life. With that potential for profound change comes a responsibility for the creators of the technology to account for the ramifications of their work. In this episode Nicholas Cifuentes-Goodbody guides us through the minefields of social, technical, and ethical considerations that are necessary to ensure that this next generation of technical and economic systems are equitable and benef...

Jan 28, 2024•39 min•Ep. 27

Build Intelligent Applications Faster With RelationalAI

Summary Building machine learning systems and other intelligent applications are a complex undertaking. This often requires retrieving data from a warehouse engine, adding an extra barrier to every workflow. The RelationalAI engine was built as a co-processor for your data warehouse that adds a greater degree of flexibility in the representation and analysis of the underlying information, simplifying the work involved. In this episode CEO Molham Aref explains how RelationalAI is designed, the ca...

Dec 31, 2023•58 min•Ep. 26

Building Better AI While Preserving User Privacy With TripleBlind

Summary Machine learning and generative AI systems have produced truly impressive capabilities. Unfortunately, many of these applications are not designed with the privacy of end-users in mind. TripleBlind is a platform focused on embedding privacy preserving techniques in the machine learning process to produce more user-friendly AI products. In this episode Gharib Gharibi explains how the current generation of applications can be susceptible to leaking user data and how to counteract those tre...

Nov 22, 2023•47 min•Ep. 25

Enhancing The Abilities Of Software Engineers With Generative AI At Tabnine

Summary Software development involves an interesting balance of creativity and repetition of patterns. Generative AI has accelerated the ability of developer tools to provide useful suggestions that speed up the work of engineers. Tabnine is one of the main platforms offering an AI powered assistant for software engineers. In this episode Eran Yahav shares the journey that he has taken in building this product and the ways that it enhances the ability of humans to get their work done, and when t...

Nov 13, 2023•1 hr 5 min•Ep. 24

Validating Machine Learning Systems For Safety Critical Applications With Ketryx

Summary Software systems power much of the modern world. For applications that impact the safety and well-being of people there is an extra set of precautions that need to be addressed before deploying to production. If machine learning and AI are part of that application then there is a greater need to validate the proper functionality of the models. In this episode Erez Kaminski shares the work that he is doing at Ketryx to make that validation easier to implement and incorporate into the ongo...

Nov 08, 2023•51 min•Ep. 23

Applying Declarative ML Techniques To Large Language Models For Better Results

Summary Large language models have gained a substantial amount of attention in the area of AI and machine learning. While they are impressive, there are many applications where they are not the best option. In this episode Piero Molino explains how declarative ML approaches allow you to make the best use of the available tools across use cases and data formats. Announcements Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to del...

Oct 24, 2023•46 min•Ep. 22

Surveying The Landscape Of AI and ML From An Investor's Perspective

Summary Artificial Intelligence is experiencing a renaissance in the wake of breakthrough natural language models. With new businesses sprouting up to address the various needs of ML and AI teams across the industry, it is a constant challenge to stay informed. Matt Turck has been compiling a report on the state of ML, AI, and Data for his work at FirstMark Capital. In this episode he shares his findings on the ML and AI landscape and the interesting trends that are developing. Announcements Hel...

Oct 15, 2023•1 hr 3 min•Ep. 21

Applying Federated Machine Learning To Sensitive Healthcare Data At Rhino Health

Summary A core challenge of machine learning systems is getting access to quality data. This often means centralizing information in a single system, but that is impractical in highly regulated industries, such as healthchare. To address this hurdle Rhino Health is building a platform for federated learning on health data, so that everyone can maintain data privacy while benefiting from AI capabilities. In this episode Ittai Dayan explains the barriers to ML in healthcare and how they have desig...

Sep 11, 2023•50 min•Ep. 20

Using Machine Learning To Keep An Eye On The Planet

Summary Satellite imagery has given us a new perspective on our world, but it is limited by the field of view for the cameras. Synthetic Aperture Radar (SAR) allows for collecting images through clouds and in the dark, giving us a more consistent means of collecting data. In order to identify interesting details in such a vast amount of data it is necessary to use the power of machine learning. ICEYE has a fleet of satellites continuously collecting information about our planet. In this episode ...

Jun 17, 2023•43 min•Ep. 19