In this episode, we'll explore how the data science team from Klaviyo developed a Generative AI Product that enhances experiences and enables efficient customer segmentation. We'll also discuss two key concepts in Generative AI: prompt chaining and few-shot learning. Based on their published tech blog, with the link provided here for your reference: https://klaviyo.tech/building-segments-ai-fe33d9cab822
May 13, 2024•16 min
Description: In this episode, we will talk about the importance of remediating failed workflow jobs to reduce business infrastructure costs. We delve into Netflix's approach, which involves enhancing their existing rule-based error classifier with advanced machine learning models. This allowed for auto-remediation, improving the handling of memory configuration and unclassified errors, ultimately leading to substantial cost savings. Based on their published tech blog, with the link provided here...
May 06, 2024•14 min
In this episode, we will talk about what is technical debt in software engineering and its associated risks. We will also share a set of metrics to measure the status of technical debt and ways to help companies quantify their progress toward better software engineering efforts. Based on their published tech blog, with the link provided here for your reference: https://medium.com/booking-com-development/measuring-technical-debt-to-avoid-the-boiling-frog-syndrome-c44eb48b3ce1...
Apr 29, 2024•13 min
In this episode, we will discuss how the Science team at Amazon designs pricing experiments to improve experimentation power. We will cover concepts like the carryover effect and spillover effect, as well as the solutions the team developed to overcome those challenges. Based on their published tech blog, with the link provided here for your reference: https://www.amazon.science/blog/the-science-of-price-experiments-in-the-amazon-store...
Apr 22, 2024•16 min
In this episode, we will talk about what is position bias in recommendation systems, and how the applied scientist team at Uber tackled this challenge. Based on their published tech blog, with the link provided here for your reference: https://www.uber.com/blog/improving-uber-eats-home-feed-recommendations
Apr 15, 2024•14 min
In this episode, we will take a look at one analytical approach in decision-making called Analytical Hierarchy Processing (AHP). The New York Times team leverages this AHP process to help with making better collective decisions in their technology choices for privacy. Based on their published tech blog, with the link provided here for your reference: https://open.nytimes.com/collective-decision-making-with-ahp-3ef819e5bc2a...
Apr 08, 2024•15 min
In this episode, we delve into a study conducted by Intuit on measuring the productivity impact of the Gen AI tool on their data analyst team. It demonstrates exciting positive productivity gains, offering an exciting outlook for many in the industry. Based on their published tech blog, with the link provided here for your reference: https://medium.com/intuit-engineering/how-intuit-data-analysts-write-sql-2x-faster-with-internal-genai-tool-c3b9d482208a...
Apr 01, 2024•12 min
Description: In this episode, we discuss how the data science team at Microsoft designed perturbation analysis to understand the Large Language Model’s performance on commonly seen tasks. Based on their published tech blog, with the link provided here for your reference: https://medium.com/data-science-at-microsoft/perturbation-analysis-and-llms-how-sensitive-are-llms-to-their-input-91a8407a971f...
Mar 25, 2024•14 min
In this episode, we discuss how DraftKings leverages Monte Carlo simulation to predict tennis game outcomes and generate probabilities to power its online gambling service. Based on their published tech blog, with the link provided here for your reference: https://medium.com/draftkings-engineering/building-a-tennis-simulation-d6afdaa97d19
Mar 18, 2024•11 min
In this episode, we discuss the insights shared by Expedia's Machine Learning Engineering team on how they leverage the two-tower neural network architecture in the candidate generation stage of their recommendation system. Based on their published tech blog, with the link provided here for your reference: https://medium.com/expedia-group-tech/candidate-generation-using-a-two-tower-approach-with-expedia-group-traveler-data-ca6a0dcab83e...
Mar 11, 2024•10 min
In this episode, we discuss how the Engineering team at Walmart created a customized Large Language Model (LLM) agent with Retrieval Augmented Generation (RAG) Technology to improve their agile planning efficiency. Based on their published tech blog, with the link provided here for your reference: https://medium.com/walmartglobaltech/an-autonomous-agent-for-agile-planning-98303e194e08...
Mar 04, 2024•14 min
In this episode, we discuss why and how Intuit leverages automated sanity checks to streamline its machine learning development process. Based on their published tech blog, with the link provided here for your reference: https://medium.com/intuit-engineering/how-to-streamline-ml-model-deployment-automated-sanity-checks-64a23166fdc5
Feb 26, 2024•11 min
In this episode, we take a look at the learnings from an online grocery tech company regarding their development of machine learning models for scalable demand forecasting. Based on the tech blog from Picnic International, the link is provided here for your reference: https://blog.picnic.nl/running-demand-forecasting-machine-learning-models-at-scale-bd058c9d4aa7
Feb 19, 2024•12 min
In this episode, we discuss the online-offline discrepancy problem in machine learning models and explore Pinterest's journey to tackle this challenge in their ads ranking system. Based on the tech blog from Pinterest, the link is provided here for your reference: https://medium.com/pinterest-engineering/handling-online-offline-discrepancy-in-pinterest-ads-ranking-system-8fd662da4c2d
Feb 12, 2024•14 min
In this episode, we explore why and how Instacart’s advertising team unified multiple browse click-through-rate prediction models by leveraging an advanced deep learning architecture. Based on the tech blog from Instacart, the link is provided here for your reference: https://tech.instacart.com/one-model-to-serve-them-all-0eb6bf60b00d
Feb 05, 2024•10 min
In this episode, we explore how ZipRecruiter, an online employment marketplace, leverages deep learning and Encoder modules to develop their recommendation system, which is designed for better matching job seekers with job listings. Based on the tech blog from ZipRecruiter, the link is provided here for your reference: https://medium.com/ziprecruiter-tech/multimodal-learning-for-employment-marketplace-recommendation-ee67bdbede53
Jan 29, 2024•13 min
In this episode, we look at how Oscar Health, a health insurance company, leverages the large language model (LLM), ChatGPT-4, to develop its claim assistant feature. They have built a reasonable assessment system and deployed several strategies to guide GPT-4 in further improvements. Based on the tech blog from Oscar Health, the link is provided here for your reference: https://medium.com/oscar-tech/oscar-claim-assistant-built-on-gpt-4-5bd7eb4d6129
Jan 22, 2024•12 min
This episode shares how the applied science team developed an ad simulation system as a more efficient and economical evaluation method for the Uber Eats Ads business. This tool has proven instrumental in guiding and expediting product development and driving growth. Based on the tech blog from Uber, the link is provided here for your reference: https://www.uber.com/blog/unleashing-the-power-of-ads-simulation/
Jan 15, 2024•12 min
This episode shares how the HelloFresh data science team explores the use of generative AI to create new food recipes. Based on the tech blog shared by Data Scientist from HelloFresh, the blog link is: https://medium.com/hellofresh-dev/recipes-and-generative-ai-6d74a107860c
Jan 08, 2024•13 min
This podcast episode delves into my journey of utilizing unconventional learning methods to fulfill my 2023 New Year's resolution of enhancing my chess skills. The high-level concept and philosophy apply to Learning in general and it aims to inspire others to reflect on their best learning strategies on various topics, such as Data Science and Machine Learning.
Jan 01, 2024•11 min
This episode includes a re-distribution of the interview I had with Daliana Liu on her podcast "The Data Scientist Show" (https://open.spotify.com/episode/0HzWSLIWpz6iHJiKXs0GNJ) There, I shared my thoughts about the focus difference between machine learning engineering and product data science, some projects I worked on at Uber and LinkedIn, as well as other topics: such as why I transitioned from an individual contributor into a manager; and the cultures I experienced in different tech compani...
Dec 25, 2023•1 hr 15 min
This episode discusses one company's journey to enable product personalization, starting with customer cohort based recommendation to user-level personalized experience using Learning to Rank algorithm. Based on the tech blog shared by Data Scientists from CARS24, the blog link is: https://medium.com/cars24-data-science-blog/personalized-buyer-listings-at-cars24-an-overview-83d8428bd7d9...
Dec 18, 2023•12 min
This episode discusses how to leverage the marketing mix model (MMM) to better understand and optimize company's marketing strategies. Based on the tech blog shared by Data Scientist from Haleon, the blog link is: https://medium.com/trusted-data-science-haleon/optimising-marketing-allocation-with-marketing-mix-models-382c9e471dde
Dec 11, 2023•11 min
This episode discusses how the team develops a Generative AI solution to create engaging email subject lines by combining a reward model and prompt engineering over ChatGPT APIs. Based on the tech blog shared by Machine Learning Engineers from Nextdoor, the blog link is: https://engblog.nextdoor.com/let-ai-entertain-you-increasing-user-engagement-with-generative-ai-and-rejection-sampling-50a402264f56...
Dec 04, 2023•12 min
This episode discusses what is customer segmentation and how to use the k-means clustering algorithm over Recency, Frequency, and Monetary (RFM) segmentation features to generate them algorithmically. Based on the tech blog shared by a Data Scientist from Microsoft, the blog link is: https://medium.com/data-science-at-microsoft/introduction-to-clustering-based-customer-segmentation-2fac61e80100...
Nov 27, 2023•11 min
This episode discusses how contrastive learning (one self-supervised machine learning algorithm) is used to enable in-video search capability, and help facilitate creating promotional videos. Based on the tech blog shared by Machine Learning Engineers from Netflix, the blog link is: https://netflixtechblog.com/building-in-video-search-936766f0017c
Nov 20, 2023•12 min
This episode discusses what is geo experimentation, and how synthetic control methods can be used to help make it more efficient, along with some tips and good practices shared by the blogpost author. Based on the tech blog shared by Data Scientists from Mercado Libre, the blog link is: https://medium.com/mercadolibre-tech/harnessing-the-power-of-geo-experimentation-how-mercado-libre-measures-the-effectiveness-of-its-f68b38857c4b...
Nov 13, 2023•13 min
This episode discusses the importance of faster experimentation velocity and how one powerful methodology, CUPED ( C ontrolled-experiment U sing P re- E xperiment D ata), can be used to achieve the goal. Based on the tech blog shared by Data Scientists from Statsig, the blog link is: https://www.statsig.com/blog/cuped...
Nov 06, 2023•13 min
This episode discusses how math and data science approaches make geospatial search easy. Based on the tech blog shared by Data Scientists from Walmart, the blog link is: https://medium.com/walmartglobaltech/geospatial-search-made-easy-52c0f213ea93...
Oct 30, 2023•10 min
This episode discusses how the New York Times cooking team made personalized recipe recommendations. Based on the tech blog shared by Data Scientists from The New York Times, the blog link is: https://medium.com/@timesopen/how-the-new-york-times-cooking-team-makes-personalized-recipe-recommendations-669c26aa4825
Oct 23, 2023•11 min