Snacks Weekly on Data Science - podcast cover

Snacks Weekly on Data Science

This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!
Last refreshed:
Follow this podcast in the Metacast mobile app to refresh it and see new episodes.
Download Metacast podcast app
Podcasts are better in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episodes

Building Generative AI Product for Customer Segmentation [Klaviyo]

In this episode, we'll explore how the data science team from Klaviyo developed a Generative AI Product that enhances experiences and enables efficient customer segmentation. We'll also discuss two key concepts in Generative AI: prompt chaining and few-shot learning. Based on their published tech blog, with the link provided here for your reference: https://klaviyo.tech/building-segments-ai-fe33d9cab822

May 13, 202416 min

Machine Learning Solution for Failed Job Auto Remediation [Netflix]

Description: In this episode, we will talk about the importance of remediating failed workflow jobs to reduce business infrastructure costs. We delve into Netflix's approach, which involves enhancing their existing rule-based error classifier with advanced machine learning models. This allowed for auto-remediation, improving the handling of memory configuration and unclassified errors, ultimately leading to substantial cost savings. Based on their published tech blog, with the link provided here...

May 06, 202414 min

Measure Technical Debt in Software Engineering [Booking.com]

In this episode, we will talk about what is technical debt in software engineering and its associated risks. We will also share a set of metrics to measure the status of technical debt and ways to help companies quantify their progress toward better software engineering efforts. Based on their published tech blog, with the link provided here for your reference: https://medium.com/booking-com-development/measuring-technical-debt-to-avoid-the-boiling-frog-syndrome-c44eb48b3ce1...

Apr 29, 202413 min

Improving Price Experimentation at Amazon [Amazon]

In this episode, we will discuss how the Science team at Amazon designs pricing experiments to improve experimentation power. We will cover concepts like the carryover effect and spillover effect, as well as the solutions the team developed to overcome those challenges. Based on their published tech blog, with the link provided here for your reference: https://www.amazon.science/blog/the-science-of-price-experiments-in-the-amazon-store...

Apr 22, 202416 min

Tackle Position Bias in Uber Eats Feed Recommendation [Uber]

In this episode, we will talk about what is position bias in recommendation systems, and how the applied scientist team at Uber tackled this challenge. Based on their published tech blog, with the link provided here for your reference: https://www.uber.com/blog/improving-uber-eats-home-feed-recommendations

Apr 15, 202414 min

Decision Making with Analytical Hierarchy Processing [New York Times]

In this episode, we will take a look at one analytical approach in decision-making called Analytical Hierarchy Processing (AHP). The New York Times team leverages this AHP process to help with making better collective decisions in their technology choices for privacy. Based on their published tech blog, with the link provided here for your reference: https://open.nytimes.com/collective-decision-making-with-ahp-3ef819e5bc2a...

Apr 08, 202415 min

Leveraging Generative AI to Boost Data Analyst Productivity [Intuit]

In this episode, we delve into a study conducted by Intuit on measuring the productivity impact of the Gen AI tool on their data analyst team. It demonstrates exciting positive productivity gains, offering an exciting outlook for many in the industry. Based on their published tech blog, with the link provided here for your reference: ​​ https://medium.com/intuit-engineering/how-intuit-data-analysts-write-sql-2x-faster-with-internal-genai-tool-c3b9d482208a...

Apr 01, 202412 min

Perturbation analysis of Large Language Models (LLM) [Microsoft]

Description: In this episode, we discuss how the data science team at Microsoft designed perturbation analysis to understand the Large Language Model’s performance on commonly seen tasks. Based on their published tech blog, with the link provided here for your reference: https://medium.com/data-science-at-microsoft/perturbation-analysis-and-llms-how-sensitive-are-llms-to-their-input-91a8407a971f...

Mar 25, 202414 min

Monte Carlo Simulation to Predict Tennis Game Outcomes [DraftKings]

In this episode, we discuss how DraftKings leverages Monte Carlo simulation to predict tennis game outcomes and generate probabilities to power its online gambling service. Based on their published tech blog, with the link provided here for your reference: https://medium.com/draftkings-engineering/building-a-tennis-simulation-d6afdaa97d19

Mar 18, 202411 min

Two-Tower Neural Network Architecture for Candidate Generation in Recommendation System [Expedia]

In this episode, we discuss the insights shared by Expedia's Machine Learning Engineering team on how they leverage the two-tower neural network architecture in the candidate generation stage of their recommendation system. Based on their published tech blog, with the link provided here for your reference: https://medium.com/expedia-group-tech/candidate-generation-using-a-two-tower-approach-with-expedia-group-traveler-data-ca6a0dcab83e...

Mar 11, 202410 min

Large Language Model (LLM) with Retrieval Augmented Generation (RAG) Technology for Efficient Agile Planning [Walmart]

In this episode, we discuss how the Engineering team at Walmart created a customized Large Language Model (LLM) agent with Retrieval Augmented Generation (RAG) Technology to improve their agile planning efficiency. Based on their published tech blog, with the link provided here for your reference: https://medium.com/walmartglobaltech/an-autonomous-agent-for-agile-planning-98303e194e08...

Mar 04, 202414 min

Automated Sanity Checks to Streamline Machine Learning Deployment [Intuit]

In this episode, we discuss why and how Intuit leverages automated sanity checks to streamline its machine learning development process. Based on their published tech blog, with the link provided here for your reference: https://medium.com/intuit-engineering/how-to-streamline-ml-model-deployment-automated-sanity-checks-64a23166fdc5

Feb 26, 202411 min

Demand Forecasting with Machine Learning Models [Picnic]

In this episode, we take a look at the learnings from an online grocery tech company regarding their development of machine learning models for scalable demand forecasting. Based on the tech blog from Picnic International, the link is provided here for your reference: https://blog.picnic.nl/running-demand-forecasting-machine-learning-models-at-scale-bd058c9d4aa7

Feb 19, 202412 min

Handling Online-Offline Discrepancy in Ads Ranking System [Pinterest]

In this episode, we discuss the online-offline discrepancy problem in machine learning models and explore Pinterest's journey to tackle this challenge in their ads ranking system. Based on the tech blog from Pinterest, the link is provided here for your reference: https://medium.com/pinterest-engineering/handling-online-offline-discrepancy-in-pinterest-ads-ranking-system-8fd662da4c2d

Feb 12, 202414 min

Consolidate multiple machine learning models for a better performance [Instacart]

In this episode, we explore why and how Instacart’s advertising team unified multiple browse click-through-rate prediction models by leveraging an advanced deep learning architecture. Based on the tech blog from Instacart, the link is provided here for your reference: https://tech.instacart.com/one-model-to-serve-them-all-0eb6bf60b00d

Feb 05, 202410 min

Building Recommendation System with Encoder Architecture [ZipRecruiter]

In this episode, we explore how ZipRecruiter, an online employment marketplace, leverages deep learning and Encoder modules to develop their recommendation system, which is designed for better matching job seekers with job listings. Based on the tech blog from ZipRecruiter, the link is provided here for your reference: https://medium.com/ziprecruiter-tech/multimodal-learning-for-employment-marketplace-recommendation-ee67bdbede53

Jan 29, 202413 min

Leverage ChatGPT to build claim assistant functionality [Oscar Health]

In this episode, we look at how Oscar Health, a health insurance company, leverages the large language model (LLM), ChatGPT-4, to develop its claim assistant feature. They have built a reasonable assessment system and deployed several strategies to guide GPT-4 in further improvements. Based on the tech blog from Oscar Health, the link is provided here for your reference: https://medium.com/oscar-tech/oscar-claim-assistant-built-on-gpt-4-5bd7eb4d6129

Jan 22, 202412 min

Ads Simulation to Accelerate Advertising Optimization [UberEats]

This episode shares how the applied science team developed an ad simulation system as a more efficient and economical evaluation method for the Uber Eats Ads business. This tool has proven instrumental in guiding and expediting product development and driving growth. Based on the tech blog from Uber, the link is provided here for your reference: https://www.uber.com/blog/unleashing-the-power-of-ads-simulation/

Jan 15, 202412 min

Leveraging Generative AI for Food Recipe Creation [HelloFresh]

This episode shares how the HelloFresh data science team explores the use of generative AI to create new food recipes. Based on the tech blog shared by Data Scientist from HelloFresh, the blog link is: https://medium.com/hellofresh-dev/recipes-and-generative-ai-6d74a107860c

Jan 08, 202413 min

[Special episode] Different types of Data Science work and why I enjoy being a Data Scientist

This episode includes a re-distribution of the interview I had with Daliana Liu on her podcast "The Data Scientist Show" (https://open.spotify.com/episode/0HzWSLIWpz6iHJiKXs0GNJ) There, I shared my thoughts about the focus difference between machine learning engineering and product data science, some projects I worked on at Uber and LinkedIn, as well as other topics: such as why I transitioned from an individual contributor into a manager; and the cultures I experienced in different tech compani...

Dec 25, 20231 hr 15 min

2023-12-18 Product Personalization with Learning to Rank [CARS24]

This episode discusses one company's journey to enable product personalization, starting with customer cohort based recommendation to user-level personalized experience using Learning to Rank algorithm. Based on the tech blog shared by Data Scientists from CARS24, the blog link is: ⁠⁠https://medium.com/cars24-data-science-blog/personalized-buyer-listings-at-cars24-an-overview-83d8428bd7d9...

Dec 18, 202312 min

2023-12-11 Optimising Marketing Allocation with Marketing Mix Models [Haleon]

This episode discusses how to leverage the marketing mix model (MMM) to better understand and optimize company's marketing strategies. Based on the tech blog shared by Data Scientist from Haleon, the blog link is: ⁠https://medium.com/trusted-data-science-haleon/optimising-marketing-allocation-with-marketing-mix-models-382c9e471dde

Dec 11, 202311 min

2023-12-04 Generative AI solution to create engaging email subject lines [Nextdoor]

This episode discusses how the team develops a Generative AI solution to create engaging email subject lines by combining a reward model and prompt engineering over ChatGPT APIs. Based on the tech blog shared by Machine Learning Engineers from Nextdoor, the blog link is: https://engblog.nextdoor.com/let-ai-entertain-you-increasing-user-engagement-with-generative-ai-and-rejection-sampling-50a402264f56...

Dec 04, 202312 min

2023-11-27 Clustering-based customer segmentation [Microsoft]

This episode discusses what is customer segmentation and how to use the k-means clustering algorithm over Recency, Frequency, and Monetary (RFM) segmentation features to generate them algorithmically. Based on the tech blog shared by a Data Scientist from Microsoft, the blog link is: https://medium.com/data-science-at-microsoft/introduction-to-clustering-based-customer-segmentation-2fac61e80100...

Nov 27, 202311 min

2023-11-20 In-video search [Netflix]

This episode discusses how contrastive learning (one self-supervised machine learning algorithm) is used to enable in-video search capability, and help facilitate creating promotional videos. Based on the tech blog shared by Machine Learning Engineers from Netflix, the blog link is: https://netflixtechblog.com/building-in-video-search-936766f0017c

Nov 20, 202312 min

2023-11-13 Geo Experimentation [Mercado Libre]

This episode discusses what is geo experimentation, and how synthetic control methods can be used to help make it more efficient, along with some tips and good practices shared by the blogpost author. Based on the tech blog shared by Data Scientists from Mercado Libre, the blog link is: ⁠⁠⁠⁠⁠⁠ ⁠ https://medium.com/mercadolibre-tech/harnessing-the-power-of-geo-experimentation-how-mercado-libre-measures-the-effectiveness-of-its-f68b38857c4b...

Nov 13, 202313 min

2023-11-06 CUPED explained [Statsig]

This episode discusses the importance of faster experimentation velocity and how one powerful methodology, CUPED ( C ontrolled-experiment U sing P re- E xperiment D ata), can be used to achieve the goal. Based on the tech blog shared by Data Scientists from Statsig, the blog link is: ⁠⁠⁠⁠ https://www.statsig.com/blog/cuped...

Nov 06, 202313 min

2023-10-30 geospatial search made easy [Walmart]

This episode discusses how math and data science approaches make geospatial search easy. Based on the tech blog shared by Data Scientists from Walmart, the blog link is: ⁠⁠ https://medium.com/walmartglobaltech/geospatial-search-made-easy-52c0f213ea93...

Oct 30, 202310 min

2023-10-23 personalized recipe recommendations [The New York Times]

This episode discusses how the New York Times cooking team made personalized recipe recommendations. Based on the tech blog shared by Data Scientists from The New York Times, the blog link is: ⁠https://medium.com/@timesopen/how-the-new-york-times-cooking-team-makes-personalized-recipe-recommendations-669c26aa4825

Oct 23, 202311 min
For the best experience, listen in Metacast app for iOS or Android