Real-Time Exactly-Once Event Processing with Apache Flink, Kafka, and Pinot //Jacob Tsafatinos // MLOps Coffee Sessions #97 - podcast episode cover

Real-Time Exactly-Once Event Processing with Apache Flink, Kafka, and Pinot //Jacob Tsafatinos // MLOps Coffee Sessions #97

May 05, 202254 minSeason 1Ep. 97
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

MLOps Coffee Sessions #97 with Jacob Tsafatinos, Real-Time Exactly-Once Event Processing with Apache Flink, Kafka, and Pinot, co-hosted by Mihail Eric.


Join the Community: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTJoinIn⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠

Get the newsletter: ⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠⁠https://go.mlops.community/YTNewsletter⁠⁠


// Abstract
A few years ago, Uber set out to create an ads platform for the Uber Eats app that relied heavily on three pillars: Speed, Reliability, and Accuracy. Some of the technical challenges they faced included exactly-once semantics in real-time. To accomplish this goal, they created the architecture diagram above with lots of love from Flink, Kafka, Hive, and Pinot. You can dig into the whole paper (https://go.mlops.community/k8gzZd) to see all the reasoning for their design decisions.


// Bio
Jacob Tsafatinos is a Staff Software Engineer at Elemy. He led the efforts of the Ad Events Processing system at Uber and has previously worked on a range of problems, including data ingestion for search and machine learning recommendation pipelines. In his spare time, he can be found playing lead guitar in his band Good Kid.


// MLOps Jobs board  
https://mlops.pallet.xyz/jobs
// Related Links
Uber blog
https://eng.uber.com/author/jacob-tsafatinos/
https://eng.uber.com/real-time-exactly-once-ad-event-processing/

--------------- ✌️Connect With Us ✌️ -------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Mihail on LinkedIn: https://www.linkedin.com/in/mihaileric/
Connect with Jacob on LinkedIn: https://www.linkedin.com/in/jacobtsaf/

Timestamps:
[00:00] Introduction to Jacob Tsafatinos
[00:40] Takeaways
[04:25] Jacob's band
[05:29] Lyrics about software engineers or artistic stuff
[06:20] Connection of hobby and real-time system
[08:43] How to game the Spotify Algorithm?
[10:00] Data stack for analytics
[13:28] Uber blog
[16:28] Video mess up
[17:04] Considerations and importance of the Uber System
[21:22] Challenges encountered through the Uber System journey
[26:06] Crucial to building the system
[28:13] Not exactly real-time
[30:22] Design decisions main questions
[34:23] Testament to OSS  
[36:58] Real-time processing systems for analytical use cases vs Real-time processing systems for predictive use cases
[38:46] Real-time systems necessity
[41:04] Potential that opens up new doors
[41:40] Runaway or learn it?
[46:09] Real-time use case target
[49:31] Resource constrained
[50:48] ML Oops stories
[52:45] Wrap up

For the best experience, listen in Metacast app for iOS or Android