Streaming Audio: Apache Kafka® & Real-Time Data

‌

Episodes

‌

Confluent, founded by the original creators of Apache Kafka®•developer.confluent.io

Streaming Audio features all things Apache Kafka®, Confluent, real-time data, and the cloud. We cover frequently asked questions, best practices, and use cases from the Kafka community—from Kafka connectors and distributed systems, to data mesh, data integration, modern data architectures, and data mesh built with Confluent and cloud Kafka as a service. Join our hosts as they stream through a series of interviews, stories, and use cases with guests from the data streaming industry. Apache®️, Apache Kafka, Kafka, and the Kafka logo are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. No endorsement by The Apache Software Foundation is implied by the use of these marks.

Episodes

International Podcast Day - Apache Kafka Edition | Streaming Audio Special

What’s your favorite podcast? Would you like to find some new ones? In celebration of International Podcast Day, Kris Jenkins invites 12 experts from the Apache Kafka® community to talk about their favorite podcasts. Unlike other episodes where guests educate developers and tell stories about Kafka, its surrounding technological ecosystem, or the Cloud, this special episode provides a glimpse into what these guests have learned through listening to podcasts that you might also find interesting. ...

Sep 30, 2022•1 hr 2 min•Ep 235•Transcript available on Metacast

How to Build a Reactive Event Streaming App - Coding in Motion

How do you build an event-driven application that can react to real-time data streams as they happen? Kris Jenkins (Senior Developer Advocate, Confluent) will be hosting another fun, hands-on programming workshop—Coding in Motion: Watching the River Flow, to demonstrate how you can build a reactive event streaming application with Apache Kafka®, ksqlDB using Python. As a developer advocate, Kris often speaks at conferences, and the presentation will be available on-demand through the organizer’s...

Sep 20, 2022•1 min•Ep 234•Transcript available on Metacast

Real-Time Stream Processing, Monitoring, and Analytics With Apache Kafka

Processing real-time event streams enables countless use cases big and small. With a day job designing and building highly available distributed data systems, Simon Aubury (Principal Data Engineer, Thoughtworks) believes stream-processing thinking can be applied to any stream of events. In this episode, Simon shares his Confluent Hackathon ’22 winning project—a wildlife monitoring system to observe population trends over time using a Raspberry Pi, along with Apache Kafka®, Kafka Connect, ksqlDB,...

Sep 15, 2022•34 min•Ep 233•Transcript available on Metacast

Reddit Sentiment Analysis with Apache Kafka-Based Microservices

How do you analyze Reddit sentiment with Apache Kafka® and microservices? Bringing the fresh perspective of someone who is both new to Kafka and the industry, Shufan Liu, nascent Developer Advocate at Confluent, discusses projects he has worked on during his summer internship—a Cluster Linking extension to a conceptual data pipeline project, and a microservice-based Reddit sentiment-analysis project. Shufan demonstrates that it’s possible to quickly get up to speed with the tools in the Kafka ec...

Sep 08, 2022•35 min•Ep 232•Transcript available on Metacast

Capacity Planning Your Apache Kafka Cluster

How do you plan Apache Kafka® capacity and Kafka Streams sizing for optimal performance? When Jason Bell (Principal Engineer, Dataworks and founder of Synthetica Data), begins to plan a Kafka cluster, he starts with a deep inspection of the customer's data itself—determining its volume as well as its contents: Is it JSON, straight pieces of text, or images? He then determines if Kafka is a good fit for the project overall, a decision he bases on volume, the desired architecture, as well as ...

Aug 30, 2022•1 hr 2 min•Ep 231•Transcript available on Metacast

Streaming Real-Time Sporting Analytics for World Table Tennis

Reimagining a data architecture to provide real-time data flow for sporting events can be complicated, especially for organizations with as much data as World Table Tennis (WTT). Vatsan Rama (Director of IT, ITTF Group) shares why real-time data is essential in the sporting world and how his team reengineered their data system in 18 months, moving from a solely on-premises infrastructure to a cloud-native data system that uses Confluent Cloud with Apache Kafka® as its central nervous system. Wor...

Aug 25, 2022•34 min•Ep 230•Transcript available on Metacast

Real-Time Event Distribution with Data Mesh

Inheriting software in the banking sector can be challenging. Perhaps the only thing harder is inheriting software built by a committee of banks. How do you keep it running, while improving it, refactoring it, and planning a bigger future for it? In this episode, Jean-Francois Garet (Technical Architect, Symphony) shares his experience at Symphony as he helps it evolve from an inherited, monolithic, single-tenant architecture to an event mesh for seamless event-streaming microservices. He talks ...

Aug 18, 2022•49 min•Ep 229•Transcript available on Metacast

Apache Kafka Security Best Practices

Security is a primary consideration for any system design, and Apache Kafka® is no exception. Out of the box, Kafka has relatively little security enabled. Rajini Sivaram (Principal Engineer, Confluent, and co-author of “Kafka: The Definitive Guide” ) discusses how Kafka has gone from a system that included no security to providing an extensible and flexible platform for any business to build a secure messaging system. She shares considerations, important best practices, and features Kafka provi...

Aug 11, 2022•39 min•Ep 228•Transcript available on Metacast

What Could Go Wrong with a Kafka JDBC Connector?

Java Database Connectivity (JDBC) is the Java API used to connect to a database. As one of the most popular Kafka connectors, it's important to prevent issues with your integrations. In this episode, we'll cover how a JDBC connection works, and common issues with your database connection. Why the Kafka JDBC Connector? When it comes to streaming database events into Apache Kafka®, the JDBC connector usually represents the first choice for its flexibility and the ability to support a wid...

Aug 04, 2022•41 min•Ep 227•Transcript available on Metacast

Apache Kafka Networking with Confluent Cloud

Setting up a reliable cloud networking for your Apache Kafka® infrastructure can be complex. There are many factors to consider—cost, security, scalability, and availability. With immense experience building cloud-native Kafka solutions on Confluent Cloud, Justin Lee (Principal Solutions Engineer, Enterprise Solutions Engineering, Confluent) and Dennis Wittekind (Customer Success Technical Architect, Customer Success Engineering, Confluent) talk about the different networking options on Confluen...

Jul 28, 2022•37 min•Ep 226•Transcript available on Metacast

Event-Driven Systems and Agile Operations

How do the principles of chaotic, agile operations in the military apply to software development and event-driven systems? As a former Royal Marine, Ben Ford (Founder and CEO, Commando Development) is also a software developer, with many years of experience building event streaming architectures across financial services and startups. He shares principles that the military employs in chaotic conditions as well as how these can be applied to event-streaming and agile development. According to Ben...

Jul 21, 2022•53 min•Ep 225•Transcript available on Metacast

Streaming Analytics and Real-Time Signal Processing with Apache Kafka

Imagine you can process and analyze real-time event streams for intelligence to mitigate cyber threats or keep soldiers constantly alerted to risks and precautions they should take based on events. In this episode, Jeffrey Needham (Senior Solutions Engineer, Advanced Technology Group, Confluent) shares use cases on how Apache Kafka® can be used for real-time signal processing to mitigate risk before it arises. He also explains the classic Kafka transactional processing defaults and the distincti...

Jul 14, 2022•1 hr 7 min•Ep 224•Transcript available on Metacast

Blockchain Data Integration with Apache Kafka

How is Apache Kafka® relevant to blockchain technology and cryptocurrency? Fotios Filacouris (Staff Solutions Engineer, Confluent) has been working with Kafka for close to five years, primarily designing architectural solutions for financial services, he also has expertise in the blockchain. In this episode, he joins Kris to discuss how blockchain and Kafka are complementary, and he also highlights some of the use cases he has seen emerging that use Kafka in conjunction with traditional, distrib...

Jul 07, 2022•51 min•Ep 223•Transcript available on Metacast

Automating Multi-Cloud Apache Kafka Cluster Rollouts

To ensure safe and efficient deployment of Apache Kafka® clusters across multiple cloud providers, Confluent rolled out a large scale cluster management solution. Rashmi Prabhu (Staff Software Engineer & Eng Manager, Fleet Management Platform, Confluent) and her team have been building the Fleet Management Platform for Confluent Cloud. In this episode, she delves into what Fleet Management is, and how the cluster management service streamlines Kafka operations in the cloud while providing a ...

Jun 30, 2022•48 min•Ep 222•Transcript available on Metacast

Common Apache Kafka Mistakes to Avoid

What are some of the common mistakes that you have seen with Apache Kafka® record production and consumption? Nikoleta Verbeck (Principal Solutions Architect at Professional Services, Confluent) has a role that specifically tasks her with performance tuning as well as troubleshooting Kafka installations of all kinds. Based on her field experience, she put together a comprehensive list of common issues with recommendations for building, maintaining, and improving Kafka systems that are applicable...

Jun 23, 2022•1 hr 10 min•Ep 221•Transcript available on Metacast

Tips For Writing Abstracts and Speaking at Conferences

A well-written abstract is your ticket to conferences, but how do you write an excellent synopsis that will get accepted? As an experienced conference speaker, Robin Moffatt (Principal Developer Advocate, Confluent) often writes presentations that help the developer community to understand Apache Kafka® and its ecosystem. He is also the Program Committee Chair for Kafka Summit and Current 2022 : The Next Generation of Kafka Summit. Having seen hundreds of conference submissions, Robin shares bes...

Jun 16, 2022•49 min•Ep 220•Transcript available on Metacast

How I Became a Developer Advocate

What is a developer advocate and how do you become one? In this episode, we have seasoned developer advocates, Kris Jenkins (Senior Developer Advocate, Confluent) and Danica Fine (Senior Developer Advocate, Confluent) answer the question by diving into how they got into the world of developer relations, what they enjoyed the most about their roles, and how you can become one. Developer advocacy is at the heart of a developer community—helping developers and software engineers to get the most out...

Jun 09, 2022•30 min•Ep 219•Transcript available on Metacast

Data Mesh Architecture: A Modern Distributed Data Model

Data mesh isn’t software you can download and install, so how do you build a data mesh? In this episode, Adam Bellemare (Staff Technologist, Office of the CTO, Confluent) discusses his data mesh proof of concept and how it can help you conceptualize the ways in which implementing a data mesh could benefit your organization. Adam begins by noting that while data mesh is a type of modern data architecture, it is only partially a technical issue. For instance, it encompasses the best way to enable ...

Jun 02, 2022•49 min•Ep 218•Transcript available on Metacast

Flink vs Kafka Streams/ksqlDB: Comparing Stream Processing Tools

Stream processing can be hard or easy depending on the approach you take, and the tools you choose. This sentiment is at the heart of the discussion with Matthias J. Sax (Apache Kafka® PMC member; Software Engineer, ksqlDB and Kafka Streams, Confluent) and Jeff Bean (Sr. Technical Marketing Manager, Confluent). With immense collective experience in Kafka, ksqlDB, Kafka Streams, and Apache Flink®, they delve into the types of stream processing operations and explain the different ways of solving ...

May 26, 2022•56 min•Ep 217•Transcript available on Metacast

Practical Data Pipeline: Build a Plant Monitoring System with ksqlDB

Apache Kafka® isn’t just for day jobs according to Danica Fine (Senior Developer Advocate, Confluent). It can be used to make life easier at home, too! Building out a practical Apache Kafka® data pipeline is not always complicated—it can be simple and fun. For Danica, the idea of building a Kafka-based data pipeline sprouted with the need to monitor the water level of her plants at home. In this episode, she explains the architecture of her hardware-oriented project and discusses how she integra...

May 19, 2022•34 min•Ep 216•Transcript available on Metacast

Apache Kafka 3.2 - New Features & Improvements

Apache Kafka® 3.2 delivers new KIPs in three different areas of the Kafka ecosystem: Kafka Core, Kafka Streams, and Kafka Connect. On behalf of the Kafka community, Danica Fine (Senior Developer Advocate, Confluent), shares release highlights. More than half of the KIPs in the new release concern Kafka Core. KIP-704 addresses unclean leader elections by allowing for further communication between the controller and the brokers. KIP-764 takes on the problem of a large number of client connections ...

May 17, 2022•7 min•Ep 215•Transcript available on Metacast

Scaling Apache Kafka Clusters on Confluent Cloud ft. Ajit Yagaty and Aashish Kohli

How much can Apache Kafka® scale horizontally, and how can you automatically balance, or rebalance data to ensure optimal performance? You may require the flexibility to scale or shrink your Kafka clusters based on demand. With experience engineering cluster elasticity and capacity management features for cloud-native Kafka, Ajit Yagaty (Confluent Cloud Control Plane Engineering) and Aashish Kohli (Confluent Cloud Product Management) join Kris Jenkins in this episode to explain how the architect...

May 11, 2022•49 min•Ep 214•Transcript available on Metacast

Streaming Analytics on 50M Events Per Day with Confluent Cloud at Picnic

What are useful practices for migrating a system to Apache Kafka® and Confluent Cloud, and why use Confluent to modernize your architecture? Dima Kalashnikov (Technical Lead, Picnic Technologies) is part of a small analytics platform team at Picnic, an online-only, European grocery store that processes around 45 million customer events and five million internal events daily. An underlying goal at Picnic is to try and make decisions as data-driven as possible, so Dima's team collects events ...

May 05, 2022•35 min•Ep 213•Transcript available on Metacast

Build a Data Streaming App with Apache Kafka and JS - Coding in Motion

Coding is inherently enjoyable and experimental. With the goal of bringing fun into programming, Kris Jenkins (Senior Developer Advocate, Confluent) hosts a new series of hands-on workshops—Coding in Motion, to teach you how to use Apache Kafka® and data streaming technologies for real-life use cases. In the first episode, Sound & Vision, Kris walks you through the end-to-end process of building a real-time, full-stack data streaming application from scratch using Kafka and JavaScript/TypeSc...

May 03, 2022•2 min•Ep 212•Transcript available on Metacast

Optimizing Apache Kafka's Internals with Its Co-Creator Jun Rao

You already know Apache Kafka® is a distributed event streaming system for setting your data in motion, but how does its internal architecture work? No one can explain Kafka’s internal architecture better than Jun Rao, one of its original creators and Co-Founder of Confluent. Jun has an in-depth understanding of Kafka that few others can claim—and he shares that with us in this episode, and in his new Kafka Internals course on Confluent Developer. One of Jun's goals in publishing the Kafka ...

Apr 28, 2022•49 min•Ep 211•Transcript available on Metacast

Using Event-Driven Design with Apache Kafka Streaming Applications ft. Bobby Calderwood

What is event modeling and how does it differ from standard data modeling? In this episode of Streaming Audio, Bobby Calderwood, founder of Evident Systems and creator of oNote observes that at the dawn of the computer age, due to the fact that memory and computing power were expensive, people began to move away from time-and-narrative-oriented record-keeping systems (in the manner of a ship's log or a financial ledger) to systems based on aggregation. Such data-model systems, still dominan...

Apr 21, 2022•51 min•Ep 210•Transcript available on Metacast

Monitoring Extreme-Scale Apache Kafka Using eBPF at New Relic

New Relic runs one of the larger Apache Kafka® installations in the world, ingesting circa 125 petabytes a month, or approximately three billion data points per minute. Anton Rodriguez is the architect of the system, responsible for hundreds of clusters and thousands of clients, some of them implemented in non-standard technologies. In addition to the large volume of servers, he works with many teams, which must all work together when issues arise. Monitoring New Relic's large Kafka install...

Apr 13, 2022•38 min•Ep 209•Transcript available on Metacast

Confluent Platform 7.1: New Features + Updates

Confluent Platform 7.1 expands upon its already innovative features, adding improvements in key areas that benefit data consistency, allow for increased speed and scale, and enhance resilience and reliability. Previously, the Confluent Platform 7.0 release introduced Cluster Linking, which enables you to bridge on-premises and cloud clusters, among other configurations. Maintaining data quality standards across multiple environments can be challenging though. To assist with this problem, CP 7.1 ...

Apr 12, 2022•10 min•Ep 208•Transcript available on Metacast

Scaling an Apache Kafka Based Architecture at Therapie Clinic

Scaling Apache Kafka® can be tricky, let alone scaling a team. When he was first hired, Domenico Fioravanti of Therapie Clinic was given the challenging task of assembling a sizable tech team from scratch, while simultaneously building a scalable and decoupled architecture from the ground up. In addition, he wanted to deliver value to the company from day one. One way that Domenico ultimately accomplished these goals was by focusing on managed solutions in order to avoid large investments in eng...

Apr 07, 2022•1 hr 11 min•Ep 207•Transcript available on Metacast

Bridging Frontend and Backend with GraphQL and Apache Kafka ft. Gerard Klijs

What is GraphQL? And how can you combine GraphQL with Apache Kafka® to query data in real time? With over 10 years of experience as a backend engineer, Gerard Klijs is a Confluent Community Catalyst, a contributor to several GraphQL libraries, and also a creator and maintainer of a Rust library to use Confluent Schema Registry with Java client. In this episode, he explains why you want to use Kafka with GraphQL and how they work together to bridge the gap between backend and frontend to make dat...

Mar 29, 2022•23 min•Ep 206•Transcript available on Metacast