Event Sourcing with Hannes Lowette - podcast episode cover

Event Sourcing with Hannes Lowette

Jul 31, 20251 hr 4 min
--:--
--:--
Download Metacast podcast app
Listen to this episode in Metacast mobile app
Don't just listen to podcasts. Learn from them with transcripts, summaries, and chapters for every episode. Skim, search, and bookmark insights. Learn more

Episode description

How can event sourcing help your applications? Carl and Richard speak with Hannes Lowette about his work in helping developers utilize event sourcing patterns to build scalable applications. Hannes discusses moving away from the old habit of decomposing data from objects into rows, columns, and tables, as there's no reason to save that disk space anymore. Storing objects as event streams means you can always generate relational data if needed, but things run faster and scale better in the streams.

Transcript

Speaker 1

How'd you like to listen to dot NetRocks with no ads?

Speaker 2

Easy?

Speaker 1

Become a patron for just five dollars a month. You get access to a private RSS feed where all the shows have no ads. Twenty dollars a month, we'll get you that and a special dot NetRocks patron mug. Sign up now at Patreon dot dot NetRocks dot com. Hey guess what, it's dot net rocks. I'm Carl Franklin at amiraterid Cabell. We're here again for the nineteen hundredth and sixty first time.

Speaker 2

Well you.

Speaker 1

Okay, you're here for the eighteen hundred and sixtieth time.

Speaker 2

I think something like that. Second.

Speaker 1

I'm the new guy, remember new guy new the new guy. Yeah, we are the OG podcast certainly when it comes to development, and one of the first five that's still existence since podcast hit the ground running. We've been going since August two thousand and two, A long, long, long, long time.

Speaker 2

It's been a while. Yeah, how are you, Richard? I'm good. It's summertime, and summertime it's good. Sometimes it's beautiful on the coast.

Speaker 1

It's a little hot where I am, but it's not bad.

Speaker 2

It's not bad. It could be worse. Well, and she who must be obeyed is turning sixty, so I've well, at this point we'll have thrown that party.

Speaker 1

So wow, it's always congratulations Stace. All right, let's roll the crazy music because I've got something for Hannas for better no framework awesome?

Speaker 2

All right, man, what do you got?

Speaker 1

Since this is episode nineteen sixty one, you can go to nineteen sixty one dot dot me and that brings you to this article from Menga, which is just somebody's blog I think. I don't even know the name of the person, doesn't even say all right, but anyway, it's called computerized guitars and whether they're worth buying or not. I know, Hanness plays guitar and makes guitars and yeah.

Speaker 2

Builds guitars.

Speaker 1

Yes, he's a guitar guy, right, Yeah, So this article basically says nobody likes these electronic you know, computerized guitars. Really, nobody likes them. Nobody wants one.

Speaker 3

There's some weird experiments out there, like there are Gibson experiment with de robotic tuners and volunteery that that's great, and nobody seems to want that on their guitar.

Speaker 2

That's just it. You know.

Speaker 1

The sales say that they're cool, but nobody wants them, like the Gibson Firebird X, which is I think the one, or Firebird ten, which is I think what you're talking about, and the Les Paul Robot guitar was another one. But this is without question the most computerized electric guitar in the market right now. It has everything on it. The

only it doesn't have is sales. Nice nobody bought it, and Gibson actually ended up crushing, like with a bulldozer, about a hundred of these things because they were a liability to have on the books, because they got had to get rid of their inventory, and because nobody was buying them.

Speaker 3

Weirdly, Yeah, like every every guitar manufacturer that does strangely innovative stuff has been has been having trouble selling them. Like my favorite guitar that I own is a park Or Fly, which was way ahead of its time, and they struggled to sell them. Yeah, although they were selling guitars that were actually work twice what they were selling them for, but nobody bought them because they were so unconventional.

Speaker 1

Yeah, and so let me just point out the other ones here the Fender Veg stratocaster, and I actually have one of these, but it was before that it was the Fender Veg. It's a Fender strat, but it also has the role In logo on it because you can connect it to the rollin VG system and it's got the right pick up in there and all. But as a strat, it's just kind of bleuh right, I have

a strat plus that I love. I mean, it's hard to keep in tune and all that stuff, like a regular strat is, but at least it's a good sounding strat.

Speaker 2

The isn't the question here why people play guitar, because most people who play guitar aren't making money playing guitar. No, that's true, sadly no.

Speaker 1

They also mentioned the epiphone Les Paul Ultra, which has a USB port where you can connect the guitar to you know, your computer. The problem with that is that he says here too, machines and computers just don't mix when it comes to guitars. An electric guitar is a machine that by nature is a very physical instrument. This isn't like a synthesizer where it just sits there and you play it with a guitar, you sit down way that, you stand up with that you move around playing and

so on. But most players consider these computerized guitars not real instruments because, let's face it, the sound of playing a synth through a guitar is just no, it's just bad. So they're out there. I think if I was going to get one, I'd just get an automatic tuning one and leave it at that.

Speaker 3

Actually that system works, it's not that yeah, yeah, it's just good. Well, in this blog post is thirteen years old. You think the technology has probably moved on a little from there.

Speaker 1

Yeah, a little bit, but I mean those are still not that much though. Those are still the models that you know that people like, except for the Fiber ten, which is not available anywhere.

Speaker 3

Actually, the musician that has done the weirdest stuff to some of their guitars that a lot of people might know about is Matt Bellamy from muse Be built in like x Y controllers that control synthesizers that were like controlled from his guitar and built in effects into the body of the guitar. Like that had more electronics in it than a lot of other instruments out there.

Speaker 2

Right.

Speaker 1

Well, and then you got guys like Brian May who just made a very unique sounding guitar working with phase cancelation and phasing and all that stuff. But you know he's just just built by himself, Yeah, and his father, Yeah, out.

Speaker 3

Of very unconventional woods in the guitar world. He used things like oh, which nobody he uses.

Speaker 1

Yeah. So anyway, that's what I got today. And I thought we would have a nice conversation about guitars, and we did so.

Speaker 2

Yeah. Richard, who's talking to us today, grabbed a comment off a show nineteen thirty nine, which is the one we did with Jeremy Miller talking about vertical slice architecture, and I know we got into events, sourcing and things with him at one point or another. Two. This comment comes from cash bon Fields, who said, you mentioned the buzz factor, which is how many people could be hit

by a by while the project still survives. I have been recently considering the opposite, how many people should be hit by a bus for a project to be effective. The bus should most often be used for selecting managers, though I think there's a little anger in their cash. You want to talk to someone, don't don't hurt anybody. No, I get you. You know, there's definitely that effect of this. You know, how many cooks before you spoil the soup kind of thing, right, And we were six months behind

on the project, so they had a developer. Now we're nine months behind on the project. People don't understand how to fix things. Hey, I got you the I got your book Mythical Man month, but actually got you two of them so you can read it twice as fast. Boom, yeah, Cash, Thank you so much for your comment, and a copy of music Coba is on its way to you. And if you'd like a copy of music Cobe. I read a comment on the website at dot and rocks dot

com on the facebooks. We publish every show there, and if you comment there and I read it on the show, we'll send you a copy of music. Go buy Cobey is still going strong.

Speaker 1

Twenty two chapters if you call them or episodes or tracks, I guess I.

Speaker 2

Kind of like chapters.

Speaker 1

Twenty two tracks, yeah, whatever you want to call them. There's twenty two musical tracks and they're all twenty five minutes long, perfect for your Pomadoro technique. Music took code by dot net all right, before we bring on Haness what happened in nineteen sixty one besides John F. Kennedy's inauguration, after which he immediately invaded.

Speaker 2

Cuba by pigs thigs. Yeah, I don't know, just parallel, maybe kind of a hairy year, no two ways about it, and without a dough.

Speaker 1

Yeah, the establishment of the Peace Corps and Juriga Garin became their first official human to travel into space. Additionally, the and directly into orbit too, you know, yeah, she orbited as opposed to you know, the US first effort

with Alan Shepard in the sub ardable flight. Yeah, yeah, so I know, there's all the Beatles came in, you know around there, I said, the Peace Corps, some nuclear testing was going on in the Soviet Union, and the Academy Awards thirty third Academy Awards took place, with the Apartment winning Best Picture. You know, just some random trivia. But you've got the reallest Richard, what really happened in nineteen sixty one?

Speaker 2

I mean I always go off to the invention side, just because you know, so many cool things are made. Sixty one is when the first industrial robot is deployed into a factory. It's called Unimate. It had been developed in the fifties by George Deval and General Motors puts it into a automotive factory to remove metal parts from a die caster. So a die caster is basically a big hydraulic ram. You put sheets of metal into it and it slams them into the shape of like a

door panel. But that force that molds that that quickly makes the metal really HoTT. So not when a person has handled at any clubs and things, is very easy get burned like it's dangerous work. And so having the robot deal with that we made a lot of sense. Made sense. Yeah, And over the next decade or so, automation would come throughout the factory process. So it begins in sixty one

with the Unimate. A little corollary to that, as they were celebrating the Unimate success, it got an appearance on Johnny Carson and poured a cup of coffee.

Speaker 1

I think I do remember that. Yeah, yeah, it was ninety sixty one. You were not born yet, no, but I saw the rerun that was a famous famous clips.

Speaker 2

Could be famous episode. Nineteen sixty one is also the first integrated circuit in production. Now I step into this very carefully because it is a hotly contested topic. In fact, it's got a Wikipedia entry just on the invention of the integrated circuit, because there's lots of arguments here. I mean, for I mean meaning who did it first? Yeah? And what represents an integrated circuit? Mean, we'd been messed around with semi conductors for a while, We've made a bunch

of different kinds of transistors. You've got Back at fifty twosh radio engineer Jeffrey Dummer was making semiconductor circuits. There's also Harwick Johnson, Sydney Darlington, Yasatarui. But it really comes down to two people for the most I think most

people agree. Now it's Jack Kilby of Texas Instruments with a thing he called the hybrid ice, so some discrete components and some integrated in a containerized package with pins, and of Robert Nois of Fairschald Semiconductor, who's also the guy who later goes on to form Intel with his monolithic ices.

Speaker 1

So the integrated circuit means everything is on one ship with the pins and the transistors are mounted inside that.

Speaker 2

That's the idea, and there's a bunch but if you're looking at just the packaging, So I have a package with pins on it, and it has a bunch of stuff in it, and it does a particular task that's pretty integrated. But then you get into what's actually inside that package, and you know, Noise is designed with what they call the monolithic icee is actually a layers of ustrate silicon crystal that are doped into different potentials so that they can be etched together to make transistors and

resistors and other components. I get it. Yeah, Killey's approach was more discreete.

Speaker 1

So ICs continued to evolve and get more powerful. But you're calling that one the first because it was the first contained integrated.

Speaker 2

Well, I called the first really going in ninety sixty one is when it was first made into a product, at least briefly, and it was you know, the nexus of this is in two thousand, Jack Kilby was given the Nobel Prize in Physics for his quote unquote contributions integrated circuit and created a huge fuss around all that because Kilby's design that the hybrid icee doesn't last. Ultimately, Noises design is what makes integrated circuits to this day.

Although limittedly. The technology has evolved a lot, but in nineteen sixty one, Texas Instruments started manufacturing a product called the Multivibrator five h two. Inside of that package, some of it some silicon substrates, some discrete were two transistors, four diode, six resistors, and two capacitors to be used as a waveform generator, which was part of a miniaturized

radio receiver for the US Air Force. Wow. So they made a few hundred of these and built the test articles, and it ultimately failed, like in the end, it wasn't sturdy enough, but it set the big path for integrated circuits in that kind of ministriization at that time called molecular electronics because they were so small even though they were not molecule size in aerospace, and the Air Force would lead on this for quite a while. It leads

to the ICBM and a bunch of other technology. But the first, you know, you can kind of hang your head on, Texas Instruments produce this first thing they called the integrated circuit, even though it was a design that ultimately fell into obsolescence in the model ethic ic would go on from there. Wow big one though, like I said, it's not simple. Yeah, it is.

Speaker 3

Something simpler that we may all remember. Nineteen sixty one, Phillips introduced the audio cassette, the Compactus that we would all remember from compact cassette player from our Walkman's Wow throughout the eighties.

Speaker 2

The Walkman's Walkman's is still twenty years away at that point. But yeah, I know, I know, but wow, you know what I you know, I pulled up a graphic on the other day on on the Nakamichi autocassett player because it was the one that would flip the tape. I have one.

Speaker 1

It's right there. Actually, yeah, I have one. It doesn't work it, Yeah, flips the tape.

Speaker 2

Over, pops the thing would pop out, turn the tape over, and go back because they didn't want to try and adjust the head alignments to play both both sides of the tape. You know.

Speaker 1

Yeah, classics. Okay, all right, cool stuff. So I guess we can introduce Hannis officially. You've heard him talk there. But Hannis Lauette started his dot net career during college with Framework one point one. He continued this trend in his first jobs and has been using dot net ever since to deliver projects. For many different clients. He's always has had a passion for back end development in his projects. He's always gravitated to problems that involves scaling, architecture and complexity.

Because of that, he developed a love for event driven architectures that fits with what we're going to talk about. Since twenty eleven, surprise, surprise, Yeah, yeah, Since twenty eleven, he's been working for ACES in Belgium.

Speaker 2

Is that how you say it?

Speaker 1

Yes, ax Xes currently as principal consultant. Aside from his technical role, Hannas has always been driven by helping other developers grow. He actively coaches people, runs workshops, gives conference talks, and is a course author on dome Train is free time. You can find him on stage playing guitar for line Breakers, The line Breakers, I guess that's the line Breaker Dylan's band.

Speaker 2

Right.

Speaker 1

When he's not doing things with his three children, he builds guitars, plays chess badly in parentheses, I didn't write this, and loves tasting whiskey and who doesn't to keep moving. He loves to mountain, bike or swim. All right, Hannas, The floor is All of those things are true. Wow, yeah, And the floor is yours. I'm sorry, I'm pretty sure you wrote this.

Speaker 3

I'm just yeah, I kind I kind of sent this to Krawl in events. But hey, thanks for having me, Thanks for having show. I'm embarrassed almost that we haven't had you before at this point. Well, this is your first appearance on dot Ney Rocks, even though we've hung out at conferences and seen each other talk. Yeah many times. So welcome, Thank you, Thanks for having.

Speaker 1

The floor is yours? You're going to talk about event sourcing. I guess maybe other things, maybe other shoes. Yeah, let's see where the conversation goes. Should we start with the elevator pitch? What is event sourcing? If you've been hiding under a rock for the last fifteen.

Speaker 3

Years, okay, elevator pitch. We have been using systems like the default when we develop software systems seems to still be to use normalized databases. And basically, when you or state in a normalized database, what you're basically doing is storing the state of the system right now, and you lose all of the history that came before it. And with event sourcing, what we actually try to do is the opposite. We store all of the events that lead up to the current state, which means that if we

replay the events, we can recalculate the state. If we have the state, we cannot recalculate the history because that's gone. Basically, if you switch your philosophy from storing state to storing events, you enable a whole lot of other things in your architectures as well. But the essence of event sourcing is basically that you're going to store events instead of the projected safe.

Speaker 1

You neveric going to update, You're going to add a new record with the changes. Yes, some people have asked this by adding a couple of fields to every table. You know, update user and update date. You know, insert date, insert users.

Speaker 3

Like auditing fields and stuff like.

Speaker 1

Yeah, like auditing fields, and that's only good for the last change that was made.

Speaker 2

Yeah.

Speaker 3

The whole idea for me that makes this interesting is if we look at where database normalization comes from. A lot of that was conceived when the first Squel standards were developed in the nineteen eighties. And what's relevant for that time is that your average hard drive cost a lot more than the computer that it was attached to, Right,

So normalizing had a couple of benefits. First of all, it made your data consistent and you try to avoid any update conflicts or any data manipulation concurrency issues that you might have. You would also store every bit of information only once, which conserved.

Speaker 2

Disk space back back when that was important.

Speaker 3

Right back when that was important. But nowadays on your cloud build, disk space is not typically the thing that most like CPU cycles and and memory usage and bandwidth like that's that's all going to be way more expensive than than your disk space.

Speaker 2

Yeah.

Speaker 3

But what we have, I think in our industry maybe ingrained in our brains a little bit, is all the tradeoffs that we've made to normalize our databases, because it comes with a whole bunch of tradeoffs, but we have just learned to work work around, right, all of those tradeoffs.

Speaker 1

Sure, So what are those trade offs with normalized relational database that we might have forgotten about?

Speaker 3

Well, the first thing that comes to mind is you are sacrificing CPU cycles to be able to store your data in a normalized shape, because whenever you have to fetch a meaningful amount of data to do work on, you're often hitting multiple tables, which means that you're doing joints,

and joints are expensive. Yeah, and the problem with doing joints in a typical relational database is you also have no way of easily scaling that compute because it's going to happen in your database engine and you cannot just like put a cluster of nodes in front of it that are going to do the joints for you. So

that's that's the first thing. You're sacrificing a bunch of compute to make that happen, and you introduce locking problems because whenever you touch your record, you have to make sure that there's no race conditions and no threading problems, so.

Speaker 1

You have to use transactions in case there are problems, and then you roll it back.

Speaker 3

Yeah, so either you're going to deal with that with optimistic concurrency or with pessimistic concurrency, but at some point you are going to deal with some kind of because concurrency issues that you're going to solve in your database as well, because you're only storing that one bit of state, and everybody that touches that is going to have to touch the same record and update it or deleted or whatever.

Speaker 2

That we want to do with it.

Speaker 3

And we've taken all of that for granted, because if we look at even the modern versions of dot Net. The first stack that they brought to Core included entity framework, and I'm a huge enity framework fan. But part of why we have or rems is because we try to solve some of the issues that you have with this way of thinking, because almost never are you in a

complex application are you touching just a single table. You usually need a whole graph of different related tables, a parent entity with some with some related entities that you're all fetching and then updating and then writing back and all that sort of stuff. And that's why you have ORMs to make that a lot easier to do in code. But all of that is still working around the idea that we're only storing the truth at this very moment in our relational database.

Speaker 2

And part of storing the truth is actually creating data structures that keep the truth. You know, the classic one is yes, only one address per customer, and then they move, Yeah.

Speaker 3

Like you lose the history, Like what was the state of this entity last week, last year, or whatever. We've actually lost that. And that's where the whole idea of event sourcing comes from, is that if instead of storing the resulting state, we instead store all the business events that lead up to this state becoming what it is. We can always recreate the state because the state is just very cheaply replaying the events.

Speaker 2

Right.

Speaker 3

And in event sourcing, we assume that every event that we have written to our event stream what's true at the time that it was created, so we don't have to doubt it, right, this was the truth, which means that your replay mechanism is actually going to be relatively cheap because what you usually have is you have your root entity with all of its dependencies, and you're replaying all the events on that and you don't have to

doubt the event. So a lot of these methods are just going to be setting a couple of fields, a couple of properties that you're going to fill out or modify or whatever, and you can replay thousands of events in just a couple of milliseconds. But the cool thing about that is that replaying is not happening in your database engine, is right. It's happening somewhere outside of it, in your net code, which is easier to distribute.

Speaker 1

So the light bulb just went off because I was asking myself, when you're talking about the problems that it solves in this whole you know, multiple joints and everything. Well, now you're really kind of talking about flat document, right that instead of having all these joints. So is event sourcing something that you can use alongside a relational database,

like your events are separate from the relational database? Or do you remove the hierarchy of a relational database when you're using events sourcing?

Speaker 3

Well, typically what an event stream looks like is an append only store. You're going to append new events to the store, and all of these events happen usually in the scope of a certain stream ID or a certain aggregate I D or whatever you want to call it, which typically like what orders, it kind of points to

the root and dy that this event happened for. Right, So if you have a certain order number, like customer Richard is placing an order on my website to order a new guitar, and he's going through the whole process of specking said.

Speaker 2

Guitar, it'll be a tuba though for Richard, totally.

Speaker 3

Totally hypothetical, and we could we could basically model Richard's

configuration process of the guitar in multiple events. Now, he selected the color and he did this and he did that and whatever, and every event is going to point to Richard's order, which means that if you Carl at the same time are configuring a guitar as well, those two aggregates will have nothing that has that could potentially have a side effect on one another, because those will be two separate aggregates or two separate streams, if you will.

And when we append everything to the store, we have a couple of things that play into the key. One is the aggregate that event happened for, but also the

sequence number. And the sequence number is what's going to help us with concurrency, because at some point you're gonna have in an event source system, you're gonna want to scale out, you want to go You're gonna want to have a way to deal with multiple commands coming in on the same aggregate because they're all gonna spawn events on the same aggregate and you don't want them to be written to the database without knowing about the other

events that might have happened in the mean one. So the the the stream idea or the aggregate idea, and the sequence number that's going to be the important bit of storing new events and events. The payload is just an object something that we serialize, and it could be several it could be as deep as we want, it could be a whole object graph. It's just a single object that we have to be able to de serialize again. And that's that's basically the essence of what you need.

Simplest event stream you could build.

Speaker 1

So let's say the first person to log on after midnight. Right, Let's say you're going to get all the data from your relational database, et cetera and make an event that's like, hey, this is the first event today.

Speaker 2

Is that?

Speaker 1

Is that the idea, and then all the modifications that happen to it just work with that those event streams and when you when you're doing updates and everything.

Speaker 3

No, the idea is that the idea is that if you need that root entity, instead of fetching that from the database, you were actually going to fetch all of the events that have happened on that entity and then you're going to replay them in code. Okay, and that will bring you up to speed to the point where we are. But where do we start? I mean we start with a database, right that has on a data in it. Yeah, well, you could know your database could just be a single table with all of your events in it.

Speaker 2

Oh all right, so this is what I was asking before.

Speaker 1

You get rid of your relational database in but you still have your objects in memory, right. You define your memory that you're going to turn into Jason yes in story in your event database.

Speaker 3

Whenever the live cycle of how, let's say, an API call comes in. I think that's the easiest way to reason about. It's like an API call comes in typically a command something you want your system to do, and you're going to want to execute that. Usually your commands

are going to touch a certain root entity. If I'm going to update the address for Carl, the root entity is going to be Carl, right, So I'm going to have to fetch your date details, and then your current state is going to dictate what the result of the command is going to be, because what your current address

is might affect what happens with this address update. So in order to get your current state, I'm going to fetch all of the events that have happened on entity Carl, replay them, get your current state, then run my command, and then append the new events to the stream.

Speaker 2

Again. Yep, I get that. I get that.

Speaker 1

I guess what I didn't get was that we're completely replacing our relational database well in this with a document centric database.

Speaker 3

For the events sourced stuff. Yes, yes, And what does happen a lot is that you will still have relational tables with some reference data and so on. There's no point in making every single entity in your entire system events sourced. I mean, you don't want to hurt yourself too much. And so if you have a table that just has a list of countries or whatever, you could make that event sourced if that is part of your core domain. Usually that kind of stuff is reference data

and you can just keep that in relational tables. But when it comes to the entities where you're doing you're running your actual domain logic, you're going to basically want to get your truth from the events. That's the whole idea.

Speaker 1

Yeah, that's great, and this is a great place to pause for a break, So we'll be right back after these very important messages don't go away. Did you know you can easily migrate asp net web apps to Windows containers on AWS. Use the app to Container tool to container eye is your iis websites and deployed to AWS managed container services with or without Kubernetes. Find out more about app to container at aws dot Amazon dot Com, slash, dot net, slash modernize, and we're back. It's dot net Rocks.

I'm Carl Franklin, an amateur Campbell, and we're talking to Hanness Lowett about event sourcing.

Speaker 2

You know, I'm reminded that the relational Databases developer reporting and it got sold on transactional velocity because that was easier to sell. But there is a there's a case to be made for after an event is logged, if you think it's reportable or it's a you know, falls in a certain class of event, then you do decompose it asynchronously, not making the customer way into relational data tables for reporting purposes, because.

Speaker 3

That's what we call projections. Yeah, and so what you basically do if you think about your c qres architecture, Let's say that you have a comment side of your system where all of the business logics is triggered when other systems or users Q commands, and you have your red side of the system where you're querrying data. It makes sense to prepare some data for querying because if you regularly need to pull a list of all of

your customers with their current addresses. That's going to be very expensive to run if you have to project it from the source events, because what you're basically going to have to do is get all of your users and project them all one by one until you have the list of addresses. As way too expensive. So basically, whenever we queue, whenever we save an address changed event, that's going to update that table. So we have little bit of bits of business logic that are going to be

running and listening to certain events and then updating projected tables. Now, the big advantage of doing that is that also that is relatively inexpensive. You can look at the contents of your event and you know, okay, it's this entity. This is talking about Carl, and Carl has new address. So all I have to do is without really thinking, because I don't have to check the validity of the event that has happened when we executed command, so I don't

have to recheck that. I can just take that and then update the view or at the table that we are going to querry on the query side of our system, and we can keep that in sync. And the cool thing about these projections is if we want, we can rebuild them so whenever the logic to go from source events to whatever projected data we want to have. Whenever that changes, we just throw away the table and repopulate it.

We replay the stream from the beginning of time and like, if you have a system with millions and millions of events, like, some of these projections might take a while to rebuild, so there are techniques to do that at a synchroncy before you replace the table, that sort of stuff. But you can basically get an updated view of your query data by just rerunning the events and replaying them because you still have everything that has happened in the system ever since.

Speaker 2

Yeah, so it means you could always regenerate any aggregate resulting data set one or the other at least validate whether it's correct or not. But my concern is as your number of events starts to build up, finding out how many orders did we have last month can get costly.

Speaker 3

Well, not if you have a projection that updates that view on the floor right right. If your projection is caught up and you're not changing the way that that projection works. Theory, you should not have to rebuild it.

Speaker 2

So it should always be correct.

Speaker 3

It should always be correct, So you can you can flip this around. You could have a projection for every query endpoint on your querry API that hits a single table on an index, and you can have that table populated from your event stream with a dedicated projection just for that table.

Speaker 2

But I appreciate that we're both saying should because things do go wrong, but can also be fixed.

Speaker 3

Of course, it can be fixed, and that's that's the big advantage. Because you have the whole history, you you can rebuild anything. And in theory, like one of the examples that's often used is okay, you can you can

generate new insights from past events. And in theory that's true if you have saved all of the events, you can basically generate new insights, new features, even based on events that have happened in the past, and you can repopulate the view by just letting that projection run until it's caughta In practice, like often when you're extending a feature set of an application, you're also introducing new events and you will not taken those into account in the past,

so it that that claim has a limited validity. In practice, but it's a cool selling point, right.

Speaker 1

Yeah. So you've been talking about some of the problems that have event sourcing solves. Are there any new challenges with event sourcing that come into plug.

Speaker 3

Oh, definitely, some of some of the challenges that, weirdly, you don't have with a normalized database because if you normalize your database properly, every bit of information is one linked to the primary key of the table that it's in, and it's only stored in one bit. So if you do it right, you should end up with a pretty deterministic view of what your data looks like and the way that you query it and update it on whatever.

Speaker 2

Like.

Speaker 3

All of that is coding that happens afterward, so you don't have to think about that too much. As soon as you start applying event sourcing, partitioning your system based on the root entity that the commands are going to happen in basically determining the boundaries of your aggregates, that becomes very important. But it's a problem that you also

see when you use a document database. For instance, if you switched over to something like the dB or AVNDB or whatever, you're going to see the same problem because you have to really think about which bit of my data belongs together and is a sensible scope to process commands. Because as soon as you have to start working with multiple documents and multiple streams at the same time, that's when things get tricky, right, But.

Speaker 1

Generally you know what those things are, right, I mean, you generally have to find the tip of the iceberg and then you know all the related data goes in there as well. So isn't that's something that you would probably know if you're familiar with the data shape.

Speaker 3

If you're familiar with the data shape, yes, But fromans, I feel like most developers that come from the crut normalized database world they don't think that way and they've never actually wondered like, which are my root entities?

Speaker 2

Where does the stuff happen?

Speaker 3

And as soon as you start mapping out business processes and you start thinking about what is happening here in my business? When does a user or an external system q a command? What is happening when that command? Which entity does that command hit? That's when all of that becomes very clear. So if you practice things like domain driven design and you start using things like even storming

to map out your business processes. With post its on a wall, you end up with not only a very clear view of how you're going to partition that system, but you end up with post its that you can one on one basically convert into class in code, because those will become your commands and your events and all of that. And of course there's there's work to do, but at.

Speaker 2

Least the the.

Speaker 3

Amount of information that can get lost when you try to translate that from that wideboard full of post its into code is a lot closer than it would be if you're using a relational database. It also will have way less side effects.

Speaker 1

So if you have if you create your what you think is your hierarchy and your your your shape and your root unit down and you start saving some events and all that stuff, and then you realize, oh, I need this entity in here as well. Does that change anything or do those past events become invalid or in the new event you've just added this entity.

Speaker 3

And not really like there are definitely scenarios where you are going to touch multiple multiple streams as a result of a single command. Think about your shopping car, for instance, at some point you're also going to touch the stock system, so that might be another stream that lives in the same application. That might be an external system that needs to get called, right, so.

Speaker 1

You're not trying to make one stream that rules them all. This is what you're talking about boundaries having where do you separate those things exactly?

Speaker 3

And you have to figure out where the boundaries are and cross them with a very conscious state of mind. I have to worry about where am I leaving, like the thing that I can do on a single stream, and there has to be a very good reason for that. Like, if all of your commands are doing that, then you model something really wrong.

Speaker 2

Yeah, I think single stream for everything is bad too, but I'm sure you end up with arguments about too many streams.

Speaker 3

So the like, there's a couple of patterns that you can use. For instance, like the saga pattern is very

well known. If you have a command that comes into your system that's going to touch multiple streams or even multiple systems if you have to make external calls, what you typically have is a quite long running transaction that is going to be happening as soon as that command comes in, right, So you're going to save some intermediary states and let the whole thing play out because with external systems, it might be an outgoing call and then a callback coming in and whatever you're going to have

to be able to pick up the threat. That's a perfect way to model that, and it's not that dissimilar from how you model more complex transactions across multiple streams and inside the system. Now if you use frameworks, because a lot of it depends on what is backing your what is backing your event sourcing framework, Like you can

roll your own. A lot of it is not that complex and you needed probably about a day to to roll something that can do a lot of the things if you want to have something battle tested and let somebody else think about hard work. You've had Jeremy on the show and we talked about Martin his Martin framework is well, it's not is it a framework or a document database? It's an event store right right? What Martin

calls it a web framework. Yeah, well that's that's if you involve Wolverine and and and all the other things of the Critter, Critter Stack, the creator. But Martin Martin is basically I don't know if that's my story to tell. I'm sure Jeremy told it on the show.

Speaker 2

As well, Like he's been on a few times.

Speaker 3

They they ran into trouble with with raven dB, if I remember correctly, and they needed something capable of handling the load, and they looked at postcrests. It's like, cool, Postcrest can do a lot of stuff that we need when it comes to the document stuff, but the projection stuff that's in raven like not so much. And that's why they ended up building a lot of what Martin

eventually became. And because that had documents in there and had projections in there like, it became a very nice candidate for building an event source system as well, because if you have a stream with all your documents that have all of your events that have happened in the past, and you can run projections on that, then you're already like eighty percent of the way there to have an event source system where you have your event stream and

you have your projections that project into other documents or other tables or whatever, which is why Martin now has a very explicit support for event sourcing, and you get an out of the box event stream that you can pentax to and it supports all the things that you will run run into at first when you start growing one of these systems, because one of the typical problems is you have a certain aggregate that becomes very long lived and it starts accumulating not thousands, but millions of

events that becomes very expensive to replay. So at some point you're going to want to start snapshotting that. Right, We're going to take a snapshot at events one million, two hundred and sixty four so that we only have to replay the last fifty, right, and all of that functionality is already built in. Somebody has thought about that way, about those things way harder than most of us ever will, And I think that's a very valid argument to start using one of those ready made frameworks to do that

kind of stuff. But you do have to understand what goes on behind the scenes, because if you register a projection in there, that's going to spin off a background worker that's actually going to be querrying your database and running code and writing back to the database. So none of that's it happens magically, but it's not.

Speaker 2

Free, right. Yeah. I was just going back and look at the Jerry Miller shows and realized the entire thing that you've just told we've done as a series of shows with him about Martin on Postgrass, then playing with events sourcing, and then Wolverine, and it's literally working through the problems you're describing as this works up to this point and then you need to go over here and like it's just it's.

Speaker 3

Yeah, Wolverine is very nice SAGA support, right, So if you pair that with Martin, that works really well.

Speaker 2

Now we're back in the critter stack exactly. There's a reason that we have all these different critters.

Speaker 3

Yeah, But like I think at this moment in the dot net space, Martin is probably one of the one of the best ways to get started with events sourcing because you get a lot of stuff out of the box and the complexity of the setup for you as the person developing the system, the complexity is relative fully low because what you need is a dot net process that connects to a postcress database, right, and if you have multiple nodes doing that, they will actually communicate through

postcress locks, so you don't need anything out of band for your notes to make sure that they're not interfering with one another.

Speaker 2

What makes postcrests special in this situation, Like I I already have SQL server why wouldn't I use it basin binary chasin binary Jason.

Speaker 1

Okay, no, well SQL server has adjacent table field type now, yeah, but it's more expensive.

Speaker 3

Yeah, but for a long time they didn't. And postcress is is still a lot more powerful than what the server is trying.

Speaker 2

I also like the licensing on postgrass, which is to say free. Yeah, but.

Speaker 3

Postcress is ridiculously powerful and it does a lot of things very well. And Basin is basically a way that they figured out to store chasing documents in a binary format in a way that they could still index on fields that are pretty deep in the document's tree, which is cool. Added benefit is this works in an acid transactional database, which Mongo. In Mongo, for instance, you have

transactions on document level, but not across multiple documents. And with postgress, that's a thing that's there by default because by default it's an ASTE compliant relational database. And because basin is so powerful, like in a lot of scenarios, it will outperform a lot of other document databases while basically being a relational database. And I get the irony, like, now we're going to use a relational database to store our events stream and our documents.

Speaker 1

But you know, but at the same time you get the relational stuff you need that too, So.

Speaker 3

Yes, and and that's one of the cool things. If you want to project your events into actual sequel tables, like you can do that in the same transactions. You will be able to run if you want. You can have some of your projections transactionally consistent, meaning that they will project when you're saving the event. Like that's the hard stuff, like figuring figuring all of that out in a framework that you're going to roll yourself is going

to cost so much time. And that's when Martin, for instance, shines because it has.

Speaker 2

All that building.

Speaker 3

But Basin is a lot of the reason why that works with postgress.

Speaker 1

Do you think agentic coding with LLLMS is going to help or hurt the efforts of events sourcing?

Speaker 3

I think, But of course i'm comfort in this. I might be prejudiced because a lot of the patterns that come from using event sourcing they're relatively side effect free. Like your projection is going to touch the table that it's updating, but it's not going to touch anything else. It reads from the statement writes to a single table, and the same with your commands, like you're fetching a bunch of events and then you're appending to the stream.

You're not updating records, you're not walking records. Like a lot of that is actually easy to performance tune down the line because that code is relatively side effect free. The patterns that are used in event sourcing would actually be very very adequate for generating code, because if you're trying to explain to an agent like, hey, this is what my business process looks like, these are the commands,

and then this happens. Basically the way that we think out the world, right, if we figure out our drink order at the bar, it's a conversation, it goes back and forth, and it's the way our world works, is the way that software systems work, and especially event source systems. And I think that because there is less of a translation happening between the business problem and the actual code, I think this will actually lend itself very well to

getting generated systems from from agents or whatever. Of course, we don't have to train them on code basis that deal with normalize databases, but that's not a hard problem to solve, right if you're training LLLMS.

Speaker 2

I haven't done any of this testing yet, but it makes a lot of sense to see could we point an LLM at an existing SQRES implementation, say, is this a well implemented event source pattern application? You know, because the first thing I want to do is be able to test. Is they already doing this right before I even ask you to generate any code? Sure? Yeah, I've not tried, no, but this is an interesting idea to say, could these tools be really effective at this pattern based

development approach? Yeah?

Speaker 3

Curious est part of the reason that this is gaining a lot of traction in the DDD world is because of the lack of translation from the business concepts to the actual classes that you're going to see for your events.

Speaker 2

I like the attitude, and I thought I've said this before on the show of Just Store the truth. Order comes in store the order. What you do with it after that is secondary the point. But the fact that you have a record of what happened, a complete as it was at the moment. I think it's really powerful. And the only price there is disk space, and this space is trivial.

Speaker 3

It's cheap, especially if it's structured data like chasing documents.

Speaker 1

Yeah, it's very easy, and I imagine the base on data takes up way less dispace. Then oh yeah, ASKI chasonski chasing.

Speaker 3

But even asky chasin isn't that bad.

Speaker 2

It's not yah, and this base is not the issue. I mean parsing all that after the fact, you can you can discuss from there, But first and foremost, it's just a good idea to keep a record of the truth.

Speaker 3

You know, you can don't be mistaken like you're not getting auditing information like out of the box, because that's a common misconception about events sourcing. If you want to have your full audit trail of what happened for legal reasons or whatever, you're gonna need to enrich that data a little bit because who is important, Like who made the command that came in? What was the command that came in, what events resulted from that, what changes it

a like due to the entity? Like the last part is what you're getting for free with your event stream, but you need to also lock your commands and add some metadata like who did this and when did they do?

Speaker 2

Is the only thing you're getting for sure is the results, But to keep the detail of how you got to this result, that's up to you exactly where does this struggle? Like where do where do you get to? Where does this get into trouble?

Speaker 3

Oh like GDPR becomes tricky?

Speaker 2

Oh yeah, oh interesting, mostly protecting privably identifiable information.

Speaker 3

Yeah, because you you in a way, you want to have everything that happened in your system in your in your streams. And it's very easy, like if Carl asks to have his information deleted, like we're is going to delete his streams from our events stream, but that might affect the next time that I'm running.

Speaker 2

Check the stuff out. Yeah, well, and he's not. He's not asking to delete the sales record. The sales record is the truth. It's the identifiable information. They identifiable information. And then you come into situations where there's certain information like passwords for instance, you're not gonna want in your events stream, but they still need to go somewhere, right, Yeah, so you're you might end up masking some data in the event stream that you're gonna store somewhere else or

deal with elsewhere. Right. I don't know if it's compliant, but my instinct there is I never modify the stream, but I do flag we don't have access to this person's information.

Speaker 1

Anymore, so you never show it. Well, but is that the same as deleting it in in in.

Speaker 3

Like I think the GDPR legislation forces you can actually delete ith right, you can no longer have it on file.

Speaker 2

That's what I thought too. Yeah, so do you have to go back through all of your historical records that were burned a DVD and delete those two? That's I love compliance. Compliance is the best, but that's that's.

Speaker 3

One of those problems where events sourcing gets tricky.

Speaker 2

For instance, Yeah, this idea of keeping a record of the truth.

Speaker 3

Now to be honest, like any any system here, if you start thinking about.

Speaker 2

That, Yeah, and my niceing innolational database is I can this customer wants to be forgotten and so we erase their identifal inform infusiation from the record? I think I still keep the pointer record, like just the ID, but none of their information. Isn't that ID anymore? And that that also.

Speaker 3

Begs the question like, okay, we still want to track Carl as one of the people that registered for our website, like how many users sign ups did we have? Right? And if we drop his stream like that event might

be gone. So then you have to start thinking about Okay, if we want to model that particular bit of information like new users signed up, we're going to have to have a user signed up event in another aggregate that does not get deleted when Carl's stream is removed from the system, so that that information is still still accurate. Like those things becomes more intentional.

Speaker 1

Do you leave the name and erase everything else or replace it with not available or scrubbed or something like redacted or something like that.

Speaker 3

There's there's different different approaches that you can take. Yeah, but very very most Margin for instance, allows you to archive certain streams, which means that like this aggregate is never going to get fetched back again.

Speaker 2

And then.

Speaker 3

In theory, you could just delete that from the database. You could go through you could execute a simple command saying like every every event where we have the archived flag set to TROW, we can delete that. But then you have to really worry about like, okay, which projections might that affect and is that a bad thing? Is that a problem?

Speaker 2

Okay?

Speaker 3

If this, and that's also a reason why you might want to rebuild stuff, because I might delete the data from the stream, I might delete defense, but the data might still be in the projected in the projection tables. So that's one of the reasons to periodically rerun or update, like rebuild certain protections.

Speaker 2

And you got to get I'm so low to the leade data just because you know what you are, you sure you're deleting the right things, Like I can also get in trouble with the government for not reporting sales because I had to forget this person and forgot their sales.

Speaker 1

Yeah, well that's true. Got to keep it around at least until you pay the sales.

Speaker 2

Tax or whatever that. Yeah, and probably have to keep it around forever you can. You know, you get audited, you still have to make sure your sales are correct. Yeah, there's a in the United States, there's like a ten year window or something after which you can delete anything you want. Yeah, but even then you I mean, it's smart to keep track of all of your sales, who you sold it to. That's identifiable information. You can delete that,

yeah yeah, yeah, yeah. I just suddenly have this vision of you have these streams marked GDPR redacted user yeah number you know eleven, that kind of thing.

Speaker 3

If certain events, for instance, contain identifiable information, you can mask that. You can say, like this field and that type of event. I don't want it to be safe to the stream.

Speaker 1

And then you just but if the federalies say you have to delete it, you have to delete it.

Speaker 2

Yeah, answers the question of what are you deleting, right because at the same time I have another federale saying you can't delete sales records. Yea, I will send you to prison for hiding income.

Speaker 3

Another thing that becomes tricky over time is versioning of events. Events may evolve over time and become.

Speaker 2

Sure new version of the sotom more elaborate or more complex or simpler.

Speaker 3

A certain command might certainly spawn two events instead of one. Suddenly you have as long as your objects are backwards compatible to whatever is saved in or data store, like, that's not a problem. But at some time you're going to break that.

Speaker 2

Yeah, you a replayable state somehow. And the easiest thing is is to.

Speaker 3

To just have the two versions in your code. Just have two versions of that event in two different namespaces or with two different class names or whatever. And that works fine.

Speaker 2

But that because how sweet do you think we're only going to have two?

Speaker 3

That becomes a mess over time, And then you start, Then you start practicing things like upcasting or but that's that's something that I tried to steer away from as long as I can, and ideally never do, is start updating the table and upcasting the events and resaving them too ideally new streams. But that's a tricky mechanism as well, because your new stream will have new eyelins to it. It's now incorrect, and it escalates, it escalates.

Speaker 1

I think when I hear you're saying, is higher harness if you want to tackle these problems?

Speaker 3

Yeah, well yeah, And there's there's a bunch of other people out there that do great work in this space as well. Like if you think about Oscar do for instance, you've met Oscar right. Oscar does some great stuff with events sourcing, and the whole d d D community seems to have adopted this as their pattern of choice. So there's many many people out there. But like think before

you actually do this. And this works very well if you have complex domains that need to be modeled in code where you have high concurrency, for instance, because it's very good at dealing with that very explicit business logic. You don't want to have too many side effects. That's a very good reason to choose for this, But in general, as soon as you become proficient at this, you start using it for a lot more things because it's.

Speaker 2

Just so easy. Yeah, it's definitely. It sounds like a pattern that once you get the groove, you see places to use it. But it's not everything.

Speaker 3

It's just you have to flip the switch in your brain and for me, that live cycle of command comes in. We replay events, then we execute our business logic and that spawns new events which we now append to the stream again. As soon as that clicks in your head like it's no, it's not harder or easier than working with normal entities that you're pulling through entity framework from your r M and then calling safe changes. I mean, it's it's it's a trivial, a trivial amount of thinking to fit that in.

Speaker 2

Well, honess what's next for you? What are you doing? What's in your inbox?

Speaker 3

I'm leaving on holiday tomorrow or where you're thinking about what's next for me professionally? Either you know where you're going so European, I'm going I'm going to the to the Dutch coast with my family. We're going to go to a little house that's actually on the beach, much like the way that Richard lives, but not as fans.

Speaker 2

Wow, no fancy, my place has been Okay, that's pretty we.

Speaker 3

We will have neighbors for instance, that we can see.

Speaker 2

The family.

Speaker 3

The family is going to to the ghost and and I'm gonna recuperate for a little bit because I've had a rough year, like renovating a house and and becoming an independent contractor for half of my time and making courses and all that sort of stuff. And what's next professionally what I'm hoping to do in the fall, because I've just released two courses on events sourcing on the

Home Train and the second one. There's like two more chapters that I really want to make that aren't there yet, So that might be a thing that I'm that I'm going to do in the fall. Great. Yeah, while we're looking and it's looking like conference season is going to be busy from November to beginning December, so I'll probably run into you guys a couple of times somewhere somewhere.

Speaker 2

All right, Well, that man, this is great stuff. Thanks Hanas. Yeah, great talking to you.

Speaker 3

I was happy to be on the show and thanks for letting me rand a little bit about this thing that I'm really passionate about.

Speaker 2

Absolutely crazy explanation. Thanks so much, very clear, all right, and we'll talk to you next time on dot net rocks.

Speaker 4

Dot net Rocks is brought to you by Franklin's Net and produced by Pop Studios, a full service audio, video and post production facility located physically in New London, Connecticut, and of course in the cloud online at pwop dot com. Visit our website at d O T N E t R O c k S dot com for RSS feeds, downloads, mobile apps, comments, and access to the full archives going back to show number one, recorded in September two thousand and two.

Speaker 2

And make sure you.

Speaker 4

Check out our sponsors. They keep us in business. Now go write some code, See you next time.

Speaker 3

You got jam Vans

Speaker 2

And

Transcript source: Provided by creator in RSS feed: download file
For the best experience, listen in Metacast app for iOS or Android