#60 - Software Tradeoffs and How to Make Good Programming Decisions - Tomasz Lelek | Tech Lead Journal podcast

00:00

Software engineering involves a lot of decisions and that decision has some trade-offs. So we have bars and cards and so on. It's not like one decision is always better than the other. Sometimes you are not aware about trade off at the decision time, but later after a year or two, this is costly. So maybe this cost could be validated. Hey everyone.

00:26

My name is Henry Surya Barragan. And you're listening to the tekhelet Juno, the show will be bringing you the greatest technical leaders practitioners and thought leaders in the industry to discuss about their Journey ideas and practices that we all can learn and apply to build a highly performing technical team and to make an impact in your personal work. So let's dive into our Journal. Hello, everyone. Welcome. Back to another episode of the

01:00

technology. Not podcast with me, Ojos Henry Surya with Robin. Thank you for tuning in and spending your time with me today, listening to this episode. If you're new to this podcast, please follow technology, you know on your podcast app and social media on LinkedIn. Twitter. And Instagram also consider supporting the show by subscribing as a patron at technology. No, dot f /, Patron, and support me to continue producing, great content every week. Speaking about software

01:29

mistakes. I'm pretty sure that many of us have made such mistakes in our software projects in whatever shapes or forms, or how about having to decide on important trade-offs to incorporate in your software design. And if you look back to those experienced, some of you who got them, right? Will be patting yourself on the back for being able to choose the right solutions for your software problems. However, I also believe that a lot of us could point back in time and say, yeah, I think I

01:57

got that one. Wrong. And what a big mistake that was personally. I've made a number of those costly mistakes and how I wish I could use some guidance, that can teach me how to avoid typical and common software mistakes and trade-offs. Fortunately, Our Guest for today's episode has written a book just on that particular topic. Thomas leg is the author of software mistakes and trade-offs, how to make good programming decisions. In this episode, Thomas shared what?

02:27

Let him to write this book and also shared one of his past software mistakes taken from his career experience. He also gave advice on how we as software developers should approach the potential software mistakes. We could make and explain some trade-offs that we typically face. When making software, engineering, design decisions these days such as code, duplication forces flexibility, premature optimization versus optimizing hot path data locality and memory and finally

02:56

different. Semantics and there are trade-offs in building distributed systems. There is so much to learn from Thomas about software, trade-offs, in this episode. And I hope you learn a lot from this episode as well. And if you do help the show, by giving it a rating and review on your podcast app or share some comments on the social media channels.

03:18

It may sound trivial, but those reviews and comments are one of the best ways to get this podcast to reach more listeners, and hopefully, they can also benefit from All the contents in this podcast. So let's start our episode right. After our short sponsor message. Are you looking for a new cool swag? Tackle it Journal. Now offers you some swags that

03:39

you can purchase online. These wax are printed on demand based on your preference and will be delivered safely to you all over the world where shipping is available. Check out all the cool strikes available by visiting technology, you know, dot, f / shop and don't forget to break yourself. Once you receive any of those X. Hello everyone, welcome back to another episode of the package, you know, podcast today I have with me someone named Thomas

04:07

lelik. He's actually the author of soon-to-be published book by many called software mistakes and trade-offs. So when we talk about writing software, most of the times actually, yes, we do care about correctness. We do care about performance and things like that. But sometimes we also need to be aware of what kind of possible mistakes or trade-offs that we

04:26

need to think about when. When we write software so that it performs based on the context that we want to build it from and also the business requirements. And also in terms of maybe functional correctness and also performance. So today, I think Thomas will discuss a lot about this and hopefully, we can learn from him about all the mistakes that maybe have seen throughout his career or we some of the Epic stories Thomas.

04:48

Currently works at data stacks for company that built databases such as the Sandra and also a few other things. Yeah, Thomas really happy to have you in the show and looking forward for This conversation. All right, welcome. Oh, so Thomas maybe in the beginning for you to introduce yourself, maybe telling us more about your career Journey or maybe any highlights or turning points? Yeah. Sure. So I've started in the ships that company.

05:11

It was like on the Scandinavian Market at all Scandinavian people were reading for content stores, like newspaper and also websites providing news, but it was only to millions of people and users. So that scale was not so big. So, Really? I promise to Allegro the send eCommerce website here in Poland. The biggest one, we have 20 million of active users. And there was a lot of interesting problems that are

05:38

big data problems. And we were collecting users, traffic and saving dots for the future analytics, much learning. And so on the data was like eight years or so, so there was a petabytes of data. So was involved in streaming Solutions when using Kafka also microservices architecture that will Amount of services a picture with like hundreds of Miles. Davis is living there was 700 of them and each of like a service was deployed to multiple nodes

06:06

that was auto-scaling. Also rolling deployment, of those batteries that are well known in this ecosystem. And also be greatest solution, space Millennials park because it's Advanced and has some advantages over how to bunt and those old Solutions after almost four years. I've proposed to data stocks because here we have a lot bigger scale as you mentioned. We are providing ecosystem around Cassandra and some tools around it.

06:32

Like for example, I know Stargate API that provides a variety of apis to our customers that could speak to Cassandra but not deep directing VAR C ql but also for example grdc graphql and so on. Also, I was involved in a cop car collector. So it was based on connect three more cred by confluent. So we provided a way to insert data to Cassandra and Link instruction that price from Kafka platform. Was quick out there for sure as well. So we needed to take part was consideration. Seriously.

07:03

I was involved in some performance test things as well and so on, and main product was also double driver. So the main entry point for some data, from jvm ecosystem to the jvm world also design trade-offs, interesting decisions that I was involved as a part of the team. So that's a summary of my journey besides that. I am also a technical trainer here in Poland. So that's also an interesting. Audience because I can teach folks from different companies, but also I can learn from them

07:32

from their questions by doing. So I'm better at the world. I am teaching them or something Kafka or caching Solutions performance. That's and so on. Thanks for sharing your journey. It seems like throughout your career. You have to work a lot with kind of big data problem, distributed systems microservices and also like optimizing to the low-level you mention about building like connectors drivers and things like that. So obviously you have seen Of different types of software.

07:59

Most they are for high performance characteristics. But in the first place, why are you interested in writing a book about software mistakes and trade-offs? This is quite interesting because I don't see much of such book except yet some mental patterns. Maybe there is, but in the first place software mistakes and trade-offs. Why do you want to publish this

08:17

kind of book? Yeah. So from the beginning of my career, I've started this thing that software engineering involves a lot of decisions, and the decision has some trade-offs. So we have pros and cons and so on. It's not like, one decision is always better than the other. But of course, you can have more than two options as possibilities. When I was involved in this variety of systems, in some business, strictly e-commerce systems, be data, streaming processing us on the team

08:45

tackle. A lot of those decisions and trade-offs are different levels. So the solutions are the low-level, like, code patterns. That's, every supplier engine is to make them almost every day, but also more architecture. Just that you can like discuss with your team which screened or h 2 weeks or even more high-level positions that will influence your evolution of your software. Its maintenance flexibility also performance during those

09:11

breaking points in my career. When we choose one solution over another I've started to writing a personal list of those decisions and how they end up. Like, from the time perspective. For example, how this decision impacted our software after one year after, Three years and so on because all psychic pleasure to remove those systems, not only during the development, but also made the nurse and when the system was deployed to production and I so actual usage of it.

09:41

After that. I've created this list of decisions. It was like, 15, hold them. That were really important. And after some time I saw that, those words are quite generic, so I decided to share it with for the engineers Architects and so on. So they will not make Same mistakes and they will be aware of those trade-offs because sometimes you are not aware about trade off at the decision time, but later after year or

10:07

two, this is costly. So maybe discussed could be a lady that after we will read one of my chapters of my book. So it's interesting that you started this from like curating your personal list of decisions and trade-offs in your career. Maybe if you can share maybe one or two, the costliest mistakes that you have. Senior career based on this list. Welcome. So first one and I'm touching on that in chapter 10 of my book.

10:36

So we're building a streaming solution and we needed some kind of scheduling Library scheduling framework that will work efficiently. So we need to count like ten thousands of requests per second. At least our system has distributed was based on Hazel cast work. Then we picked one solution one Library without realizing all its trade-offs.

10:57

We hold properly analyzing its source calls because on the one hand, and this library was documenting that it should work properly in distributed system, but in practice, in reality, it turns out that this problematic, if you have multiple nodes, there were some problems of synchronization, the domain was totally different and the library was not well prepared for that. So, this was caused to be saved because we needed to debug those problems.

11:22

And then after some time, turns out that this is a lot easier to just implement the me. Even product. That provides those functionalities for arms, instead of using food party software, that we didn't fully understand and we didn't have enough influence on fixing that. So I think that was one of the mistakes that could be judged. That would be generalized to a fact that we should know the trading model of a libraries that we are importing.

11:49

So for example, even if you have some long block right now, there is a chance to influence microservices that works using non-blocking Solutions can gladly Node.js and so on. Sometimes if you are using libraries that are not well suited for that reading model. You may block you on, mine dress processing. It made turns out that it Blocks Your Mind flow and impact your performance. So that's should be found at the design stage.

12:13

And also, we find a way around if you have quite simple flow, that works in synchronous way using an awesome College library. Also, you should be aware about problems that may arise because you need to be our about ordering about multitrack. Being feisty. When you are introducing most challenging to your code. It means that you need to be aware of that, and it means that we will have additional problems or maintenance costs.

12:37

Maybe sometimes there may be hidden, like, for example, in chapter 5 of this example of java streams API and using parallel stream of Destruction. So it hides the internal for between pool that is doing processing in the multitrader dry. So, at the first glance at the code, complexity level, there is no change at all. You are just changing strings. Method, invocation, two, parallel streams, but underneath you on creating goal of Triads

13:03

and your compass multi-threaded. Now attaching, I'm not into chapters of my book basically, so, as a software developer, I know that a lot of times as a developer's, we tend to work based on just business requirements and you know, just ensuring that the functional correctness, the input produces the correct output. So in your opinion, right, what should be the attitude of developers or software engineers in terms of looking at? Bible Software mistakes and

13:29

trade-offs. Is it something that they have to be aware since the beginning before they even write the software or is it like during the implementation, or like, how should you advice for software developers to relook at all these potential software mistakes and trade-offs. Yes. I think there are two different approaches to that. So, first one is when you are peaking at some framework library, that will shape your code base and influence your code base alone.

13:56

So, for example, in the VM Ward, if you are picking spring framework and well if you guys are called base and will be hard to change it to something else in the future. Also, for example, if you are picking a queue solution and you are picking Kafka, we need to be aware about the stray dogs.

14:13

It is favoring availability over consistency, depending on the settings of acknowledgements with a distributed system that you are integrating with, the, you need to ask the same questions like how to influence, consistency, availability. And so on. So for those situations, I think it's better to fry those up front. You can see possible to know them upfront by doing good research and prototyping. On the other hand.

14:38

There are those low-level decisions, like refactoring of the cloud and picking a specific pattern or design in this situation. Maybe it's part of those to experiment with the code and comparing like to approach sighs. So maybe it's not so cozy to it. For example, Implement a solution using inheritance, but also implant. Another solution is incomprehensible. This is going to be one imploded by two engineers and then you can compare your Solutions at

15:03

pick, the better one. So basically in the second scenario, the cost of experimentation is slower. So maybe you can postpone the decision about which one would you choose and in the first one is the cost is higher because once you have this in your applications alucard's to migrate some of its better to read about this educated about these and all those trade-offs up front. So maybe let's go into some of the software.

15:27

Where mistakes and trade-offs that you covered in the book, let's start with the first one which is code duplication versus flexibility. Maybe tell us more about these kind of trade-offs. Like what do you mean by duplication forces flexibility. So I'm considering this problem to wise. First one is at the architectural level. I would focus on that firstly in the microservices ward or multiple Services. Each service is responsible for some specific Business Online.

15:55

Ideally and contains some Self-contained functionalities. So if you have for example to microservices, it is often the case that they may share a similar code in some way like for example autorización set up being the token and submitting it or maybe some kind of validation or things like that. So if you have this duplication between Services the first idea, maybe to just remember that will unify it using. For example, library and use it in both Michael statuses, but it

16:24

also has a trade-offs. So for example, if you have the common Code. You have the tight coupling between two microservices and this Library. So it means that it may impact the evolution of your code base. Also, the fact that this code was duplicated at the beginning, doesn't mean that it was the same functionality. Maybe they needed to evolve in a different way in that situation, really hard to evolve it because you will have some obstruction

16:49

that is not fitting properly. All these cases, you will need to add additional content to the library development and so on and suddenly, I've applied coupling and also flexibility of your software is lower. So you are not able to deliver your software as quick as possible or the solution would be to just extract this new functionality to order

17:09

microservice. But also it has other problems like additional Network request response time, additional latency deployment, but we need to create for this micro service and so on. So at this architectural level, sometimes sound code duplication. It's not blocker. For example, you have two components that Are doing a similar job and you slowly start seeing some obstruction in those two components and you are, for example, decide to use inheritance to reduce.

17:36

The duplication of components solemnly are dependent on this. New component, got an inheritance and the same situation can happen here as it meant result. That's obstruction, doesn't fit the evolution of those two components. Maybe you decided to extract it too soon. And I'll be stages, may be really hard to get it back to the previous solution and of all

17:57

those components. The palanquin, similarly, you will introduce tight coupling and impact, the speed of delivery, flexibility of your code. I think this is quite a common case, right? Whereas the software Engineers. We are also kind of like hot. So you have to remove duplication. Don't repeat yourself dry principle.

18:15

Also, like, when you see things over the internet, you have so many open source project their basic creating libraries Frameworks and reusable components and things like that. So we are kind of like hot in a way that okay. Location might not be a good idea and also reuse abilities or important. So in the first place when we are tackling a particular problem, what is your advice for software engineer to actually be look at this perspective.

18:39

Should you start with just a simple implementation first and over the time, you'll see some kind of abstraction that could be built around that or you should start seeing the obstruction in the beginning based on the particular context that you know, like, what's your approach here? That I think the same solution is the second one. I don't. Why not starting from the abstraction but why some time until this will settle out.

19:01

So if you have a couple of components and they were leaving for some time, so we are quite stable state. So it's not a frequent ablution place. If you see some extraction, then it's a good time to extract it at the beginning. You may be doing that too soon and it may be hard to unwind that, but there's also this argument that if you don't start earlier, where you You can reuse

19:23

some of these stuffs. It can also be costly in the future where you have to tweet your existing software refactor, it in order to make it more reasonable generic, and things like that. So any fog around this, there is no simple answers here. Also, if you will go to the extreme, your code will not be good. But yeah, I think it also comes to a fact that you need to have a good test coverage of your components.

19:46

So this is not an excuse to not create good quality code, also need to have one that stood component if Our what the status is, of course, easier to refactor, but there are some functionalities that you can be sure that they will be reused. For example, some simple string manipulation, like what you have in standard as the case, some connection to Excel on systems and so on Coop, so let's move on to the next trade-offs, which is what you call premature optimization versus optimizing

20:15

hot path. May be the first term. Almost all Engineers will be familiar. With premature optimization is the root of all evil. They all say, The second term here. Optimizing Hot Pot. What do you mean by that? So in chapter 5, I will propose in your cream work. If you have code, that you don't want to prevent to optimize how to make it an early optimization or just in time optimization how to have enough data to optimize it. So we need to have two things at all.

20:43

Lakewood decision, which code values to optimize. So the first one is number of requests per s. Also expected to both on your path so assume that we have two endpoints. Let's say 1 and point is only one request per second. And second one is only done because per second. So having that data, you might be tempted to just go and optimize. The centrepointe block is executed 10 times per second, but it is not enough because we need to have latency data.

21:09

So once we have data about latency, for example, you can get average latency for the first opponent s 1. Then you can multiply a number of requests and average latency and have a distribution of overall Legend. She overdosed to and phones having the data. You can detect the power of that is executed. Most open. I'm calling it hot bath. It means that it is executed almost for every user requests

21:34

in a system. So that was working with, it was very often, the case that they were following this Peridot principal distribution. So for example, like 80% of value was delivered by 20% of the code or it was ten ninety. Thirty seventy and so on. And it means basically, that's it. Focus. Your optimization efforts on the smaller part of the code you will gain a lot of benefits. So I'm proposing, there is how

22:00

to detect this hot tough. So the smaller part of code that optimizing will result in future benefits and how to make this decision. So, we are basic, you need to have this requests per second smash of latency. When we have the data you can detect hot puff and dig deeper in your code. So you can go to the specific parts of this hot bath and measure each of those. Those parts and then you can focus on optimization efforts on

22:26

specific thing. It is basically a way to limit the scope of your work because in production systems, it is not possible to optimize every possible path. If you detect the whole class, we will can make a rational decision. It is a place where you can do some additional efforts and they will readily be worthy.

22:47

After optimizing. The whole path is going to result that you will reduce latency so much that it is not Responsible for the most of the overall utilization on your system, but it may also turns out that still optimizing this hot path will give you better benefits than optimizing other groups in your course evenly, for example this first endpoint that was one request per second average latency is for example, 500 milliseconds and the hot pipe is even 50 or 10.

23:16

If you multiply that with this number of requests, we have the full context and make turns out that series 1. Optimizing. Maybe a little bit of recap. So yeah, you should know the true put of your system requests per second and also the latency of each of that request. And then you multiply it, get a sense of distribution.

23:34

Right? What's your P95, maybe your average your maximum and things like that, maybe a little bit of tips here, because sometimes Engineers when they write software, I don't think they have this in mind in the first place, what kind of tools or what kind of techniques that they should introduce to their software in order to get all

23:51

these numbers easily. If you have system have defined as alive, so if there is a, some kind of a nice light should stay at or system should be able to accept some requests per second within specified like like and see if you don't have this data, then you'll need to measure it somehow to get latency. You should write benchmarks that will validate your application. So you can, for example, use got linked to WR K tool. There's a lot of stomach acid also has its own testing, three

24:22

morgue. You can, of course, use it to extract the latency, but there are a number of requests per second in it to help from us alike or from some product information. Maybe what kind of traffic are you expected, Albert, traffic will grow over time. So then you can adopt your test and measure again. So sometimes also I think these days people, classify these

24:42

things as observability, right? So having good monitoring or Matrix, as part of your software, maybe instrument your code base to that you can get this data easily. Maybe even a profiler. Some tools these days. They can embed instrumentation code inside your code, so that you can get this providing statistics easily. So, thanks for highlighting this importance, about optimizing, Hot Pot. Let's move on to the next one, which is what you mention about data, locality and memory.

25:08

Maybe can share a little bit more about this. That's also in the Big Data Systems, but also all the data intensive systems. There is an important optimization that you should make. And all those recent remarks are doing those. Optimizations. But also the impact the way how the system's Big Data remorse on spark. How do a song are built and some complexities in them. If someone is new to a big data, ecosystem may think that those complexities are unnecessary or easily hard to enter point is

25:41

high. But they do quality means that you are moving your computations to data. And not they got two computations. What does it mean? So, in Luxor example, if you have your data saved, almost People notes like a dog knows, five sister knows even database nodes. If you have some big data processing or processing ecology, like for example find the average of some value and your processing is on different nodes.

26:06

Then you are not fetching the data from the nodes or perform operations on the computation node, but you are serializing the computations and sending them to the data nodes as sending computations is fast, because they will not take a lot of data this way. Basically sterilizing your Java Scala or other language program into binary format, sending it over the wire to the data nodes on the diagonals.

26:32

Also, some kind of an Executor has to be running to the sterilize, the computation and apply it to the data that is on the local node. It can be done in a distributed way. So multiple data nodes. Each of those is processing part of the data and then there's another place of coalescing, the result is sometimes display. Involves sending data because for example, it depends on the partition or partitioning scheme.

26:59

Sometimes you need to send some data, but the amount of data we usually will be substantially reduced. If you do it from the beginning, send the data at the beginning. That's basically as the simplest description of data locality, but implementation of it is provided within those distributed systems, and I'm focusing on Apache spark because it's a recent technology that is available planet.

27:22

It's over the old how to based Technologies, mainly because it does a lot of processing in around not on the disk. The good example of data locality could be using the join operation. So if you are joining data sets and we have one big data set and second maybe smaller ones, for example, you have a list of accounts and all purchases made by those accounts. So there will be a lot of purchases and number buttons, will be smaller, leveraging data. Locality also would mean that

27:51

you are making a sign. Visions which data is sent to other nodes. In that case. We would send a smaller data set to the bigger data, set to reduce this network traffic impact within scope Network shuffling and that's one of the biggest value. We need to focus one writing the data processing to optimize - processing, but with Calculus to generalize to just normal to occasions, right? If you need to fetch some data,

28:17

you need to answer the question. Maybe it's better to tell the system to compute that and then we only Results. Also, the second aspect here is partitioning that is important and picking the proper partitioning scheme for your processing. So we need to optimize your partitions to your patterns. For example, if you need to analyze data pronounce, then it will be nice if one data node will keep the data for the whole month.

28:43

If that's feasible. Then you will send a computation to this data node and competitions will be local to the whole long and they could be executed in a local way. On the other hand, if you have data Partition by, for example, account ID in that case, and you need to extract data for the whole month. It would mean that you will need to scan the data from each data node, and then it involves

29:04

Network partitions. So yeah, I'm showing couple of those examples in this chapter and how to basically help to feed this partitioning depending on the processing that you plan to execute. So, thanks for sharing all these details about how big data distributed system works because I think like for small kind. Data, normally people, I'm not really aware of such trade-offs. But yeah, it makes a lot of

29:27

difference. If you are processing a lot of data and where you store the data, especially if your data cluster is large. So the thing that I pick up from your explanation, just now, instead of sending the data to the computation. Why not doing it the other way

29:40

around? So you sending the computation, which is your coat, and you can call it the functional code, serialize it to Binary and then sent to the data, do some local processing that and just send the result back or maybe you do some kind of joy in and Reduce operation right. In terms of Hadoop Paradigm mapreduce, you basically send all this map operations, two different notes and then you reduce them and aggregate them into one possible result. So look at this data locality of.

30:04

Could you also actually apply this to like micro service? Is that kind of architecture pattern? Is that something around that or is this just applicable for Big Data? Yeah. I think it's going to be also, of course applied to microservices Microsoft is often leads to such a lot of data from other services to Eight and and result for customer, or maybe

30:25

for another microservice. If you start noticing that a Microsoft business to such a lot of data from multiple services and speech services returning, thousands of records. And then you are just iterating over those records. Filtering some subset of those circles or even coalescing into the one result. Maybe that's a situation when you put send to the second microservice. Just Computing this body. When the twin this one value.

30:51

In that case, you will Reduce the traffic, the computations will be done on this second microservice. And only the result will be returned and that case you will save Network bandwidth. And also processing, CPU time, and so on. So, yeah, maybe a little bit on

31:07

this, right? May be some kind of a Viewpoint because last time in traditional way, we used to have this database stored procedure to process database queries, and sure that you can actually compute on the database node, but over the time people move on from that strategy then actually put the application. Coat. So to speak, like a back-end API, right? Where you just query the data and you compute on the application back. And so what's your view on this?

31:30

I think the last time we used to do it near to the data, which is the database itself, maybe throughout the time, because of scalability, because of may be other considerations. People are moving away, that kind of computation away from where the data is residing. Do you have any thoughts around that? Good stored procedures, hit a couple of problems, may be testing the stability over time

31:50

and also, it could be very vendor-specific. 60, you are coupling to a specific vendor and other problems so that implementation of it was not ideal. But the under Direction, had some classes, those trade-offs. And so you are keeping the computations on the known as you mentioned. There is a trend to move, computations back to the data and all the data promoters are doing that. But they are doing that on a little bit higher level to provide some obstruction to note

32:17

tied to a specific vendor. For example, when you spark, it doesn't matter what format They tell you have it can be a brawl Arquette or I system just plain text files and so on one Cassandra and database collector that crisis abstraction and all the logic is in actual programming language. So it's easier to test easier to maintain. There are some forms of these approaches. Of course. Yeah, looking at the distributed data framework. I must say that.

32:45

It's also exciting. So I've used like Apache, beam data flow as well. Like spark. I think there are lots of Innovations around That I'm really glad that some people actually writing this kind of software, so that as an end user, so to speak software Engineers. They just need to understand the Paradigm and write the code appropriately. Let's move on to the next trade-offs, which is what you call delivery semantics in distributed systems. So what's the importance of delivery?

33:09

Semantics? Or maybe in the beginning explain tool some of us, what is delivery semantics? So system that you are interacting with gives you on the data. It will be delivered to us, at least, once it could be at least. Once at most once or exactly, once basically, if you are interacting with network call or any distributed system, then those delivers a man things are important for you.

33:34

Even if you have two simplest microservices and one is sending a request to the second one, there could be a network partition. For example, when the response get back to the first microservice, in that case, the second microscope is accept the data process. It properly may be saved to the database, but the first service doesn't know what. Happen. And in that case, you need to make a decision from the perspective of the first service. What to do, we can retry the request.

34:00

In that case. You will have a consistent view of your system if the request succeeded. But it means that you have submitted more than one request. So, you may have to duplicate and this will be at least once you will deliver the message, but you could deliver it more than once, it gives a good quantities on both systems, but it also has some problems because the second system leads to be Prepare for that. So, it means that the could be duplicates. You can design a system to the

34:28

item potent. So it doesn't matter. If you are replaying in my search. It will result in the same outcome. Like, for example, deleting the body for a specific primary key, maybe item portal, because it doesn't matter. If you lose it once or 10 times, it will be just deleted. But there are some operations that are harder to design them to be. Identical told like updating, for example, some counter because it will be

34:52

Inconsistencies there. So we have at least once in that situation, if the customer system is not aware of this, those applications may result in the data. I'd inconsistent State. And on the other hand on the producer side. We might decide that in such a situation. You are not be trying, but it also has a problem because the fact that response didn't count to the first service. It can also mean that it wasn't processed at all. It was a failure in that

35:20

situation. The first service will not We try and it led to a sound that the second one didn't get the value at all. So it was lost at most once. So it means that it could be delivered 20 times or one time. It will be no duplicates but possibility, 40 and delivery, of course as everyone can be aware that can have problems because you lost son. Had a gun in transmission. There are some cases of the disease.

35:45

Also, good enough, like, for example, collecting celebrities that are collected every millisecond or something like this and if you will Lost some love them. It may be not problematic. Lastly. You can have effective exactly once because when the network is involved, you will need to make decision about retrying at some point. It'll be done even on the protocol level ACP layer and so on so ineffective exactly.

36:09

Once basically you have this duplication that can use hidden somewhere or some kind of a transaction built into your system. It's hard to implement because some problems complexities also big performance overhead view. We need to build it and be aware of that. Also, what's important, if you have to smuggle service architectures, when there is multiple, Microsoft is connecting with each other, even if there is effective ones between two of them, the next one, if does not provide

36:37

effective, exactly. Once they will be the same problems. So your whole pipeline needs to work in this way. So maybe just to recap again. There are three delivery semantics. So like at least once at most once and exactly once. So from my experience in my career, Most of the systems especially the distributed systems, they will opt for the at least once delivery, semantics because maybe it seems like it's a much cheaper to accomplish.

37:02

And also it's less complex in terms of how to guarantee that. So in your point of view, is it fair to say that for most of the use case, we should design our system using this at least once delivery? Semantics. Yes. Yes. The other effects correct. We can remove the need for the duplication by designing our system to be item button with require. So I'm thinking for example, if because events based system which requires some thinking but

37:27

it's feasible. So, for example, you can send the whole state instead of following updates. Like if you have the whole state, you don't need to do this counter updates as I mentioned, so there will be no possibility of making the other that inconsistent. So if you are designing your system to behind important, at least once will work in a very good way. Also, you can build up the duplication mechanics of.

37:49

So, for example, each event, each message can have some unique Eid. It's going to be uuid and at the consumer side, you can keep just some kind of a table persistent map or you are mapping the ID to the fact if it was processed or not, but also it has its own problems because you might operate with distributed database that also lay involves Network partition. That's why this update of the fact that even most processed or not needs to be done also in an

38:16

academic way. But yeah, I have this in my chopper, as well. In that case. You'll need to apply this modification. On the database itself, not supposed to say the state and to do some intermediate operations, but it needs to be one Atomic operation. For example, comparing, some kind of a compare and set on the database and it's nosql databases offers that as well. So you can Implement applicants.

38:39

Another thing that is, commonly related to this delivery, semantics is when you choose your messaging cure, I think maybe these days, a lot of messaging queue up for, at least once as well, but there are some products that offer different delivery semantics Maybe. From your point of view, any kind of thought, what kind of messaging you should people are four or maybe some kind of technologies that people should be aware of, with all these different delivery.

39:00

Semantics in this talk to, and I'm referring to Kafka, because I have to because experience in this technology of coral is nice in that way that it can allow you to configure your producer and consumer Behavior to work in either of those ways. You can have at least once at most once depending on your needs. So then you can tune your settings properly. Come. Offsets in the proper way and you can design your hands to and pipeline to work in either of those situations.

39:28

Because she has mentioned, you can have these cases that are good enough with utmost words, but also some use cases when you have and you need at least once. So using Kafka, you can have both of those. It's only the matter of configuration and configuring Consular producers properly. I have this in chapter 11 of my book.

39:49

Thanks for sharing that. So for those of you who are interested in, In looking into more details about delivery, semantics make sure you check the chapter 11 in particular about messaging, qubit Kafka and all the configurations that you can do. So, Thomas, I think it's been a pleasure learning from you. But all these different trade-offs. I'm sure as a software engineer.

40:07

I hope we can all upscale ourselves to be aware of what kind of mistakes that could possibly happen and what kind of trade-offs that we should think in the beginning before we Implement something so that we don't make costly decision. That is quite difficult to retrofit or quite impossible. To actually fix sometimes because when you have the data actually involve, right? Sometimes of fixing data is not

40:28

so easy. So Thomas Before I Let You Go, normally, I have this one last question that I always ask for all my guests, which is to share your tree. Technical leadership is done for all of us to learn maybe from your journey, or from your career. So, can you share your three technical leadership? Wisdom, but also the first of all bleeding, but example, if you want your team to enforce some specific standards, we

40:50

should give an example. For example, if you wants to health good test coverage, you should also do it. If you want to have good description of your work also should do it. And all I will follow is the good direction, second one. So it's always ask the question, why? So if you are given to do something Implement a specific feature that will result in complexity. You should always be aware what. Value it will bring if it gives you an eval you or not.

41:20

What's the ants to Aunt Flo of the city? Turn how it will impact your customers. And so on. So being aware of this, not only about the technical things, ask why we do even need to implement, but because every code is maintenance overhead and cost. So, if you are able to solve the problem without code at all, then it will be ideal. And the put one is to create a blameless culture. So that there is no blame of a single person if you are doing

41:51

so blender with some error. Or failure on production. There are high chances that there was a problem of a process and not a single person. So it means that you should time to process fix the process to not do the same mistake in the future and learn from their mistakes. So create the some meeting discussing. How does it happen? And try to incorporate that in that new process that will not have it. Thanks for sharing all this wisdom.

42:17

I find it interesting. Especially the second one always asks, why especially if it involves Some kind of a complex or a big kind of effort before you actually go down the rabbit hole and you realize, oh actually it's not so important all. That's another alternative way where you actually don't need to write so much code. So, thanks again. Thomas for people who wants to reach out to you or learn more about the book itself. When is it going to be published?

42:40

Where can they find you online? So on LinkedIn, also Twitter, my Twitter handle is stomach. I'll 007, and it's also for get help. I will respond any questions. Feel free to ask. Ask them regarding bold. You can also join demanding slack Channel this week. We have book of the week and our book is book of the week. It means that everyone can ask any question regarding teamwork, and I'm answering them. That's cool. That's cool. So, thanks again. Thomas. I wish you good luck with the

43:09

publication of the book. So looking forward for that. Thank you for listening to this episode and for staying right till the end. If you're highly enjoyed, please share it with your friends and colleagues who you think would also benefit from listening to this episode. And if you're new to the podcast, make sure to subscribe and leave me your valuable review and feedback. It really, really helps me a lot in order to grow these podcasts

43:36

better. You can also find the full show notes of this conversation on the episode page at technically journal, the death. Site, including the full transcript, interesting quotes, and links to the resources and mentions from the conversation. And lastly, make sure to subscribe to the show's mailing list on technology. No, the deaf to get notified for any future episodes. Stay tuned for the next technique Journal episode. And until then. Goodbye.

Transcript source: Provided by creator in RSS feed: download file

#60 - Software Tradeoffs and How to Make Good Programming Decisions - Tomasz Lelek

Episode description

Transcript