Write Great Code, Volume 2, 2nd Edition: Thinking Low-Level, Writing High-Level

Speaker 1

00:00

Hey, everyone, ready for another deep dive Today. We're going to look at assembly language.

Speaker 2

00:05

Ooh, assembly language going low level.

Speaker 1

00:09

Yeah, but you know, assembly language can seem a little bit intimidating, a little bit like a you know, a relic of the past for a lot of programmers. Yeah, but I think that understanding it, even if you're not writing it directly, can make you a much much better programmer in any language. What do you think?

Speaker 2

00:27

I completely agree. I think it's kind of like knowing how your car works. You know, you don't need to be a mechanic to drive, right, right, Yeah, but if you know how your car works, you're probably going to be a better driver.

Speaker 1

00:38

Right. You'll know when something's wrong, Yeah, exactly. Yeah, you'll be able to maybe fix it yourself, or you'll know when to take it to the shop.

Speaker 2

00:44

Absolutely.

Speaker 1

00:45

So to guide us on this journey today, we got some excerpts from the book Thinking Low Level Writing High Level Write Great Code. And this book it's really interesting. It really busts some myths about compiled and about optimization, and it gives you some really actionable insights that I think can level up your coding game.

Speaker 2

01:07

Yeah. I think one of the mythsetbusts is that compilers are so good these days that there's really no point in learning assembly language, right, which I think this book kind of proves that that's not true.

Speaker 1

01:20

So why why should we be skeptical of let's say, like compiler benchmarks, right, Like, we see benchmarks all the time. This compiler's faster, this one's more efficient.

Speaker 2

01:29

Well, you got to think about who is making those benchmarks. Usually it's the compiler vendors themselves, so they're going to write code that makes their compiler look good.

Speaker 1

01:38

So it's like if you were, you know, designing a racetrack for your specific car. Yeah, exactly, it might not be a fair comparison to other cars. Absolutely, so the results might not reflect like real world performance.

Speaker 2

01:50

Yeah totally.

Speaker 1

01:51

And that's where I think assembly language can really shine because it can help you. You know, if you can understand how your code is running at that machine level, you can make spot Bottle likes yourself. Yeah, you can write more efficient code even if the compiler is not doing everything perfectly for you. So it's about developing that low level intuition. Absolutely, even if we never write assembly code ourselves, we could still.

Speaker 2

02:13

Benefit from that totally, and you realize pretty quickly that even a simple statement that you write in a high level language, like just a simple calculation, can actually end up being multiple machine instructions when it gets translated down to assembly. For example, there's this example in the book, a visual Beasic statement. It says profits is sales cost or goods overhead commissions.

Speaker 1

02:36

Yeah, I mean that's basic math. You'd think that would be one instruction, one instruction.

Speaker 2

02:41

But at the machine level, at least on the eighty by eighty six architecture, the CPU can only work with like two operads at a time, so you're going to have to have multiple subtract instructions in assembly to actually carry that out.

Speaker 1

02:54

So even though it looks really simple in our high level code, it can be a whole process.

Speaker 2

02:59

It can be a whole dancer team, yeah, for the CPO.

Speaker 1

03:01

Yeah yeah. So, speaking of the eighty by eighty six architecture, what are some maybe some mind blowing things that you've learned about eighty by eighty six.

Speaker 2

03:10

One thing that might surprise people is that those familiar eighty by eighty six registers like EAX, EBX, ECX.

Speaker 1

03:16

Right, the general purpose registers.

Speaker 2

03:18

Yeah, those aren't actually all separate entities.

Speaker 1

03:20

I always imagine them as like separate little like, you know, little boxes, little boxes in the CPU.

Speaker 2

03:26

Yeah, yeah, but they a lot of them overlap. So the lower sixteen bits of EAX, that thirty two bit register that can be accessed as a sixteen bit register called AX.

Speaker 1

03:36

Oh.

Speaker 2

03:37

Interesting, and then AX that's divided up into two eight bit registers ah and al.

Speaker 1

03:42

So it's like those registers are like a Russian nesting doll. Yes, yeah, you know, registers within registers exactly. You got to be mindful I guess of how changing one register might affect other ones.

Speaker 2

03:52

Yeah.

Speaker 1

03:52

Yeah, it's a historical design decision they made to maintain backward compatibility, right, right, But it's really interesting. It's something you don't think about when you're writing high level code. Yeah. What other kind of low level secrets does this book reveal?

Speaker 2

04:07

Another one is index addressing mode, okay, which is how you access array elements. Oh yeah, and how that works can really influence how you structure your code.

Speaker 1

04:18

So can you give us a quick rundown of what indexed addressing mode is?

Speaker 2

04:22

Imagine you have an array of numbers stored in memory, right, so each element of that array is going to be at a specific address in memory. Indexed addressing lets you calculate the address of a specific element by adding an offset to the base address of that array.

Speaker 1

04:38

So like, if I want to access the fifth element of an array, I add an offset of four.

Speaker 2

04:42

Yeah, if it's zero based indexing, exactly, you've got four to the beginning on its address. Yeah, and that gets you to the fifth element.

Speaker 1

04:48

Wow, it's amazing how much is happening under the hood that we don't really think about at the high level. So we've talked about how compilers. You know, they handle a lot for us, but they're not perfect. Yeah. What are some things that compilers struggle with, especially when it comes to optimization.

Speaker 2

05:08

Well, one of the things they struggle with is they only have a limited amount of time to analyze and optimize your code.

Speaker 1

05:14

Right, It's like a time constraint.

Speaker 2

05:15

Yeah, exactly. The process of optimizing code can get incredibly complex. Yeah, so they have to use all these different techniques and they're often limited by time.

Speaker 1

05:24

It's like if you were trying to solve like a giant Sidoku puzzle, but you only had a certain amount of time to do it. Exactly, you might not be able to find the most optimal solution.

Speaker 2

05:32

Yeah, exactly.

Speaker 1

05:34

So how do compilers, I guess, like, how do they approach this problem?

Speaker 2

05:38

They use a few different techniques. One of the biggest ones is data flow analysis, where they track how data moves through your code. And they also use basic blocks, which is where they kind of break down your code into a smaller, more manageable chunk.

Speaker 1

05:51

So the basic blocks are like building blocks for optimization.

Speaker 2

05:54

Yeah, you could think about it that way. Yeah, it makes it easier to analyze and transform the code.

Speaker 1

06:00

Yeah, that's fascinating. I mean, it's incredible to see how much it's happening behind the scenes, even with something that seems as simple as compiling code.

Speaker 2

06:07

Oh, it's a whole world down there.

Speaker 1

06:08

This has been a fantastic you know, first part of our journey into thinking low level writing high level. We've learned a lot already about this hidden world beneath our code, and there's so much more to come, and we'll pick it up next time. Let's stick with us. Welcome back to our deep dive. We're continuing our exploration of thinking low level writing high level.

Speaker 2

06:29

Yeah, we're really seeing how these low level concepts can make us better high level coders.

Speaker 1

06:34

Absolutely. Last time, we talked about how even simple calculations, you know, they could be way more complex at that machine.

Speaker 2

06:41

Level, right, and how compilers they do a lot of the heavy lifting for us, but they're not always perfect.

Speaker 1

06:47

Yeah, So let's dig a little deeper into some of those techniques that compilers use to optimize code, and you know how we can actually write code that kind of plays nicely with those techniques. Yes, make sera drop easier, their job easier, exactly. So one of the most fundamental optimizations is constant folding. You know about constant folding.

Speaker 2

07:05

Oh yeah, constant folding. It's a classic, super simple but really powerful.

Speaker 1

07:10

Yeah, so just remind me how does it work.

Speaker 2

07:13

So let's say you have an expression in your code, like you know, five plus three. A compiler that's doing constant folding, it'll actually evaluate that expression at compile time and just replace it with the result eight in the final code.

Speaker 1

07:26

So it's like pre calculating all the easy stuff so the CPU doesn't have to waste time.

Speaker 2

07:30

Yeah, exactly. It's low hanging fruit. Yeah, but it can make a real difference, especially if you're doing that calculation over and.

Speaker 1

07:35

Over again, right in a loop or something exactly very cool. So what other tricks do compilers have up their sleeves.

Speaker 2

07:43

Well, a close relative of constant folding is constant propagation.

Speaker 1

07:47

Okay.

Speaker 2

07:47

This is where the compiler basically sees that you've assigned a constant value to a variable, okay, and then it just substitutes that value directly into expressions that use that variable. So, for example, you might have pi equals three point one four nine, and then later on you have circumference to pie radius. Right, the compiler can just go ahead and replace that pie with its.

Speaker 1

08:08

Actual value throughout the code.

Speaker 2

08:10

Throughout the code. Yeah, you don't even need to, you know, fetch the value of pie from memory every time. It saves a little.

Speaker 1

08:17

Bit of time, saves a little bit of processing power. Yeah. So I'm starting to see how understanding these optimizations can make us more like efficient coders ourselves.

Speaker 2

08:26

Absolutely, it's like you're speaking the compiler's language, helping it out.

Speaker 1

08:30

So what other optimization techniques should we be aware of?

Speaker 2

08:34

So another important one is dead code elimination. Okay, it's pretty much what it sounds like, getting rid of dead code. Yeah, any code that will never be executed like spring cleaning. Yeah exactly, our program, so you might have like a conditional statement, right.

Speaker 1

08:49

Like an Eiffles block, one of those branches. Maybe it can never be.

Speaker 2

08:52

True, right, so it's never going to execute.

Speaker 1

08:54

It's just never going to happen. So the compiler can just go ahead and get rid of that code entirely.

Speaker 2

08:58

That's great. It makes our program smaller, potentially faster.

Speaker 1

09:01

What else, there's common sub expression elimination. So this is where the compiler will see that you're doing the same calculation multiple times with the same inputs okay, and it'll just do that calculation once okay, and then use the result everywhere.

Speaker 2

09:14

So if I'm calculating like the area of a circle over and over again with the same radius, yeah, yeah exactly, it's like it can figure that out.

Speaker 1

09:22

Yeah. It's basically like factoring out a common factor in an equation. It simplifies things.

Speaker 2

09:27

These optimizations are fascinating. I never really thought about, you know, how much is going on behind the scenes, isn't it. So. One more that I think it's really interesting is loop invariant code motion.

Speaker 1

09:40

What is that? So you have a loop, right.

Speaker 2

09:43

And there's some code inside that loop that maybe it doesn't actually need to be inside the loop.

Speaker 1

09:49

Okay, so it's not changing every iteration.

Speaker 2

09:51

Exactly, it's loop invariant. It doesn't change, right. The compiler can actually take that code and move it outside the loop.

Speaker 1

09:57

Okay, so it only gets executed once.

Speaker 2

09:59

Yeah, exactly, and that can save you a bunch of time, especially if it's a loop that's running a lot of times.

Speaker 1

10:05

So these optimizations, it's like we've been driving with parking brake on all this time.

Speaker 2

10:10

Yeah, a little bit.

Speaker 1

10:11

Yeah, and now we're learning how to take it off and let our code really fly. So we've seen some of these compiler techniques, but what can we do as programmers to kind of write code that's more optimization friendly.

Speaker 2

10:26

Well, one of the first things you got to understand is operator precedence.

Speaker 1

10:29

Ah, the order of operation.

Speaker 2

10:31

Yeah, pem das right. We all learn that in school. But it turns out compilers have to follow those same rules too.

Speaker 1

10:38

So if our code isn't clear about the order of operations.

Speaker 2

10:41

Yeah, it can get confused.

Speaker 1

10:43

It can best things up.

Speaker 2

10:44

Yeah, you can get the wrong results.

Speaker 1

10:45

So another important thing to consider is side effects. What are side effects?

Speaker 2

10:50

A side effect is basically when a statement in your code does something other than just calculating a value. Ok So it might change the value of a global variable or write to a file.

Speaker 1

11:02

Right, It's like a hidden action happening in.

Speaker 2

11:04

The background, exactly. And this can make it really hard for compilers to optimize code because they have to consider the order in which those side effects might happen.

Speaker 1

11:12

So it's like trying to I don't know, choreograph a dance where some of the dancers are doing their own thing off stage.

Speaker 2

11:18

Yeah, that's a good analogy.

Speaker 1

11:19

So minimizing side effects and making them really explicit can help the compiler do its job better.

Speaker 2

11:24

Yeah, make its life easier.

Speaker 1

11:26

So we've got operator presidents, we've got side effects. What other tips can you give us, Well, let's.

Speaker 2

11:31

Talk about control structures like is statements and loops and switch statements, right.

Speaker 1

11:35

The things that control the flow of our code.

Speaker 2

11:37

Exactly. The book talks about how to write those control structures in a way that lets the compiler implement them efficiently. For example, with Eiffel statements, Right, you can often figure out which branch is more likely to be taken, Okay, and if you structure your code so that more likely branch is kind of the default path that can help the compiler generate more effect It's like planning.

Speaker 1

12:01

A route where you're probably going to take the highway instead the back roads. Yeah, exactly, you know, make it, make it the most efficient path.

Speaker 2

12:07

The book also talks about switch statements, which can be optimized using things like jump tables, which are really interesting.

Speaker 1

12:13

Some tables. What's a jump table?

Speaker 2

12:15

So imagine like a table that maps each possible case value in the switch statement to the corresponding block of code. Okay, so instead of like evaluating a bunch of conditions, it can just jumped right to the right place.

Speaker 1

12:29

So it's like a super efficient lookup table for code execution.

Speaker 2

12:33

Yeah, you got it.

Speaker 1

12:33

Okay. What about loops? Loops? You know they're so essential, but they can also be a big performance bottleneck. Oh yeah, for sure if we're not careful. Yeah.

Speaker 2

12:43

The book talks about how to avoid writing code that's going to get in the compiler's way when it's trying to optimize loops. So things like minimizing the amount of work you're doing inside the loop right and using loop counters effectively. It's about making that loop really predictable.

Speaker 1

12:59

This is like streamlining the assembly line exact to make it as smooth as possible.

Speaker 2

13:04

You got it.

Speaker 1

13:04

Wow, this is incredible. It's amazing to see all these optimizations that are happening that most of us probably don't even think about.

Speaker 2

13:11

Yeah, it's a whole world down there.

Speaker 1

13:14

So this has been a really insightful look into the world of compiler optimization.

Speaker 2

13:17

It's amazing what they can do, isn't it.

Speaker 1

13:19

And how you know, we as programmers can actually help them out a.

Speaker 2

13:22

Little bit, give them a hand exactly.

Speaker 1

13:25

But we're not done yet. There's still more to explore in thinking low level writing high level. Stick with us for the final part of our deep dive. Welcome back. It's the final part of our deep dive into thinking low level writing high level.

Speaker 2

13:42

We've covered a lot of ground, haven't we.

Speaker 1

13:43

Yeah, it's been a real journey exploring how those low level assembly concepts, you know, they can inform the way we write high level code.

Speaker 2

13:51

Yeah, it's all about building that intuition.

Speaker 1

13:53

Yeah, Like understanding those fundamental principles, even if we're not writing assembly code every.

Speaker 2

13:58

Day, right, makes you a better programmer overall, no matter what language you're using.

Speaker 1

14:02

Absolutely, So for this last part, I want to focus on a specific data structure, one that we use all the time. Strings.

Speaker 2

14:09

Strings. Ah, yes, those simple sequences of characters.

Speaker 1

14:13

Yeah, but as we'll see, you know, there's more to them than meets the eye.

Speaker 2

14:17

There's a lot going on under the hood. Different ways to represent them in memory.

Speaker 1

14:21

Right, and they can have a big impact on performance and memory usage.

Speaker 2

14:24

Absolutely. So let's talk about some of those different ways to represent strings.

Speaker 1

14:29

Yeah, refresh my memory. What are some of the common types.

Speaker 2

14:32

Well, the one you probably see most often is the zero terminated string.

Speaker 1

14:36

Oh yeah, the null terminator.

Speaker 2

14:38

Yeah. Classic. Yeah. It's basically just an array of characters, right, but at the very end there's a special character, a null character represented as zero and no. Yeah, that's it, and that signals the end of the string.

Speaker 1

14:49

Simple effect it is, But.

Speaker 2

14:50

It has one little quirk. Since that null character marks the end, you can't have any null characters within the string itself.

Speaker 1

14:58

Oh, I see. That could be a problem if you're trying to store certain types.

Speaker 2

15:03

Of data, yeah, like binary data or texts that might contain nulls. For that, you might need a different type of string.

Speaker 1

15:09

Okay, so what's another option.

Speaker 2

15:11

Another one is the length prefixed.

Speaker 1

15:13

String length prefix Okay.

Speaker 2

15:14

Instead of relying on a terminator, it stores the length of the string explicitly, usually as an integer right at the beginning.

Speaker 1

15:21

So it's like the string is carrying a little size.

Speaker 2

15:23

Tag exactly tells you exactly how many characters there are, so you.

Speaker 1

15:26

Can have any character you want within the string itself.

Speaker 2

15:29

Yeah, null characters, no problem. The downside is you need a little extra space to store that length information.

Speaker 1

15:35

Uh. Trade offs, y, always, trade offs always.

Speaker 2

15:38

The book also talks about HLA strings, which are specific to the HLA language.

Speaker 1

15:43

High Level Assembly HLA strings.

Speaker 2

15:45

And they kind of combine elements of zero terminated and length prefix a hybrid approach. Yeah, you could say that.

Speaker 1

15:51

Interesting. Now there's another type of string that I've always been a little fuzzy on. Descripture based strings. Ah.

Speaker 2

15:59

Yes, descriptor based strings.

Speaker 1

16:02

What are those?

Speaker 2

16:03

So instead of storing the string data directly, they use a separate data structure called a descriptor. Okay, and that descriptor it points to the actual string data, right, but it also contains information about the string.

Speaker 1

16:15

So it's like a filefolder that tells you about the strength.

Speaker 2

16:17

Yeah, good analogy. It gives you the length the character said all that stuff. The advantage is flexibility, right, you can share string data more easily. Okay, but there's an extra layer of indirection there, so it can be a bit slower.

Speaker 1

16:29

I see. So each type of string has its pros and cons.

Speaker 2

16:32

Yeah, depends on what you're trying to do.

Speaker 1

16:34

Now, beyond these basic types, the book mentions some more specialized string representations, like reference counted strings.

Speaker 2

16:42

Ah. Yes, reference counting, that's a clever one.

Speaker 1

16:45

That's a memory management technique, right.

Speaker 2

16:46

It is, each string keeps track of how many references to it exist, like how many pointers are pointing to it. When that count goes to zero, nobody's using it. The memory gets.

Speaker 1

16:57

Freed, so it's like a self cleaning system exactly.

Speaker 2

17:00

Helps prevent memory leaks.

Speaker 1

17:01

Okay, and then there's the whole world of Unicode strings Unicode, but it also introduces some challenges, right.

Speaker 2

17:09

Yeah, Unicode can represent characters from any writing system, which is amazing, right, But those characters, they might take up more space than a simple ASKI character, right.

Speaker 1

17:21

Because Unicode has to handle a much larger set of characters.

Speaker 2

17:24

Yeah, exactly, So string operations might take a little longer.

Speaker 1

17:28

So it's something to keep in mind.

Speaker 2

17:29

Definitely, especially if performance is critical.

Speaker 1

17:32

This has been a really eye opening look into strings. You know, I use them every day, but I never really thought about the different ways they can be represented.

Speaker 2

17:40

It's a good reminder that even those basic building blocks have their own complexities.

Speaker 1

17:44

Absolutely. Now, before we wrap up, I want to touch on something else. The book talks about something that really surprised me. What's that The impact of file organization on code efficiency.

Speaker 2

17:54

File organization, it might not seem obvious, but it can make a difference.

Speaker 1

17:58

Yeah, I never thought about it. So how does that work?

Speaker 2

18:02

So operating systems they manage memory in chunks, right, okay, And if you organize your code, well, you can make it easier for the operating system to load and manage those chunks.

Speaker 1

18:12

I see. So it's like if you're packing a suitcase, you want to put all the heavy stuff together.

Speaker 2

18:16

Yeah, kind of like that, so.

Speaker 1

18:17

You're not constantly digging around for things exactly.

Speaker 2

18:20

The book talks about things like locality of reference.

Speaker 1

18:22

Locality of reference Okay.

Speaker 2

18:24

It means if your program accesses one memory location, it's probably going to access nearby locations soon after.

Speaker 1

18:32

So keep related things together, keep.

Speaker 2

18:33

Things close, makes everything run smoother.

Speaker 1

18:36

Fascinating. Well, I think we've just about reached the end of our deep dive. It's been quite a journey, it really has. We've explored so much about how those low level concepts they can inform and empower our high level.

Speaker 2

18:49

Coding, even if we never read a line of assembly ourselves.

Speaker 1

18:53

Exactly, And you know, it's not about becoming assembly language experts. It's about developing that low level intouition, that ability to see how things really work under the.

Speaker 2

19:03

Hood and write better code as a result, exactly.

Speaker 1

19:07

So keep exploring, keep learning, and keep pushing the boundaries of what's possible with code.

Speaker 2

19:11

Happy coding everyone, until next time.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript