I'm so glad that you're fully plugged in and ready to go today. Ready. So we have two technical topics. Do we? Well, that's what's on deck right here. And I'm wondering if we should switch one to non-technical. We can do whatever makes you happy, man. The vibes. Right now, the two ones are, Scott, you'll teach Mark how to make his terminal colorful, and I'll teach you how to take a dump. I'm sorry, kick her down.
Hey, don't laugh at non-native speakers. It's not non-English speakers. You said it absolutely perfectly. It's always been funny. It's been funny for 30 years. Oh, really? I thought it was just me. No, no, no, it's not you. Do you know why it's funny? I don't know, actually. Explain that to me. You know what a process dump is, right? Yeah, I know that. Okay, so if you take...
A process dump, you take it from the process. But if you are living your life and then you're like, hey, I need to go take a dump, it means poop. Oh, okay. Well, this has been funny since the 90s. And this is something that is just one of those things that everybody giggles at when you say it. Okay. All right, let's do this. Do I use one, two, three? Did we start? We're starting now. Hey, friends. Welcome to Scott and Mark Learn 2.
Mark, I've been coding for some decades. You've been coding for some decades plus one. And I think that amongst application developers, average Joe and Jane application developers, dump files, DMP files. are still kind of opaque. I don't think there's a certain kind of level where it's like, and now you're driving stick shift, you know, taking a dump, no pun intended on windows. There's a certain kind of person that just gets it.
How can we get more developers to get that? And is that a useful skill to have? It is a useful, if you're a programmer, it's an absolutely must-have skill to have. If you're, especially if you're doing native code development, but even with C Sharp. I think you've probably opened a lot of dumb files. Sure. Yeah, a lot. But I think that this is still like, this is my opinion, and we'll see what people say in the comments, but I feel like it's a sub-10% of developers' skill.
Which is probably not enough. I don't think it's enough. You're really handicapping yourself if you're not able to process a dump file to understand what went wrong when your app crashes. Yeah. Do you think it's because of the rise of managed code? that we've kind of obfuscated what's really happening and people are just making memory and garbage collectors taking care of it. Even with that, so you get an unhandled exception, like looking at the stack with symbols and source code.
It's like you're in the debugger anyway. In fact, it's just kind of a static view of your app in a debugger. Okay. So if you can run the debugger, then you can open a dumb file. One of your most popular blog series is The Case of the Unexpected Whatever, where you're just living your life and then a thing happens. And rather than just accepting it...
like a normal person, you immediately stop everything, you get nerd sniped, and then you solve it by usually doing some kind of low level debugging. Maybe you could take a dump for us today. And we could do some of that debugging on the show. Now, this, of course, will be interesting for those who are listening rather than seeing this on YouTube. So if you're listening to this podcast...
We'll try to do some audio description, but you'll probably get a better experience if you go and check it out on YouTube because we're going to share Mark's screen for this. So, Scott, you were talking about my case of the unexplained. Yeah. You didn't get the name right, which shows that you never really... No, not really a fan. Not a fan. Which is really disappointing because you missed out on some good times. What did I say?
I don't know, you said you babbled something. Yeah, okay. But one of the things that I always show in Case of the Unexplained is I'd get into crash dumps, and I'd focus on kernel dumps because a decade or so ago... blue screens on windows weren't that all all that uncommon it's gotten a lot better now and i very very rarely encounter a blue screen or green screen do you ever have any
Only time I get green screens is when I'm installing unsigned drivers for weird IoT stuff, for weird, you know, Arduino things I probably shouldn't have installed a driver for. I had a green screen a couple of days ago installing an unsigned open source ButterFX, ButterFX, BRTFS file system driver, and it was my fault, and I removed it. So I would not call that out.
That was an error between the chair and the keyboard. Self-inflicted, yeah. All right. But in the course of normal usage of Windows, it's extremely rare these days. Well, even if, like in the old days, graphics drivers would be blue-screening you left and right, and now it just flickers your screen.
and you get a little, hey, something happened, but we figured it out. Yeah. I don't even see that anymore. It just flippers. But one of the things that I would do is show collections of blue screens out in the wild. And I haven't updated this in a long time. You have a folder of blue screens? Yeah, and they're just awesome. They're waiting for baggage. It's never going to come. It's never going to come. I was in Berlin, in the Berlin airport, when the entire...
airline industry blue screen that day. Yeah, I guess it looked like this pretty much. Oh, wow. Yeah, yeah, yeah. You don't see that anymore very often. But lots of airport ones, I guess. This is just airport, airport, airport. That's before they added the friendly emoji that makes you feel better about it. Furniture store. Courtesy fun. USB. It's always USB. Customs. Airport. Airport. Airport. Airport.
It's starting to feel like it's the airport's fault at this point. Yeah. One of my favorites, though. Nice. There's a Mac one. Oh, wow. I always like it when the airplane video thing on the chair in front of you reboots, and then you're like, what version of Linux is this chair running? And it's like from 1997? Yeah, exactly. I like this one.
That's nice. But one of the things that I have on here is one of my favorites. It's the space shuttle. They have a blue screen on the space shuttle, and it's the recording of them talking to mission control. I've got a blue screen here. It's got white writing on it. And you hear the mission control people say, okay, why don't you just write down what you see and we'll take a look. Okay.
So when this happens, though, I mean, a lot of us feel helpless, but as developers, we do not need to feel helpless when a blue screen that generates a dump happens. No, you don't need to feel totally helpless, but you can get some idea about what went wrong. And here I can show you a tool that I've got that can let you practice if you're interested in debugging blue screens. And it's called Not My Fault. So I've just searched for it. And here it is. I'll download it.
and then run it. Requires admin privileges. And the reason it requires admin privileges is it's actually installing a device driver. And that explains its name because when you have a blue screen crash, It has to be something in kernel mode, a driver, that is causing the crash or the kernel itself. And so this GUI that we're looking at. So just to be clear, you have loaded a signed kernel driver that does bad stuff, but it's okay because you did it as admin. Yes.
And you can crash, you can hang the system. You can also leak kernel memory. Okay. So this is all aimed at trying to help you. Now, I'm not going to, like I said, crash my system. But I've got a dump file. from before. And the way we're going to look at the DOM file is using a tool called WinDBG. WindBag, the kids call it. Yeah, affectionately called WinBag. And this is the new one. This is actually really cool. It's got a whole bunch of new features.
The one I've been using for 20 plus years is not as pretty as this one. Yeah. The one that you get in the WDK or the SDK is the old one. Yeah. There was a new one that was included with the SDK and WDK called Windows. windbag x that's evolved into just windbag in the microsoft store yeah it's super good yeah so if we open this up
Here we go. And then I can type control D, which is load dump file. And this is for loading kernel mode or user mode dumps. And these are some dumps from Not My Fault that I've collected from those options that you saw. Here's a high IRQL or high IRQL. Dump? Yeah. Probably from loading your Sound Blaster card on the wrong interrupt. Well, no, this is, yeah. But this is from Not My Fault.
from anything else. I know, I know. Whenever I think IRQ, it's like, did you put your card on seven or five? That's true, because they're interrupt requests. Exactly. And IRQL interrupts request level. There are certain rules about what kernel functions you can call at different IRQLs. If you call a kernel function at the wrong IRQL, you could get a blue screen, which is IRQL-related. And that's what the MyFault driver is causing here.
Now, you saw when I loaded it up, it loaded some symbols. It's using the Microsoft Symbol Server to download symbols for Windows itself. It doesn't have symbols for my fault, though, because that's my own driver. Okay. But even without the symbols, we can understand potentially what happened here. So if I click on this Analyze-V, that is an automated, heuristically driven...
view of the dump that can try to pinpoint what happened. And one of the most reliable tools for figuring out what happened is just looking at the stack trace on the processor where the BSOD. function was called. And if we look at the stack trace, you can see NT trap. So that's a fault. And then you can see the next frame up is the my fault driver. So into the my fault driver.
we ended up crashing at that location. So what are the hex values? There's five of them there and then a T in parentheses to the left. Yeah, so this is the stack frame. This is the return address at that stack frame, and here are parameters in the stack, the first three of them. Those are the addresses to the parameters? Yeah. Okay. And we'd have to know what...
the function that's being called. You have to know what the shape of the function is to figure out, to go and decode those. Yeah, but here's the actual line that crashed. So it's the reference of this, indirect reference of that. caused the crash. And that is a disassembly. So that's an X64 disassembly. If I was on an ARM machine, would it be ARM assembly? Yeah, this is actually 32-bit. This is that old from an XP machine. Oh, because you pulled out a 2005...
I was wondering about the date on that, but I saw the EAX and I was like, okay, what would this be if it was 64-bit? So that's 32-bit. You can see x86. Yeah, there it is. Okay. But then what you would do, like you can say that it's pointing at, this is a likely culprit. Did it do like heuristics or did you just go back the frame? Yeah, heuristics. Okay. Here's the heuristics. Let's take a look at this frame. Oh, look, it's NT. That's me.
Okay. That's Windows. It can't be Windows. Oh, here, look. Something that's not Windows. That's very heuristic. Yeah. Staff rate minus one. Yeah, and then you can see information, crash test driver. So you can see this is... So you don't have the symbols, but if you did have the symbols for my fault, you would be able to see those parameters, what they were, and exactly what function. It would be like debugging live. Right.
And that's the thing that I think people don't realize. A lot of folks that know that Windbag exists but maybe haven't spent time with it don't realize that they could load these dumps into... a Visual Studio-like environment and see a moment in time where the dump is preserved in amber, like a mosquito in amber from Jurassic Park, and they're looking at it at a moment. And then, of course, you could get into even more cool stuff like time travel debugging.
And then there's user mode dumps. More common. More common. And you debug those in your favorite debugger like Visual Studio. Or you can use WinPag to debug those too. I'm going to show you. a cool tool related to dumps that can help you capture your app when it's in states that you want to go look at, like high CPU usage or high memory usage if it's got a leak. Or if it's spiking every now and then, you want to catch it when it's spiking to understand what's causing it to spike in CPU usage.
or it throws a certain exception, you want to capture a dump. And the tool we're going to use for that is called ProcDump. It's also on sysinternals. So if you just go back to sysinternals. How is proc dump different than me going into task manager, right clicking on a process and saying create dump? So that is, you know, manually generating the dump on demand.
But what ProcDump does is lets you monitor the process to capture a dump at a specific point in time. I've got it installed here. And it comes with a slew of command line options. It's almost like conditional breakpoints. So here, like this dash C lets you set a trigger, which is when the process that I point proc dump at gets to this CPU usage. then generate a dump. And you can specify how long it needs to be at that CPU usage. You can specify what type of dump you want it to collect.
You can specify the format of the dump file, where the dump file goes. You can specify collect a number of dump files. You can specify that you want when CPU usage drops below a certain level. I mean, it comes with when the process is hung. The joke used to be that I put this in specifically for Outlook. Ouch. And actually I did because Outlook support team asked me to put it in for them.
Oh, that's cool. But this is much more deterministic than just trying to right click and hit create dump whenever you feel that's right. So really right clicking and creating a dump from task manager is really only good for stuff that's clearly visibly hung. But this is far, far easier and far cleaner way to do things. So let's go generate a dump.
just to see this thing in action. So if I say dash C and I say like 10, that means when CPU usage of the process exceeds 10% of the CPU of the system, then generate a dump. And I'm going to actually... run a tool called CPU Stress, which is also a Sysinternals tool. This CPU Stress tool is designed just to do things like consume CPU. And if so, if I do... procdump-c10, CPU stress. By default, you can see that when the CPU threshold goes above 10% of the system for 10 seconds,
Then it's going to jader a dump. Here's the format of the dump name. Here's where it's going to go into the tools directory. And what we're going to do now is cause these, create a bunch of threads in CPU stress, select them. activate them raise their act and then you can see oh that's cool it's right at the threshold right now so it's dropping below every now and then so if i create another one and set it to activate
We'll probably stay above. Let me raise the activity level. Your computer's too good to make this thing get stressed out? Yeah. Because you need it to be above there for 10 seconds. Oh, look. We dropped below. Let me do another one. Work harder. Yeah. Okay, so it's thinking and it's... There we go. Dump initiated. It sounds very high-tech, doesn't it? It does. And now we can go back to windbag.
and open that dump. Now, you could also theoretically open this in just Visual Studio, right? Yes, good. You can see here the thread, active thread here, stack.
their CPU stress all over. I think people get intimidated by these. I know that this looks just like... the matrix to you and you just look at it you can immediately like identify stuff like you can you are fluent in this but i think for a lot of application developers they can follow visual studios kind of visual debugger and spend some time in the stack
and looking at the stack. But I think that you as a fluent debugger expert maybe underestimate how few people spend time in this kind of place. But I think this is a good place to spend time. Yeah. It is. I mean, like I said, you're handicapping yourself if you're not familiar with at least the basics of this. Yeah. Now, I saw that it just went to the Symbol Debugger, like the public CDN for Symbols.
Yeah. And brought symbols down. Those are going to be stored in your symbols cache. Now this, I mean, you can see here, by the way, which proc dump inserted into the dump. The reason for the dump. Interesting. Is that a proc dump? That seems like a uniquely proc dump thing. I think people would be spoiled to have that otherwise. No, it's not. I mean, proc dump is calling Windows APIs to generate the dump, and that API supports inserting a comment into the dump.
ProcDump inserts a comment about its trigger. And you can see here, it'll tell you the hottest thread. So if this is the thread that pushed you over the limit for the trigger, you can now go look at that thread. the debugger. Actually, do this. Do this. Take the dump that you just had, the CPU stress one, and what does Visual Studio look like versus WinBag, and why would I use one over the other? All right, let's see. And then we'll close out.
Okay, so what if I took that same dump file? Wait, wait, wait, wait. How do I? You literally just open it. Yeah. Open, open file. Yep. Watching you type slow is kind of like doing screen sharing with my dad, helping him debug his problems. You know what?
I'm like, isn't that in the clipboard? Can't you just control V your way to glory? I'm like watching him type C colon backslash. It's good that you have been so successful in your career as such a slow typist. I'm actually pretty fast. Is this a playback in slow-mo? This is a nice screen. This is attractive. Mini dump file summary. Look at that. I don't see the comment, though.
Which is lame. That's interesting. That feels like an opportunity for us there. You're absolutely right. Tell me what happened. I don't think Copilot's going to work on that one, dude. I don't think it sees many dumps right now. Nah, it's not going to see that. Try getting run diagnostic analysis, though, there.
Copilot for WinBag, though, I think people have been talking about it. Managed debugging is not available for this minute. Oh, because it's not a managed app. Yeah. It's not a managed app. Yeah, but you can go debug with native only. Yeah. All right, there we go. How this was unresponsive and terminal. Ask Copilot. That's cool. I forgot that they put that in there. Cool.
this error happens because it became irresponsible and terminated by the operating system. In fact, not because it was terminated by Lockdump. But it couldn't know that. It just knows it got slapped by somebody. Lockdump injected this. Yeah, exactly. So you injected the hang to shut it down. Yeah. That's cool. All right, do you want to back up and do this again? I think this is good as it is. I don't feel anything. Are you making fun of me for typing slow?
Oh, that's hilarious. People love that stuff. That's going to meet more subscribers. Unless you're offended. I am offended, but, you know, whatever it takes to drive views. Do me a favor, go back over into the Visual Studio again and hit View Disassembly. I'm just wondering, why would I pick VS versus WinBag? Because I feel like there's things that are different in here.
that are better, but I feel like I would probably end up using both of the tools. Yeah. And bounce between them. I mean, it's whatever you're comfortable with. You're just giving me nothing. What? You've just given me nothing. Yeah, you can't continue because you're frozen in amber. But look, it thinks you can. Well, it's giving you the illusion that you're in the middle of an active debug session, which is kind of a cool illusion.
Oh, look at this. Analyze. Oh. A unique thread? Yeah. Which is true. Look at that. Yeah. This is really good. Yeah. Here's the main thread. And then hit show external code at the bottom. And the call stack. Where's that? Down, down, down. A little bit more. What? Show external code. Oh. It's a button. There you go. So now you can see CPI stress. Yep. And jump.
Boom, there it is. All right, so then you said other than John Robbins' book, which is quite a bit older. Well, there could be more recent books. It's just I haven't been paying attention to this space. Yeah, yeah, yeah. Learn. WinBag. Getting started with Microsoft Learn. Getting started with WinBag in user mode. Open Notepad. They actually have you debugging it in Notepad. Debugging Notepad rather. So there's a whole tutorial over here.
to go and do that inside of Microsoft Learn. So you can go into getting started with Windows debugging. They have both user mode and kernel mode. They have a whole thing about choosing 32-bit versus 64-bit tools, and they walk through an example where you actually... stop Notepad, look inside of Notepad, and then run around and see what's going on. I didn't know that. What a great thing. That's up at Microsoft Learn, getting started with WinBank. Cool.
Well, thank you so much for giving me that overview of WinBag. I got to go back and re-familiarize myself with the state of the art in both kernel and user debugging. Have you ever used WinBag? Oh, yeah. I was trying to. play the foil so that you would feel smart and help me. You'd never doubt me. Do I have? Are you serious? Come on. I'm driving stick shift constantly. When's the last time?
It was the USB driver. It was a Chinese no-name Arduino thing that I was trying to debug. This was a week ago or whatever? No, this was a couple of months ago. And I was trying to figure out what was going on. Oh, I almost used it with TikTok. TikTok Live Studio. I wrote a blog post about this a couple of maybe a week and a half ago. It turns out that TikTok... installs a device on a virtual camera. And I was trying to figure out if it was like their own driver file or not.
It's called lsvcam, but it's a user mode thing, not a kernel mode thing. But I spent a little time in there and ended up just finding out what registry key they were using and deleted it. So I was close. I was close to doing some debugging, but didn't need it. I did use all the SysInternals tools, though, to save myself. You did? Nice. I did. My most favorite one is... Zoom it. Zoom it.
Well, I do love Zoomit, but I used to think about Filemon, Regmon, Procmon, but isn't there like a unified supermon now? Well, Procmon is Filemon, Regmon, yeah. Yeah. I'm so old school that I would open individual mons. Yeah.
rather than the supermon with his procmon. But that's my most favorite thing in Windows, is to be able to go and see what process is writing to the registry and why. And then do all the filters and get them to highlight and go, that's the one. That's the one that saved me in this instance.
more than a WinBank. Yeah. There's a saying, when in doubt, run ProcMon. That is true. And that's another thing, you know, I'll leave people with that, with that reminder that there's nothing being hidden from you.
There is always a feeling that the operating system is a big black box and I can't get in. But that is an illusion. That is a lie. If you want to know what's going on, go one level below your comfort zone. We've talked about this before. Everyone gets complacent. You know, if you're a web dev, you get.
comfortable at ASP.NET and you forget that processes exist and ports get opened and then when you get to the user mode, you forget that the kernel is a thing. So I always try to go one level below my comfort zone just to like... Get in the deep end of the pool to mix metaphors. Nice. Cool. Thank you, sir. Appreciate your work. Over many years of taking dumps. We'll see you again next week.