Episode 383 – Securing Azure: Monitoring and observing your Azure estate | Microsoft Cloud IT Pro Podcast

00:03

Welcome to episode 383 of the Microsoft Cloud IT Pro podcast recorded live on August 23, 2024. This is a show about Microsoft 360 5 and Azure from the perspective of IT pros and end users, where we discuss a topic or recent news and how it relates to you. In this episode, we continue our discussion on security as we transition from Microsoft 365 to Azure. We kick things off with Azure Security Logging and Auditing before moving into Azure Monitor for

00:34

collecting, analyzing, and acting on telemetry data. We also explore how it can help you identify trends and anomalies to help with your threat detection. We got the Logitech MX keyboard, the MX mini. Are you enjoying it? I don't know. I have a problem. Sound very clicky clacky. It's not clicky. It is definitely not a clicky keyboard. It is better than the macOS keyboard except that it doesn't have my fingerprint reader. I do like the fingerprint reader on

01:04

my Mac keyboard. Like, they need to just sell a standalone fingerprint reader for logging in to, like, desktops. I too wish they would get there. But it's backlit, but it's definitely not like a NuPhy or a Keychron or a Clicky Mechanical. It's yeah. I would say it's a little bit more tactile than, like, the macOS keyboard, Little bit more key travel,

01:28

but not a mechanical keyboard. So if you like backlit macOS keyboards that are not, I would say better, and if you want space gray, they do not sell a space gray mini I don't want the number pad. They don't sell a space gray one without the number pad. They only sell white. They don't have a black one with the number pad? They have a black one with the number pad, not a black one without the number pad. Go figure. Baby will get one in September. Only pros use number pads. That's your

01:54

problem. You you gotta upgrade. You gotta be more you gotta put more pro in the pro of Apple Pro. Joshua Sharfstein: What do I need a number pad for? Joshua Sharfstein: What don't you need a number pad for? It's there to like mishmash navigate. Turn it into like hot keys so that you can do, like, window management or something with it. When I hit 1, it goes to lower quarter. When I hit 9, it goes to the upper right quarter. No. That's what keyboard shortcuts are for.

02:18

I'm telling you. Who needs extra keys when I can push 5 keys at the same time? Use the numbers as a macro pad, and you'll be okay. That's what my stream deck is for, and it gets in the way of my mouse. Oh my goodness. Okay. Today. Yes. Real quick before we do that. I should be recording the correct mic. Do I not sound like I'm on the right mic to you? No. You're not. Really? Yeah. I blame Teams. I have an issue with Teams, though we're not getting into the topic today.

02:47

It has to do with its ability to select the right audio device even though the right audio device is selected. That's the Teams audio driver, Core Audio d. Horrible name because it's not actually Core Audio for, like, the system, but they named their audio driver Core Audio d for Teams, which is like, I don't know why you have a daemon with the same name as the other daemon that does the thing and Microsoft got a Microsoft. But let's just talk about other ways

03:11

that Microsoft is gonna Microsoft. I don't know what other ways Microsoft will Microsoft. Loop 2.0 is out. That still doesn't allow you to secure a workspace with any form of group whatsoever. You're asking the wrong questions. The question isn't, can I secure it with a group? It's, why would I ever wanna do that? And then once you get over the hump of, you can't, you'll find another way through. Yes. You know what my other way through is

03:38

gonna be? I'm gonna go create an Azure Automation runbook that loops through all of my teams or my groups, my Microsoft 365 security groups. So when to look at the name, it's going to create a corresponding workspace and loop, it is then going to iterate through every member in my group, compare it with every member of my workspace, and then add or remove users appropriately running every 10 minutes in an Azure automation runbook. No. I'm not gonna do that. Super

04:08

easy. Barely an inquiry. Workspaces should be securable by group. End of story. Especially given that there are now notes for Teams meetings. If I'm gonna do a Teams meeting and my meeting notes are going into a loop component, said loop component 1 should be able to be assigned to a workspace so that said meeting notes can be in the same workspace with all my other loop things for

04:33

said client. It means the way channels work and they can aggregate meetings and you can have multiple meetings in a channel, but it's technically still all in the same team and still Yes. Joshua Sharfstein: still in the same channel? Wouldn't that make sense? Joshua Sharfstein: And then secured by the same group security group,

04:47

yes. Joshua Sharfstein: You tried to tell them how to fix the problem, not that it was actually going to be fixed, but let let me know how much your Azure automation costs you to run every 10 minutes. I think I can get it in for free. I figured out the math ones. 500 minutes, if it runs for 1 minute, I can run it how many times a day. We're gonna have to figure that out as Copilot. So have you no. You're Scott, I've been up since 3 AM teaching, and I warned you I'm gonna be all over the place.

05:12

I keep trying to get you back into it. Thoroughly enjoying this. Back on topic. So back on topic, we chatted in the past about security, and we started getting into securing the modern workplace through the lens of identity and things that are available within intra ID. So we talked about things like security baselines for identity within intra and conditional access and some other stuff, and that kinda naturally leads into a conversation, which we had last time, about

05:44

Microsoft 65 Office 365 workloads. Just because you're in that identity space, you're all tied in, you're already in SaaS land, and in that software as a service land. So you've got identity as a service, and you have software as a service things that exist out there like SharePoint Online and Exchange Online that are dependent on this identity as a service in the form of IntraID. The other thing that IntraID governs and becomes the identity store for is also Azure.

06:15

So we should really talk about the Azure side of the conversation, which isn't going to be constrained so much to identity. Identity is a component about it. Right? So, like, something for conditional access, the way conditional access can apply to your

06:32

SharePoint online tenancy. Conditional access can also apply to things like the Azure portal or potentially workloads that you stand up, say, like a custom website that you secure with entry ID and you're driving, like, OAuth authentication through there, then that's available to you as well. So if you're interested in, like, the ID side of it, you can go back and listen

06:52

to that episode. But for this one, I wanted to just talk more about Azure Security in general, which starts to get a little bit weird because security is an open ended conversation anyway. And then it gets even weirder because it's Azure, and what is Azure, but a set of components that all sit behind this governance scheme of things like subscriptions and management groups and ultimately that identity construct in IntraID.

07:19

So, yeah, we should talk a little bit about Azure and some of the other things that sit over here. So things like security we talked about security baselines for identity. We should talk about security baselines for Azure and what those mean, and how as we start to decompose out of identity as a service and software as a service, and we get more into, like, platform as a service, infrastructure as a service components that are out there, what do we do, and how do we

07:42

think about that? Yeah. And real quick, because you mentioned identity too, it made me think of this. Just a heads up with identity, there is also a new identity in Azure. Starting in October, so, like, a month from when you'll probably hear this episode if you listen to it on release day, Microsoft is actually making MFA mandatory for all Azure users. Your break glass accounts. Come up with a plan for your break glass accounts because they are going to have to be MFA'd.

08:14

A great way to think about that is our friend, mister Yubike or a Fido key. Since you do have to have MFA, tying that to my admin today's phone number might not be the best way to go about that. Consider things like Fido keys and plan for that because now you're going to have that for your break glass accounts as well. That I think that's probably more the automation and how am I getting in on the back end if something does go wrong scenario.

08:42

Yep. Because this is also then gonna roll out early 2025, gradual enforcement of MFA for Azure CLI, Azure PowerShell, Azure Mobile app, infrastructure as code tools, like, everywhere. And with the YubiKeys, like, I would say see, this gets a little tricky because you also don't wanna just hide to 1 YubiKey because have you ever misplaced a YubiKey and then you can't log in? It's let's do this for multiple YubiKey and probably have multiple administrators

09:11

of your Azure environment have these YubiKeys. Because I'm with you. I've had it tied to a phone or tied to my authenticator app in my phone and then you lose your phone or you reset your phone. That MFA does not come back reliably in my experience with the authenticator app. I don't know if you start switching to passkeys. Yeah. I like YubiKey for it. Phishing resistant,

09:33

secure. But that is one thing I did want to mention that people should absolutely start preparing for kind of in that whole Azure security and identity vein that has been announced. To your point of dual enrollment, even with a single key, you might wanna consider dual enrollment depending on, like, the devices that you might need to actually use to access. For example, I have a USB c UB key with NFC on

09:59

it. I've enrolled both the NFC component and the USB c component, and that lets me use USB c, like, on my desktop or my laptop when I need to get in. But if my desktop or laptop aren't available and I still have the key on me because it's on my Keyring, then I can still get in via my phone, which does support NFC. And it has to listen. There's the call out to multiple keys also to minimum every time. Any piece of hardware can die. It it

10:25

can crash out. It can crap out. I've had the YubiKey, particularly like the nano YubiKeys, the ones that stick in, the little nubbin. Those things seem to die on me all the time, and I don't know if it's just cause they get so tight in the USB c ports and something goes wonky when you're trying to pull them out. Whatever it is, like, those things have not been reliable for me. So I've completely moved over to, like, USB c plus NFC kinds of things so that I can make my

10:49

life just a little bit easier. Also, consider that if you're doing FIDO keys, they're going to come in various forms of connectivity, not just USB C and NFC, but also USB A. And what's the device I need to log in on? So you might actually want to have 2 or 3 of these, especially in the cases of, like, break glass accounts because it is super important kind of stuff. Like, it's worth investing $30 in a couple of keys and then distributing them out there and getting them to where they need to

11:16

be. 100%. I'm with you. I have 3 keys that I enroll in almost all of my services when I use FIDO keys, and they're for sure all enrolled in Azure AD. Same thing as you. One's USB a, couple are USB c, one of them has NFC as well. Multiple options, they're not all in the same spot. So if I lose one, I can go look for 1 in another spot and all that. So moving on from identity and security to where you started to take me is security all up in Azure. Where do you

11:46

wanna start with? Is this a broad, big, large topic? I think it's helpful to start at the top, consider what's the landscape of things that's available to you. So I I think one of the reasons this topic becomes so broad is you could say, I don't know, something like I'm a Azure SQL customer. Yep. And I'm running SQL inside of Azure Virtual Machines. I'm not running like SQL as a PaaS service, I'm running it as IaaS in a virtual machine. And in that world,

12:23

you're running virtual machines. And how do you secure virtual machines, and what does that look like? And that's a pretty constrained conversation. Maybe that takes us back to the CrowdStrike thing. But if you take a step back from that and you say, hold on, like, before I was a virtual machine customer, what was I? Oh, I was an Azure customer. So what are the things that are available to me in Azure that I should go think about lighting up and turning on and enabling based on my scenarios?

12:51

So in my mind, there's no reason that you would treat Azure in any way differently than you might like your Microsoft 365 environment. Right? So when you go into your Microsoft 365 environment, you don't first go into Exchange Online and start configuring security around Exchange Online. You start at the top and you say, oh, I'm I'm an m 65 customer. What can I turn on that gives me logging across the suite? And, hey, there's the audit log. So let let me go start to light that stuff up.

13:23

What can I do to get visibility into the wider swath? And I think that's a good way to start with Azure as well because we've tied into the identity piece. So we talked about that and getting in, you can report on your sign in logs and start

13:38

to understand that stuff. And then you take that next stop down to Azure, and before you even get into the services, there's this common base layer for Azure customers in the form of Azure Resource Manager and like the modern API surface, which is different than that old Azure service management service. At the ARM layer, you have these things like activity logs

14:04

that are available automatically. Right? So for every Azure resource that's deployed, that you interact with, in certain cases that you perform like listing operations with or things like that, you've got the activity log there by default. And the activity log is integrated directly into a service, called Azure Monitor. And what does Azure Monitor give you? Azure Monitor gives you insights into metrics about your environment. So metrics being numbers.

14:34

Right? Number of API calls, number of errors, number of sign in attempts, like whatever it happens to be given the the service you consume. So we start at the Arm layer, we start at the controllable plane. I think the first place we look is activity logs in general, because as you're getting into Azure, like you light it up and you have no resources, the very first resource you create let's say

14:59

you create a virtual machine. So you're gonna go into the marketplace, and you're gonna click, I wanna create a new resource, you're gonna search for virtual machines, and you're gonna hit that blade. Ultimately, you're gonna submit that deployment, and that deployment, the act of submitting that deployment and having it ingested by Arm, captures that entry in the activity log, and then

15:18

it's there for you. And then later when you come back and you interact with that virtual machine, let's say you come back and you add a disk to it, you're modifying that deployment, then that's captured in the activity log for you. And like I said, there's a whole slew of information that's captured in the activity log in general, but it's there for 90 days, 93 days by default.

15:40

It's basically 3 months of data, both Azure Monitor for metrics and your activity log as well that you just have out of the box, ready to go. You don't have to turn

15:50

anything on. You don't have to configure anything, but you do have to know that it's there and that it's available to you because what's going to happen is, for that activity log, any interactions with the control plane, so things that happen through the API surface of Azure Resource Manager, that would be your CRUD operations for these resources, your creates, your reads,

16:11

your updates, your deletes. All that stuff is going into the Azure Activity Log, and it's just there, ready to go for you, which is nice. You didn't even have to turn it on. It's always weird to me that you have to turn on the audit log in Office in M365, but I get why they do it, but it's well, just turn it on by default. It's Right. They had a security incident. Right? That compelled them. That made them turn it on, but you're right. For a long time, multiple years, it was not on by default.

16:39

And I said the same thing. I'm like, that's stupid. It should just be on. Anyways, yes, it is. It's on. It's there. Use step 1 is recognize the activity log is there. One of the things that I mentioned was the activity log is integrated with Azure Monitor. Azure Monitor is this whole Azure service that is a crosscut, and it allows ingestion from a bunch of other Azure services. But the goal is to give you access to that observability data. That comes in the form of the activity log, which is

17:16

effectively strings, right? We're talking about like words that go in, this was a create event, here was the name of the resource, this was the UPN that initiated that call, so you can do things like track, k. Was this resource created by, a service principal? Was it created by a real human? Which human was it created by? And how does that start to tie back to your identity environment? That's all there for

17:38

you. And the other thing that it gives you is the observability part of the platform is it also gives you access to metrics. So you're probably, like, once you start deploying Azure services, and and it's not directly related to security, but it's important to know that it's there, is you're probably gonna wanna think about ways that you can tie metrics into that and things like metric alerts. Let's say I deploy

18:05

a storage account. So I deploy a storage account, one of the things that I might want to monitor as a storage customer is how many transactions or how many errors am I driving? Like, how many transactions are erring out? What's the class of the error? Is it is it a 5 0X error, like a a throttle? Is it a 4 0X? Maybe it's 4 0 threes or 4 0 fours, things like that. So that's all tracked

18:31

in monitor by default. You can just go into monitor, and you can select a storage account, or you can even graph multiple storage accounts if you want to. I think you can put up to 200 individual storage accounts in a single Azure Monitor workbook for crosscut reporting, and you can immediately get those insights in a visual form. Or you can go the next click stop and you can say, hey, here's the things

18:53

that are really important to me. Like, I might wanna know when I have more than n five zero three errors in a given period. So in the last hour, if I had more than a 1,050 threes, maybe I've got some segment of throttling going on that's a little bit higher than

19:08

I want it to be. So you can do things like configure alerts on top of that metric or those set of metrics, or you can start to combine the logs and metrics inside of alerts as well, and and you can start to layer that information together a little bit along the way, if that makes sense. Some people are like, yeah, platform metrics, my storage account, my

19:29

CPU resources, that type of stuff. Maybe not security, but I also think as you start developing a baseline for how these should behave, a storage account or a VM that you're monitoring these platform metrics on, and you set those alerts to alert you when you stray outside of your baseline,

19:46

it could indicate a security event. Like, someone got access to your storage account and started uploading and downloading a whole bunch of junk to it or somebody got into a VM and is running a bunch of processes on it that are and you see it because your VM CPU spiked and is running way higher than it normally runs. The SQL resource or SQL platform metrics, all of those things, I think, are very valid, and it's something that people would monitor

20:11

anyways on prem. If you're running a VMware host or some other servers on prem, you're typically monitoring CPU consumption, memory consumption, storage, disk, IO, all that type of stuff. So why would you stop monitoring it when you go to Azure? Because, ultimately, Microsoft doesn't care if you fill up a storage account or if you incur a bunch of costs or have some type of security incident that's really affecting

20:40

your data. Right? It's still that shared responsibility model of I'm running a VM in Azure. I'm still responsible for patching my OS. I'm still responsible for antivirus on the v maybe even storage accounts, like, getting into some of the other security options maybe that we'll talk about is what files are in my storage account that I have up in Azure. So I think from that perspective, like, platform metrics can still very much be an indicator of

21:09

security issues or other issues. I I have a lot of conversations with customers in this area, is there's both the control plane and there's the data plane. So there's interactions that are driven through Azure Resource Manager, like I want to in the case of a storage account, I wanna create a storage account. Now once your storage account is created and you do things like you stand up a container and you upload an object into that container, the upload of that object, the creation of

21:38

that container, those are data plane operations. So we've we've actually crossed over to a new API surface. But the the the thing to remember here is, like, in the context of, say, a storage account and just about every service I can think of off the top of my head that's in Azure, even though it's got a data plane to it, that data plane information was still available and monitored to me

22:01

alongside my control plane information. So it's a kinda have your cake, you need it too scenario, and it's important to recognize, like, where those things are coming in and how they're getting there and where they're coming from. Because in that scenario where I described monitoring a storage account for things like number of errors, those errors, those those HTTP error codes like that 404, 403, those 5 0 x errors, those are ultimately data plane metrics. But I didn't have to

22:29

do anything to turn them on. Like, they were just there for me out of the box, and they were presented to me through that observability layer in Azure Monitor where it was automatically combining

22:40

my control plane and data plane. So while Azure Monitor itself is not a security service, it has these components that are giving you all the observability and alerting capabilities, which leverage the right way, becomes something that's a component of your strategy to thinking about how you monitor and secure your environment. Because a lot of security constructs

23:04

do come down to things like observability. If you think of even like the basics of an AV client, an AV client is there to watch for heuristics and new things to load and try and block them when they're malicious. So it's own kinda like little observability

23:18

container. And, yeah, once you start to walk down that path and and recognize that crosscut is there for you, I I think it it makes things a little bit easier, and it opens up that mindset of, how do I get out of just observability mode into translating my security needs into things that observability is going to inform and where does this pull me to? Do you feel overwhelmed by trying to manage your Office 365 environment? Are you facing unexpected

23:51

issues that disrupt your company's productivity? Intelligent is here to help. Much like you take your car to the mechanic that has specialized knowledge on how to best keep your car running, Intelligent helps you with your Microsoft cloud environment because that's their expertise. IntelliJunk keeps up with the latest updates in the Microsoft cloud to help keep your business

24:10

running smoothly and ahead of the curve. Whether you are a small organization with just a few users up to an organization of several 1000 employees, they want to partner with you to implement and administer your Microsoft Cloud technology. Visit them at inteliginc.com/podcast. That's intelliginkdot com/podcast for more information or to schedule a 30 minute call to get started with them today. Remember, Intelligink focuses on the Microsoft cloud so you can focus on your business.

24:46

So that covers activity logs, platform metrics, those both collected by default have access to. There's one other type of kinda core log. I would say you probably also want to monitor high I would highly recommend you start monitoring, but this one is not on activity logs is I'm going in and I'm creating storage accounts, I'm creating VMs, platform, the metrics we just talked about. Resource logs are actually then insights into operations that are performed

25:17

within the Azure resource. So actions that are performed within a storage account, uploading, downloading files, actions within a VM. You can pull out event logs from VMs, actions within the SQL database. You have to turn on your diagnostic settings in these individual resources or potentially use something like Azure policy or something else to turn these on and then figure out where do I want these

25:46

logs to go. And these are going to very much vary resource to resource based on what types of logs you're collecting. Logs within IIS are very different than logs within storage, are very different than logs within VMs. You can pick and choose too. There's multiple logs within all of these. You can go in and say, I just want all the logs from this resource. Or based on the resource, you can go in and pick, I just want this certain type of information, these

26:12

certain logs from this particular resource. And usually, you have different options. You can go into storage accounts with these. Scott would love it if you put all your logs in his storage accounts. Typically, I think what people tend to do, and this is what I tend to do, is send these into log analytics. You can also I can't remember. What are the are there 3 or 4 options? Log analytics, a storage account, queues? There's 3. Log analytics, storage account, and event

26:39

grid. Yes. So you can send them back through through eventing. Taking a step back, when we talk about monitoring and observability and then tying observability back to things like what do I want to monitor to think about the security of my environment, You need to start answering those questions like service by service or area by area. So for a storage account, I could be really interested, like I said, in things like number of errors on a given subset of transactions that exist

27:10

out there. So that's my class of, hey, what are my 4 0 x errors for GitLab requests? Like, how many unknown requests am I getting? Is somebody randomly hitting me? Once you start getting to the place where you're like, show me the user who did that thing, or show me the URI for the resource that was impacted by that event,

27:30

those are logs. So a good way that you can rationalize this in your head, or at least it's a way that I found to work for me, is metrics are always numbers, and those are always gonna be in Azure Monitor, and they're always gonna be there for free for the those 3 months. Outside of that, it's on you to retain them for longer. Once you start talking about things like, oh, was it a UPN? Was it a URI? What was the operation that was called? Things like that. Those are logs, and logs are strings.

28:00

And if it's a string, it's always gonna be in a resource log if it's coming from, like, a native service perspective. So if you're just looking to rationalize, hey, do I need to turn on resource logs? If the thing you want to interrogate sounds and feels like and it is a string, yeah, you need resource logs to make it work. We

28:17

talked about the activity log for Arm. That absolutely is a bunch of strings that you can go in query through Azure Monitor and light up scenarios around that without needing to enable resource logging on a given resource and standing up a log analytics thing or dumping your logs to a storage account and then worrying half, how do I download it? What's the scheme of the JSON file? All those kinds of things that come along with it along the way. Yes.

28:43

Absolutely. Your audio wasn't going through the Discord. That was my fault. Yeah. Teams. It goes back to my Teams issue at the beginning. But, yeah, absolutely. Excellent points on all of that. Trying to think. Activity logs. We got platform metrics. We have resource logs. Which did you know? We have all that log data, Entra. Yes. And I'm gonna tie this back. The log data

29:03

in Entra has been there. I noticed it with global secure access and some of the enhanced logging you can do now, because this is a requirement for enhanced logging, is you can actually configure Entra diagnostic settings. So these resource logs we just talked about, Entra has resource logs where you can go set up Entra to send its essentially diagnostic

29:28

logs into log analytics as well. Mhmm. I don't know how long that's been there, but I haven't always had that set up because I haven't had a need to until I was playing with Global Secure Access. So for these things like monitor, entry ID, or like the activity log, entry ID logs, things like that, where they're there by default, but maybe they're not pumping their data out to to

29:48

another source like Log Analytics. The reason that you start to pump the data out to Log Analytics, or maybe you go and you look at security specific offerings like Sentinel or things like that, that can ingest that data for you, or they can act as a target or a sync source for that data, is because you wanna retain it longer, or because you wanna start to get that crosscut visibility and insights into, do I need to correlate an event that happened in enter ID to an event that

30:14

happened in the activity log to ultimately an interaction on the data plane in one of my resources. That might be something like, do we want the ability to track the malicious sign in? Say there was like a phishing hack, and somebody got in with some admin credentials to your enter ID tenant, And then from there, they created another user, they created a service principal, and they gave that some kind of elevated rates in your environment, and then they connected with that user. Great.

30:40

The login attempt up here is in Entra, and then the other interaction now is down in the activity log, and then all the way down to what they do while they were there. Oh, I can see that they went into my storage account, and they changed the configuration of a container on my storage account, and they enabled it for public access, like public and on access. And then I tie that back to a metric, and I see that they egress some data or something like that along the way. You know, so it's, like,

31:09

a little convoluted. I I I think, if you're not, like, deep and and, like, familiar with it and you haven't touched it all. But once you start to touch it all, it it starts to make sense because you're just decomposing it into the various layers. So really, you're walking through, like, the various API surfaces, right, like, Entra and Graph, down to Azure and Arm, and the control plane, ultimately maybe back to, like, data plane within an Azure service, if it happens to have a data plane.

31:37

Not all services have data planes. Some of them only have control plane interactions. It's hit or miss depending on the service, so you do have to know what you're going for service by service. And someone in Discord was just asking to, like, being able to track who created, like, a function app and when they created it and stuff and where those would be. That would be more

31:57

of that Azure activity log. That's gonna be logging more of those, and we had talked about a little bit more of those control plane creating those. But I do agree everything you just said about logging, and that's why I personally logging all of these to log analytics. We'll probably start talking about more security stuff in future episodes.

32:17

To your point of it's these different APIs, it's these different control planes, but then somebody gets into NTRIC, gets into a storage account, finds a file out there that has other login data, and then they go log in to SharePoint and pull information out of SharePoint. It in my opinion, it's really important when you have this wide range of things. Like you said, bring all of those logs into

32:40

one central spot. So when you are investigating an incident or doing some type of threat hunting or if you wanna set up some type of alerting, all these logs are together, and you're not jumping between the entry ID sign in log and then over to the SharePoint audit log and then over to the Azure activity log and then jumping back into your diagnostic logs

33:02

for your storage account. It gets really hard to correlate everything when they're all in these disjointed places, and that's where I think log analytics and Sentinel, if we talk about Sentinel in the future, really give you a good security benefit when it comes to being able to pull all of this together. So it's an interesting conversation. I have it with customers quite a bit. There's the class of customers that is scrappy.

33:33

I absolutely understand the motivations, where they're coming from, and they want these things to be free. In that world where they want them to be free, they kind of stick with the free offerings. So they do these things where they end up building bespoke tooling, say, to use the graph PowerShell commandlets to query the the sign in logs and things like that from Entra. And then they go and they wire up a CLI command to extract metrics from over here, and they build these.

34:05

It's gonna come across as, like, derogatory. I don't mean it to be that way. You end up building these, like, Rube Goldberg machines that that are just these mismatches of things. And they accomplish their goals in many cases, like, they figured it out and they learn how to get there, and they think in their heads they did it on the cheap. But the reality is they had to invest all the time into building that bespoke tooling and doing all those things. And ultimately, you

34:28

still need to spin up the compute. And and the compute is what costs you the money in these scenarios is, k, I need to wire something up that can be online and available with networking and disks so it can talk to that other thing and bring it down and ingest it. I caution folks about they get into their heads very quickly. They look at the pricing of Log Analytics or Sentinel or things like that, and they go, too much money. I can just go send my engineer off, and they can take

34:51

3 months and build it for me. Your engineer that just built that thing for 3 months, depending on their salary, you might have been able to buy a year of Log Analytics at your ingestion rate. It all depends on your environment, like what rate you have and rate of change and churn and retention and those things that you want

35:09

along the way. But, I I think if you take a step back and you peel it back, like some of these things that look like they cost a lot of money when it comes to security, they're like a net wash when it comes to resourcing and operations and things like that. You do have to keep those things in mind. Oh, yeah. Another interesting question in the chat. So if you use a third party SIEM instead of Sentinel, would you consider log analytics aggregation doubling up on a function? So

35:37

you don't have to Maybe. Install. I I I think is the yeah. But I think this is another misnomer is, like, customers feel that they're locked in, and I I get it. To a certain degree, you are locked in. Right? If you're an Entree customer, absolutely. You're using Entree. You really don't have another choice there. That is the identity provider for Azure and M365. So you you can do things like have your IdP, Okta, or whatever, as the relying party, but the reality is you're using Entra

36:04

at the end of the day. Some stuff you can't fight. Some of these things you can. Do you have to use Sentinel? No. Do you have to use Log Analytics? Absolutely not. You can do things like build out that integration for those diagnostic logs, which can include metrics in them. They're not just resource logs. You can pump out the metrics as well. You can build out those integrations around things like Event Grid and pump those out

36:26

to any system that you want to. I have no expectation that every customer who comes to me who says, I'm a Splunk shop, is immediately going to go and turn the switch and learn Kusto and start doing Sentinel or log analytics tomorrow. That's not good for them. They're already a Splunk shop. Or you

36:43

might have another solution that's out there. So 90 ish percent of the time, and it's almost a 100% of the time with, like, the really large providers out there, they're going to have some integration that is going to be able to hook into Event Grid and the eventing system. That's a core service within Azure. Right? It's available. It's scalable. It it will do all those things for you, and it it would allow you to put that data out to where it needs

37:08

to be. Your answer of maybe classic consulting 101, but very true, there might be pieces of functionality that you could absolutely replicate and do yourself. It's just, to my point earlier, it's not

37:21

worth your time, right? Like the engineering effort that you'll invest to duplicate some of these solutions when they've already been done and built for you, I I really do think you need to have that kind of rationalization moment of, do I want my engineer to take a day, a week, a month, a year, whatever it is to build this thing, or can I just buy it and be done with it? Let's be honest, like you're in a pay go service anyway, everything's not gonna be free for you. Be smart and pick

37:48

and choose where you're spending your money. Yeah. And, right, it depends too is we're looking at this from a security lens. So I would say from a security lens, if you are sending everything to Log Analytics and everything to a third party SIEM, yeah, you're probably doubling up on a lot of stuff that you don't need to be. That being said, Log Analytics is also not just security. So I actually have some clients I've worked with as well that

38:14

have 2 log analytics instances. 1 that they send everything to from a security standpoint or even certain, resource logs that they wanna look at from a security perspective, and then I have another one for, like, operations because, for instance, Azure app services.

38:31

If you're hosting a website in there, part of the resource logs or those diagnostic logs are also used for app insights for doing things like tracking website visitors and response time on your website and things that operations developers, your SEO folks, if they care about response time, certain, depending on their role, may actually want some

38:55

of these logs. So you may still want a certain aggregation in log analytics, not from a security perspective, but from a operations monitoring perspective or from a debugging perspective or from a we just wanna know if there's an error on one of our Windows servers, again, going back to some of the operations. So I think from a pure security perspective,

39:17

yes. There can absolutely be use cases for sending data either to both or even, again, sending sometimes data to 2 different Log Analytics workspaces. We mentioned the diagnostic logs. When you configure those, you can send them multiple places. You can send them to 3 or 4 different locations. It's not, I'm only going to send these logs to 1 or the other. Yeah. It's like level of effort and things like that, and I was saying you've inquired the

39:44

whole time. It's EventOps, mea culpa. So the other thing to keep in mind too is if you're a customer who's coming to Azure with an existing solution and you're, like, looking at this space and you're going, yeah, I know there's some native stuff, but maybe you are that customer who has a third party SIEM or something else. Maybe you use, like, a a different firewall. Like, you you're out there using FortiGates and you're, like, Azure Firewall is not my thing.

40:10

Those options are often available to you as well. So it's not like you have to completely ditch the ecosystems that you're in today. There there there's a pretty wide swath of partners and ISV solutions that are available to you. So like in the monitoring space, Elastic's there. So you can do like Elastic Integrations, Elasticsearch, things like that. Datadog is another

40:36

one that I run into a lot. Like, I end up working with a lot of our cloud native customers and things like that, like custom dev shops who are spinning things up. So I see a lot of that. You can do things like Event Hubs has native Kafka integrations. So if you're pumping data out and you wanna pump it through Kafka and then ingest that over in, say, Databricks or do some Spark analysis on top of it, that's just all available to you. It's all possible. It's there. The hooks are there.

41:03

It's worth checking through the documentation and seeing, hey, is there something that's already here that's in my wheelhouse that I'm familiar with so that you don't need to reinvent the wheel. And I think that's a big consideration, right? Because if you're reinventing the wheel or you're at it net new and you've never done it before, it's not like an immediate security hole, but it's a

41:23

gap for you. And the and the more gaps you have in that observability, the less comfort factor you have, and you start to go down the weird rabbit hole in your own head sometimes about it. Ultimately, like, the cloud is about plugging a a bunch of pieces together.

41:36

I I always think about it as like Lego bricks and every if I gave you a bucket of Lego bricks and it had a 100 different colors in it and I just spilled it on the floor and I said build me a house, you're gonna build something that looks like a house. Your house might have 4 walls, and the one my overachieving son builds is gonna have 6, whatever it is. But it's still gonna be a house, and that's okay. You're not gonna necessarily land on the same solution that

41:59

somebody else did. I see a lot of customers that are like, just tell me how the other customer did it. I'm like, somebody else did. I see a lot of customers who are like, just tell me how the other customer did it. I'm like, I I really don't want to. We're not gonna be bespoke, and it's not like everything's in, you know, a unique snowflake that fell on the ground, but at the same time, we're all gonna be a little bit different. And that's okay. That's the place we're gonna land. Yep. Agree.

42:22

Scott, when we said at the beginning of this episode, we're gonna keep an eye on time and quit out 30 minutes. Yeah. And we didn't even make it past, like, metrics and resource logs. Who saw that coming? Not me. Maybe we need to rename this to, like, the Microsoft Cloud Security Podcasts. Something observability. Yeah. We can probably cut it here. I think that's a good kinda grounding and an overview of Log. Some of the observability pieces

42:47

and some things to think about there. We should definitely come back and revisit some of the native tooling that's there. I I think it's worth talking about. Sentinel and and some of the other things there, Defender, which we've talked about Microsoft Defender in context of Lakeham 365. I don't know that we've ever talked about Defender for Cloud and some of the integrations that come on the Azure side. There's a bunch of third party

43:10

stuff out there. So, yeah, we can just keep running with this one for a while. I think that might be the plan. We'll see how long it goes. Not how long we can drag it out. We don't wanna drag it out, but how long it takes us to cover it to our satisfaction? 2025. Here we come. Perfect. Alright. Scott, go enjoy your weekend. Sounds good. Rest. Relax. As always. Thanks, Ben. Thank you. Alright. Have a good one. If you enjoyed the podcast, go leave us

43:37

a 5 star rating in iTunes. It helps to get the word out so more IT pros can learn about Office 365 and Azure. If you have any questions you want us to address on the show or feedback about the show, feel free to reach out via our website, Twitter, or Facebook. Thanks again for listening, and have a great day.

Transcript source: Provided by creator in RSS feed: download file

Episode 383 – Securing Azure: Monitoring and observing your Azure estate

Episode description

Transcript