Welcome to the Azure Security Podcast, where we discuss topics relating to security, privacy, reliability, and compliance on the Microsoft Cloud Platform. Hey everybody, welcome to Episode 46. This week is myself, Michael. I'm here with Mark and with Gladys. Sarah is actually in California at a conference, so she won't be joining us this week. We also have a guest, we have Matt Egan, who's here to talk to us about Microsoft Sentinel and third party connectors and KQL queries in general.
But before we get to Matt, let's take a little wrap around the news. Gladys, why don't you kick things off? Actually, I'm going to do something a little bit different. Instead of giving news, I'm going to talk about something that is affecting many customers. I've been in calls with many customers that are trying to align their security to the executive order, the White House, for those that are not familiar with this. The White House has released some orders on securing infrastructure.
That has prompted a lot of different organizations like DOE, TSA. DOE stands for Department of Energy, TSA, Transportation Security Administration, CISA, the Cybersecurity Infrastructure Security Agency, NIST, National Institute of Standard and Technology, and others have given a lot of different guidance. While the customers have been trying to align this security, they often call us and they're asking us, okay, we need guidance on cirutros and SASE. What is the difference on this?
Frankly, I often found it really hard to explain it to customer, what is the difference? I've been reading a lot of documentation even from Garner in order to come up with definitions. Hopefully, this is going to help some customers. Mark and Michael feel free to chime in with additional information because we are receiving so many calls. It's just unbelievable.
Basically, what I was reading, I found out that SASE is mainly a framework that aims to securely connect users and endpoints to applications and services, no matter where they are, whether they're on-prem or in the internet at home or whatever, the framework is supposed to secure that communication. Because they are looking to consolidate all the information, they are focused heavily on network capabilities.
So you will see a lot of routing, SASE, content delivery, catching, and other services being interconnected. Now, cirutros is a framework that assumes everything can be compromised. And as such, a defense layer approach is used to protect the environment. Cirutros also relies on the implementation of least privilege. The key is to mitigate risk by proactively verifying every information, every attribute available for that entity and phone application, data network, and infrastructure.
So the amount of verification required depends on the information available to be verified. So when you compare both, SASE is a framework that aims to provide that secure connection in the monitors, there's a lot of network capabilities. While cirutros is aimed to also provide that security through verification by reducing risk.
I found that although both use similar infrastructure, SASE contains some sources of content from quality of service, what area, networking, optimization, information that cirutros may not take advantage of. While cirutros, because we are implementing least privilege, there's services like cloud infrastructure, entitlement management, just-in-time access, and others that may not be necessarily used with SASE.
So in other words, both frameworks may be accomplished using similar infrastructure, but each of them has some extra components not necessarily needed to accomplish the other.
Yeah, if I can add there, Gladys, the thing that I've seen is that when we were first trying to figure this out, because I helped contribute to some of the slides and whatnot for Microsoft in this one, the original place that we started was a FEND diagram, where they have a lot more in common than they have different, and that's very true. And so that's one of the ways to think about it. SASE also, we found, tends to be a little bit more implementation-specific.
And so like, zero trust is, there's kind of two zero trusts. One is the big strategy that drives modernization of access control and sec ops and governance and asset protection, and all those kind of things. But people also tend to say, zero trust, use it for the access control, modernization as well. So it's kind of like two zero trusts, like a big zero trust, and then a specific initiative, a smaller one.
And SASE tends to be pretty close to that access control piece and really focused on that, what is a secure access service edge. And so that's one of the ways that they sort of overlap. And SASE does tend to bring in a little bit more of a network bias we've seen. It's definitely embracing identity and sort of requires really identity as a service.
It's not always explicitly stated, but all these different CASB, secure web gateway things really kind of require an identity as a service in the middle to kind of bind it together and pull it together. So it has a little bit more of an availability piece, because it's talking about the service edge, not just in secure terms, but also in performance terms. And so Microsoft is kind of a fan of both frameworks because they shine light in two different directions on different problems.
Ultimately, the thing that we found that's kind of cool is, say you go down a SASE road, but you're still moving towards zero trust as a big picture kind of initiative too. So those are some of the things that I've seen in that space.
So being the lookout for our links that we're posting in the Azure Security podcast because Microsoft has developed most of it, it has a lot of documentation that can help not only align the security requirements to all these guidance, but also help you understand the differences. Actually, I don't really have a news link this week either. You can chalk it up to laziness or...
But one of the things that's come up is the cyber reference architecture has gone around a few times on LinkedIn and whatnot. And there is a SASE diagram in there, which I kind of neglected to mention earlier, but in the cyber reference architecture, there is a whole section on SASE and kind of how it does a comparison to zero trust.
One of the things I wanted to talk about a little bit that is also covered in the cyber reference architecture and something that we're working on some sort of deeper analysis and content on is a little bit more around security teams and roles.
One of the things that I've sort of come to a more crystal clear understanding about in the past, say about a month or so, is really that security has two operational functions, two sort of active interacting with the environment, interacting with the operational team like IT ops, DevOps, et cetera, functions. One is what most people are familiar with kind of in the reactive side sec ops or SOC, security operations center or security operations.
And that tends to be all about the live fire incident, the actual attackers in the environment. And you definitely need a dedicated operational team, work with IT ops teams to have ops teams investigate, respond and get that addressed. But there's also this sort of preventive operations, which most of the time it usually shows up as just a vulnerability scanning team that asks the various different IT ops teams to go apply patches.
And the thing that's been interesting is watching that plus sort of all these cloud security posture management and secure score and all these kind of new cloud capabilities come in that allow you to have sort of real time on demand operational view into not just your software vulnerabilities, but configuration vulnerabilities and sometimes some operational practices and identifying some of those kind of things that could cause risk.
And so it's been sort of an interesting realization that we really need two different operational teams and security. One focusing on that hot live fire incident, but also focusing on that security posture. We're calling this posture management, Microsoft, that's really focused on the preventive side. Just like the SOC doesn't actually change the environment, right?
It works with the IT operations team to do it, but is kind of there, and the different parts of the teams are doing glass watching and then working with the teams, enablement of the teams as well. We've seen some organizations that apply this really well to application security.
So instead of just having a, here's a scan and a 50 page file to go into your app development, much more engaged in the DevOps processes and building experts within DevOps teams and then actively working with the IT operations teams as well. But ultimately, they found that the number of incidents that happen on one of these teams that has worked with this sort of application security help desk or enablement or evangelism team is much lower than the teams and the DevOps teams that don't.
And so we're really seeing from a bunch of different directions that need for a formal focus and structure around that sort of preventive operations. And so we're pulling together as many learnings and best practices around that as we can lately. And we've sort of hinted at this in the MCRA stuff with the plan build run type of things there and we're looking to do even more. So I have a few items that sort of took my interest over the last few weeks.
The first one is, so as you DevOps was supposed to deprecate TLS 1.0 and 1.1 like three days ago, but they actually rolled back the change because there were some compatibility problems with some big clients apparently. That's the extent of what I know. But yeah, so at some point in the future, there will be a rollout of deprecation of TLS 1.0 and 1.1. So only TLS 1.2 and eventually TLS 1.3 will be supported. The next one is with a product that's near and dear to my heart and as a key vault.
We've now increased the service limits for all customers. Historically, key vault could only do, for example, a get against RSA say, software keys of around 2000 every 10 seconds. This has now been doubled to 4000 get transactions per 10 seconds. Now, before you start thinking, oh, that's fantastic, let's double our number of hits that we're going to make against key vault, I would urge against that. And there's a few reasons for it.
The first one is, key vault is not really designed as a massively transactional product, it's not like a database that's designed to have massive amounts of throughput. And to be honest, you're not really going to be rolling keys and secrets and certificates that often anyway. So the general rule of thumb there is to actually make a cash connection instead and actually say, have only hit key vault every five minutes or every 10 minutes. But certainly not 2000 a second or 2000 every 10 seconds.
And the reason for that is also is if you do start hitting those limits, key vault will throttle your connection. So even though we have increased the connections and that's fantastic, I would still caution against just opening the floodgates, that's not a good thing. The next thing is in Azure Database for Postgres SQL, in the hyperscale version, we now have private endpoint support, private link and private endpoint support.
As I mentioned on multiple podcasts, we see this happening across more and more past services. And so this is another thing that's great to see. So that's Postgres SQL for hyperscale, and that is general availability. The other one under general availability, also for Postgres SQL, is we now have a whole bunch more certifications. I'm not going to go through all of these things. Most of them seem to affect European countries, so AFM and DMB in the Netherlands.
I do not pretend to be familiar with those. In Switzerland, there's the financial market supervisory authority. So these are applied to Netherlands, France, Switzerland, Denmark, Belgium and Poland. So if you're in those countries or have customers in those countries that require these kinds of certifications and you use in Postgres SQL, then we have some good news for you. So that's basically all the news that I have this week. So let's turn our attention to our guest.
This week we have Matt Egan, who's here to talk to us about interconnecting Sentinel with third party connectors. Matt, hey, welcome to the podcast. Would you like to spend just a moment and explain kind of who you are and what you do? Sure. Thank you so much for having me. Really appreciate it. As you mentioned, my name is Matt Egan. I am what we call a global black belt here at Microsoft, which is a very fancy title for really just an SMA on our security tools.
So I specialize in the Microsoft threat protection tools, so defender for end points, defender for office, as well as Microsoft Sentinel, wherein I have a lot of experience in building on connectors for the systems as well as integrating with other third party tools. So really pleased to be here. So yeah, just for everyone out there, we may end up accidentally calling it as your Sentinel. And if you hear that, we mean Microsoft Sentinel, just so you know.
So obviously a big part of Sentinel is ingestion. And here we are to talk about data connectors and so on. So do you want to give an overview of what it means to ingest this data? What does that take? And what are these sort of prebuilt data connectors that we have? And why would people use them? Sentinel has 120 plus prebuilt connectors.
These are ones that have already been set up to ingest data sources from either first party Microsoft solutions, third party solutions from a number of partners. We see things from other cloud services, from on-premises services, even generic data sources. So things along the lines of Syslog or things that are common event format or CIF format of data. All of those connectors are prebuilt. They're integrated inside of the system.
You can just deploy them from inside of the portal and start bringing in data from all those different sources. Now data is of course the lifeblood of any SIM that's out there. So for anybody who doesn't know Sentinel is a SIM, it's a cloud based SIM. So we handle everything from the management of the servers, the clusters, data storage, maintaining that 90 days of rolling data that's part of the service.
You can go up to about two years if you wanted to, but we give you 90 days right out of the box, as well as the prebuilt data connectors. So you can start bringing things in. These data connectors could be anything from events on your firewalls to events in your Azure Active Directory or something from your on-premises servers. And like I said, that's really the lifeblood of the system.
It's all about getting those logs in, those events, and then being able to do hunting queries, analytics, leveraging things like machine learning models to find anomalous behavior. And then raising alerts and taking response actions after that. So Matt, how would a customer bring data from other products or even threat intelligence into Sentinel if they're not included in those prebuilt data connectors?
If you're just looking to get in threat intelligence, actually there's a couple of ways that we can bring in that data. We actually have a prebuilt connector for sticks and taxi. So we can support a taxi feed of 2.1 and higher. And taxi, I just really wouldn't know, it's the threats. I can never remember what the acronym for taxi means, but it's the actual exchange of information between the systems. And sticks is the structured threat intelligence exchange.
But the taxi protocol is the way that those different records of threat intelligence are exchanged, and we actually can accept a connection to a taxi server, bring in threat intelligence that way. You can also publish data on the graph. So if you're used to using graph.microsoft.com, you can put IOC's indicators of compromise onto the graph, and then those will be ingested into Sentinel.
If, however, there is a data source that isn't an already prebuilt one, or you have another source of threat intelligence or some sort of enrichment data that you want to put into the environment, we do actually have a full API that you can leverage to pull data in from other sources. Or rather, I should say push data in from other sources.
Leveraging that API, you can have a connection directly into the threat intelligence environment, or you can use the underlying log analytics workspace tables and push data directly into those. So you mentioned enrichment. This is actually a comment that Sarah has brought up in the past. I want to be honest with you, I'm not 100% sure what this enrichment stuff is. So could you just give us a brief overview of what you mean by enrichment and how that is different from any other kind of data?
Let's think about the data that we're bringing into the environment. We are seeing, for example, let's take a number of events that are coming in from, say, servers, right? And we see data about all these different servers. That record may contain something along the lines of the server name, an IP address, maybe an event ID, if it's a Windows box, or that's something coming in from, say, Syslog, it could be a facility code, and maybe a severity that goes along with it.
And then some additional data that's traveling along with it. Well, for somebody who doesn't know what those particular machines are, if I just hand you a machine that is called Server 01, that's not very helpful. It's a little ambiguous as to what that box could be. So what I might want to do is I might want to enrich some of the data that I have in the environment. Maybe I have a reference to what all those servers are.
That might be a CMDV that I have somewhere outside of my environment, that I'm going to use to pull in maybe who owns the server. What is it doing? Where is it normally talking to? That could be one type of enrichment. Another type of enrichment data could be something about a user inside of the environment. Maybe it's additional data about that user's role, or maybe where they work or something along those lines. Or it could be data about really anything.
As long as it's enriching my investigation, it's giving me a deeper insight, better views into the network at large, the environment at large, or maybe even just the threats that I'm seeing inside of my environment. It could come down all the way to adding in geographical information associated with IP addresses, which don't get me started on that one. I could talk about that for hours.
But you might want to pull that sort of information in to say that, hey, this particular IP address is coming to me from a particular country. And I want to know that sort of information without my users having to go look it up. So I've actually been doing a little bit of work in that area just recently.
I was trying to trigger Azure Key Vault, or more accurately, Microsoft Defender for Key Vault, into sort of making sure that everything worked by using a Tor endpoint to access a Key Vault, or at least attempt to access Key Vault, I should say. Would that be an example of enrichment? So for example, if I had some database or some lookup or some API or something that had a list of ongoing sort of Tor exit points, would that be an example of enrichment? Certainly could be.
Yeah, it would be something where maybe you have getting back to like what Gladys is saying about threat intelligence. It's really any data. It could be threat intel. I mean, generally, threat intelligence is kind of limited towards people, normally limited towards things like indicators of compromise. In other words, something that is definitely bad, meaning that this IP address is related to the bad actor that's out there.
Or this file hash is related to the malware that somebody may have seen somewhere. Other types of enrichment data could be exactly that. It could be those Tor endpoints that you know. And you might be saying, because they may not be bad in sort of the general parlance, but they're mostly of interest to you. You want to know where those particular IP endpoints might be, or what service they may be related to. And that's very similar.
I mean, any sort of thing that you could bring into your environment that is of use to you, I would consider that to be an enrichment feed of data that's inside of there. I'm curious, for the enrichment data to work, does you have to actually copy or move the data into Sentinel? Or can it sort of do lookups in other databases? It depends. I always love giving that answer because it depends on what you want to really do. You can, of course, copy the data in.
I personally think that that's one of the most efficient ways to do it, just because you can then take advantage of the scale that is behind Sentinel itself, in terms of speed, reliability, etc. You can, however, refer to data outside of Sentinel. It's actually a feature of the Kusto query language, or KQL. You have the ability to refer to data using a feature called external data.
External data will allow you to refer to, say, a CSV file that you have inside of Azure Storage Blobs, or maybe of a JSON file that's somewhere else in another service, or maybe it's even an API that can respond back with the information in a data type format. You can use that external data function to refer to that data in place.
Now, the nice thing about that is, if it's something that's being maintained externally, for example, we actually have some threat feed data that we actually have published inside of our GitHub repository, and we use the external data query to refer to it. That way, if it's being maintained outside of your environment, you can just go ahead and refer to it, get that information, utilize it at the time, and not ingest it into Sentinel itself.
On the other hand, if you have data that you're maintaining, and you have the ability to pull it into Sentinel, I would say go ahead and do that. One example is I put together an Azure function, actually, and this is a great example of even how you can extend things out beyond the prebuilt connectors. I put together an Azure function that does a registration data access protocol query, or an Rdap query. Rdap, if you're not familiar with it, is the new version of Who Is.
Who Is itself, venerable service inside of the internet, has been around forever, but was originally designed to be read by humans. If you do a Who Is lookup on a domain or you lookup an IP address, you get back information that's really, really easy for a human being to read, but not so easy for a machine to read.
The Rdap query, actually, this new protocol that they have, replaces that, and it makes it so that you get a JSON notation, JavaScript object notation, or JSON file, as a response back for a domain that you're querying on. If I go and do a lookup on Microsoft.com, I can find out all the registration information about that domain. Where was it registered? Who owns it? When was it created? Et cetera.
All that information is returned back to me as a JSON file that I can either use inside of my query, or the way that I've done it is I have it so that the system goes out in queries for every domain that I see in the environment, and then I have another analytical rule that's running in Sentinel because the Rdap query is taking it and pushing it back into Sentinel.
I then run a regular analytics over it that says, hey, if I see a user has gone out to a domain that has been registered within the last 30 days, I want you to raise an alert for me. That's a newly registered domain. It's a little odd that somebody would be going to it that quickly, so I want you to actually just let me know that somebody's had that happen. There's nothing wrong, maybe, with the domain.
It's not necessarily bad, so it doesn't actually qualify as an IOC, but it is something I want to look into. You mentioned something really interesting. I didn't know it existed. You said that you could call basically some code from KQL. Now, is that KQL that can do that, or is that something that's in Sentinel? And to what degree can you call something from KQL? So it's in KQL itself. It's not necessarily limited to just Sentinel, so you can actually use it from any log analytics workspace.
It's part of the language. It's designed to refer to data storage, right? So you can use it to call out to an Azure blob. You could use an AWS blob. It could be Google. It doesn't really matter. If your API that you're calling is able to return the value in something that it can utilize, then you can call that. So there's an IP lookup, actually, that I use all the time, and it returns the values back as JSON notation.
So what I can do is actually just call external data using what's called the let keyword. So I could say, let this value be equal to the return from this call out to this external source. Now, I do have to define what the data looks like. I can't just let it. It won't understand the data on the other side, so I do have to know something about the data. So if it's returning IP info, maybe I have a city, a country, an ASN, or something like that that's assigned to it.
I have to pre-define that as part of the results. But as long as I can map those results back, it can then come into a data table that can then be used in line with my query. And that way, I don't have to store that information myself. Now, there is a drawback to it. That one drawback is it doesn't accept pipelines communications. So in other words, I have to basically be calling a static IP address. It can't be something that I can put together, can catnate a string inline inside of Custo.
I'd actually have to go ahead and have that as a pre-defined string. That's not totally insurmountable. It's really great, again, if I'm calling out to a fixed storage file or something like that, because that's what it was originally designed for. On the other hand, if it's something that I'm calling out to some sort of a dynamic API, then it's really not the best case for it. It's really not the best case from just best practices anyway.
You want to try and reduce those calls as much as possible. And that's another reason why bringing that data in the Sentinel can make sense, because we're actually keeping in line with those best practice recommendations. It seems that we keep bringing a lot of capabilities for enriching that data. Can you talk a little bit about CodeList Connector? CodeList Connector is great. That's the new functionality. It's in preview now.
And what it allows you to do is to create a fully SASS-based Connector platform leveraging the capabilities that Sentinel has. So there's a number of different ways, like I said, that you can do a data connector. One could be that I could run it as a playbook inside of Sentinel. Playbooks are the automation or SOAR capability that are built into Sentinel. And they're built off of Azure Logic Apps. And they're great. They're very useful, very powerful.
But you can also use those to not only take automated response actions, but you could use them actually to digest data from another source. You can also use Azure Functions, of course. Azure Functions are serverless compute that you can run. But the thing is that both of those actually require you to kind of put together some infrastructure in a way. I mean, like the Logic Apps require that you actually put together the playbook for it and put together the structure.
Azure Function could be putting together the PowerShell script that you want to run or writing it in C-sharp or maybe it's in Perl or Python or something like that. Perl. That was weird. I don't know why I said Perl. But it's in Python. But what the new codeless connector allows you to do is you define what your service is using a JSON file again. And it becomes a fully codeless connector for it. You don't need to stand up anything. You tell us what the API is that you need to connect to.
You give us the information about how you want to connect to it, meaning authentication, tokens, secret keys, et cetera. And the system can then go out, make that connection, and start bringing that data into Sentinel. What's really cool about it, too, is it doesn't only support bringing in the data. It actually supports the new health monitoring as well. So we can monitor the health of that connection to tell you if it's good, bad, if it's up or down. Really, really, really nice system.
It's funny you should bring up Perl. I used to do a lot of Perl development back in the day before the days of PowerShell. I always used to joke that Perl is a right only language. Once you've written it, you can never understand what you actually wrote. I still got a bit of a soft spot for Perl, but yeah, I think PowerShell's taking that position in my life. I'd love to get your opinion on geography-based or geo-based IP filtering, because I've got some opinions here.
But I want to hear your take on that one. Yeah, so basically, talking about GeoIP, it is sort of my personal issue. And it's not really that I have anything wrong with it. I think that it is a fine capability as long as people understand some of the potential limitations of it. And somebody can disagree with me. People always do. I'm more than happy to hear critiques about how I'm wrong.
But the problem I have with it is that it comes down to a really fundamental question about GeoIP-based blocking. And that's really, what is an IP address? And then what is geographical data? And how are the two attached to each other? When you look at an IP address, it is completely fungible to sort of steal the term of the day. It is not a non-fungible token. It's fungible.
You can replace an IP address with another IP address, because they are different things that are really just attached to the network for communications. And that's what they're there for. The geographic data, however, is fixed place and time. Or fixed place and time? Sure. Places are moving temporally. No. It's a fixed geographical location on the planet that is then to which the IP address is then tied.
For some reason, people think that it's now non-fungible, that it's never going to change, that it could never be wrong. And that's not quite true. We have to look at the fact that when IP address information or when GeoIP data was first sort of put together, it all comes out of a database. So IP addresses are assigned by IANA. That's the International Association of Number Authority, something like that. And they have a database of all the different IP address blocks around the world.
And those different IP address blocks have information, interestingly enough, inside of who is about to whom they've been assigned. And when we take that information, we can then discern at a certain level that this IP address, or at least this block of IP addresses, belongs to this country. Well, that's assuming that there's never a transfer of that IP address from either one of the regional authorities, because they're not all in one major place anymore. I mean, IANA used to be like one org.
Now it's actually broken out into different groups around the world. There's Aaron here in the United States. There's RIPE in Europe. There's AFRNIC. There is APNIC for Asia Pacific, each one of which has different IP address blocks that have been assigned to them. And sometimes they actually do interregister our transfers, where there'll be a block of IP addresses that maybe was assigned to Japan at one point, and now it is assigned to the United States.
During that transfer, well, the transfer doesn't really take much time. It's an IP address, once again. But the registration information might take time to change. Also, it's only at the country level. The individual registrars might have information about to whom they've assigned a block, and that may give you information about the city, or at least the city that's on the registration record.
The further information about to whom it's been assigned inside of that country could be completely random at that point. It all depends on which ISP they were assigned to, or which registrar they were assigned to, and how accurate that detailed that data is. Now, that's how that all started, was people went out and they mined all this data out of the different registrars and used that as a basis of creating a lot of the GOIP databases.
They've been augmented over the years by people doing things like war driving. War driving is going around with a Wi-Fi connection, and recording geographic information along with any of the networks that you see or can get connected to. And some of that has actually proven to be fairly useful. The only problem with that, though, is, again, it's only as good as the source data.
If there's not a lot of source data to go with it, or if that source data is inaccurate, it's now somewhat questionable. And if you are getting back to my original point, if you're using this as a source of truth, well, the question there is it's about truthiness. It's truthful only as far as the source data is truthful. I've seen, for example, a lookup that I did on one IP address. It came out of a certain country, and that IP address, one of them had it in the capital of that country.
And another source actually had it about 2,000 miles away from the capital of that country. And it's like, well, those are pretty far spaced pieces of information there. So if you're doing things, you know, or trying to do things down at that sort of tight geo level, it's going to be really, really difficult.
If you try and do it at the country level, maybe you have a need to block something from perhaps, maybe it's an ITAR restricted country or an OFAC person who might be associated with a country or something like that, those might be useful. But it's only as good as the source data itself. And even there, there's some potential issues around things like IPv4 and IPv6.
Yeah, like the way that I think about it, when there's always like one of these, you know, grand internet debates, I always fall back on sort of the cost of attack is, does it actually create cost or, you know, friction, pain in the butt, whatever you want to call it for the attacker, right? And if so, it has some value. And then the next question is, how hard is it or easy is it to do?
And so, you know, so the way I think about it is like, okay, if I've, it's definitely gonna create a little bit of annoyance if you block a particular country and you expect adversaries from that country, or you expect you might get adversaries from that country. And if you absolutely know that, listen, we're a small regional bank, for example, we don't have people coming in from outside of, you know, the US or something like that.
Okay, yeah, you can maybe put a detection on it or possibly a block, especially if you're getting a whole lot of noisy attacks from, you know, a given part of the world where you just don't do business. So like, I'm okay with like, if it's easy and it's not gonna break stuff, yeah, you're gonna add some friction. But like, would I bet my security posture and, you know, my critical crown jewels on it? Oh, no. Yeah, exactly. And that's kind of my point, right?
Is that it's a, it comes down to that level of, I think the truthiness also leads to the level of comfort that goes along with the data. As you pointed out, if you're a small entity that never does business outside of a certain area, it is just easier just to go ahead and say, I'm gonna block all these things. They never should be there. Although I do have to ask sort of the, what if question of well, what if one of your customers just happens to be traveling in an area that you've blocked.
Now we've completely disabled, you know, their capability to do business with us and we've made their life more inconvenient. If it's something of, you know, maybe I'm concerned about it from a data exfiltration standpoint, I don't want my data going to this other place. Well, from the point of putting a roadblock in front of the attacker, there are these things called VPNs. And people will use them to do things like get around all sorts of geo blocks.
You know, if they're willing to do it to watch a television show or a movie in another country, I'm fairly certain that an attacker is willing to do the same thing to get your data and get out of the, you're blocking. What I would say makes more sense to me at least is, and this is sort of along the lines of, you know, endpoint security and assume breach is assume attack, right?
Assume that anything that's going anywhere in the world is potentially a malicious actor who might be taking your stuff or might be trying to attack you. And if you see any sort of anomalous behavior, irrespective of where it is in the world, treat it as exactly that. It's anomalous behavior. It doesn't matter to me if it was from my own country, from, you know, another country, if it's from my neighbor next door, an anomaly is an anomaly is an anomaly.
And I need to treat them all sort of the same way. When he talks about, you know, geo fencing and so on, the way I look at it is application security. You know, people often talk about, you know, security through obscurity. And I've heard people say, oh, you can't do security through obscurity. And I'm like, no, you actually can do security through obscurity. So long as it's not your only defense. And as long as you don't tell anybody. Yeah, that's a good point.
But the point is when what we're doing here with this sort of geo IP isolation slash blocking slash fencing, whatever you want to call it, you know, it's okay just long as, you know, you're not betting the entire organization on, you know, on that one defense. Oh yeah, it's gotta be, it's gotta be layered. And correct me if I'm wrong here, but the way I look at it as you and Mark were talking about it, it sort of reminds me of an example of something that may be good enrichment data, right?
It's not, it may help you make a decision, but it's not, you know, it's not like got a security guarantee around it. Yeah, exactly. Well, and that's the thing about, I think I mentioned this a little bit earlier, talking about like the databases and, you know, all the information, the geo data that's out there. There's one thing to consider here too, is that as time evolves, some of this data becomes less and less useful and less and less capable really, from a detection standpoint.
What I mean by that is most of the geo IP data that's out there that folks are using is based on IPv4. So that's your standard, you know, octets of data, you know, so you're 182, 168, 1.1s. And what we are seeing now is that of course, because of the exhaustion of the IPv4 address space, we're moving everybody to IPv6.
Well, that creates a little bit of a problem, because, well, if you think about it, just from a mathematical standpoint, IPv4 is two to the 32nd power, so about 4 billion addresses that are out there. 4 billion is a huge number, it's a lot of stuff, but it's something that we can actually even think of mathematically in our heads as humans, and we can have a database that has 4 billion IP addresses in it and we can say that they belong to, you know, these particular groups or particular people.
Now, when we start moving over to IPv6 though, we're talking two to the 128th, that's 340 undecillion IP addresses. That is a number that is so big that it takes about, I think it's five commas to get into the quadrillions or somewhere along those lines. I'm not a mathematician, so don't hold me to it. But the problem then becomes you've got this now massive database of information. Now, of course, it can be done the same way that an IPv4 database can be done.
You can do it by, you know, octets and subnets and, you know, class A type, you know, information. That's still a lot of data. And in this modern world also where things are from an IP address standpoint, stuff isn't even fixed in a fixed place anymore. Mobile devices, mobile PCs, you've now got just this huge domain space that you've got to try and figure out.
And I think it lends itself towards that, you know, you have to assume attack all the time and just say, look, I'm not gonna really care too much about the IP address that this is coming from. And instead, I'm just gonna treat it all as potentially suspicious or potential attacks.
Yeah, when you look at it that way, I think what you just said is that, you know, we can have a database of four billion or so entries, but we can't have a database of a, whatever the number was, squintillion or something. 340 undecillion. Okay, that too. You just can't do that, right? So it becomes totally unmanageable. That's a good point, yeah. So now one thing we always ask our guests is if you had one final thought you'd like to leave our listeners with, what would it be?
So a final thought, I would say probably the biggest one would be don't forget the basics in security. By that, what I mean is it's not always going to be about the latest, greatest, finest security tool that you might see out there, or the flashiest thing that's out there. A lot of security comes down to still doing the basics, getting them right, operationalizing them, making them not just something that you do all the time or do every once in a while, but you do it all the time.
Something that you actually internalize, good cyber hygiene, internalize good device and user and app security postures. And that it's not something that is easy to do. It's not something that you can necessarily flip a switch and make it suddenly happen, but it will pay for itself quite a bit by taking the little bit of extra effort to put those in. All right, with that, let's bring this episode to an end. Matt, thank you so much for joining us this week.
I especially like your thoughts on geo-fencing and geographic restrictions based on IP addresses. So again, thank you so much for joining us and to our listeners out there. Thank you so much for joining us this week. Stay safe and we'll see you next time. Thanks for listening to the Azure Security Podcast. You can find show notes and other resources at our website, azsecuritypodcast.net. If you have any questions, please find us on Twitter at Azure SecPod.
Background music is from ccmixter.com and licensed under the Creative Commons license.