S5E28 - Azure Storage Accounts - Scalable, secure cloud storage for data objects | Let's Talk Azure! podcast

00:01

Hello and welcome to the let's talk Azure podcast with your hosts, Sam Foote and Alan Armstrong. If you're new here, we're a pair of Azure and Microsoft 365 focused it security professionals. It's episode 28 of season and five. Alan and I recently had a discussion around Azure storage accounts, which provide scale, secure, and highly available cloud storage for various data types. We what are storage accounts and why are they useful? How do you create them? What are the configuration options and how much do they cost? We've noticed that a large number of you aren't subscribed. If you do enjoy our podcast, please do consider subscribing. It would mean a lot to us for you to show your support to the show. It's a really great episode, so without further delay, let's jump in. Hey, Alan, how are you doing this week?

00:47

Hey Sam. Not doing too bad. How are you? Yeah, good, thank you. I don't think this been too much excitement this week, has there? Especially not in the security world. No. We've seen some root cause analysis of the crowdstrike side of things. True. I haven't looked at that. What's your sort of thoughts so far?

01:12

My quick sort of read of it, so I've not got, you know, not read the report itself, but it sounds like there was 20, there was an extra data source or something in the memory that was added and there was only 20 and there was 21 references. So when it went to, when it had to read the 21st one, it couldn't and crash machine, and that's probably me butchering that completely. But in effect, it tried to read something that wasn't, didn't exist, so. And then it crashed the kernel, basically. So. So, yeah, I mean, that's the initial part of that. So. So, yeah, but yeah, apart from sort of that side of things. And is there anything else? No, I don't think there has been as there. I mean, I've been off this week from work and the world, so I've been to, uh, too close to any of it this week.

02:13

Yeah. It's gonna be interesting to see what the long term fallout is of that. I, I haven't personally talked to an organization that's had to recover from it, you know, so I'm not acutely aware of the pain that organizations had to go through. So it will be interesting to see, obviously, because we work on a different technology stack. It'll be interesting to see, you know, if it does affect renewals and how they retain customers. I'm not saying, you know, I'm not trying to be super negative or anything like that, but yeah, it's a pretty big incident, isn't it? You know, for organizations to have to deal with?

02:55

Yeah, definitely. And it doesn't necessarily, it's just unfortunate that it happened to crowdstrike. You know, it could have been any other av provider that has that kernel level access, you know, including defender. Really. As I said before, I think maybe last week or the week before, you know, with, we've seen windows updates crash machines, haven't we? So it's similar thing. It's just that we've built in the mechanisms to try and reduce the impact with update waves or I rings and things like that. So it's just bringing that into that framework. Is it a framework, all that methodology into not just Windows updates and Windows products, maybe moving out to other suite of products to reduce impact if possible. Of course, if it's possible.

03:54

I think the thing with a security tool, you know, people are going to be really time sensitive, you know, to new threats emerging. So it'd be interesting to see if there is any real way to protect against this sort of thing or whether it was just a sort of a freak one off, you know, because touch wood, you know, I think vendors are. Well, this is certainly probably the biggest thing we've seen ever, or, I don't know, I've never seen something like that before.

04:27

Yeah, I think as well, people obviously tie onto the negative. It was down for a couple of days. Users, customers, organizations were maybe down for a couple of days a week. But realistically the product crowdstrike may have prevented them from being attacked, which could take them out for months. You just don't, you know, it's just when it's working, everyone doesn't really praise it too much, do they? It's only when it goes down it hits the news.

04:57

So I think it's gonna, it's, I personally think it's gonna cause some really complex discussions around business continuity planning, you know, so let's do a, let's do a sweep of the environment about all the software engine tree that's got kernel level access and how do we protect against that going rogue? And I assume people that were most caught out were people who hadn't planned for that eventuality, maybe not tested as well. That's probably me being too harsh, but it definitely tested people, that's for sure.

05:35

Yeah. Cool. Okay, so what's this episode than, Sam? This week we're going to do Azure storage accounts or just Azure storage, I suppose. Yeah, because we haven't actually done an episode on it, which it's probably, I don't know. Is it one of the most core aspects of Azure? Probably. I don't know. Apart from RBAC and networking maybe. I don't know.

06:09

Definitely a well established service, isn't it, that we see across multiple. This kind of storage mechanism is very, is definitely, you know, seen, you know, in all the three main clouds, isn't it? You know, s three buckets and the equivalent in GCP kind of thing. So it is pretty core to it. Yeah, I guess some organization might not need to use it, but there's definitely some, some things in the background that are kind of like default to using it. Yeah, I mean, I guess taking a little bit, but taking the Azure Cli that you can do in the, in the console in there, you don't need a storage account anymore. They, as, they've changed that, but that is, you know, core to it. So, yeah. Okay, so I guess let's dive into it. And so do you want to start, Sam, with what Azure storage accounts are?

07:11

Yeah, so I suppose the umbrella term is azure storage really because Azure storage accounts are effectively a collection of different storage services which are sort of administered within the same like resource, if that makes sense. I would say it's more of a container for different other types of azure storage mechanisms, essentially, but it really is the core sort of storage mechanism in Azure. And as you mentioned, a lot of different things integrate with it and it's also used for many different scenarios. But the idea is that this is a service that you transact and then you create and all the management of this solution is handled for you. So there's no infrastructure for you to worry about. It's all managed by Microsoft. You obviously pay for this. We'll obviously talk about licensing, but this is one of those core PaaS solutions that allows you to essentially build your app on top of for data storage. It's not just apps, there's also many different other solutions that can be integrated, but the real benefits come from durability and high availability. And we'll go into that a bit more detail in a bit more detail. But the idea here is that you hand off that responsibility to Microsoft and they will manage your data that you're storing and they'll also make sure that it's durable and highly available. There's, you know, it's replicated on your behalf, it's not something you have to worry about. And there's different levels that you can go to for that durability, security is a big part of storage accounts, not just accessing it, but we'll probably delve into security a little bit deeper later on. But really about at rest encryption and RBAC on top of it in order to have sort of rich control about who, sorry, RBAC is role based access control. If you're not sort of in our world, about who and how that data can be accessed, which is a challenge with storage systems that you have to run yourself. Scalability is also really important and that is a, you know, when you're looking at a more traditional on prem sort of NaS San type of scenario, you have to provision your capacity. So you have to think about, you know, the actual hardware that you're procuring and sort of your needs into the future over the lifecycle of that, of that device and also accessibility. So making sure that you can access this data privately and in theory publicly as well via a multitude of different mechanisms, essentially rest APIs, Clis explorer applications, that type of thing. So making sure that no matter what you're building, there is as many options as possible to get access to this data.

11:04

Okay, cool. Yes, definitely. When you're talking about, you know, on prem sort of capacity planning or hardware planning and things like that, as you kind of said initially, you know, high level, we know this is, it's a platform as a service, a PAAS solution. You know, this is potentially handled by Microsoft for you, so you don't have to necessarily worry about it or think about it up front at least. Okay, so you kind of talked about it was kind of a storage count or azure storage, kind of a cluster of storage services. You know, what are the different types that you can use within, you know, storage count?

11:52

Okay, so yeah, so Azure blobs or blob storage is really where that storage account sort of mechanism resides. Blob storage, Blob stands for binary large objects. So this is things like audio and video files, images, documents, those types of things. So it's not a file system so to speak. It is a mechanism for you to store and retrieve objects. I'll say one at a time, but you can list them, mouse download them, but you generally tend to work on them one by one. Blob storage, you can submit a file to, to a blob storage account. It will be stored and then you can access it in a variety of different ways. You can make those blobs publicly accessible if you want to sort of, anybody anywhere can access and download them. You have to be a bit careful about that. We will talk about security later, but also you can access it via rest APIs, Azure Powershell, Azure CLI and client libraries. So if you're building a.net app, a node, JS app, Java, Python, go blah blah blah, you can get access to it as well. So there's a variety of mechanisms to interact with those blobs and also to interact upload them and retrieve them as well. There is also a layer on top of blob storage which is called Azure data lake, which is really about the scalability and feeding into sort of big data analytics on top of storage accounts because the ability to store scales massively, so block blobs I think can go up to something like 190 terabytes bytes or something like that. It's pretty crazy. I'm not sure about the maximum number of items that you can have, but I assume it's in the millions if not billions, or even unlimited, I'm not 100% sure. So you can build out hugely scaled storage systems and massive amounts of data. There's also different storage tiers where you can sort of say I want hot cool or archive storage which allows you to, I think we probably need to have an episode on each different storage mechanism, so I'm not going to go into them in too much depth. But yeah, just think about massively scalable storage at an object level. There's also Azure files, so this is essentially a replacement for your more traditional SMB NAS shares. So it's designed so that you can essentially lift and shift your on premise file shares straight into the cloud. So it gives you the familiarity of what you're currently using, essentially your SMB clients, but allows you to put things like authentication in front of it, shift those into the cloud for management of the files themselves and also allows you easier access to cloud for cloud resources to get access to those files as well. So yeah, again fully managed by Microsoft. Another couple that I want to mention is Azure queues. Azure queues sit in Azure storage which give you a relatively simple messaging system if you're using things like Azure service bus. This is a queue storage system. So you've got a queue where you can add messages to and you can broker them to other applications. So it's relatively lightweight, it doesn't have the same feature rich functionality of service bus, but it can certainly get the job done. Each queue message can be up to 64 kb in size and there's different sort of architectural styles of how to consume into that as well. So it's very easy to provision and it's very low cost, which is why it's quite popular, because other more traditional queuing systems can be expensive and hard to maintain. Table storage is a nosql. Azure table storage is sort of a NoSQL storage mechanism, similar in ways to Cosmos DB, but again not at the same sort of architectural level and scalability that Cosmos DB is at. So within a storage account you can have a table and a table can have a number of entities, and an entity is essentially an object. Objects entities can be up to 1 mb in size, and so you can store relatively large JSON objects there as well. Again, I probably didn't mention it on the queues, but I will mention it here, and it's applicable to both. There are client libraries to get access to these more, I would say sort of proprietary features, if that makes sense, because you do need special access into them. I'm not going to talk about the other types of storage data services because they become more enterprisey, I would say. But there are a couple, I will just list them so people are aware of them. Azure Elastic San it's effectively a managed San inside of Azure. I haven't had any actual experience with that before. Azure managed disks run in Azure storage data services as well. For block level storage attached to Azure VMS, there's Azure NetApp files, which is an enterprise file storage which is actually, as the name suggests, powered by NetApp, which allows you to integrate large enterprise line of business applications. Again, not something that I've actually used. I believe you can access those via NFS and SMB. On that side of things, there's a new different type of storage data service called Azure container storage. So that's for volume management, deployment and orchestration for containers. So one of the big challenges with container management is volume storage backup and those types of things. So that integration there could be quite valuable to some.

19:54

Cool. There's definitely a few there, isn't there to do. And I've, well, we, I should say we've kind of touched Stephanie more on the, I'd say actual tables and blob storage, mainly with some Azure file stuff, haven't we, around Azure Vetch desktop and fslogic sort of integrations and things like that. But yeah, there's definitely a lot of uses there, isn't there? And I guess another question is that can you, whether you would, because I guess you'd kind of separate a storage account for specific services, maybe, but I think you can enable some of those features in the same storage account, can't you? Those services?

20:43

Yeah, exactly. And it is a it's more of a container, you know, for those different services. So it can be a little bit confusing because it's sort of a storage account with multiple different types of data in it. So yeah, it's a bit of a catch all, really. Yeah. Okay, so we talked about putting data into the storage account in the various services there. So how can you manage that data, let's say, within the storage account?

21:23

Yeah, so as I've mentioned, as I've gone through, some of these storage services are going to use clients that you're previously used to. Things like all protocols, I should say SFTP. No, no, it's ft. No. Is it SFTP? No, that's. What's the ssh one? The secure ssh, sCP, sorry. And SMB nfs, those types of storage connection protocols. But primarily blob storage, and from many different vendors it's the same that it's sort of a proprietary type connection to it. So people may have heard of s three, s three buckets and s three. I suppose s three was probably one of the first, if not the first, blob storage systems. Don't know. I don't really, you know, early days of cloud, I was more aws than I was Azure, so I'm not going to call out who got there first. But the s three sort of protocol is Amazon derived, but it is used by other people, and I assume it must be an open protocol or it's been reversed. I'm not 100% sure, but my point there is that Microsoft has its own ways of actually connecting to a lot of these systems. You'll need to use an SDK, you'll need to use a cli. A lot of applications like data management applications, things that you would use to upload, like FTP servers, sh file shares and things like that. A lot of them do support the authentication that's required to storage accounts. So just, just, you know, in the back of your mind, just remember that when you are using Blob storage, you're going to need to integrate it into your app a little bit closer than you would have just, you know, mounting a file share, if that makes sense. Now, that works relatively well for a lot of modern applications because, you know, developers are used to integrating these SDKs to get access to data storage mechanisms. So that's not really so much of an issue for newer applications. It's more like a legacy challenge there. But there are also, you know, ways to get around that as well and keep the benefits of it. You know, we heard about block storage sorry for azure virtual machines. So it might be that you run your application on an IaaS box in Azure, you still have access to highly durable and redundant storage in a storage account. You just access it in a different way. So I think that's in one respect. That's why there's so many different offerings, because there are so many different ways that organizations want to be able to store data. I think the odd ones out to me are more table storage and queues, because they're like a completely different type of data. A message in a queue or an entity in a table is essentially just a JSON object of data. It's not like a blob file, if that makes sense. It's like storing the contents of the file instead, and it's more structured in its approach. So that's the only thing I would say, is it's just a little bit different for those two.

25:11

Yeah. Just quickly, do you think that. Because I think storage counts have been around for some time now, do you think that was the initial. It kind of feels like the initial way to be able to do. Because you said, like, tail storage was similar to Cosmos DB, and Cosmos DB wasn't in, well, where it existed at that time, or whether it was in Azure as a service. Do you kind of. I kind of feels like this was like the gen one of maybe, you know, storing no sequel or something similar to Cosmos in there. Same with. With the queues. Maybe it was like the first versions of it before. Like you said, service bus. And is it event grid similar? Maybe not. Service bus sort of came into. Into light and into service, maybe.

26:01

Yeah, I. My gut is, and I don't know, I might get the timeline wrong, but my gut is that table storage did exist before Cosmos, because. I don't know, I feel like Cosmos is more relative, because I believe Cosmos came after DynamoDB from AWS, and I think it was introduced. I don't know, I could be butchering it, but it felt like it was a response to DynamoDB, which was, you know, well, Cosmos DB equivalent on the AWS side. I think that's how the timeline went, and I don't. It's probably unfair for me to call it out like that, but the only thing I would say is we've had storage, like queuing systems as an example. There's loads of open source queuing systems out there and they've existed forever. They're not, you know, service bus. The way that service bus is delivered and managed is innovative, but the underlying concept isn't novel. If that makes sense. So, so, and I don't really know about early service bus, I don't know when that, that came to life. But yeah, these are definitely those two solutions. TABLE storage and azure queue are definitely more basic versions of them. So if you don't need some of the more advanced because, for instance, with queuing systems, it's really around delivery and making sure that there's effective durability of messages, make sure messages aren't duplicated. Those types of things. There's not those levels of consistency with cues. So, you know. Yes, it is definitely, you know, older siblings, if that makes sense, more mature offerings, but they're kind of serving two different audiences, I think, personally.

28:17

Yeah. And it's not, and like you said, it's, it's not necessary. They may be older, maybe like you said, more senior kind of thing, but they definitely have a purpose in the, in the ecosystem, like you said. So, yeah. Cool. Is there anything else, Sam? Sorry? Around managing the data in them? Because I probably sidetracked us a little bit there.

28:42

No, just no, there's too many services to go into them in detail. The main thing is you might need to use an SDK or a CLI to get access to it. So. So just, yeah, keep that in mind when you're going to integrate.

28:55

Okay, cool. So onto the next sort of area. And it's really around security and securing access to data because I guess, as always, data, depending what type of data you have in them, it could be quite critical to your organization or I guess bad actors may want to collect it for their own use. So how can we, you know, what are some of the mechanisms that we can secure the data? Yeah.

29:25

Okay, so let's talk about some different scenarios for sort of, because there's a few different things here. Where should we start? Let's go for data protection to start off with, you know, making sure that the data that is in those storage accounts is correct and safe. So one thing that storage accounts are sometimes used for is for storage of uploading of files. So this could be in your application. Let's say you've got a social network and people on your social network can upload images. You're going to want to store those images somewhere because they want to be on their profile forever. So you might have a file upload control in your application. Then you take that file, you put it into a blob storage account so that you can always retrieve it via the application. One of the challenges with that is making sure that non malicious files are uploaded like the spread of malware, those types of things, executables. So you can, you know, do some of that filtering in your app itself. So there are, there's bolt ons on top for that type of scenario. So defender for storage, I believe, is it then? I don't know which tier. We won't go into that. But there is an element of anti malware inside of defender for storage to protect against those types of. There is more than that. There is looking at access attempts and sort of alerts for exploitation of your storage accounts. So if you want that sort of, and that is quite a complicated thing to manage because I think sometimes, initially when I looked at like defender for storage, it looked quite expensive. But then with, you know, the anti malware capability is actually pretty powerful on that type of thing, on that side of things. The other, the other aspect to it is about sort of immutability and data protection. So storage accounts have an ability to be able to turn on soft deletes for different areas. So in sort of a breach scenario or an issue with your application, you do have some protection of the actual data there. You can also do immutable blobs that can never be deleted. And you can specify, sorry, a retention interval if you want to, because obviously your costs could spiral out of control. One of the big things that we try to drive is making sure that secure and modern authentication is used, where possible, to these storage accounts. Storage accounts do have the ability to have a SaaS token, which is. Is it fair to say that's basically like a password in the most basic sense, I would say. Alan.

32:36

Yes. Just a key, isn't it?

32:38

Just a key, yeah. So then you have, when you use those, that type of access, it's hard for you to control that key. You've got to think about management of it, who has access to it, what happens if it's lost or stolen? So, yeah, big bonus points for using things like managed identities and secure, more secure access mechanisms. So enter id essentially to authorize access to those, to that blob data. Let me just. I'm just going to scroll down. Yeah, I think the next thing that I want to talk about is networking. No, let's talk about access, actually. So one of the big things that we see is anonymous read access to containers and blobs. So containers are like a folder for blobs. So the blobs are actual files themselves. You have an account, you've got a number of containers and a number of blobs. That's sort of the hierarchy there. Time and time again we see organizations who, because by default, I think this is fair to say, by default, storage account, we'll call them storage account owners. Well, Azure administrators, let's call them, can create storage accounts and they can share containers publicly out to the Internet. Now the configuration of this previously or historically was pretty poor, I would say from a user experience point of view. The sort of UI was very basic, like do you want it private, do you want it public? And dependent on how you accessed it with other services in Azure it can be, could be quite confusing about how you actually get access to these blobs. So we generally saw a lot of blob storage accounts that were publicly accessible. There's a couple of different levels to what is publicly accessible. I won't go into them, but there is a possibility that people could essentially enumerate your storage accounts and access publicly available blobs that you've never linked to anybody previously. I think there was a website or open source project that effectively enumerated. This isn't just a problem for Azure storage accounts. S three buckets are and have have also been scraped in a very similar way. So the big takeaway for that is to make sure that you've got good governance of those storage accounts and the creation of any new storage accounts going forward. So you're alerted to that because a lot of the time with a lot of businesses that don't have sort of public facing applications, they don't need or want anonymous sort of read access to containers and blobs because you can imagine how dangerous that could be. So just a public service announce, check your storage accounts and yeah, just to add on to that point, file access logging. So as your activity, you know, the activity of the admins is pretty well logged, I would say. But you have to enable blob access logs. So if you want to see who's accessing what and have an audit trail of it, you have to do extra configuration. And a lot of the time that configuration isn't done networking. So essentially you can use private endpoints, vnet connections, specific network limits. So if you want completely private private storage accounts then you can do that. Also you can force things like secure file transfer, so forcing TL's, HTTPs, those types of things as well. So just a few key things I want to just raise with around security is you're potentially storing a lot of sensitive and important and valuable data there. Just make sure that you've got the right access control, networking in place and also blob data durability and make sure you're securing them in the right way.

37:18

Yeah, it's probably worth mentioning because you kind of mentioned defend the cloud in there around malware and things like that being uploaded. But I guess the second part is device discovery, I think not device discovery, data discovery. And that you can use the data security part in Defender cloud to see you're exposed what are deemed as public accounts which still need investigating to see if they're truly public. It's fair to say some, isn't it? But can also give you some PI, you know, understand if you've got any PII data in there.

37:56

Yeah, it's definitely a good start. I think the recommendations from Defender for cloud about, you know, your storage accounts being public is probably to me the most valuable part because you actually know whether it's public or not, that can give you an indication of Pii information, but that's never, always the true case, is it? You know, it could be a load of doc back files of your SQL dumps that are in there. You know, it could be all sorts of. Anything could basically go in there. So I think for me it's all around, you know, mapping what is there, investigating it today, if you think you've got a problem there, and then also putting the controls in place to be alerted or either preventing people from creating public storage accounts, which you can do, or being alerted when somebody does for investigation. And those things aren't that complicated to set up in the grand scheme of things.

38:55

No. I think another key area, like you said, is getting enter id to do the authentication because at least then you can identify which service is accessing it. Yeah.

39:03

And because it's in effect that, you know, the SAS keys are just a SAS key for any. Yeah. There's two of them, isn't there? I think so. It's just. Yeah. Identifying which service is doing it. And if there's a. An issue, you can understand which service is actually trying to access it or at least what account is. So it gives you an idea of where you need to start targeting your investigation, things like that. Yeah, definitely.

39:28

Okay, so data stored. How do I know? It's sort of available to me. So, you know, what's the high availability sort of options we've got here around the storage as on prem? I'd say I got two nas's in two locations sort of thing as a scenario.

39:48

Yeah. So I suppose the big sort of selling point, I would say, of a remotely hosted and managed solution is the responsibility of storage is moving to somebody else. You could view that in a few different ways because it's not in your control. Then what if things go down, things go wrong. Let's just talk about some of the different redundancy levels that you can have. So locally redundant storage, so there's a few different levels. So locally redundant storage, which is the first base level, I think that's just out of the box. There are three copies of your data in a, no, there's three copies of your data inside of a single data center which provides at least eleven nines of durability over the given year. So 99.99 more times. So you know, that's to, that protects you against the data rack itself, against the server rack itself, but it doesn't protect you against things like fire and flooding because it's inside of a single data center zone. Redundant storage is the next level up which replicates your storage account synchronously across three azure availability zones in a primary region. So each availability zone is a separate physical location with independent power calling and networking. That then gets you up to twelve nines of durability over a given year. And what's great about that is you get zone redundancy then. So if a write request, if an access request comes in an availability zone is down, it can essentially route into another zone for you. So you're, you know, there are price differences between these two and we'll talk about those later. But yeah, you essentially massively increase your storage mechanism there. So the next one is, so that is three individual copies shared across availability zones. So there's only one copy in each availability zone then. So it's the same as the previous one. But if you want redundant storage in those secondary regions, you can then go up to georedundant storage which is a primary and secondary region and you have three copies of your data in each. If you go up to that, you go up to 16 nine s of durability and you get that geo replication but you get multiple copies stored in that data center to also protect you against rack failure in there as well. And then the next level up from that is geo zone redundant storage which essentially gives you zone redundant storage in your primary region. So you still get the three availability zones in your region for lowest latency and good access in your primary region and then a full three copy replication to a secondary region. Because when you're using that ZRs which is a single copy in each availability zone, you do get better performance and latency with that. So that again is 16 nines of durability over a given year, but it's set up in a different way. Yeah. So there are four different, four different tiers. The first two tiers, LRs and Zrs. There's three copies of your data either in data center or across availability zones. And GRS and GZRs are six copies of your data in multiple regions. So you have some really good options for your storage availability there.

44:45

Am I right? And maybe I'm not some, I just want to make clarify this. I think whenever you go to a availability zone, in effect, like you said, there's always three copies in that, that single data center or availability zone. And like you said, if you do. No, not necessarily. No. It's only if you go for zone redundant storage. It's three copies across, one copy in each availability zone. To get the redundancy in a second availability zone you need to go for geo zone redundant storage.

45:21

Well, geo redundant storage in a second availability zone, but I think, I thought that in each availability zone your data was still split three times because of that. No, only if you go for geo redundant storage. ZRS. The second one is only one copy in each availability zone. Right. Okay. Yeah. It's only the base locally redundant storage. You need to move up for geo redundant storage. Let me just, I've got some, I've got some screenshots I'll send to you and show you that.

46:01

That's cool. Okay. Yeah. So definitely ways to keep your data available and safe in various, like you said, different regions. Yeah. Or just locally in a region.

46:18

Yeah. And that is just for object durability. If you want availability for read and write requests, you only ever get three, you essentially only ever get three nines of availability on that side of things as well. So you might need to think about, you might need to think about your application and how you handle access failures in your application, if that makes sense. They're only giving those incredibly high durability figures for those really high durability figures for the objects being stored themselves.

47:07

Yeah. That's cool. Okay, so I guess the big question to sort of end with is how much does it cost?

47:20

Okay, I'm gonna fire up the pricing calculator because this is gonna be a bit of a wild one. Right. Problem is I don't just want, sorry, I just want to make sure I've got the right because it's going to push me to what's going to be the best way of me doing this because I've got to go. Okay, let's go blob storage first because that's probably the biggest one. Right. So, okay, so you pay for blob storage. You pay for data storage by the gigabyte size. So what's going to be the best way for me to represent this? There's about 15 tables talking about all the different stuff. I'm not going to talk about numbers, I'm just going to talk about how it's. It's charged per gigabyte and you have different storage tiers essentially. So you have premium, I believe that's got better IOP's and performance hot. It's just live access, cool, cold and archive. And when you shift along those different tiers, you essentially have different cost profiles for accessing data. So if you're in archive, your cost per gigabyte is incredibly low. But I believe you're charged to restore that data, to read it? To read it. And there's also a. It's not immediately available. I believe you have to request it and wait for it to be retrieved from archival storage. So there's the sort of a latency situation where you need to. With what you need to think about there. So yeah, just the only thing I would say is you're also charged differently for different durability as well. So you know, there's scenarios of, you know, geo redundant storage, geo zone redundant storage and the different ones you need to think about the extra storage of that. And I think that's basically dependent on the amount of different copies of the data that you have. You've also got to think about the data transfer prices of those items going across those different regions because there is a. I believe there is a cost for that as well to replicate across regions. So you do have to be careful with blob storage because it can get very expensive very quickly. You also get charged per operation as well. So it's not just the storage, it's write read and different types of storage mechanisms. Just. Yeah, if you're. If you are planning a deployment of storage. Yeah, definitely go in, do some calculations on it. Table storage is. Let me just bring up the pricing for table storage. Sorry, they're, they're split out on the Microsoft website for. Because even though they're the same in the UI, it's, it's different on, on that side of things. So table storage can also go across zones as well. So there's different storage costs there per gigabyte. So even though it's sort of a table's worth of data, you're not paying to have like a system running constantly. And there's also read, read write scan list, different operations there as well and there's different pricing dependent on the durability. So another thing that you need to think about as your azure queue. Sorry, I'm just looking up these live and azure queue again. There's data storage with different levels of durability and a cost per operations for that as well. But weirdly you don't pay for geo replication data transfer for storage queues. Didn't know that. That's cool. So yeah, probably not worth going through all the pricing because it is nuanced in the detail there.

52:19

Cool. Yeah, there's definitely a load of combinations in there for that. Yeah, yeah, I think, you know, not only are people going to be accessing different types of data in different ways, they're going to have different durability and access latency requirements for that data. So I think they're just sort of, they've got a big different, a big sort of matrix of different combinations to sort of, to sort of cater for all of those different scenarios.

52:52

Yeah. Cool. Okay. Is there anything else than Sam that we want to talk about? We're at 53 minutes. Yeah, let's not go into. We've just done an overview of azure storage. I think we could do an episode on each different storage mechanism. We probably should. So yeah, we'll leave it there for now.

53:14

That's cool. Yeah, I think that'd be good to get sort of use cases for using some of them and things like that in a bit more detail. Cool. Okay. My, the next episode I'm gonna do on defender external access surface management. So understanding what, what your risks are against you, you know, in effect your attack surface out there and yeah, we'll just run through that and get some of that insight about what you can discover and then how it kind of feeds into some of the other products to help identify risks, vulnerabilities and things like that. I don't think we've really touched on it, have we, Sam? Too much? We kind of mentioned it occasionally, but I think it probably worth diving in a little bit now.

54:04

Yeah, I don't think we have no, yeah, sounds great. Cool. Okay, so did you enjoy this episode? If so, please do consider leaving us a review on Apple, Spotify or YouTube. This really helps us reach out to more people like yourselves. If you have any specific feedback or comments or suggestions, we have a link in our show notes or you can comment against the episodes in on YouTube. Yeah. And if you've made it this far, thanks ever so much for listening and we'll catch you on the next one.

54:32

Yep. Thanks, all.

Transcript source: Provided by creator in RSS feed: download file

S5E28 - Azure Storage Accounts - Scalable, secure cloud storage for data objects

Episode description

Transcript