Updates to CosmosDB Security

00:00

Welcome to the Azure Security Podcast, where we discuss topics relating to security, privacy, reliability, and compliance on the Microsoft Cloud Platform. Hey everybody, welcome to Episode 51. This week, it is myself Gladys and Mark. Sarah is actually busy with some work stuff. We also have a special guest, we have Thomas Weiss, who's here to talk to us about some of the changes in Cosmos DB security. Since last time he was on the podcast, which is June 2020.

00:32

Yes, we've been going nearly two years. In fact, the next episode will be Episode 52, and that'll be two years. Before we get to the news though, I want to tell you a little story. It's actually a really cute little heartwarming story. So my mother-in-law is actually in assisted living, and my niece is actually in the same facility, but she's doing the books.

00:51

There's a gentleman in there by the name of Mark, and he suffers from a debilitating disease called Huntington's Disease, which honestly I'd actually never heard of until they said this little story. Mark's in his 50s, but the thing with Huntington's is it leaves you relatively motionless and talking becomes restricted as well. My niece found out that Mark was actually in cyber security at one point, and she called my wife and she said, hey, what's the name of Uncle Michael's podcast?

01:22

My wife gave her the name after asking me, because of course she had no idea what the name of the podcast was. So my niece found the podcast on Spotify on her phone, and she put the phone next to Mark. Mark started smiling and nodding his head as he was listening to this. Big shout out to Mark. Thank you for listening. We hope it brings you some joy. I was really kind of heartwarming when I heard that story. So with that little episode out the way, let's turn our attention to the news.

01:51

Why don't we kick things off with Mark? Yeah, and a shout out to the other Mark too. So doing my best to follow that. One of the things that I've been waiting for for a little while has finally released. So there is a new release, the revision four of the 800-40 from NIST, the National Institute of Standards and Technology. This is NIST guidance focused on enterprise patch management. This work and the conversations for it started around the time just after the Patea, not Patea type of attacks.

02:27

So we went through the government process to get a NCCOE project, a National Cybersecurity Center of Excellence project around patch management. And we need to revisit the higher level guidance as well that guides that work. And so the two documents, the 800-40 which is the overall guidance, including the scenarios that were tested in the lab.

02:52

And then the lab document itself and the results from a bunch of vendors came into the NIST labs and helped take the current best practices, current knowledge, and implement it and show how to implement it using their technology. And so these two documents both came out recently. And then one of the nice things that's in there is, it didn't really treat enterprise patch management as just an isolated thing.

03:21

It's actually sort of a normal part of how do you do maintenance, just like you can communicate with a business leader, like, hey, this is just like changing the oil and otherwise maintaining your fleet of trucks or planes or whatever. You need a little downtime, you need to do some maintenance on it. And so a nice focus on sort of the end-to-end view of patching, not just a technical practice, as well as some linkage to zero trust and how it fits into a zero trust concept and context rather.

03:49

And so really, really liking that document that came out there. So we'll send you links for both of those. And that's what I got for this week. So, shout out for Mark as well. I have three news that I want to share. We actually just released a few days ago, actually. A few articles that talk about how Microsoft performed tasks in order to help with the Sea Loader campaign. One of the articles was named this Mantling Sea Loader.

04:24

How malicious that led to disabling security tools and trying somewhere. I love this article because it's a perfect example of how the partnership that Microsoft is offering, not only to customers, but overall to the security community. We are following our mission, right, to empower every person and every organization on the planet to achieve more.

04:51

So not only we have a commitment to keep improving our capabilities within our tools to deal with new threats and risk, which help our customer to protect, detect, respond and recover. But also we share our findings with the community. So other customers and vendors can improve their capabilities. So this is really important. The article basically goes into explaining how different capabilities, such as Microsoft Edge, a smart screen can be used to protect services and windows overall.

05:32

To also how our services like Defender are using detection capabilities to protect and help recover if a customer is attacked with this campaign. The next news that I wanted to share is that this is the fourth consecutive year that Microsoft 365 led the Mitre and Junity Independent Attack Enterprise Evaluation. Basically we successfully surfaced Security Operations Center comprehensive incident for each of the simulated attacks that were performed.

06:13

In the spirit of a soon compromise, yes, we provide the capabilities to protect. But what this shows is how we understand that it is important to detect fast and respond as fast as possible. And we enable that by interconnecting our products. So in our near real time, we could control the damage that attackers may do and recover from them.

06:43

So by doing so, the article which was named Microsoft 365 Defender Demonstrate Industry, leading protection in the 2022 Mitre and Junity Attack Evaluations and is in the Microsoft blog, you will see where Microsoft 365 demonstrated complete technique coverage across attacks such as Whizzer, Spider and Sandworm. Basically they were leveraging artificial intelligent adaptive protection. And you see like the attack chain and how we were protecting those different attacks.

07:24

Last I wanted to mention the new Unified Microsoft SIM GitHub community. This is a good way that Microsoft is providing a lot of information that show our efforts across a SIM and XDR, basically to enable the SOC team to centrally discover the latest hunting queries, analytics that can be used both in Microsoft's Sentinel and Defender. So the first news item I have is we now have some new VM types under the Azure Confidential

08:03

Computing banner that use NVIDIA GPUs. Now the nice thing about this is that if you want to use the GPUs to accelerate artificial intelligence research, we can do it under these confidential compute VMs. Historically we have these VMs that run with various types of CPU. Now we have included NVIDIA GPUs as well. For those of you who are not aware, you can write code that runs across all the hardware threads of the GPU to parallelize the work. So it's great to see that.

08:34

Next one is Azure Data Explorer. We talked about Azure Data Explorer a few months ago and it's a way of offloading, for example, massive logs to something that's a little bit more cost-effective in the long term. That is Azure Data Explorer. Now we've added conditional access support to Azure Data Explorer. That's great to see because obviously log files can contain sensitive information.

09:01

So now we can apply conditional access policies. So for example, if someone comes in and some condition exists, someone wants to read this data, then they may be prompted for MFA as an example. Next one is Azure Load Balancer. Now it allows you to manage port forwarding for the back-end pool.

09:20

Going to be honest with you here, I understand what it all means, but I don't necessarily understand exactly what's why, but managed port forwarding is now available in Azure Load Balancer at the back-end pool. So for those of you who care about that, you're welcome. Now this next one I do thoroughly understand and I'm very, very happy to see this. There's a kind of attack called a dangling DNS problem. And without going into all the horrible details, I will provide some links in the show notes.

09:53

But because in Azure, and same with other cloud platforms as well, you're bringing recent services up because of the whole agility of the cloud, you're bringing services up, you're tearing them down, you're bringing them up, you're tearing them down, then if you apply a DNS name to that, well, that DNS name actually live longer than the service exists. And unfortunately, that opens it up to potential hijacking of that name. And the attacker could actually take on, essentially direct the

10:23

DNS name to their IP address. And Microsoft Defender for DNS will actually pick up on this as a potential issue. However, as an extra layer of defense, we now have this ability to have DNS reservations. So what that means is the DNS names can actually hang around for a lot longer. So you can actually register a DNS name. And if you bring a service up, pull it down, bring it up, pull it down, whatever, the DNS name will basically be associated with that

10:53

particular service for a much longer time. And that helps, it really does improve your security posture substantially. The DNS issues can lead to, well, I often refer to a subdomain takeover, this can lead to things like cookie harvesting, phishing campaigns, cross-site scripting, cross-site request forgery, cause, bypass, and even loss of control over the

11:17

contents of that subdomain. So it's actually a pretty serious attack. But yeah, so one, Microsoft Defender for DNS will help catch this, but also we now have something a little bit more concrete in place as well. The next one is about Azure Monitor Logs. I don't want to go into this in too much detail because there's actually a lot of announcements here. And again, I'll include the links in the show notes. There's been a lot of feedback from customers around how they can optimize

11:41

their spend with logging. I've seen customers using, let's just say, not particularly cost-effective mechanisms for storing their log files. And there's lots of options that we have. But essentially what we're doing here is providing even more options for people to store logs in long-term storage, storage in such a way that it's optimized just for querying. So that's all available today.

12:05

Again, links in the show notes. But if you're finding you've got big log files and perhaps the cost of retaining and querying those log files is something that's of concern to you, then take a look at this. We've basically made a whole bunch of options available to you. And the last one is kind of interesting. So a good friend of mine, Ian Helen, and he and I have worked together for a long time over the years. And he works on the Mystic Pi team. So we talked about Mystic

12:29

Pi some months ago as well. And Mystic Pi is a Python library that basically contains some cybersecurity tools for hunting and investigation using Jupyter notebooks. Well, Ian has actually made available a course up on PluralSite, certainly a short course, but well worth looking at if Mystic Pi is something you're looking at using or kicking the tires on. Then if you already have a PluralSite subscription, then go and take a look at Ian's class. All right, now we've got the news

12:57

out of the way. Let's turn our attention to our guest. This week, we have our very first repeat guest. It's Thomas Weiss from the Cosmos DB team. Hey, Thomas, thank you so much for joining us this week for our listeners who didn't hear the first podcast. Would you care to introduce yourself? Absolutely. Thanks, everyone. Thanks for having me. So I'm Thomas Weiss. I actually used to be a Cosmos DB PM. I joined the Cosmos DB team close to three years ago. I used to be a PM focusing on

13:23

a flurry of different feature areas, security being one of them. The move that happened recently for me is that I migrated to a new database platform security and governance team. What that means is, technically, I changed titles and managers, but my day-to-day duties are pretty much the same. I am looking after security all around for Cosmos DB and I'm leading the delivery of any new security features on that great platform. So I'm going to get my question in before Michael starts geeking

13:50

out and I don't have a chance to get a word in. So tell me, tell us about some of the big changes that have happened since last we spoke on the podcast. Yeah, well, so much has happened, let me tell you. It's been 18 months, I think, and we obviously never stop either introducing new security controls or improving existing security controls. And I think one big driver for us is

14:16

to try to catch up with our big brother that is SQL. Those guys, they have two to three decades head starts and actually it's nice for them to give us the security charter because they kind of paved the way on what is next that we should tackle. And among those things that SQL has introduced a while back and that we were lacking was support for client-side encryption, what is called always encrypted on SQL. This is something that we identified as a gap as customers were

14:45

progressively reaching out to us. The Cosmos DB team saying, look, that thing is great on SQL and this is something that I would need on Cosmos DB as well. They were looking for some kind of column level encryption, which obviously doesn't directly translate to Cosmos DB because being a document database and a schema agnostic database, there is no concept of column per se. Essentially, we understood that what customers were looking for is property level encryption.

15:11

So it was an amazing journey for us to start engaging with those customers, understanding really their requirements end to end, and also engaging with our SQL friends because it would have been crazy for us to tackle that by ourselves. And so the delivery of always encrypted on Cosmos DB is the result of a very landmark collaboration between the Cosmos DB and the SQL teams. We collaborated on a design, our SQL friends reviewed our design and made sure that it made sense.

15:40

And we even collaborated very technically in that we actually ended up sharing code. And so the code that is running in our drivers to provide always encrypted to our customers is pretty much the exact same code that is running on the SQL drivers. And I think what's interesting, a quick note for our listeners is that the way today Cosmos DB encrypts data through client and encryption is pretty much cipher text compatible with always encrypted on SQL. So that's how far we took it.

16:12

We introduced that public preview a while back. I think it was a build last year, sometimes in Maytime frame. We got great customer feedback and response, a lot of high-profile customers for which it is very important to put an extra level and extra layer of encryption on top of extra sensitive data. Those customers obviously were very keen to get started. We went through that preview program with some changes we brought to the API based on the feedback we got from

16:42

the customers. And I'm more than happy to share that actually we were able to make that feature generally available the last month. And so this is now available for all our customers with production support. And I think that's an important point is the resulting cipher suite is the exact same cipher suite as SQL server, right? Which is the AES256 HMAC SHA256. So it's just kind of funny, right? Because it's not really always encrypted at

17:13

all. It's actually always protected, right? Because the encryption is provided by AES. But there's also an HMAC in there that also makes sure the data has not been tampered with. That's right. That's right. So when he says the same data format as like the same cipher text format as SQL server, does that mean in theory that I could say pull some data out of

17:37

SQL server without decrypting it? Like say bulk export it and then through some code or something, turn that into a JSON document without decrypting anything and then insert it into Cosmos DB? So technically you can. This is not something that today we support as a turnkey feature. We haven't documented that to be honest. It is on our charter to provide that level of interoperability to our customers at some point. We just set the foundation for that to happen.

18:04

We need to make some improvements on the tooling that would let our customers move the data around. But technically, I mean, we have done it in our labs and it works. Now the next step for us is to productize that support and which is something we are going to work on. Yeah. So I mean, it's impossible to talk about always encrypted in Cosmos DB without comparing and contrasting with SQL server. Again, to your point, they had a massive head start.

18:30

Actually, there's pros and cons to that. From a pro perspective, it's mature technology. From a comp perspective, SQL server has this internal access model that bears no resemblance to anything that Azure knows about. So you've got to end up mapping Azure identities and Azure access control models to work inside of SQL server where you guys were born in the cloud, right? So you

18:53

understand that from day one. Yeah, absolutely. But that said, Michael, I think it's no coincidence or no surprise that we named the feature always encrypted as well because essentially for our customers, the same approach, it's the same concept in that we get customers

19:10

create what we call encryption policies at the container level. So with those encryption policies, customers can say, I want property one to be always encrypted with key one in a deterministic fashion and I want property two to always be encrypted with key two in randomized fashion. So when the customer creates those Cosmos DB containers, we give them that option to define and provide that encryption policy. And also the way keys are handled is very similar to

19:37

what happens on SQL. Always encrypted in that we let our customers create data encryption keys at the database level. Those data encryption keys are themselves wrapped with a customer

19:48

managed key that our customers today can manage from keyboard. And so at the end of the day, the same security guarantees are provided today on Cosmos DB in that when customers encrypt data we've always encrypted, they are guaranteed that we Cosmos DB as a service, we never ever see the plain text data and not only that, but we never ever see the plain text data encryption keys. And so for customers, we want to have that guarantee that there is absolutely no way for a

20:16

Microsoft or a Cosmos DB operator to eventually decrypt the data. That's a guarantee that comes with the future. Yeah, that's an important point. And that's something that I certainly talk about heavily with customers is heaven forbid. But in the case of, say, an environment being compromised, Cosmos DB, an environment being compromised, Cosmos DB doesn't have the keys. I mean, the attackers are just going to get ciphertext. The key is not there. I mean, it's not

20:40

like, you know, I can go and get the key and do it. No, the key is not there. I can't convince Cosmos DB to decrypt it. It doesn't have the key. It doesn't have access to the key at all. It's a it is client side encryption, right? So it's the clients that have access to the key and the R back policies will restrict access, you know, through the through the client, rather than Cosmos DB having access. And it's interesting because a lot of customers have a they kind of have to stop and

21:03

think about that. It's like, yeah, Cosmos DB is not doing the encryption or the decryption. It's all being done by the Cosmos DB client drivers, not by Cosmos DB itself. And that includes some kinds of queries, right? That's right. That's right. There is we did our best to let our customers leverage the full spectrum of database operation when they are using always encrypted. That said,

21:26

there were some limitations that today we had to introduce. And those limitations mainly impact the query ability when customers want to filter on encrypted properties for our listeners or family encryption, they will probably understand that we bought any support for, you know,

21:40

secure enclaves like SQL has today. Today, the only filter construct that we support are equity filters, meaning that if I have defined an encryption policy where property one is encrypted and is encrypted in a deterministic fashion, then it is possible for our customers to write a query where, you know, property one equals ABC. That's totally supported. Anything

22:03

more fancy? I think fancier like, you know, range queries, you know, starts with things like that today, we cannot support because that would require, you know, having a support, secure enclaves, or some kind of confidential computing, which is something that's definitely on our radar. It is on our backlog. But I would add that, you know, that those constraints today, they probably

22:26

impact a class of workloads that couldn't be today translated to always encrypted. But we need to keep in mind that there is a very large spectrum of workloads for which those constraints would still be absolutely fine. We do have a fair amount of customers who are actually using Cosmos DB as a key value store. The only thing they do are document lookups. And for those operations, no problem at all. And our situation and our position today is that we GA'd that first milestone,

22:56

that first iteration of always encrypted. Now we are just opening our ears and listening to customers. And if we have substantial feedback from customers saying, look, I'm blocked because, you know, I need range queries or I need things that today are not possible. This is how we are going to fuel the next iteration and how we are going to improve the future. You know, it's kind of funny, right? I've had these conversations with the customers as well.

23:18

Here you've got, you know, you look at the job of a database, right? It's to query stuff. And it's to get data. And then the whole point of encryption is to not get data, right? That's the whole point. So here we've got these two things that are absolute opposite ends of the spectrum. And the fact that we're actually doing something in the middle that will, you know, that will actually allow you to do queries, even as it says substantially small, I should say,

23:41

subset of queries, which basically is equality or inequality. The fact that you can do anything at all over ciphertext without decrypting it is pretty cool. I mean, you know, I have customers say, well, you know, well, can I just turn on always encrypted and everything will just work. It's like, no, it probably won't work because you probably have some quick SQL queries that, as you say, I like things like range queries, right? Something between the first of September

24:04

and the third of September, it's not going to work, right? Because you're doing the query over ciphertext. But the fact that you can do some kinds of queries, especially over, you know, super sensitive data, that's just, that's just fantastic. I mean, I'm just, and I've seen people essentially change their schema slightly to accommodate for, for this so that they can

24:24

actually do some kinds of queries. But yeah, you know, this is a great example, you know, as we talk about in our confidential computing literature, you know, this is encryption of data while it's in use. And so we're actually using it and querying it. And it's just fantastic to see Cosmos DB, you know, adopt this, adopt this technology, which has been in SQL server for a

24:42

while. But as you also point out, SQL server has been around for, you know, for longer. And they also have the secure enclave technology, which allows them to do more types of queries, because the query engine runs in a secure enclave, which is for those of you not aware, so secure enclave in the sort of Azure confidential compute world is takes advantage of virtual machines that run

25:06

specific Intel CPUs that have what are called software guard extensions. And essentially, what we do is we corner off at the CPU level, a chunk of memory, and then the query engine runs inside that little enclave. That's what's called non-clave. So that is not available today in Cosmos DB, but it's something like you say you're sort of listening to customers, see if it's of interest to them. Absolutely. Absolutely. That's our current situation. And that technical solution

25:29

is complex, right? Let's say it from an implement implementation standpoint, it's a huge effort for to embrace a confidential computing and enclaves to add additional query support. And that's why we are cautious, right? We don't want to engage work that actually our customers wouldn't be necessarily interested in. So yeah, anyone listening here, if you think that you would need additional support by all means, you know, funnel that feedback to the Cosmos DB team and we'll take

25:56

action on that. So Cosmos DB can talk multiple air quote languages, MongoDB, SQL, Cassandra, and so on. Is this limited to the SQL drivers or can other drivers handle it as well? That's a great question. So today, we have a GA always encrypted on our SQL API or core slash SQL API, which is the native API, the API that we control, right? Now having support for clients and encryption on the other platforms involves that involves support for their back-end

26:30

features that lets the client do that encryption. And I'm glad you bring it up, Michael, because for example, MongoDB had the support for what I think they call field level encryption on their drivers for some time. And for some time actually, we couldn't support that in our MongoDB compatible API because we were missing support for some back-end level features that the drivers are leveraging in order to deliver a field level encryption. But with our latest iterations and

27:00

our latest versions, we no support that. What that means that Cosmos DB users that use the MongoDB compatible API that want to use the vanilla field level encryption that comes with the MongoDB drivers now can do that as well. Yeah, as long as people understand that, they understand some of the limitations. Are there any other limitations like what sort of data can be encrypted or particular types of data that can't be encrypted? No, when it comes to our MongoDB API,

27:27

we support the full range of what field level encryption can do. Not supporting enclaves, we come with the same limitation when it comes to the type of queries, but in terms of data types, whatever you can do on MongoDB can also be done through our MongoDB compatible API. But that is separate technology though, right? It's not always encrypted. No, it is part of the vanilla MongoDB drivers. So yeah, it is not always encrypted.

27:53

So if you're a MongoDB person using Cosmos DB, you'll be familiar with that? Absolutely. Yeah, this is really exciting stuff. As I don't even know or not, Thomas, but last week, well, the last time I did the podcast, we had no news on purpose so that Mark could actually talk about his stuff. But I couldn't help it because Cosmos DB had just GA'd, always encrypted. So I'm like, well, you know what? We have no news, but hey, you know what? I'm going to talk

28:19

about this one thing anyway, even if it means stealing some time from Mark. But yeah, this is really exciting stuff. I'm such a fan of it. It allows some really interesting zero trust environments as well because Cosmos DB doesn't have access to the keys. You got strong auditing of the keys, all the key encryption keys at least, and as a key vault, I just think it's a fantastic solution. Great to see it, fantastic engineering, elegantly simple as well, which is

28:50

not to me that it's easy to implement, but it's just so elegant. I'm a huge fan. And again, it's not just encryption. There's an HMAC in there as well, so the data can't be tampered with. So as expected, I sat there listening and learning. Anything else in addition to the always encrypted goodness? Yeah, absolutely. I think there is another thing I would like to mention, something that is a bit less new, something that we introduced last year, but that was

29:19

another major update to the security controls that we are exposing on Cosmos DB. And this is the new Azure AD powered data plane RBAC that we introduced for our customers. That's a workstream that was initiated based on customer feedback, customers where, some of our customers and arguably the most security conscious customers, we're asking for an alternate way to authenticate

29:44

their Cosmos DB database operations. Since the beginning, the only authentication that we supported was key-based authentication, so customers can fetch a primary and a secondary key, and those are the keys that they can use when they connect to their Cosmos DB account, which is fine in plenty of use cases, but we had some customers who arguably wanted something different, and what we were actually asking for is an Azure AD-based authentication. That was one requirement that

30:11

we initially tracked. The second requirement that came from customers is that they wanted to some kind of granular access control. Another problem that comes with the primary, secondary keys, is that it's open bar, right? That whoever has the key can pretty much anything on the account, including management operations. So that puts a lot of pressure on our customers to

30:32

handle those keys in the right way, right? What I would like to mention here very briefly is that, although it is extremely convenient to connect with a primary key, because again, it's just one shared secret that lets you do anything. Our customers and our listeners here should be mindful that we've great power, great responsibility, and those keys need to be handled appropriately if you want to have the best security posture. In particular, it is the best practice to regenerate

31:03

and rotate those keys on a regular basis. This is something that, it's not a reflex that many of our customers have. So we regularly remind our customers that it is the best practice when you use key-based authentication to perform those upgrades now. That's a side note. We are actually introducing new APIs to let our customers query what was the date of last generation, regeneration for those keys so that they can kind of take action on keys that may be stale.

31:34

So yeah, that's another feedback we got from the customers, right? They would like to have some granular access control and they would like to better configure the client to make sure that client A can only do a subset of operations based on their business rules. And the third and last requirement we were gathering at the time is that our customers were seeking better auditability of

31:58

what's happening on the on the data plane, right? There are situations where the customer would like to answer questions like, you know, who deleted my data, you know, and when, and that those are questions that you just can't answer if you are using just a key as the authentication mechanism because anyone could be using that key. So it is based on that customer feedback that you have that we have built that brand new data plane RBAC. In terms of API, what we did,

32:26

maybe a bit lazily, but I think that that was the best thing to do anyway. We copied over the concepts from Azure RBAC because that was a a problem model that has been out there for for quite a while. And so all the typical concepts that you that you can find on Azure RBAC like, you know, actions that make up a permission model, role definitions, role assignments, those are those are concepts that we carried over the cost on the Cosmos DB side. And we have introduced new

32:54

APIs on the control plane on our control plane to let our customers do that. Right. So now what our customers can do is that they can either play with data plane built-in roles, we have a couple of new built-in roles, we have a read only and read write built-in roles, or that we can also I mean, we also let our customers craft their own specific roles and customers can can can go very

33:17

specializing those roles, right? If you if you want to create a role to let the client just, you know, insert data and not do anything else, not not even being able to read back the data you have inserted, which makes sense in some scenarios, right? That's possible now and something that typically wasn't possible at all before. If you want to create a role that, you know, only lets you

33:36

read from the change feed, you know, and nothing else that that's that's the kind of thing you can do now. So you can craft your roles, you can then assign your roles to any kind of Azure AD identity, we support obviously the full spectrum from user principles to service principles, including managed identities, which is something I'm very fond of and I always recommend our customers to have a good look at managed identities. Azure AD groups obviously are also on the spectrum. So yeah, that's the role of the client.

34:04

So yeah, that's the role assignment part of the of the workflow. And then once this is done, obviously, customers can finally upgrade their client connection to use what's called the token credential instead of the of the primary or secondary key. And so that that token credential, the

34:22

authentication essentially is passed to our drivers. And that lets our drivers acquire an Azure AD token on behalf of the identity that the client wants us to use. And then we just use that token to do both the authentication and the authorization of the database requests.

34:38

And I know that when I see customers using, you know, these these essentially these secrets to access Cosmos DB, you know, the first question I ask is so okay, so where are those secrets being stored? I don't mean the back end I mean, you know, in your applications that have to access these secrets, it's the classic secret

34:54

problem, right? I mean, I had a conversation yesterday with a customer and they're like, they're accessing some back end mainframe and I'm like, so how do you access that? Well, we have an API key. Well, where's the API key? Well, it's stored in this configuration file. But where's the configuration

35:08

file? What's over there? Well, how's the API key protected in the configuration file? Well, it's encrypted. So where's the key for the encryption? Well, it's over there. Well, how's that key encrypted? And then you can see them all like looking between each other is like, how is it how's the final key? You know, how's the key encrypted? It's like, and they didn't know. And the problem is like the five Ys but as applied to five house.

35:29

I know. And I look, I understand why it's there. It's been around for a long time. But you know, you've got to make sure those things are protected, especially the primary keys, right? Because they used to, sorry, if I got the name wrong, but the, you know, the essentially the admin key that lets you get into admin

35:47

things. So yeah, I'm a huge fan of, you know, AAD RBAC at the data plane. I don't know if you know or not, but just recently we added that recently, probably about a year ago now, we added that in Key Vault as well, right, they had their own authorization model, and now they're going to

36:03

use a data plane RBAC model. It's a little harder to configure compared to the old one, but it's also a lot more granular. And you have very strong in auditing as well that goes in there. So again, you know, it's really great to see Cosmos DB listening to customers and adding this this functionality.

36:20

It's great to see. Hey, you know, there's something else you guys for probably you guys actually maybe you guys release Microsoft Defender for Cosmos DB. And by the way, this is completely off script. Oh yeah, that's something that that's an effort in collaboration with the Microsoft Defender team. It is not directly worked on by the Cosmos DB team, it is more of a partnership where we enable the Defender team to implement, you know, those threat models and threat

36:49

detections. And yeah, yeah, I'm also glad to share today that we have very recently kind of rebooted those efforts. This is something that used to be exposed to our customers as a preview under the name of advanced threat protection, ATP, that probably rings a bell to some of our listeners here. The ATP team went through both

37:09

renaming. So they renamed to Microsoft Defender but also at the same time, they took that opportunity to reboot the efforts. Because until recently, not only was ATP in preview only but also it was only covering a small set of threats, right? I think

37:27

initially, what they were covering is suspicious connections, right? If suddenly you get new connections from IPs that you know you've never seen before from locations that you have never seen before, this is something that Defender used to cover as part of their preview.

37:42

And I think they also had a threat detection on data exfiltration, right? If suddenly you have massive reads or massive scans across the database, this could be a sign of someone trying to exfiltrate your data to some other place. And so yeah, for quite some time, the preview was running with only those two threats being covered.

38:01

And so over the past six to nine months, we kind of rebooted those efforts. And the Defender team has added very interesting new threat to our detections. They are now detecting key extractions, which is also a very common pattern.

38:18

So they are looking for access to those keys that we just mentioned, right? Which you can fetch through a list keys operation on the CosmoDB control plane. And they try to detect anything suspicious here, right? If suddenly there is a new IP or a new identity that starts getting the keys that may trigger an alert on their end.

38:39

They are also, and I was actually pleasantly surprised that they managed to pull that off. They are also now detecting SQL injections, right? Which we didn't necessarily think about at the beginning, but especially since the SQL dialect that is exposed by CosmoDB is not NC SQL, right?

38:58

It is kind of a SQL adaptation that we crafted in order for it to be compatible with the specifics of JSON. But yeah, the team managed to actually craft a detection on potential SQL injections. So that's another great new capability that Defender is now exposing.

39:16

So all those new detections have been, as I said, relaunched, rebooted in public preview now. So available for all our customers to try out. And I believe that the Defender team is lining up the GA of Defender for CosmoDB in the next couple of months.

39:32

I'm a huge fan of Defender. Actually, before we, by the way, I want to talk about Defender. I don't mean the video game. I mean, Microsoft Defender. Although I am actually back in Redmond, I used to have a Defender stand up console, an original one from the 80s. Nice. Yeah. Yeah, it cost me $2,000 worth every penny. No, in all seriousness, I should want to explain something really quickly about the SQL injection stuff because I had a conversation with a customer about this a few days ago.

39:57

They said, the guy's like, well, hang on a minute. So you got this brand new product. Well, relatively new product called Cosmos DB, you know, relative to say SQL databases. How come you have SQL injection vulnerabilities? I'm like, well, hang on. It's not as simple as that. This is not a weakness in Cosmos DB. This is a weakness in client side code where you use string concatenation to build a SQL query where one of the strings comes from an untrusted source.

40:21

Cosmos DB just sees a SQL query. It doesn't know how it was constructed. It just sees the SQL query. And if it happens to be a SQL injection SQL query, there's not a lot. It doesn't know a good query from necessarily a bad query. You know, it's really a coding discipline at the client. It's got nothing to do with Cosmos DB whatsoever. It's not a weakness in Cosmos DB.

40:41

So I want to make sure everyone's aware of that. And if you take, for example, in C sharp, as I think a link language integrated query, which can also do SQL queries against SQL server, it is actually resilient against SQL injection vulnerabilities, but only because it is a client side technology.

40:58

And it knows how to construct SQL queries from something that's not actually a real SQL query. So it is resilient to SQL injection vulnerabilities. But that doesn't mean that all of a sudden magically SQL server or Cosmos DB or Oracle or DB two or anything else is resilient to SQL SQL injection. It's got nothing to do with any kind of weakness at the back end. But it's great that defender is detecting some classes of SQL injection. That's really good to see.

41:28

Well, Thomas, thank you so much for all of that. I always enjoy listening to you talk about Cosmos DB. It's such a fantastic product. I think probably 90% of the customers I work with are using Cosmos DB in their workloads, which is fantastic to see when they have so many options available to them and they're choosing Cosmos DB.

41:47

Don't tell the SQL guys I said that by the way. So as you know from last time, one thing we ask our guests is if they have one final thought to leave our listeners with, what would it be? Well, Michael, I would say we have talked about new features that are now generally available. We believe those features are substantially improving the range of security controls we have on Cosmos DB and help our customers adopt a much better security posture.

42:16

I guess the call to action is just to go out there and start using them, especially that Chinese new support for always encrypted. As a PM, I do want to hear from our customers. So by all means, kick the tires, start a quick prototype. It's quite easy to get started with our public documentation. Feel free to reach out to me if you have any feedback on me.

42:36

Thomas, thank you so much for joining us this week. I know that Cosmos DB is going through some exciting changes and I know that you're very busy. So again, thank you for joining us this week. And to all our listeners out there, thanks again also for listening. Take care and we'll see you next time. .

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript