Episode 71: Azure SQL Database and Always Encrypted using Virtualization-Based Security Enclaves - podcast episode cover

Episode 71: Azure SQL Database and Always Encrypted using Virtualization-Based Security Enclaves

Feb 15, 202328 minSeason 1Ep. 71
--:--
--:--
Listen in podcast apps:

Episode description

In this special episode, Michael sits down with Pieter Vanhove about a new addition to the SQL Server 'Always Encrypted' family. The new addition, Virtualization-Based Security Enclaves (VBS), is now in Preview and allows for more scalability and lower cost when using secure enclaves compared to the current SGX-based enclaves.

Transcript

Welcome to the Azure Security Podcast, where we discuss topics relating to security, privacy, reliability, and compliance on the Microsoft Cloud Platform. Hey, everybody. Welcome to Episode 71. This is going to be one of those special episodes. I'm here with my colleague, Peter Van Hover, to talk about a feature that is about to go into preview, which is Azure SQL Database using, always encrypted, using virtualization-based security. This is actually a really cool feature.

Again, we're going to talk about this in more detail with Peter. But before we get stuck into it, Peter, welcome to the podcast. We'd like to take a moment and just introduce yourself to our listeners. Yeah. Hi, Michael. Thanks for having me. So, yeah, I'm Peter. I'm a program manager in the data platform security team.

That's the team that manages all the security features in Azure SQL, in managed instance, in SQL Server, but also all the other data platform services that we have, like Cosmos DB, Postgres SQL, and MySQL. And I'm the program manager for two features, for Ledger, which is all about data integrity. And I'm also the program manager for always encrypted. Very cool. So for those who are not aware, so Peter and I are actually colleagues. We're on the same team.

Peter looks after, as he mentioned, specific security features, whereas I'm more on the sort of engineering side of things, sort of looking at software design and threat modeling and code quality and crypto design and all that sort of stuff. So our paths cross on a very regular basis. So this is actually really cool stuff. And it's a new feature that is, again, in preview for Azure SQL DB, always encrypted, using a new form of enclave technology called virtualization based security or VBS.

So before we get stuck into that, we position this new technology, Peter, why don't we just go through the current options that we have today in both SQL Server on-prem or in a VM, as well as Azure SQL DB. So if you want to sort of rattle off those options and then we'll discuss them and then hopefully we'll lead into always encrypted. First option that our customers have in SQL and in Azure SQL is TDE or Transparent Data Encryption.

And like I mentioned, it is completely transparent for our customers. So what is the engine doing is it's really encryption at rest. So everything that is written to the data files or written to disk before it is written to the disk, it is going to be encrypted. And if you need to read something from disk, we're going to fetch it from the data files you're going to decrypt it and load it decrypted in the memory of SQL Server.

So for the customers, for the applications, it's completely transparent. And like I said, it's really encryption at rest, so not in use. Another technology that we have is column encryption, which is a technology where you can encrypt specific columns using a certificate and symmetric keys as well, kind of like an older technology, I think, to be honest.

Then in Azure, we also have SQL that can run in a confidential virtual machine, which basically it's a specific machine that is using specific hardware as well under the hood so that you can have it's a specific CPU that is used under the hood, which means that the VM memory is completely encrypted and the integrity is protected by the CPU that has created the keys for that. Yeah, it's an AMD EPIC CPU or virtual CPU using a technology called SEV-SMP. It's actually pretty cool technology.

I see Intel just released something very similar as well and we're looking at that. I shall put a link to that in the show notes as well, just out of interest. But yeah, so the keys, the actual symmetric keys that are used to encrypt the VM in use while it's actually being in use of the memory and everything, they're ephemeral keys that are actually managed by the CPU. So it's actually pretty cool stuff.

There's obviously a performance implication of that, but by the same token, it's very good for very, very simple lift and shift stuff. Yeah, it's cool stuff. If you look at these three technologies that we just mentioned, none of these protects your data from, for example, malicious DBAs. If you look at TDE, it is decrypted when it is loaded into memory, so everybody that has access to your database can clearly read the data. So we created a new technology.

Well, it's not that new anymore, which is called Always Encrypted. So if you look at the motivation to create Always Encrypted was to protect your data from malicious DBAs or people that cannot see your data. To give you an example, in my previous life, I was a DBA and when I came to customers, I always had full access to their databases, so I could just literally see everything.

And sometimes, of course, if you have very sensitive information, an external person like me as a consultant, I shouldn't have access to or I shouldn't be able to read this data. So that was one of the motivations to start creating Always Encrypted to enable our customers to confidentially store the most sensitive data in the cloud. So we started with Always Encrypted with our journey in 2015. So as I said, the feature is not that new anymore with an initial version of Always Encrypted.

So it's really a client-side encryption technology, so the data gets completely encrypted on the client sites within the client driver there and before it is being stored in the database. So we consider only the application site as a trusted part and the network and database engine is considered as non-trusted. So it means that the data is never decrypted inside the database, which means it provides you strong encryption.

But on the other hand, it gives you quite some limitations if you look at the computation of this protected data or these encrypted columns. And one of the limitations, for example, with Always Encrypted was that you could only use equality comparison on encrypted columns. It even then had to support deterministic encryption. Do you know, Michael, the difference between deterministic encryption and randomized encryption? Or should I also explain this a little bit further? Yeah, I can explain it.

You know, as the sort of the resident crypto nerd. Yeah. Yeah. So Always Encrypted supports two ways of encrypting a column of data. So the data can be encrypted. By the way, it's always using the same key. The key is randomized for different tables, but it's the same key. If you use deterministic, then for those of you who know about crypto, for a symmetric block cipher, you have to have an initialization beta.

And the reason why an IV exists is so that if you encrypt two plaintexts that are the same, if you have a different IV, you get different ciphertexts, even though the key is the same. So for deterministic encryption, we actually have the same. The IV is actually the plaintext. So if you have two plaintexts, it's always the same IV. With randomized, you have a unique IV, a unique initialization vector, even though the key is the same.

And that means that there is the resulting encryption operation results in different ciphertext. So same key, different IV, same plaintext will actually give you different ciphertext. So that's the difference between the two of them. The nice thing about deterministic is that it can be queried easily without using a secure enclave. But that's a discussion we're going to get into in a minute. So do you think that's probably enough I need to talk about with the difference between the two?

Yeah, sure. No problem. So this first version of Always Encrypted was released in SQL 2016, and it is still available. If you look at the next versions, you can still use Always Encrypted like it was before. It is also available in Azure SQL DB, in Managed Instance, and also in Cosmos DB. So we released Always Encrypted last year in Cosmos DB because there was also a demand to have it in there as well.

So next, what we've done then is we wanted to expand the capabilities of Always Encrypted because you didn't have much flexibility there. And we introduced a new technology like everybody mentioned, Michael, which is called secure enclaves. And what does it mean? Well, when processing SQL queries, the database engine delegates computations on encrypted data to that secure enclave.

So it is kind of like an isolated region of memory, and all the data that is stored inside that enclave cannot be accessed from outside the enclave, meaning cannot be accessed from the OS, from OS administrators, from DBA. So it's really, really a secure environment. So it means that the enclave then decrypts the data and then can perform these computations on plaintext.

So like I said, it's done completely safe because the enclave is a black box due to containing database engine process and the OS. So not even DBAs or machine administrators or whatever, they can look inside that enclave. So this is incredibly important. This is really the whole linchpin of the whole thing, right? If the enclave, the SGX enclave, so SGX stands for Software Guard Extensions. It's an Intel technology.

So the root of trust for that goes all the way down to the CPU or the virtual CPU. And that memory is completely isolated and is encrypted in use. So the actual symmetric keys that are used to encrypt that SGX, that SGX enclave, I should say, are actually managed by the CPU. They're not managed by Azure. They're not managed by the tenant. They're managed completely by the CPU. So it's actually really cool because that way it restricts who has access.

And again, like you mentioned, malicious VBA does not have access to the CPU and they certainly do not have access to the internal hardware of the CPU to access the symmetric keys. So it's a very strong defense, incredibly strong defense. And on top of that, when you use secure enclaves, it is now possible or we can now support rich confidential queries, which means you can use pattern matching like range queries or range comparisons like larger than, smaller than.

You can sort on encrypted columns. You can put an index on it. You can group by, you can order by. So these are all extra, not functionalities, but yeah, you get this. This is now all possible by using these secure enclaves. And you can also use in-place encryption.

It means that, for example, you can start or you can encrypt existing data without moving the data out of the database, which was, or which is not possible with the first version of always encrypted, which is now possible with using the enclaves as well. Yeah, I forgot about that. That's actually pretty cool. And actually there's another thing that a lot of people don't realize exists. You can do this through SSMS, through SQL Server Management Studio or from PowerShell scripts.

So when using always encrypted with secure enclaves, you can actually also do data encryption key rotation, not just key encryption, key rotation. I mean, data encryption, key rotation while the database is, well, while the table is still being used. And you can sort of set up some parameters to say, hey, rotate 3% of the database at a time or the table at a time. And it will try to not sort of impinge on the business processes.

So being able to actually rotate data encryption keys while the database is in use is actually really, really cool. Yes, that's another real nice benefit that always encrypted brings to the table. Yes, exactly. This version of always encrypted with secure enclaves was released in SQL Server 2019 and improved in SQL 2022 and is also available in Azure SQL Database. Now if we look at Azure SQL Database, if you want to use secure enclaves, you have to configure a specific hardware configuration.

Like you said, Michael, you need to have or we're using an Intel Software Guard extension or SGX that is managing the enclave. And if you want to use that, you have to configure what we call a DC series in Azure. So when you configure your database, you have to select this specific DC series, which under the hood is going to use that specific Intel CPU. Now we have a lot of customers that are using these DC series, but it comes with some, let's say, consequences.

Like you said, there's no free lunch, right? It means that if you want to use DC series, you're limited to the FECORE model. So you cannot use the DTUs if you would like to use that. On top of that, you always have to use the compute model provision. So serverless is not available with the DC series. And yeah, one of the limitations that we also have currently is that we're using, or the maximum number of physical FECOREs that you can use is eight.

So if we have some customers, they're using always encrypted, so not always encrypted, secure enclaves, because they need to have more than eight FECOREs. And yeah, currently we do not have that functionality. So this is really a limitation in the DC series. Another topic that I want to point out is that in Azure for the moment, the DC series are not available in every region. So it's limited to a number of regions. So that's also one of the limitations that we have with SGX.

But we're working on this. So we are making improvements on SGX as well for the moment. If we look at always encrypted enclaves in SQL Server 2019, the enclaves are there used by VBS enclaves or virtualization security, based security. Now what is VBS? VBS is a software based technology that relies on Windows hypervisor and does not require any specific hardware there. So that's currently what we support in SQL Server 2019 and SQL 2022. I think it's an important point, the VBS stuff, right?

So we do have memory isolation, just as we do with SGX, but the isolation technology is different. So rather than being something that's done by the Intel CPU, this is something that's actually being done by a virtualization environment like hypervisor. So the same sort of isolation that you'd get between say VMs as an example, that same kind of technology is being used to isolate a smaller chunk of memory.

And that smaller chunk of memory runs a part of the SQL Server query engine, but that memory is completely isolated from say the operating system that you're running on. So if you actually have a rogue on the machine, they can't actually get to the memory that's inside of this VBS enclave. And again, that's all enforced by the hypervisor.

So now the whole purpose of this podcast is that we introduce a new flavor, let's call it, of always encrypted with secure enclaves in Azure SQL database, not using SGX enclaves, but using VBS enclaves.

We're launching the public preview today of VBS enclaves in Azure SQL database, which means it gives our customers a lot more advantages compared to SGX because as we mentioned, there is no hardware dependency, which means that our customers, they can configure whatever database they want and they can use always encrypted with secure enclaves. So they can choose to go for DTU or still go for VCores. They can go for a compute model provisioned or even serverless.

Always encrypted functionality will work. The limitation of the number of VCores, we also don't have it. So you can specify as much VCores as you want. And currently I think the maximum is up to 128 VCores. And that is, yeah, so the limitation of the VCores also goes away. And this feature or VBS enclaves is available in all Azure regions. So there's no limitations anymore for the regions. And it's much easier to set up as well for our customers. So we're going in public preview with that today.

So I think this is important, right? So SGX, very powerful, but it's more complex to set up. It doesn't scale as well. It's not available in every region, but it's good at what it does. And then we have VBS, as you mentioned, which is now in preview, which offers a similar kind of mitigation. I mean, the implementation is different. But the key thing is, and I think this is so, so, so important because it's been like a very, very important pain point for customers, is the VMs.

The underlying VMs that run the platform as a service as a SQL database is we had to use VMs that used the Intel CPU, the specific Intel CPU. And the problem with that is that, again, there are memory limitations, there were virtual CPU limitations and so on. And so we're sort of busting that wide open to allow you to use VBS enclaves, which don't have those limitations. Yeah. And Michael, one thing we forgot to mention is the cost. If you configure a DC series, it's quite expensive as well.

So if our customers can go for VBS enclaves, they can reduce the costs as well, because you just pay the price of a normal database. So we don't charge any extra costs for always, or using always encrypted with secure enclaves. This is really cool stuff.

I'm going to be honest, when I saw this was coming out, I got really excited because I think it's still going to offer many of the benefits of the SGX enclaves, but without the complexity, without the cost, more scalability, and more region support, which is great to see. So this is a really good example of the family of always encrypted, sort of growing. We have it in SQL Server on-prem, which uses VBS. We have Azure SQL Database, for example, in the cloud using SGX.

And now we have Azure SQL Database in the cloud using VBS as well. So it really is a pretty broad family. So you can sort of choose what you need in terms of cost and performance and security, which is great to see. Yeah. The always encrypted family is growing. That's for sure. My background is obviously application development. And I've worked with a couple of customers over the years who have looked at always encrypted.

It's an interesting issue because, look, I'm going to be honest, I think it's a very difficult proposition to take an existing system beyond a hello world environment, take an existing say Azure SQL Database environment and just sort of flip the always encrypted bit. It's not as simple as that, but anyway, flip the always encrypted bit and expect the application to work. It's probably not because certain queries may not work. Even VBS or SGX enclaves don't support the entire SQL syntax.

They certainly are a much bigger syntax than just equality and inequality, which is what the always encrypted with no enclaves supports. So my own personal recommendation is sure, you have an existing system, look at what it might take to migrate to always encrypted. But I think the sweet spot, and correct me if I'm wrong here, but I think the sweet spot is going to be something new, something green field. In other words, you're designing something from the get go.

So let's design it with always encrypted in mind. Is that a fair comment? Yeah, for sure. Yeah, definitely. It's not that easy to like you're right. It's not that easy to convert an existing application and yeah, immediately start using always encrypted. No, it is a complex topic. But that being said, if you are designing something from the get go and you have some columns, for example, social security numbers is a great example in the US, right?

Or sensitive medical information, that kind of stuff. Those are candidates for always encrypted columns. And if you design something from the get go and make sure that your developers understand the implications of always encrypted, the benefits are huge. The benefits from an attack perspective are huge.

Because if you take, for example, column encryption, as you mentioned before, column encryption and TDE, if an attacker gets onto the SQL server or into the SQL server through whatever, they get plain text. Whereas if you compromise a system running always encrypted, the attacker gets cipher text. And that's an incredible story, right? Because no matter how hard, even if the attacker is a rogue, say SQL server admin, they don't have the plain text because they don't have access to the keys.

And the keys are not held in memory like they are, say, for example, with column encryption. So even if an attacker could actually get into the memory space of SQL server, they can't get the data because the keys aren't there. In the case of SGX and VBS, the keys are held somewhere else where the attacker has no access because of either virtualization, in the case of VBS, or a secure enclave using SGX, in the case of, funny enough, SGX.

So it's an incredible defense, but you have to design around it correctly. Yeah, looking at my example in the beginning of this podcast where I said I was a consultant, I went to financial companies like banks, insurance companies, they always gave me system administrator permissions on their SQL servers. So I literally could go in whatever table I wanted and if I wanted to, I could fetch all the data and go home with it.

It was always encrypted to you, like you said, I didn't have the key so I couldn't see anything, but I could still do my job. I still have access to the databases, to the SQL servers, but I don't have access to the sensitive information. So there's clearly a distinction there between who owns the data and who doesn't own the data and cannot see it. You bring up an interesting point. I had a summer experience as well where customers wanted to make me admins of one kind or another.

It's funny because my answer has always been the same. It's like, no, I don't want to be an admin. I just don't. It's plausible deniability, right? If something bad happens, it can't have been me because I wasn't an admin.

So anyone who's listening there, if you're in a consulting capacity and your customer says, hey, just be an admin on your database or be an admin in Linux or be an admin in Windows, the answer really should be no. Get someone who actually works for the company to do that work rather than you. I would never recommend that. Actually, I want to just point something else out here that always encrypted doesn't fix that people should be aware of. So always encrypted is actually...

Actually the encrypted part is actually kind of a bit of a misnomer because it really should be always protected. If I had my way, I would have called it always protected because it's actually encryption and an HMAC, right? So it actually verifies the integrity of the data to make sure that data has not been manipulated. But what's interesting, and this is why it's always really important when you're ever considering any kind of defense, what does it actually fix?

So let's just take a hypothetical example where there's a table that's got all the list of salaries within Microsoft. I could actually lift...

If I had access to the data, obviously enough privilege to access the data and change data and what have you, and let's say the salary column was encrypted with always encrypted, there's nothing actually preventing me in always encrypted from taking the cipher text out of Satya's, our CEO's, out of his row to taking the salary column out of his row or his record, taking that out as cipher text and actually pasting it into my salary column.

Always encrypted does not mitigate that because the data itself has not been tampered with. The HMAC protects the actual, let's just call it the cell, the cell of data. So if I try to change that, it won't compute correctly, it won't decrypt correctly. But there's nothing stopping me actually lifting that cell out and pasting it into my cell. Assuming of course, access control aside, there's nothing stopping that at all. But that's where technologies like Ledger come in, right?

Because Ledger prevents that. Exactly. So that's why it's always really important to understand what you're actually mitigating with these technologies. So for many customers, it may actually be a combination of always encrypted with Ledger. And the nice thing about Ledger is there's nothing required from an application perspective. It's all done by the database engine, which is absolutely beautiful. All right, Peter, this has been really, really useful. It's great to see VBS enclaves.

It just expands the population of users or developers or admins that would actually want to use this technology. But one thing that we always ask all our guests is if you had one thought to leave our listeners with, what would it be? What would it be is that I would love that everybody starts, well, since it's VBS enclaves, this is in public preview, I would love that our customers start using it, start playing with it, doing POCs with it and give us feedback.

Maybe we can still improve the feature or our customers, they notice that something can be added to the feature or something is not working correctly. Yeah, we'd love to get the feedback of our customers that are trying VBS enclaves. All right, Peter, hey, thank you so much for joining us this week. I know you've been really busy, especially with the release of VBS enclaves. So thanks for taking the time out. I sincerely appreciate it.

And to all our listeners out there, we hope you found this useful. As Peter said, go get the tires on VBS enclaves. It's actually pretty cool technology. And with that, stay safe and we'll see you next time. Thanks for listening to the Azure Security Podcast. You can find show notes and other resources at our website, azsecuritypodcast.net. If you have any questions, please find us on Twitter at Azure Setpod. Background music is from ccmixtor.com and licensed under the Creative Commons license.

Transcript source: Provided by creator in RSS feed: download file