Welcome to the Azure Security Podcast, where we discuss topics relating to security, privacy, reliability and compliance on the Microsoft Cloud Platform. Hey, everybody. Welcome to episode 84. This week our guest is Roberto Rodriguez, who's here to talk to us about topics related to attack simulation. But before we get to our guest, let's take a little lap around the news. Sarah, why don't you kick things off?
So I've just got one to talk about today, which is the Azure Container apps is now supporting environment level mutual TLS or MTLS. It's preview, but what it means is when you need end-to-end encryption, it's going to the MTLS will encrypt the data between app between different applications within an environment. So that's pretty cool if you need things to talk to each other securely. So that's just my one bit of news this week.
Actually, before Mark picks up, I'm actually a huge fan of mutual TLS. I'm actually a fan of TLS in general as an authentication mechanism. Ignoring the channel protections for the moment, TLS by default gives you server authentication, but with mutual TLS, now you've got client authentication as well. So you're authenticating both ends. And that's really cool. So it's good to see. So since we are talking TLS, TLS is going to be disabled.
I mean 1.0 and 1.1 is completely disabled starting in October in all Windows OSes. A couple related pieces of news from my perspective. Zero Trust Commandments are out and published. So we got the links there in the show notes for you. And those are essentially kind of the rules of the road, like what is and isn't Zero Trust? Basically what is good modern security.
And I'm actually going to be in Houston in late October, just before Halloween, to speak about those commandments, the reference model that's shortly going to be released before that conference. Also a panel session that will include CSA. And I believe we're trying to confirm NIST there as well. So a number of open standards organizations getting together talking about what Zero Trust actually is. So lots of good stuff coming there. So I wanted to talk a little bit about Entra.
There's additional settings in entitlement management in the policy that is going to be changed or added into the Entra ID governance. It will provide more capabilities and would allow the customer to select to not have the policy create assignments, not to remove assignments and delay assignments removal. So that is a feature that was being asked. Also in public preview, cross tenant access settings basically are supporting custom RBAC roles and protection action.
And what I mean is that cross tenant access settings can be managed with custom roles defined by the organization. And this enables you to define your own fine scope roles to manage cross tenant access settings instead of using one of the built in roles for management. I'm including some links for more information. I have a few items. The first one is in general availability.
You can now use Azure Key Vault to securely store and retrieve the access key when mounting an Azure storage account as a local share in an app service. I realize that is incredibly specific, but if you are using Azure storage in an app service, you can now store the access keys in Key Vault. So that's a biggie if you care about it. Next one in public preview, this is one from my own backyard. You can now configure customer managed keys on existing Cosmos DB accounts.
Historically, you had to create a new account and then migrate your data across, which is obviously pretty painful. Well now you can actually do it on an existing account. So that's great to see. Now in general availability, trusted launch as the default for a virtual machine that is deployed through the portal. Historically, trusted launch was not the default, but now it is. So trusted launch includes things like secure boot, a virtual TPM, measured boot, and also boot integrity monitoring.
This is great for helping mitigate malware based root kits and boot kits and so on. So this is fantastic to see. A colleague of mine, Andreas Walzer, has written a paper if you are endeavoring on providing more discrete permission management in SQL Server and Azure SQL Database. He has a paper out that is worth reading called Delegating Permission Management Using Roles versus the With Grant option.
So if that's something that you've been worried about or concerned about or have questions about, then Andreas, who by the way knows more than anyone on the planet to think about this stuff, is certainly worthwhile taking a look at that document. We now have in Azure Artifacts support for Rust Crate. Now you can argue why we even talk about this on an Azure Security podcast. The reason why we're talking about this on an Azure Security podcast is I'm a huge fan of Rust and that's the only reason.
But if you're using Azure Artifacts and you're building solutions in Rust, you can now store those crates as artifacts in Azure Artifacts. That's fantastic to see. The last one, there's a bit of a story behind this. I'm not going to go into the whole story, but if you're familiar with Azure Policy, maybe you're aware that there are some things in Azure SQL Database you can't control through Azure Policy. Because once you get inside of the SQL Engine, it's its own model.
It's its own SQL Server security model and access control model and so on. Well, one thing that's been asked for a small number of very large customers who have requirements around securing their environments is the ability to block T-SQL crud features. So for example, create table, create database and that sort of stuff. So create, read, update, delete. You can now actually block those T-SQL statements by setting essentially a policy in Azure itself.
So if you set this policy, there's a link to it in the show notes and the process you have to go through. It's actually not part of policy, but it's mimicking policy. But it will then block things like create database, drop database, alter database, all sorts of other kind of crud operations. It will block all of those, which is actually really kind of nice. You must be an owner or a contributor on the subscription to set this.
But some customers have asked for this because it gives them a much tighter control and helping restrict kind of drifting away from a secure baseline, which is the whole point of Azure policy. So really, really cool to see this. Again, it's a little bespoke, but for those who need it, very happy to see this in there. All right. Now that we have the news out of the way, it's fantastic to see everybody here this week. Really great to see. Let's now turn our attention to our guest.
This week, as I mentioned, we had Roberto Rodriguez, who's here to talk to us about attack simulation and more. Roberto, thanks so much for joining us this week. We'd like to take a moment and introduce yourself to our listeners. Yeah, thank you very much, Michael. Yeah, my name is Roberto Rodriguez. I'm a security researcher for the Microsoft Security Research Organization. This is part of the Microsoft Trade Intelligence Center. I used to be part of the R&D department.
I'm still doing R&D, but now as part of the whole Microsoft Security Research Organization. Usually just doing a lot of research, trying to use a lot of the services that we already provide to expedite some of our processes to validate security controls internally and also doing some research.
And then also diving a little bit into contributing back to the community, exploring some of the features or concepts, I guess, that you can use by probably using some open source tools to interact with large language models and then just share some ideas, some just proof of concepts of things that people can do to get more familiarized with that. And that's what I've been focusing a lot for the past couple of months. And yeah, so happy to be here and thank you for the invitation.
I am super stoked that you are here. Obviously, I have seen some of the really cool stuff you do. Obviously, you've done a lot of community contributions. So why don't we start with what are you most interested in? I'll let you decide where we go first. So I'm very interested in talking about, for example, you guys mentioned Azure Container Apps, right?
And it's something that I've been exploring a lot to automate a lot of the things that we want to do from all the way from a basic phishing scenario to like a full business email compromise, like end-to-end scenario with a lot of components in between where to me using Azure Container Apps has been super helpful. And I built a tool already two years ago called Cloud Katana. And it was a tool that is still out there and it's a tool that is based on Azure Function Apps.
And I love that because you could do a lot of chaining of actions, trying just to push all your code in the cloud, like, you know, serverless computing. That was great. But then when I started looking into Azure Container Apps, it was pretty clear that it's super powerful and flexible. And it fits very well into a lot of the components of a simulation, like what it takes to actually do some of this, right?
We usually see some basic stuff, like let's say, I don't know, download a file, execute it and then get the code back to your C2. When you start thinking about something like phishing, for example, you have to deploy your own site, you have to make sure that you can adjust maybe some of the timing of when you execute things, what happens next, how do you bring things down, how do you change your IP address right away, right? And it's something that it's just fascinating to see it.
Now, of course, the part also that is very interesting as a researcher is that a lot of this could be used by threat actors. So that's also part of the research is to understand some of the capabilities, some of the things that actually someone could do, right? So that's also part of this in parallel to talk also to the right teams and also see if there is anything that we could do to start enabling certain capabilities. So that's pretty interesting as well.
And then, of course, we can finish by talking about some of the open source tools that I've been using to share some examples and some ideas of how someone can actually start using some of the capabilities of generative AI and trying to interact with some either open NLMs for free or start interacting with some paid services if you wanted to, right? Let's start with Cloud Katana. Tell our listeners what it is, why they should go and play around with it, and the cool stuff it does.
Cloud Katana is a serverless cloud application that uses Azure durable functions in a way to orchestrate attack simulations. And a lot of this is against the cloud, pretty much. There are so many different APIs that you can use to execute, for example, if you want to send an email, if you want to delete an email, if you want to maybe, I don't know, add a secret or a certificate to an Azure AD application, probably authenticate as the application and then maybe read emails, right?
So there are so many things that we're trying to simulate to validate things, let's say, on a regular basis. So we were trying to look for something that would allow us to have those capabilities, right? Like event-driven, for example, technology. And Azure Functions does very well.
It comes with a lot of different runtimes that you can use, so it's flexible also for people that, for example, if you want to write your own function in C Sharp, in Python, in PowerShell, then Cloud Katana is just a way to set up something that then you can modify and start building your own orchestration. Started as a proof of concept because it's something that we wanted to do also internally, just to understand how this automation could be happening in the cloud.
But then at the same time, it was how can somebody use, once again, Azure Functions to run a lot of these simulations? Because when you run something as an Azure Function, you might be using a managed identity. So what does it look like in the logs, for example? They will write detections, right? So it's pretty cool and it's something that I believe people can play around in their lab environments. I don't recommend to deploy it just like that in a production environment.
The reason why is because currently the approach is to give or grant permissions to the cloud application in order to perform certain scenarios. Some scenarios do require high privilege. Some permissions that only a privileged user will be able to have or use in the environment. So we use it usually in these research lab environments and we just try to see what's going on in there and then we just use the same security tools that a company would use in their own organization.
The cool thing about this too is that, for example, something that I'm very passionate about is to make it flexible for someone to also contribute. Some of the tools out there that, for example, you have to write a lot of different, like learn a new language, for example, to craft your own campaigns or your own scenarios.
So for me also, as I mentioned before, having different runtimes like Python, C Sharp, and PowerShell, then you can allow someone to feel comfortable and say, hey, you know what? I do some of my simulations with these four scripts. They're all in Python. So then you could actually use something like Cloud Katana to maybe execute a lot of that. And as I mentioned before, the durable function component is super powerful.
And now I'm experimenting into how can we now take this idea and start using Azure Container apps, or Azure Containers in general. But one cool, I guess, service recently has been Azure Container apps with Azure Container app jobs, for example, which is similar to an Azure activity function in Azure durable functions.
And the beauty of that is that you can actually now containerize a little bit more of that runtime that you could not control that much from an Azure function perspective, for example. And it has opened the door to a lot of more scenarios where we do not just need to hit an API and write in a script to, let's say, automate sending an email, for example. Now we can start integrating phishing sites, maybe start playing with some of those other open source tools that a thread actor might use.
It's an easier way to containerize all these other components that a simulation requires to play with. Hey, so I have a couple of questions for you. One's security related and one's not, but I'll ask you both. The first one is, I'm kind of afraid to ask this question, actually. So what is a durable function? I mean, I know they've been around for a while. I've written a whole bunch of function apps over the years, but never a durable function. So what are the pros and cons? And what is that?
That's number one. And number two, what does Cloud Katana simulate? Can you give us some examples of what it actually does? Yeah, yeah, sure. So a durable function is, let's call it a feature of the Azure functions in general. Usually when you create an Azure function, you do have an HTTP trigger, for example. So you deploy your function and you say, if somebody hits this HTTP or this API, I want you to run this script as part of the Azure function.
The durable function, oh, and then you run it once, stops, and then you get the output back. But what if you want to start orchestrating multiple of those functions and actually capture the state of every single execution? That's what a durable function would do for you, which gives you the concept of an orchestrator function, for example, that is capable to go to sleep, wake up, and listen to what the other functions are doing.
For example, if you do a specific pattern, you could say, I want to run one step at a time from one to 10, and I want my orchestrator to start capturing the state of each one and maybe do something else when one finishes. And there's a lot of other patterns that you can use, but that's what a durable function will do.
It allows you to build workflows and orchestrate multiple functions, something that if you're using only one function, you only have that trigger capability, but you might not have the full workflow, once again, sequence maybe of actions that you want to run for a longer period of time. And Azure function by itself, I think, I believe last time I checked was still 10 to 30 minutes, I think, how much a function can run. A durable function, the orchestrator itself can run for a longer time.
So you might actually say, I have this big task. You can actually split it in maybe multiple activity functions. That's part of the durable concepts. And then you can use an orchestrator to start using them. So that would be the reason, I guess, why I would use a durable function for attack simulations, because that's what an attack simulation is. You run something, you want to go to sleep for 20 minutes, and you want to run something else.
That's what the durable function capability is super helpful. So one scenario would be, and this is when this specific scenario, even though it might sound very simple, it takes some time for some of the steps to actually work. So let's say you want to simulate someone that has access to an Azure AD application. And that Azure AD application has, let's say, application roles permissions, which means that the application itself, the app roles permission.
If it says Mail Read, for example, you can read email, not just your email, you can read email for the whole tenant, let's say, if the app has enough permissions, because it's acting on its own. So one of the things you can do is say, if I have access to this app, my first step would be let's add a secret to the app. So that way I can use that secret and authenticate to the app, and then use the app to then do the next steps, such as read an email.
When you add a secret to an app, the app, in order to recognize a secret, takes sometimes like a minute, sometimes it takes five minutes. And if you want to probably create a new application, maybe, and grant more permissions, sometimes it could take even like 10, 15 minutes. So for me to use something like Cloud Katana would be let's schedule some of these waits in between, and so on. Once I give the secret to the app, I can authenticate with the secret.
Now I'm acting as the app with enough permissions to read email or do things with mail in my tenant. So next step would be let's just probably start reading some email or maybe send emails also to others. And now it turns into a potential business email compromise internally, where now you're using an internal source to start sending or interacting with mail in general, I guess with others in the tenant.
That's a basic use case that it takes the steps of secret, authentication, you run an API, wait for email, maybe you want to destroy the app or delete the secret. So there is multiple steps that you will have to do. So that's something that you could do with Cloud Katana, for example, build your campaign and just make sure that the Cloud Katana identity has enough permissions to do a lot of this actions and then you could do it.
One approach that we're trying to take is also to see if we can just allow the app to use my own permissions. And that's something also that could be easily modified so that way you don't have a app with, you know, with a lot of permissions, which is, you know, super powerful, but also not the best thing to do in your tenant. Yeah, but I hope that that basic example kind of shows you steps, right, that a simulation would take, right?
I'm glad that Michael asked what are the normal use cases, I guess, for this tool, because recently I was playing with Cloud Katana and I was trying to implement it. Usually there's organizations that have a single tenant and then they give permission to just individual subscriptions for test dev environment and that way they have the capability of viewing everything that is happening in all the subscription.
So I thought that my permissions, by having privileged permission in the subscription was enough, but it seems that it's not. Can you talk a little bit more about that and describe what type of permission are expected? Yeah, so for example, something that we do with Cloud Katana right now is to provide templates of the actions that you can take and some of the actions that you can make the tool to take for you or some of the flows, the actions, right, that you can take.
Some of them do require the app to, for example, there is one attack simulation that requires the app to be able to grant permissions or define some app role assignments for a new identity in the tenant. And if you want to grant that to an app, you need to be a privileged user or at least be able to grant permissions and that doesn't happen at the subscription level. And that's because the tool by default is trying to say, hey, this tool is capable to do x, y, and z in the tenant.
So it will try to say you need to have permissions to at least grant permissions to this identity. That's one of the reasons why I think that something that we could do is to allow the user to maybe enable what use cases they would like to have by default enabled and not enable the other ones unless you have the permissions for and then you can just run it at the subscription level.
Something that I had not, to be honest, played with as much because a lot of my research has been on the identity side of things like Microsoft Entra, for example, and a lot of these scenarios do require to be a privileged identity. So that's the only reason why I guess it doesn't work with you only having subscription level access, I guess. So it's very powerful and now we're just trying to make it more flexible, more dynamic, I guess, and try to now explore other scenarios.
Like I mentioned before, doing something like phishing, it's very interesting to deploy sites, bring them down and try to maybe trick MDOs sometimes and continue just make it learn about the things that are possible. And for something like that, you need to containerize applications, you need to build your own probably Docker container images. So I'm exploring that a little bit more and seems to be super powerful.
One of the things I want to ask about is looking at this as someone that is definitely not an expert in this space. How would someone get started, get involved doing security research, engaging in the open source community? How would you recommend someone getting started to do that? Yeah, that's a good question. So in general, first it would depend on the type of research that the person wants to do.
I would recommend to friends in the community that they are very comfortable with using something like Fiddler or Burp Suite, for example, and then be able to understand how, for example, let's say an endpoint that has been joined to Azure AD, how it works, what the communication looks like. When you authenticate what happens from the client to the server. And for those things, you need some of those tools. There are free versions, community versions of that.
So for example, if you want to take that path, I think it's a very interesting path. You just need to, once again, kind of identify what it is that you want to do. That's from understanding maybe how your computer communicates with the cloud. Just be curious into how things work and then try to use some of the tooling around. Once again, a lot of that is just a community version tools that you can use.
The other thing that I recommend a lot to folks that want to do a lot of the things that I've been doing for the past couple of years, which is automation, deployment of research environments, for example. So at Microsoft, in my team, I'm enabling, helping others that would like to go through certain scenarios and say, hey, what would happen if a third actor does X, Y, and Z? Well, to find out, we need to deploy environments.
Even when you start working with, for example, right now, let's say AI, insecurity. If you're trying to understand how, for example, an application that uses an LLM to do something, to either be a chat bot or be able to do things for you as your personal assistant, let's say, if you want to learn how they could be attacked, well, we need to deploy an app. We need to deploy how, there will be some automation in between so that you can start testing a few things.
So I always recommend, if you're interested in research like this for simulations and explore what would happen, I highly recommend to start learning Terraform, Bicep. Bicep is from Microsoft, right? Bicep is like your Terraform-like experience. Or using, for example, how to use ARM templates so you can deploy things in Azure by using the Azure Resource Manager services, right? Or maybe you want to learn from AWS. So you might need to learn also the language that they have.
And a lot of these things are pretty easy to really get into actually learning. Microsoft, for example, and I'm pretty sure other services, I'm not trying just to say, do things with Microsoft in general, but in general, you will find the, if you're building your own little things like testing with one app and maybe with, I don't know, Sentinel in general, there is always this 30-day trials, three-month trials. Some limitations are really good to actually play with things for a long time.
And I think that it's pretty interesting to start diving into some of those areas, but always having, to me, the goal in mind. My goal is what would happen if somebody uses this? So then I start bringing my tools. All right, so I need to bring my Bicep templates, my ARM templates. I need to start learning about container apps. So let's just maybe containerize an application locally. Once we feel comfortable, now how do we deploy it to Azure? How do we deploy it to other cloud services?
So that's what I would definitely recommend to people, just to start learning some of those tools that would allow you to start experimenting. And it's not expensive, to be honest, to do some of these basic testing. So cloud infrastructure and what happens from your client to the server, from your computer to the cloud, and exploring that, I think it will open the door to so many different topics.
I think that by itself would open the door to understanding how authentication works, what are the different protocols that are being used. And just take as many notes as you can, and then once you feel comfortable, share it in a local event. Share it in a local B-size conference, for example. And that would also push you to start structuring your research, structuring how you want to explain things.
The more you explain things to others, the more you feel comfortable, the more you learn and it just becomes this cycle that turns into a methodology and now you actually start identifying what works for you for certain research topics. Awesome. That sounds like a very intense version of continuous learning across a lot of topics.
Yeah. I mean, that's what it is and that's what I love my job is, it's just security in general, I think there is so much to cover, but you realize, I guess, soon that you start doing similar things, right? As I mentioned before, deploying a research lab environment, if you want to do it for an active directory or a hybrid environment or everything in the cloud and maybe now start testing new applications, like I mentioned before, like AI applications, how can we do all that?
You follow a similar methodology, right? You need to understand the fundamentals, build your little PLC, test it a little bit, take some notes, as I mentioned before, figure out what it is that you can do with it, maybe build something cool. For example, I was trying to learn recently, how can we use retrieval-augmented, a generation or generative?
How can we use that, for example, to make my application that is using a specific large language model, I don't want to say specific names, but how can we use it to now bring my knowledge and make the bot actually know more than what he knew until the moment that he was trained, for example, right? Because that's one of the limitations of some of these chat bots out there that don't have the knowledge of the current, like today, right? Some of them already have some enrichments.
To me, I was thinking, if I'm trying to use a large language model into security and I bring it to my threat intelligence team, what can I do to provide that TI knowledge that sometimes it's only available for the company that holds it? So that was just an experiment, and I was thinking, all right, well, first thing, let's try to figure out what could be an open source threat intelligence sample, for example.
And then I thought, you know, MITRE ATT&CK has a lot of ATT&CK groups in their own database. When we are thinking about TI, we're tracking groups, we're tracking the tradecraft of the adversaries.
So my goal was, all right, let's investigate how we can bring all that data from MITRE ATT&CK and focus only on groups of MITRE ATT&CK and then see if we can put it into a database that then we can retrieve that knowledge as we ask questions and then make sure that we enrich our communications with our application that is using an LLM.
That's retrieval augmented generation, for example, where you are providing additional information, making the application that you're building retrieve additional knowledge that it requires to be more accurate into their responses. And with zero knowledge on all of this, it was just taking those steps, like what do I have to do first? What tools can I use that are open source? And then just start kind of working your way to building a basic POC.
And then I presented that into a couple of conferences already, and it seems that it has been super helpful for others just to kind of see the methodology of how you get to something that you were not comfortable maybe six months ago. But now you feel comfortable enough to share it with others or maybe even internally now have the conversations that you wanted to have with others that are already talking a different language.
Six months ago, when somebody was talking to me about this technique, RAG, RAC, retrieval, documentation, I didn't know what they were talking about. And I was like, man, I feel that I cannot contribute because I get stuck just at the conversation. So, yeah, so that's just one example of what somebody can do with open source stuff for free. So talking about open source, back in 2022, you came in and talked about Simuland.
Can you talk a little bit about it, anything new and how you could use it together with Cloud Katana, etc.? Yeah, so Simuland is still an open source project where we wanted to say there are specific attack paths that a threat actor can take.
For example, the topic that we have in Simuland is how you can use Federation services or an environment that is connected to the cloud via ADFS and how that could be abused or how can you compromise that environment and what are the different options that you have or a threat actor might have. And then what are the security controls that exist around some of those steps in the end-to-end scenario. So Simuland was a way to say these are the scripts that you can use to deploy the environment.
There are some manual steps that unfortunately you have to do. But the way was this is the environment, deploy it. Once you feel comfortable with it, now these are the steps that you can take to go through the simulation. And we didn't want to, of course, come up with new ways to do things. We just took what the community had already shared. So there is a pretty good tool, AAD internals, for example, by one of my good friends in the community, Dr. Ness Story from SecureWorks.
And so it was nice to use something that is already out there and that we are already tracking how other actors would use it. And then just simply share the steps. The idea was also to usually when you share TI around a, let's say, specific attack path, usually we share alerts. We share, hey, they run these commands, maybe pay attention to these IPs or maybe those permissions, those APIs being used, et cetera.
But I think that when you actually run a scenario in your own research lab in a company, for example, you're exposing the security researcher to more context that a simple report or just a report in general might not provide to you. For example, when we did Simulan and then we shared the end-to-end simulation, we actually found more logs that we had to add to our detections that we had already built in the past because it exposed us to some of those data sources that we were not as aware.
And we also feel that customers bring their own security stack, right? They might use something else. So why not see what else gets generated in your environment? So that was the idea. Run it in your environment, follow the steps. We were not providing a tool to actually execute everything. We were saying, these are our notes. This is how we will run it. And then you take care of it, you as a customer, right?
With Cloud Katana, what you can do is there are some scenarios, especially those that are only fully cloud-based scenarios, that you can just have a Cloud Katana flow or campaign to be running, for example, in a sequence. You can just run it automatically if you wanted to, right? Granting the right permissions to the app. So that's how you can mix it up a little bit.
And the news in Simulan is that internally, for the past two years, I've been working on actually documenting more scenarios internally that we're using to validate security controls and create new detections, partner with the defenders teams to start providing more coverage in some scenarios. And we're going to be releasing more things at the end of the year where there are going to be some pretty cool scenarios that will align very well with some of the new capabilities that our products have.
I'm not a salesperson. I'm just a security researcher that, for me, is very important to be able to say, yes, these are the 10 detections. These are the maybe 10 products that you can use to protect yourself. But let me also share how you could also validate it and maybe test your SOC, for example. Bring your security team into a day and say, let's run all these simulations in Simulan and then see how we respond to that, right? Something like that. I think that that's the goal as well.
To make people experience that and go beyond just alert, right? What else is in your environment that we might not even have because you might have a different tool? And what happens when this scenario triggers in your environment? That's the idea. So Roberto, obviously you do tons and tons of cool things, but I know you've been doing some things in the open source AI space. Do you want to quickly tell us about the kind of things you've been looking at there?
Yeah. Yeah. So I touched a little bit a couple of minutes ago on this, but just to expand it. So the goal is we're trying to share we as the open thread research community. So I'm the founder and also part of this community where we have a few folks, for example, those that are part of the MysticPy, for example, open source Python library from Microsoft.
Just to start brainstorming around to see how we can share Jupyter notebooks with some examples, once again, like some proof of concepts of what some of these skills, for example, would be interesting to learn. So let's say you want to build your own chatbot. Yeah, there is a lot of blog posts and videos now. Like there is a lot actually that gets released every week. Our goal is to say, how can you apply some of that into security, right?
If you're building a chatbot, can we share that basic example to a Jupyter notebook and say, let's build a chatbot that might help you to query your database. Let's use maybe some open source tools like, I don't know, Langchain, Lama index. There is a lot that you can use out there and start exploring those capabilities. So sharing a couple of Jupyter notebooks soon.
I shared a few in my recent presentations where, for example, beyond the retrieval, the generation example that I just talked about, how you can help the LLM to have more context around the topic that you want to interact with, right? You need to provide that knowledge. So how do you do it? That's what we did with Rack.
One of the things that I was trying to do is, how can we also use some of these capabilities to query a SQL database by making the LLM learn the schema of the SQL database maybe or the craft database, and then start interacting with the database in a more natural language way? So using natural language, asking regular questions and get that response back. Once you learn how to interact with a structure, database or data in general, then how can we actually start doing this in a loop?
How can we say, ask a question, get a response, and then based on the response, maybe trigger another action?
So then we're diving into the concept of agents, for example, that are super powerful to build some of these workflows that, yes, you can automate a lot of these things with a couple of scripts, but having the, let's say, LLM in the middle allows you to summarize maybe a lot of these responses in a better way, expedite the way how you want to tell someone, hey, can you put this in a YAML format or in a JSON format?
Or for example, I'm a big fan of MarMate, which allows us to create some beautiful graph visualizations with a couple of lines that it's not that hard to type, but when you want to type MarMate for a big document, then it's not that easy. So maybe having a tool that can easily do that for you because it knows the schema, that's what we're trying to test, and then how do we take it to security? How can we go from an event log in XML or JSON to a data frame?
Yes, there are tools to do that, but how can you make it actually query that data frame also based on, I don't know, a couple questions or probably giving it examples like, hey, this is what maybe lateral movement looks for me. Do you see lateral movement in this data frame, for example? Things like that. I'm trying just to build something like that and just share it with the community and see if it inspires others to do the same.
I was very happy to hear some internal researchers, some other friends in the community too, saying that they learned a lot based on these recent notebooks and presentations that were given in this past couple of conferences, like CIFCON, for example, in Poland. And it was just nice to hear that it was helping others. So why not continue doing that and sharing all that information? So Roberto, usually we ask a final thought or maybe a recommendation for our listeners.
Do you have one that you could share? Yeah, yeah. So I think this might sound, I don't know, that it's something that I always tell people is just continue being curious about all this stuff. Don't be afraid to jump into a new project and maybe search for a new open source tool that you can use to do certain things. Don't feel that you're reinventing the wheel. Actually there is where you learn a lot, just trying to build a lot of stuff yourself.
Yeah, and just continue to be curious and share as much as you can with others. That also helps a lot. Hey, thanks for joining us this week. Really appreciate it. The only thing I love about doing this podcast is you get to talk to really interesting people and you always learn something. And this is again, absolutely no exception. So again, Roberto, thank you so much for joining this week. And to all our listeners out there, hopefully you found this of use too.
Stay safe and we'll see you next time. Thanks for listening to the Azure Security Podcast. You can find show notes and other resources at our website azsecuritypodcast.net. If you have any questions, please find us on Twitter at Azure Setpod. Background music is from ccmixtor.com and licensed under the Creative Commons license.