Welcome to the Azure Security Podcast, where we discuss topics relating to security, privacy, reliability and compliance on the Microsoft Cloud Platform. Hey, everybody. Welcome to Episode 79. This week is just myself, Michael and Sarah. Mark and Gladys are taking a little bit of time off. And this week we have a guest, Thomas, who's here to talk to us about MysticPy. But before we get to our guest, let's take a little lap around the news. Sarah, why don't you kick things off?
Sure. So I've just got one little bit of news this time, which is the Azure Linux container host for AKS has now gone GA. So if that's something you might have played around with in public preview or something that you need to have or you need to use. So if you want to use a Linux container in your AKS, go and have a look because it's now GA, which means it's fully supported.
And yeah, we've had quite a few customers who've been running it for a while now in the public preview. Plenty of people saying good things about it. So we'll put a link in the show notes. So if that's something you might be interested in, go and have a look. The other thing I've been away for a couple of weeks. I've been doing some conferencing. I was super lucky to be at Black Hat Asia and also a conference called NDC, which is Norwegian Developer Conference, which is in Oslo.
There I've been speaking to devs about security and talking to them more about how they can secure things because we know that a lot of security issues can lead from how people build things. So it's great to go talk to not just security people about security. Go and talk to the wider IT land. Something that was really interesting, Michael, I don't know, because you've done secure coding for a long time.
I asked the room who had had a bad experience with security in the past. Bear in mind, this is a room full of, you know, basically devs. Pretty much everyone put their hand up, which is really sad.
And the people sort of had negative experiences with their security team. I think this comes from, you know, the days of us just saying no and not being helpful. But I don't know. What do you think, Michael? You've had experiences with that before. You've been doing secure coding stuff for a long time.
I think the number one skill that I think a lot of security people actually miss is the ability to choose your battles, right? Some things are just not worth fighting. And some things are. I mean, you're willing to die on the hill for that particular position that you want to take. The other thing is, and I've sort of believed this for a long time now, is, you know, anyone can tell you how to secure something.
But I think it really takes a really experienced security person to know when you don't have to. Because sometimes you don't have to. And unfortunately, a lot of security people don't like to take that position. And hence, they end up just saying no, no, no, no, no. Whereas sometimes there's a very amenable middle ground. And a lot of security people don't want to take that middle ground. They want to just take, you know, no, no, no, no, no. And unfortunately, that doesn't get you anywhere.
Good point. Actually, I love talking to developers. I love it. I've got a few items. The first one is in public preview is Azure Active Directory Support for Azure Files SMB shares. This is cool because, I mean, historically, you know, if you talk about SMB, for example, on-prem, that was primarily using, you know, Windows specific algorithms, transport algorithms and authentication algorithms.
Now, for SMB file shares using Azure Files, you can now use Azure AD. Now, that's really cool because you can use managed identities. And that's really cool as well because now you don't have to store credentials. You know, AAD takes care of the credentials. So now we can start building much more secure solutions because we're not persisting credentials anywhere. So great to see that.
The next one is in general availability is PrivateLink Support for Application Gateway. Again, you know, App Gateway was historically something that was just public, sort of public facing. Well, now you can lock it right down if you're using it, you know, with private endpoints. So this is another great thing to say. As I mentioned, on so many episodes of the podcast, we're seeing this huge wave across the company and have been seeing it for a long time now.
So we're seeing more endpoints and more use of managed identity and AAD authentication for clients authentication. And the last one also on managed identities generally available is managed identity support for capture in Event Hubs. So there's a capture feature in Event Hubs and users can now use managed identities when capturing event streams to storage services such as, say, Azure Storage Services or Azure Data Lake Storage Version 2.
So it enables users to do cross subscription data capturing as well. So again, really great to see more products using managed identities and AAD authentication for clients authentication. So with that little short news section out the way, let's turn our attention to our guest. This week we have Thomas who's here to talk to us about MysticPy. Thomas, welcome to the podcast. Would you like to take a moment and introduce yourself to our listeners?
Sure. Thank you, Michael. So my name is Thomas Rocha and I'm currently working at Microsoft as a senior security researcher in the Defender team. So the goal of my team is basically to look for new threat and improve the detection for our products. So basically we are always looking for new techniques that will try to bypass the antivirus and giants and the security that we put in place in Windows OS and Windows products and Microsoft products.
So yeah, my goal is mainly to investigate malware, understand how they work and build detection and signatures against them. And yeah, I've been part also to the MysticPy team since a year now. So I'm involved with some of the development features and so on. And yeah, that's me. Thomas, obviously, well, you also live in Melbourne like me, so we have met in person a few times.
But what is your team? What's your team's kind of main goal? Because I know that we have a lot of research teams in Microsoft and they all do different things. We've had a couple of people on before who've done research stuff. So I'd be really keen to hear what your focus is. So my team is so. So obviously the team is super big. So there is different kind of work groups and different research research team.
My team in Melbourne is focused on what we call advanced techniques, detections and poor and grayware research. So basically this is some of the research that that we do against potential unwanted application and potential unwanted software. But also we are looking for how these tools are used for install malware. And it could be also some some research around hardware because I do have a big economy as well.
So the team is basically working on this kind of focus and tracking the cybercrime economy related to add fraud, add clicking and so on as well. You actually said something there that I'm not so afraid and I'm not super familiar with. You said grayware. I don't know what is grayware. So it's basically software that are supposed to be that appear to be legit. But in fact, they are playing in the border and just you don't really know if it's a legit application or if it's a malicious application.
And it kind of it's kind of a gray area because you don't really know and you have to investigate. And most of the time there is malicious stuff running in that space. Well, that's really cool. Now, I know that one of the things that you work on and I wanted to talk about this quite a lot because I know of it, but I don't know enough about it is MysticPy,
which is a tool that's been around a little while. But can you tell me about the tool and, you know, the history of it, what we do with it, et cetera? So MysticPy is basically a Swiss Army knife tool for threat intelligence. So it has been created by Jan Hellen. I think it was in 2017 or something like that. And at first it was an internal project used to interact with a lot of Microsoft data and so on. Jan and Microsoft decided to release it as an open source tool for the community.
So MysticPy is a Python library which you can use to analyze different kinds of data source. You can also use it for logs, but you can pretty much use it for any kind of data. And the advantage of MysticPy is that you can use it using a Jupyter Notebook, which is basically a framework to create Python code and exchange the information using a notebook with your team.
So this is super powerful because you can combine Python visualization capability as well and also Jupyter for exchanging a kind of workflow with your team. MysticPy is basically, as I said, a Swiss Army knife for threat intelligence. So you can use it, for example, for querying different kind of logs for investigation, forensic investigation, incident response as well and so on. You can also use it for enrich the data that you collect with multiple threat intelligence providers.
So, for example, there is modules to connect your data with VirusTotal or any kind of other threat intelligence provider, for example. And you can also plug MysticPy to Azure Resource if you have some kind of information that you would like to investigate and analyze as well. And I think the most powerful feature of MysticPy is the visualization. So there is some modules that you can, you know, directly use.
It's built in the application and you can directly use against your data to visualize different kind of trends, also creating some interactive timelines, make some process trees. And this is very powerful and useful when you are doing this kind of investigation, because in a single glance, you can have a tool that will help you to analyze all your different data, but also visualize them and enrich them with multiple tools as well.
So this is really, really for me, it's one of the best tools today for doing threat intelligence research, because this is super versatile and also very easy to use. I'm going to ask as someone who is not an amazing coder, how much Python do you need to know? Well, the thing is with Python is pretty easy to use, you know, it's probably the language the most used in the cybersecurity industry. And the reason is because it's not so difficult to learn it and to use it.
So I would say if you have some basic knowledge of Python and if you know the basics, you know, such as the structures of the code and how to create some loops, some functions and so on, I think it's fairly easy to use it. The documentation also helps a lot, understanding the features of MysticPy and how to use it directly. And also there is a lot of examples on the GitHub repository. So there is some notebook as well that you can directly reuse and adapt with your own data.
So you don't even know, you don't even need to know really how to code in Python. You can just reuse the code that we already built and just adapt it with your own data. So you mentioned you can query various types of data, including your own data. Can you give examples of data? So for example, I work in the Azure SQL Database team or Azure Database team, I should say. I do Cosmos DB, Azure SQL Database, MySQL, Postgres SQL and so on.
I mean, those examples of logs that you could use and, you know, W3C logs from a web server. I mean, what sort of typical logs you're sort of looking at? Well, it really depends actually. So you can connect it, for example, to an Azure cluster and make some requests to KQL, for example. So with that, you can actually put any kind of data into KQL and then request the information using MysticPy. So it's really versatile. So I have actually a good example.
Last year, I'm not sure if you heard about the Conti leaks. Conti was a famous ransomware group and some of the internal leak chat, some of the internal chat have leaked last year. And I did some experimentation using MysticPy and analyzing this kind of data. So it was basically JSON file with all the information related to the discussion of the members of the group. And what I did actually, I just used MysticPy to load all this information and started to analyze them.
So I used MysticPy to extract indicator of compromises, such as, for example, IP address, URL, Bitcoin address as well. I also used MysticPy to enrich that information using the threat intelligence provider, but also navigate through the data and extract the information that I wanted. And the thing is with MysticPy, you can use Jupyter Notebook. So you can also build your own dashboard. So with Jupyter, you can create, for example, like Button and so on. So it's part of Python and Jupyter.
And you can basically create a notebook that will be interactive to your data. I actually wrote a blog post on the Microsoft blog last year and there is a Jupyter Notebook as well available. So you can just reuse it, load the data. And for example, I build kind of a map with the connection between the different members where you can visualize the number of discussions between two members and so on. So pretty much you can really load any kind of data through MysticPy.
And there is some built-in modules that help you to connect directly to some specific interface, such as KQL or Azure Data Explorer and so on. But you can also use your own data from an Excel file, a JSON file or any kind of other format of files. What about for things like on-prem like with Windows? Could you read Windows event logs or do they need to be extracted in a certain form first? You will need to extract them. But if you have them in some places, you can totally do that as well.
So, Thomas, I know that well, I am very familiar with Microsoft Sentinel and there are some other products that use workbooks. Now, workbooks do do visualizations, but that's just based on the queries in the logs that are within the product. But some people might be asking, what's the difference between a workbook and MysticPy? So why would you use one over the other? I don't know if you've got any thoughts on that.
Yeah, I think MysticPy is a bit more versatile to use, especially because you can use it with Python. So you can code your own modules, add your own information and your own features, something that you cannot really do with a workbook. And also with MysticPy as well, you have some built-in features that can help you enrich the data and some specific features that are really specific to investigation.
Such as, for example, we have full modules used for pivoting, such as getting one information and pivoting through that information to uncover more and so on. So MysticPy is just a bit more versatile and it's not only dedicated to Sentinel, but to multiple kinds of data and information. Makes sense. Now, we talked a bit a lot about MysticPy, but I also know that you are a very busy bee and do lots of other security research projects.
Now, the one that I wanted you to tell us a little bit about was your Unprotect project, because I'm not sure if people will be familiar with that. So can you tell us about it and what it is and what you do? Yeah, sure. So the Unprotect project is basically an open database that aims to document every malware evasion techniques. So it's a very, very daunting task because there is a lot of different mechanisms and so on.
But basically we tend to we try to document that information and in the database you can get information about, for example. So we are classifying the evasion techniques by different categories, such as, for example, anti-debugging, anti-disassembling. It could be also network evasion, sandbox evasion and so on. And the goal is to provide a full detailed classification about this technique.
So if you go to the portal and you search for a specific evasion technique, you will get information about this specific technique, such as the descriptions. And also we provide the code snippet that can be reused for detection purpose or even retemming. And we also offer YARA rules, Sigma rules and Kappa rules when it's available. And the goal is really to help a malware analyst or an investigator or security professional to understand more about a specific evasion technique.
So the project is community centric, meaning everyone in the community can participate and contribute to the project. And we actually have at the moment 24 contributors from the community. And you can just make a submission, improve an existing description, or also just upload YARA rules or a code snippet and so on. So I started this project in 2015. And at that time I was working as an incident responder for different customers.
And each time I went on site, I realized that the customer didn't really understand why a specific piece of malware wasn't detected by the antivirus engine or wasn't really analyzed by the sandbox and so on. So I started to document the evasion techniques just to help them understand how malware can bypass the security in place and the analysis and so on. And so at first the project was just an Excel sheet and then it became a wiki, an open wiki.
And today it's a much more bigger platform where we have all the techniques and all the information. One of my friends, Jean-Pierre Le Sueur, is also one of the core contributors of the project. So we are actually two people maintaining this project at the moment. But since it's community-centric, there is also multiple people that are contributing to the project from time to time.
But basically this is the place to go if you want to learn more about malware evasion techniques and understand how to analyze it and improve your detection in place. So word on the street is that you've got a book out on this kind of topic. Do you want to talk a little bit about that? Yeah, sure. I just released a new book. I'm super proud of it. It was a lot of work, actually. So the name of the book is Visual Threat Intelligence, an illustrated guide for threat researchers.
And I really wanted to offer something different. So it's not your typical computer science book. It's more like a visual guide with illustration, graphics and so on. So I wanted to provide something which is very digestible to understand, easy to read, but also can serve as a reference in your daily job or for a starting learning point, for example.
So it's a book about threat intelligence. And in this book, I talk about the fundamentals of threat intelligence, such as, for example, the threat intelligence life cycle, the traffic light protocol, the different types of intelligence, open source intelligence as well. I'm also talking about threat actors and operating methods. So I will talk about tactic, technique and procedures, the attack matrix.
Also, the unprotected project is part of this book. Then I'm also talking about how to track an adversary, a threat actor using indicator of compromise, using pivoting techniques and so on. And I also talk about some different tools, such as Yara, Sigma and also MysticPy in this book. And the last part of this book is about the notorious cyber attacks that have shaped the cyber security industry for the past decade
and some of the investigation that I did along my career. So I'm also writing about, I'm also discussing about my past experience from the front line of some of the biggest cyber attack of the past decade and so on. So it's really a practical book with visualization. I think it's really different to what currently exists on the market today. And this actually was my goal. I really wanted to propose something different.
I didn't want to have a big book which is very heavy to read and understand. This one is fairly easy to understand and to read because of the visualization, but because also of the specific structures that I put in place. And I think it's kind of innovative in our industry because I don't think there is a similar book in the market at the moment. You often find books that are either academic, very dry, hard to read. Then at the other extreme you've got Malware 101.
It's nice to see by the sound of it you're reaching a middle ground which I think is... I'm actually a fan of because a lot of stuff that I work on, I tend to write a lot of that middle ground documentation to bridge the gap between people understanding it and people who are super duper technical. You want to bridge that gap on a regular basis. That way you're meeting a very large population of people. So yeah, that's great to see. I hope we do well. I hope the book does really well.
I've got a question for you, Thomas. Did you draw all the pictures in the book? Yeah, all of them. All of them. Yeah, all of them. For those of you, you should go and check out Thomas' book. Thomas draws really lovely visual diagrams and I'm very jealous of the skill that I do not have. Yeah, that makes two of us. It's very complicated when you... Sometimes when you see the final illustration you say, okay, it's very easy.
It's very complicated, the process and the reflection to think about a technical concept and think about how to illustrate that technical concept in a simple way without removing any important information. So it's kind of very complex to start with an idea and starting to draw the illustration and to have something really useful and really complete without removing some really crucial information.
So the process of reflection is super heavy and it's actually... Writing a book is super complicated because you have to write every day and go to your routine and so on. But adding illustration on top of that is another challenge as well because every illustration takes days to create and to put in place. And during the process of writing the book, I always recreated some part of the illustration because my mind evolved during the process of writing.
And at some point I get new ideas and so on. So I had to redo some part of some of the illustration and so on. So it was a constant work and at some point you have to say, okay, now it's okay. I need to stop and I need to release the book at that time. Well, Thomas, I know that when I need drawings doing, I know who I'm going to come to because clearly you're not busy enough already.
But yeah, that's probably it. I actually might hit you up for drawings. I'm not lying. That's a true thing, but not yet. All right, this has been good. So one thing we ask all our guests is if you had just one final thought to leave our listeners with. As a final thought, like if someone wants to start in straight intelligence, I think the best way is probably...
Today there is a lot of information on the Internet. So I think the best way is probably to be aware of what's going on in the world, in the cybersecurity world as well, but also in geopolitics because straight intelligence is very tied to geopolitics as well. And also just be informed, go to security conferences, meet people and so on.
And I think that's the best starting point today in this industry because if you meet the different people, you will potentially have new opportunities and so on. So I think that's the best way to do so. Yeah, just be aware of what's going on. Meet people and share your knowledge. Thomas, thanks so much for joining this week. I kind of stayed on the periphery of MysticPy,
but it's good to see MysticPy being explained by someone who is actually actively involved. I certainly learned a great deal from it. And to all our listeners out there, we hope you found this episode of use. Stay safe and we'll see you next time. You