#294: oso authorizes Python

00:00

When we think about accounts and security, we often think about identity, logging in and proving who we are. But for many applications, especially internal apps at large organizations, that's just step one. The next step is what can we do and what can we not do? On this episode, you'll learn about a new library called Oso. It's a declarative way to create policy code that maps your mental model for who is allowed to do what in your system.

00:22

We have two guests, Graham Neary and Sam Scott from the Oso Project to tell us all about it. This is Talk Python To Me, episode 294, recorded October 23rd, 2020. Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem and the personalities. This is your host, Michael Kennedy. Follow me on Twitter where I'm at M. Kennedy and keep up with the show and listen to past episodes at talkpython.fm and follow the

01:00

show on Twitter via at Talk Python. Before we get to the interview, let me tell you about a brand new course that we just launched. At Talk Python, we run a bunch of web apps and web APIs. These power the training courses as well as the mobile apps on iOS and Android. If I had to build these from scratch again today, there's no doubt which framework I would use. It's FastAPI. To me, FastAPI is the

01:23

embodiment of modern Python and modern APIs. You have beautiful usage of type annotations. You have model binding and validation with Pydantic, and you have first class async and await support. If you're building or rebuilding a web app, you owe it to yourself to check out our newest course, Modern APIs with FastAPI over at Talk Python Training. This is the first course in a series we're building on FastAPI. And for just $39, it'll take you from interested to production with Fast

01:51

API. To learn more and get started today, just visit talkpython.fm/FastAPI or click the link in your podcast player show notes. Sam, Graham, welcome to Talk Python To Me. Thanks for having us. Thanks. Yeah, it's great to have you guys here. I'm excited to talk about this whole managing what people can do on computers from a slightly different perspective from the authorization side of things, which I think gets underserved in programming in general. So that's going to be a lot

02:21

of fun. But before we get to all that stuff, let's start with your stories. How'd you get into programming in Python? Sam, you want to go first? Yeah, sure. So I think for me, it was probably the kind of typical programmer entry, which was I had a very monotonous data entry job, which I was like, surely there's a better way. I was young enough that I reached for BB macros. A few years later, though, I actually ended up picking up

02:46

Python, primarily through my math degree. I had a professor who was very into number theory and worked on the SageMath package. Oh, yeah. SageMath is fantastic. Yeah, I've had William Stein on. It is incredible. Oh, nice. Yeah, it's cool what they're doing. Yeah. I actually dug up, that was my first open source contribution. There's a SageMath ticket from a good nine years ago. Okay, cool. Yeah. What kind of math were you studying?

03:10

So that was during undergrad maths. And so that stuff was, yeah, that was kind of the number theoretical side of things. After that, I actually went on to do a master's in cryptography and a PhD in cryptography and security, which is... Oh, nice. ...how I got here, basically. Yeah, yeah. It's sort of indirectly. Roundabout leads you here. Yep. Graham, how about yourself?

03:29

I actually, I took an entry-level CS course when I was an undergrad. And actually, at the end of my undergraduate experience, which took me way farther than I ever would have expected it to. And I don't get to do that much programming on a day-to-day basis, but I try to whenever I can, including at a recent company hackathon. So I still like to dabble when I can. Yeah, yeah. Super. These days, you both work at Oso?

03:54

Yeah. Give us the rundown on, well, I guess, maybe introduce what Oso is, your company, since you both work there. And then what do you guys do day-to-day? Yeah. So Oso is a company. What we're all about is putting security in the hands of developers. That's how Sam and I got to know each other. That's the thing that we really connected on as the thing that we want to do. And the way that we think about doing that is by building

04:19

consumer quality developer tools for security. And so the area that we're starting, which is the area that we'll talk to you about today, is authorization. But that's really sort of the ethos of the company. And what's nice for Sam and me, I think for a lot of founders, it's not always clear how to think about division of responsibility. But for Sam and me, it tends to be pretty clear. I take responsibility for the business side of the company. So sales, marketing,

04:45

financing, everything on the operational side. Sam is responsible for everything on the technical side of the business. So running the engineering team, Sam built the first versions of the product by himself, and we share responsibility for the product roadmap. Yeah. Sounds pretty clear. And I think it's a cool project. How long has the company been around for? It's not super old, right?

05:07

So we've been working together for a little over two years, but we only open sourced the project about 10 weeks ago. Okay. So in the open source side, it's quite new, but yeah, still two years is pretty young for a company. And it's easy to think of the stuff that you're building just as technology. And it's clearly like developer tools and APIs and things like that. But man, that marketing stuff and getting the word out and sales, without that, you just can't go, man.

05:34

Yeah, that's the hard stuff. Exactly. Give me some cryptography and some compiling and language interop, but don't make me write a landing page. Now, I'm serious, though, that is a super hard part of, such a critical part of technical companies, open source companies, and so on. And it's easy to overlook that side. Yeah, absolutely. But it's fun. Yeah, for sure. So we're going to talk about one of the three A's. I was recently told there might be four A's in this

06:04

whole identity authorization sort of story, but I don't remember what the fourth one is. So I'm going to go with the three A's. We've got authentication. We've got authorization. So authentication, who are you? Authorization. Okay, now I know who you are. What can you do? And then auditing, what have you done? Right? You guys, you're a fan of the middle A.

06:24

Yeah, that's right. Yeah. And I think you just about nailed it. And a lot of the products out there really focus a lot on authentication, which is, I think, the thing that most, for instance, consumer users would be most familiar with, like logging into it, getting a login page, having your username and password, doing things like password reset, or more recently, things like

06:46

two-factor authentication, how all that stuff is managed. That's the authentication bit, just making sure that you can get in the door. Even the sign-in with Google, sign-in with GitHub is really primarily about just, usually that's about two things. One, who are you? And then sometimes it's about what part of GitHub do you want to let this app access, or what part of this app do you want to let access your Google data? But it's not,

07:10

it doesn't work in the reverse way. It doesn't tell you what the user's allowed to do on that application. It's just connecting those two apps together from a data side. So even the social auth stuff is really just authentication. Yeah, those lines get pretty blurry, right? Because you're allowing some other website to access information about yourself so they can check who you are.

07:28

So there is an element of authorization going on between those two services. But you're right, like the result of that is authentication. Yeah, I guess when I say that there's not any really authorization, I'm thinking that that doesn't tell the app what you're allowed to do, but it does tell, like say Twitter, what, like, are you allowed, is this app allowed to tweet on your behalf according to Twitter? Yes, exactly.

07:50

Right. But it doesn't help within your app. Like if I want an app, I want to know like, okay, this user, they can view invoices, but they can't create invoices. Precisely. But they can't ever see the bank details of anyone like that. Right. There's no social auth that's going to help you with that side of the story is what I was thinking. That's exactly right. Right. That's fundamentally an authorization question, what you just asked. Yeah. Okay. Exactly.

08:13

Cool. And so this is the core problem that you guys are trying to solve. And we're going to talk about some of the open source stuff that you've done and the Python APIs and all that. But maybe let's just continue this part of the conversation by talking about some common access patterns. And what is out there? Like we talked about the social auth and what that means. We talked about creating users with usernames and passwords. And what are some of the patterns you all are seeing out there?

08:38

Yeah. So basically every application out there needs to let its users in some way, shape or form, see their data and do something to their data. And so fundamentally that's doing some kind of authorization. And then you'll have, so in a basic like social app, that might be like, what posts can you see and what posts can you edit or something like that? And then you'll have. Yeah. So in some sense, I guess there's like an implicit or default authorization that every

09:04

application has. And it's usually I can see my stuff in public stuff and that's it, right? Like, right. There's no rules like, okay, when I go to Twitter, I just see my stuff in public stuff, right? I could go to my profile, but there's no expectation I would ever be able to see someone else's profile and those sorts of things. So I guess if you don't do anything, that's generally the access pattern people have is I create an account and that account can see its stuff.

09:30

Yes, absolutely. And I think that's kind of like the common paradigm in particular in like a consumer application context, particularly when you start to look at like business to business applications, you end up with like very quickly, the patterns get a lot more complex. So could be an HR application or a CRM application or a medical records application. And very quickly, what people will do in building

09:57

these types of apps is they'll reach for a pattern called roles. Well, they'll group a set of permissions or capabilities together, they'll lump them together into something called a role. They'll say anyone that has this role, like admin or billing or nurse or whatever it may be can do these sets of things. And then if you want to be able to do those things, you got to get assigned that

10:17

role. And that's kind of like a handy thing, because it means that every time you want to make sure that someone can do those things, you don't have to repeat that work, which is kind of nice, right? But also kind of limited, because effectively, what that's doing is it's creating is sort of representing all your permissions as like a two by two, you basically have a bunch of roles on one axis,

10:37

and you've got a bunch of capabilities on another axis. And the truth is that most apps don't, they're sort of like underlying like data model isn't a two by two, they may be, they may have all kinds of other things going on. They may have, they may want to represent hierarchical patterns, like to represent an organization, they may want to represent inheritance, they may want to represent some kind of graph. And so like, oftentimes, after you adopt a role model,

11:03

Let me throw an idea out here, and you can tell me what you think of that. Like, so for example, if I'm a manager at a company, I can see my work, right, plus my team's work. Exactly. Right. But I don't want to see another team's work. I don't want that person to be able to see everyone's work, just the people for whom they are the manager like that scene. And then you know, how do you, you can't really easily manage that setup, right?

11:25

That's a perfect example, because that's not a role. That's like a manager in that context isn't a static thing that you just assigned to someone that's kind of dynamic, based on where you sit in the org, you might be a manager of one team, you might be a VP, in which case you're a manager of like five teams. And so it's not this thing that's assigned to you. It's more like a function of maybe some other data that sits elsewhere in your application. Yeah.

11:49

So you end up having to do all kinds of crazy things to hack around the role model and make that work for your application, which is all the kinds of stuff that we see. That's where it starts to get fun. Yeah, exactly. So that's probably means like, instead of just having say, a decorator or some simple if statement that says, if they are a manager, there's usually like some custom logic

12:11

checking in that section, right? There's like code that's been written somewhere, the checks, you know, like basically does those things and looks like I'm a manager, but who do they manage and so on. Then all of a sudden, this roles idea sort of, it somewhat falls down and you're coding in the logic into your app, right? Right. And there's like, you know, other examples of where the sort of roles model stops and other

12:34

things begin. And you sort of have to start just adding more, whatever it is, if statements, or maybe you bring the logic to some other part of the application, because that's where it makes more sense to you. But yeah, the example you gave is a perfect one. The thing about that kind of stuff that scares me is what if I forgot, right? I've got a web app

12:53

with hundreds of endpoints. One of the one section is like an admin section. And if that doesn't do the proper checks, all sorts of badness is going to happen, right? Absolutely. So you put your code in there. I mean, I don't know. How do you guys feel about that? Because I'm always like, I'm triple checking it. And then I'm like, I got to go back and check this again. Like this is, this could be bad.

13:15

It's super, super common. And that's where we see, we see a lot, you know, at larger organizations, that's where often like a security team will spend a large portion of that time. I've heard security teams who have like their own little regex that they use to go and find like every method and see if it has the piece of code they're expecting to see, or they sit on every code review so they can make

13:33

sure that one's not missed. That for people without that kind of security team though, then yeah, it's just a case of hoping you don't forget it.

13:39

Yeah. Oh, you might also even do the, it might also be kind of the opposite of that, which is that like, you might be good enough to include the logic everywhere, but rather than trying to sufficiently extend your like roles model to account for all the different like intricate scenarios that you're trying to properly represent, you might just say, ah, we'll just like, let this person be an admin or something like that, just so that they can do the thing that they need to do inside your

14:04

application. And then all of a sudden what you end up with is all these scenarios where people, or for instance, like internal services are over-provisioned because that was the fastest way to make it possible for them to get done the thing that they needed to get done. Yeah. Which can be equally painful to undo later on or risky. Yeah. It definitely sounds risky. Like it's, it's easy to have the admin, non-admin flag and just

14:26

go with that. But that's probably fine for like a small team, but as you grow, it's no longer going to work. Talk Python To Me is partially supported by our training courses. Do you want to learn Python, but you can't bear to subscribe to yet another service at Talk Python Training? We hate subscriptions too. That's where our course bundle gives you full access to the entire library of courses for one fair price. That's right. With a course bundle, you save 70% off the full price of our

14:55

courses and you own them all forever. That includes courses published at the time of the purchase, as well as courses released within about a year of the bundle. So stop subscribing and start learning at talk python.com. talk about some of the coding approaches in like current Python projects. And I guess I talked about

15:19

one, Sam, you put down as like a DIY, do it yourself. And that's the, well, if we got to have managers and people for whom they manage, we've got to, we just write some code and sort of put that logic in there. And maybe you've got that overlaid on some groups. Maybe give us some of the common approaches you might see in like common Python web apps. Yeah. So overwhelmingly common is that this is kind of seen as just the regular code in applications,

15:46

just the things you have to do in an app to build it. So, you know, in that case, it just ends up baked into, you know, every method you have is going to have a certain amount of this logic. Like I think either you're going to see people who are just sprinkling this throughout the code base, adding them where it's necessary. And that's just kind of like handles or like considered something

16:04

they just deal with. Or sometimes people will try and go the approach of like stretching that out, pulling it out through into, you know, maybe something like a decorator, like you said. And that ends up becoming this like 500 line decorator, which has, you know, 10 levels of nesting and if statements and things like that. Yeah. Decorators are already hard to put your mind around. Although that's what I do on my stuff.

16:25

I'll have like a, I'm like a permissions decorator. At least the thing that I like about that, even though it has some disadvantages. The thing that I like is I can go to the functions and I don't have to read the function and know, is this thing being dealt with? Like, does it have the decorator? Then the function is okay. You know what I mean?

16:41

Yep. Yep, exactly. And there's, so I mean that, you know, that is like one set of things and it's not like nothing exists to do these things we're talking about, like depending on the application, right? So a lot of the kind of stuff we've been describing doesn't have to happen in the codes. It might be stored in a, like I didn't see management system or something like, you know,

17:00

active directory. It's typically, this is a place where you can store all the information about users and you can add them to groups and assign them permissions to different things.

17:08

That is sort of like the manual admin approach, which probably a lot of people might be familiar with as a way to maybe like manage permissions inside an organization, but it's not kind of suitable for the kinds of things we've been talking about for like a, an application, B2B application or a consumer application where you don't want to have someone manually going into like an active directory thing and like assigning people roles and permissions. Right.

17:29

So that's kind of like one set. And then I think similarly, some of the Python frameworks out there, for example, have built-in things for similar patterns. There's Django has things like Django admin, which again is sort of baked in UI and system to manage like users and permissions,

17:43

but it's kind of more of like the AD flavor. It's like a UI where someone might manually go in and configure, you know, all right, Sam is in this group and this group can do these things or Sam can do these permissions, not for the sort of like, how am I going to provide a consistent interface to all of my end users? Like how do I do that sort of dynamic automatic configuration?

18:02

How do you feel about multi-tenant apps? So I've got a cloud service and maybe I've got one company and within that company, they have certain roles, but then other customers come along, they buy a setup for their system. You know, think like Slack or GitHub with organizations or something like that. So you just gave two very interesting examples. I think GitHub is a pretty good example of doing multi-tenancy and in a reasonable way in that you have your single user account and you can belong to

18:32

many organizations. You can have different roles inside the organizations. You can have, you can even have roles inside repositories, although it's not that obvious, but you can be a, you know, owner or a collaborator of a repository. Right. And it sort of kind of like handles all of those in the sort of reasonably consistent way that if you don't dig too deep, it kind of makes sense. But you can imagine you're on the backend to support that. What are they having to do is have a reasonably

18:55

complex data relationship model between users, organizations, repositories. If you go deeper, right, there's like teams, sub-teams, infinitely recursive sub-teams, things like that. Yeah. Yeah. Slack, I feel like they did the, at least initially, they kind of forced you to create an account for every workspace. So presumably on the backend, this looks like somewhat different where they didn't try and make it so that like you could map a user to multiple organizations with roles.

19:20

It's like you have a user inside an organization and they have a role. Yeah. And this, I think this is probably, I can imagine this might be something based on how they originally did authorization. It sort of might've even painted them into a corner where they are now sort of stuck with a model. It's hard to get away from. Yeah. Side note or sidebar. I, I, the Slack authentication model drives me crazy, but I can't just log in and see what groups I'm like. I'm with you.

19:43

I got to remember the pre, the pre, the, the, the. It's brutal. First part of the domain that belongs to it. And then the password may or may not be the same. And it's like, what am I, why am I doing this? It's so brutal, man. Yeah. I'm not saying it's because the authorization approach was wrong, but maybe it was. Well, but I mean, it just, it shows that you need to be careful about this, like how you think about this. And yeah, the fact that we are talking about it at all means that it's

20:10

like an issue for users and people experience in an ideal world. You shouldn't even think about it. You're just like, well, oh yeah, I guess I didn't really think about it, but yeah, that it is restricting to me what I should be doing and it just works. Right. Yep. Yep. Yeah. And you bring up an interesting point. It's why people say architecture is like wet cement. That's right. For a while. Yeah. Yeah, absolutely.

20:34

Yeah. I was just going to say the idea about it being exposed to users as well. This is a really interesting one. It's, I think a lot of the struggles or frustrations people have with security, but it's done similar things as well, where you end up, you're trying to use a website or something and have these like crazy complicated permission role systems

20:50

where you're trying to like decide what you can do within the app. And when you talk to teams like that, you realize it's because they've like influenced an authorization system and they're basically just like exposing the internals of that to the end users. So you kind of almost need to understand how the app works in order to decide what you can do inside it. And that is just kind of crazy. Yeah, for sure. So you all saw a problem out there and Sam, you built this policy system,

21:14

this authorization system. And I guess one of the things we want to be clear about is I see this as an advantage. Is it is not as specifically not about logging in users, managing their passwords or their party off stuff. It's about once you know who someone is, regardless, like you can have using password, you can have Google login, whatever you feel like once you've got that now, what can they do? Right. That's exactly right. And in addition to that, it can be can have them multiple different

21:41

ways that people authenticate or have identities, right? They might log in the web app, they might have an API key. And authorization might depend on that. That was like a main input to these decisions. Yeah, interesting. Like so for our mobile apps, for the training courses, we have you log in with using password, but then it actually exchanges an API key, basically, the login gets your API key. And

22:01

then from there on, it's exchanged with all the calls. So yeah, I hadn't even really thought about the API side of things as well. But that makes a lot of sense that you'd want to separate those. Yep. Yeah, it doesn't matter how you log in, if it's API key, or you do with the username passwords, like, all right, well, now we're going to figure out what they can do. So I guess tell me what problems

22:19

it was that you saw, you're like, well, we got to do this differently. And then tell us about Oso, and like how we can use it in Python and so on. Yeah, absolutely. So the, I think the biggest problem that we saw out there is that pretty much every single engineering team we've seen out there that we've spoken to has repeated this work themselves from scratch. In all the ways we just discussed, right, whether it's through their code,

22:43

as a decorator, whatever it is, like everybody's repeating the same work. And nobody's going to get it like perfect on the first attempt. And so they end up, you know, having to iterate and refactor over time and add things. So they don't quite get it right. Well, and another thing, the way I think about this stuff is this adds no value to your application in a like sort of unique feature aspect. It's like one of those things that it has to exist.

23:05

It's table stakes for being in the game, but it's not like somebody's like, I love that app because the author is so, or the author is so, like, it's only like drag and molasses if you have to do it yourself and you get it wrong, but it's not a bonus. So it's the reason I say it's not something you want to like try to invent in or whatever, right? You just want it to work really well and get out of your way. Absolutely.

23:28

Exactly. It's one of those, if you do it right, they won't even know it's there kind of thing. Yeah, exactly. So yeah, so that's basically what we're out there to solve is to make that experience for everyone involved, like that much better for the developers who are building this and spending time on it and hopefully not getting it wrong to the end users who are like dealing with these crazy two by two matrices of permissions to navigate. So basically the way we solve this

23:50

is through open source policy engine called Oso. And it's kind of two main pieces of this. There is the policy language called Polar. This is what you write your authorization logic in. I can speak about that in a bit more in a sec. Yeah. But that's sort of piece one. And then piece two is the library, which is the policy engine itself reads in those policy files and basically has a very, very simple API, a single method effectively to make an authorization decision.

24:14

Right. And forward, like I was looking through some of your docs and say for Flask, you have some built in integration. Yeah, that's right. Right. You create a Flask Oso thing and you just say initialize the app, you give it the Flask app and then you tell it what routes to authorize basically. Exactly. Yeah. So the library itself is actually available for multiple languages. So currently we support Python, Ruby, Node.js, Java and Rust. So that's kind of like the core

24:39

of this. And then we build additional framework integration. So you're exactly right. You know, we have one for Flask and one for Django. And for each of those, we try and provide sort of framework idiomatic approaches to authorization. So, you know, I think, you know, Flask is pretty keen with things like decorators. With Django, it's more about malware and automatically registering and making available like the underlying data models.

25:00

How hard would it be to add it to a new shiny framework that didn't make your list? Like, I'm super excited about FastAPI right now. Yeah. Yeah. Right. Looking really nice, but probably you don't have integration with that yet. It seems like it probably wouldn't be that hard to replicate what you've done with Flask or something along those lines. Each framework integration is effectively equivalent to how hard it is to add Oso to

25:22

your application in general. And we're talking about pip install Oso, then creating new Oso objects, loading a policy file, and you're good to go. And so, you know, in the case of Django, for example, we'll register automatically like the Django models because the policy file can actually access objects and classes from your application. So, you know, there's a little bit of work there where we just like automatically register those for you. Right. Yeah. It's very Django-like to do that.

25:47

Exactly. Nice. Okay. So I talked about this complicated story of I want to be the manager, but there's only some people where I am the manager of, and I could be managed myself and so on. There's other people who I have no relationship with other than co-employee and so on. One of the options was we could write code in the application to do that. Yep. It sounds like this Polar Language policy file is where that would go here. Is that right?

26:15

Yeah, that's right. So the Polar Language is a declarative language. It actually takes inspiration from a logic programming language called Prolog. It's sort of this decades-old, pretty well-established logic programming language. But Prolog itself typically is, it's kind of been known to have a pretty high, like, barriers to entry, kind of hard to learn. Yeah. So we sort of, you know, that's what we started with, though, but then we basically pushed

26:35

ourselves to make it as easy to use as something like Python. And so, like, along the way, you know, we've added stuff that you would expect to find in Python. You can look up attributes on your Python objects. You can call methods. You can use variable and keyword arguments. The logic is written and the letters as opposed to, you know, some arcane glyph, things like that. Yeah.

26:53

So that's the language. So that's Polar. And then so, but basically, you know, by building on something like Prolog, it actually makes writing, expressing logic, like you just, like you said, around representing complex hierarchies or things like that, it's actually a very, very powerful way to represent those. You can write a recursive rule like that, which says you're a manager of an employee if you are the employee's manager or you're the employee's manager's manager or something

27:14

like that. You can write like a little recursive rule in a couple of lines. You can now use that like throughout your policy. So you could write another rule, which would say managers can read, or like a user can read some employee, some personnel data if they're the target employee's manager. Right. And, you know, those two rules now combine together. You have, you've just written those

27:33

like kind of in your flat policy. And the underlying engine is basically the one that kind of navigates through those searches through and tries to find if that's true or not. Cool. And I'm looking through those syntax here. It's clearly not Python, but it's not that far from Python. You know what I mean? It's like a Python person could jump in here and go, okay, well, I'm creating an object. Yeah. I got to put the new keyword, but new person and such and such. That seems pretty straightforward.

27:56

Yeah. Yeah. Yeah. The rules look like methods. They have type specializers, which look exactly the same like Python type ones do, but they're actually enforced at runtime. Okay. Things like that. So how's data get from like my application over into one of these policy execution instances, like calling and running the policy. Basically I've got like a user in my, I don't know, request session or something like that.

28:22

Yeah. So they just get, they get passed straight in, right? So the also library in the Python app, you just pass it in regular objects from Python. You pass in the request user, you pass in the thing they're trying to access. And basically we sort of the policy engine, there is a sort of a, an interface between the sort of Python specific library and the internal policy engine that it lets Python deal with the objects, but it can do the sort of policy evaluation

28:45

over those objects. And it doesn't need to know what they are. It's just like, if you say user dot username, once the policy evaluator gets there, it'll be like, Hey, Python, what's the, what's the user's username? And it just says, Oh, it's this, it's Sam. And it continues on. It's like, cool. Now I know it's the string. Yeah. Nice. Super cool. So one of the things that's interesting about Python and some of the other languages you mentioned with the

29:07

integration, right? Like Ruby, for example, is that we have a REPL read eval print loop, right? If you just type Python, hopefully that runs Python three, not type Python three. And you just get your triple, triple greater than REPL. And you can start typing in Python commands and going from there. And I do feel sometimes people are learning Python. They lean too heavily on that. Like they don't just go create a file because it's like a pain to make any corrections

29:30

and stuff, but it is really nice as an exploratory mode. And what surprised me when we spoke first about this is you guys have a REPL for this polar policy thing, right? We do. We have, we exactly, we have a REPL and we have a debugger and the debugger as well. And a debugger. That's right. Because like, okay, as you said, right, it's clearly not Python, but it kind of looks like Python, but the underlying model is different to what we're expecting.

29:55

It's not imperative. It's logic based, it's declarative. And so we appreciate there is a degree of like having to learn how the language works. And so for us though, if you're going to build a language, you want the power of a language. You want to have a REPL that you can dive into and like test things out and check something as simple as syntax or just to sense check that you got the expector result back. And so we have a REPL, you can load in your policy files and it allows you to sort of

30:17

interactively query them. So, you know, you can just like dive in and make sure things are working as you'd expect. Yeah. I suspect that that would be hugely valuable. I mean, I haven't actually tried it, but you know, I think of things like YAML configuration files and stuff and you're just like, it's not working. Why does it not work? You know, it's like, it's those times you just want to yell at your computer. You're like, why don't you work? It looks right. You know, like, ah, indentation.

30:43

Yeah. Or something, right? Oh man. It's like, oh, it's coming back as a single key, not a list in the way you put it in YAML or whatever it is, right? Like having the ability to like step through it is really interesting. Yeah. The debugger is great fun for exactly that. You just drop into the debugger and just kind of like hammer next and just like watch it doing its thing. Yeah. Do you have, or have you dreamed of any like IDE integrations like VS Code or PyCharm

31:08

or more broadly IntelliJ? Yeah. So I think dream is partially the right, the right word for it. You know, we currently do have syntax highlighting available for VS Codes. I think there's actually a Vim config out there as well. There is so much stuff we want to do with like the, you know, language server protocol and hooking up the debugger to IDs like VS Codes that you get that experience like in your, in your IDE. Yeah. There's a ton of other stuff we want to do.

31:31

Cool. All right. Well, I've been beating my example of manager, employee to death. Give us some more use cases or you might see people using this kind of stuff. Yeah. So I think one of the, one of the kind of initial users we have for this is, is kind of your prototypical authorization use case. They are building electronic health record software deployed currently in hospitals and the stuff they come up with like every day just astonishes me.

31:59

Some of the authorization, but you know, there you get like, can imagine the really classical stuff, which is like, you can doctor can see your patient's records if they saw them in the last seven days, or if they have an upcoming visit, like that's the kind of level of granularity, like a hospital might want to go down to. Yeah. Yeah. That sounds like a perfect use case. You know, just here's a general heuristic for deciding whether some company or organization

32:21

might have a good use for this. If they use SharePoint, what do you think about that? That's a pretty good one. I feel like SharePoint is like, it's just like all these weird permissions and it's all about like, well, we can't really, I don't know. Just, I always feel like, all right, this is like messed up enough that you really need some help here.

32:38

So I thought you're going to go maybe a different direction with that, which is if you would be sort of embarrassed or out of business, if some of your data was exposed, then you probably need this. Also. Yeah. Yeah, for sure. For sure. All right. So give us some more examples. So we, I mean, you spoke about like the social media one at the beginning, which is

32:55

kind of an interesting one because there's a, there's a few slides of that. And actually we recently did a bunch of like blog posts on a little social media feed app that we're putting together. But even some of the simple ones that, you know, that you mentioned can be reasonably complex.

33:06

Users can see posts they posted or they can see their friends posts. They can see them, you know, if maybe if they were like tagged in a post, things like that, they can be pretty involved in like having to really look at like the post and where it was put and like who was referenced in it, things like that. There's kind of the two sides of that as well, you know, there's like the users of that. And then internally how a company like Twitter manages how employees can access things.

33:28

And obviously this was a pretty hot topic not too long ago. It was super hot. Yeah. There was a, gosh, I I'm blanking on the details. Maybe you remember I didn't tell people, but yeah, there was, it had to do with celebrities, right? Right. It was, I think it was effectively the internal employees of Twitter were able to do way more than they should be, such as post on behalf of anybody. Exactly.

33:49

Yeah. Which no one should ever really need to have, but I imagine it's a pretty convenient way to build it. Well, you want to test how it looks. You see that a lot and you see a lot for legitimate reasons, right? Like your customer is having a problem. They're like, Hey, I'm unable to, maybe it's not tweet, you know, but I'm, you know, I'm unable to do this in my account. You want your support to be able to like step in and help them

34:06

out and be like, Oh yeah, like I can see your permissions aren't quite right. Let me try this. So like, you know, there are very legitimate reasons that you'd have that much power, but it just like overlays this like entire extra dimension. It's like your user submissions and then like your customer support reps behaving as if they're a user, but we're different and so on.

34:23

And it's one thing if that's like internal data and okay, so they probably shouldn't be able to do this, but if you log in as them, they can like in the app, you could maybe do a little more within it, but it's another to have that on production in a live broadcast of the world. Like I can make this random politician or celebrity say this, do this thing. And to a varying degree, it'll be

34:46

believed, right? Yep. Yep. And then, yeah, so that's, so those are good ones. And you know, then beyond that, all of the kind of typical cases you can imagine very common inside like an HR thing because of like the manager employee relationship you mentioned, right? Very organizationally driven up in access control. So in HR or payroll, things like that, you'll see this a lot.

35:04

Similarly in like banking and finance, like any of those cases where you imagine that the data is sensitive and you have some concept of groups or hierarchies, organizations is where this like comes up a lot. Yeah. Yeah. Cool. So you said the language is available or the API is available for a lot of different languages. And one of the things that's interesting here is you all decided to build it with Rust. That's right. Which is a pretty hot, neat language. And at some point though,

35:32

Rust has to talk to Python. So I've seen a few examples of people creating the traditional example is I'm going to create some lower level thing for Python. So I'm going to use C. Maybe I'll go crazy and use C++, but expose it as C. But I think I'd rather write Rust. How did you, how did you pull that

35:51

off? How did you do that integration? Yeah. So the ecosystem for sort of embedding Rust in Python or actually, or even vice versa, you know, embedding Python and Rust or calling into Python from Rust. There's actually, there's a few tools out there which solve that kind of specific problem that they have interfaces specifically for exposing a Rust struct as a Python class, things like that. We didn't really have that option available to us because we wanted to support multiple languages.

36:12

We sort of needed some of an API that was simple enough that it wouldn't matter the language or the runtime. You can bind the Python and Rust tightly together as an option. And probably if your goal is only to write it as a base for some Python thing, it might make sense. But exactly. That wasn't your goal, right? No, exactly. So instead we sort of, okay, one of the engineers on the team had this pretty great vision of how this would look, which is sort of like a kind of an event driven API.

36:39

So like all of the Rust code is driven through like a very, very simple API from like the host language side. So from like Python or Ruby, it's like a simple API, just kind of like do the next thing. Like the internal polar evaluation is done through virtual machines. It's like virtual machine, go do

36:54

your next instruction, go do your next instruction. And it kind of returns a JSON blob of data back to say, like I said earlier, Hey, what's the username field on the objects number one, which maybe is the user

37:05

type samples. So there's kind of this, this really nice like conversation between the Python and Rust where Python is like, Hey, do more work, do more work, do more work until Rust comes back and says like, I need more information, which means that like, there is no, when, when the policy is not running, there's no like background thread. There's no, there's nothing like running there. It's Python

37:23

is free to kind of pause that virtual machine for as long as it needs. In the case of things like node, we haven't done this for Python yet, but in the case of nodes, if you have asynchronous codes, that all just works. It's like super nice. Yeah. Oh yeah. That's cool. That would be nice in Python. Also my mention of FastAPI earlier. Yeah. Yeah. That one's all about the async and await. So it's not quite yet a big deal,

37:44

but it's like some of the new frameworks are going down that path. At the same time, you could still use it. It just won't benefit from the async and await, I would suspect. Right. Exactly. So where this would matter, and this, and I mean, this would be essentially a pretty important thing to do for something like FastAPIs. If, you know, as I said, the, because the policy language can call into the application to fetch data, that called might itself resolving like a

38:05

database query. Yeah. And so maybe if you're running multiple threads, you're trying to serve multiple requests, like you're going to want that async so that the policy is asynchronously getting that data back. Yeah, for sure. Every time you can await some other external resource, you're just better off for doing it. Yeah. Exactly. Okay. Very cool. Well, it sounds like a neat integration and I guess it's a challenge I didn't

38:26

really expect. Like I figured you would have to integrate it with Rust, but I didn't expect like the, we needed to be sort of bi-directional communication and work across the different languages. That's a pretty good accomplishment. Yeah. Yeah. It was, I mean, it was a lot of fun to build. I have to say though, the hardest part of it all was the, the packaging and CI, funnily enough, because now we have this Rust library we're trying to add to

38:48

a, add to the Python package. We, I can't remember what number we're up to at this point, but we have maybe like 300 CI jobs for every operating system, Python version to test like every combination. So you want to do like build a wheel for like macOS on Python 3.8, macOS on Python 3.9 and so on. Yeah.

39:09

Exactly. So you have the, there's like the many Linux, I think it's a format would be it or a spec or something that we use the kind of the many Linux Python approach so that we can build all these, pre-build all these wheels, including the Rust code. But once you get them out, you want to make sure that those 50 different wheels you built are all work still. Yeah, exactly. Exactly. Is this interaction, this Rust Python interaction layer, part of the open source stuff that you have out there?

39:34

Absolutely. Everything is open source. Yeah. Yeah. Actually, we'll probably see a blog post coming fairly soon on how we built that. Cause I think it is, it's like a really nice, simple approach to doing this kind of thing. And I'm personally a huge, huge Rust fan. That's kind of a big reason why we're using it. And so I'd love to see people taking this approach to building this like cross platform, cross language, you know, Rust cores. I think that'd be awesome.

39:55

Yeah. Neat. All right, Graham, let me throw it back to you with a business question. Sure. Yeah. So we've been talking about how cool it is that this is open source. And yet we started off a conversation saying you guys started the business two years ago. I'm really fascinated and admire companies that are able to make legitimate, meaningful open source things, and then use some interesting extra, the thing that you get more if you support them or if you buy some product or

40:23

service from them. And it sounds like that that is the kind of thing you all are building as well, right? Because the library and the debugger and the REPL and all that stuff is open source on GitHub, people can fork it today. And that's that, right? Absolutely. Okay. So what's the story? What is your specific plan here?

40:38

Yeah. So in the near term, we're focused on open source. And the reason for that is we believe that the right way to build this company and the right way to build this community is to put enough weight behind the body of people who are actually writing code in Polar and giving them everything that they need to be successful. And so that's like the focus for us for the next year, two years plus. When we think about, obviously, we're a company and we have every intention of being around for the

41:11

long term. And so the way that we need to do that is to create a sustainable business. So the way that we think about doing that is by offering a path for teams that want to run and secure Oso in production and giving them things that make that really viable and easy. So I'll give you some examples. You know, right now, Oso is packaged as a library. Imagine a scenario where you want to run a bunch of

41:33

Oso libraries in a microservices context. And we've had folks already ask us for this today. So now you've got a bunch of libraries with different policies running across a bunch of different services. And you want a way to ensure that those always have the most up-to-date policy, the most up-to-date version of the library. And you're doing that and they're all properly versioned and so on and

41:54

so forth. So an Oso service that would handle something like that is one way you could imagine monetizing. Right. Because if you're one of these complicated SharePoint sort of organizations, there's stuff everywhere, like everywhere. And it's so easy for like one app to get its policy out of sync with the other. And how do you know you've got them all? Like, it just sounds like a nightmare.

42:13

Yeah. Or security teams equally have asked us, I mean, Oso being a library on the critical path of every request puts it in a unique position to be auditing requests, which is something that you talked about back at the beginning. And this is something that a lot of security teams, surprisingly, really struggle with. It's not an easy problem to solve. And but it's something that

42:36

this particular piece of software is in a unique position to do. And so you could easily imagine Oso providing auditing capabilities to security teams in the future, showing them who was authorized to do what at what point. And because we're making the authorization decision or so, we can actually tell them why they were authorized. Oh, because they were in this role, because they sit in this department, and they report up to this person, stuff like that.

42:58

Yeah, that's super interesting. Because you're right, you are already in the middle of all those exchanges. So it's easy for you to add that visibility. Yeah. So I mean, for us, as I said, like the philosophy is relatively clear, we want to give developers the tools that they need to be successful with Oso period, and that technology will always be

43:17

open source. The way that we think about the technology that we'll use to sustain the commercial side of the business will be the sort of organizational pieces that larger businesses rely on in order to be secure, be compliant, run large operational teams and applications in production.

43:36

Yeah, cool. Well, I think that's a it sounds like a pretty solid idea, right? You've got this legitimate open source thing that's meaningful and useful, and grow that and get the companies that got the deep pockets who are often unlikely or unwilling or incapable of contributing back to open source, give them a thing that they'll pay for that it will indirectly basically give back to open source.

44:00

Absolutely. I mean, as an example, before this, I worked at an open source company called MongoDB, and over for many, many years, and over that period of time, we invested several hundred million dollars worth of R&D into the database product, which is directly straight from the companies that were providing revenue to the business through the paid products. So it's a very clear tie between, you know, the companies paying money to the company building the product itself.

44:28

Yeah, I was going to ask you about MongoDB as well, like what inspiration you got there. Because I think one of the things that were started, at least me and the folks who I've spoken to are starting to realize is that it doesn't matter how much money a company has, they won't donate. The idea of a donation is like, I don't know where that goes into the accounting spreadsheet. It doesn't make sense. I can't tell my shareholders that we donate a million dollars to Django. Because I don't know,

44:56

it just doesn't make they just can't put that into their structure, right. But we pay for service level agreements, we pay for additional services, we pay for better support, like that fits into their accounting software. And I think that's the story that's going to work. And so if you can offer them something more, they're very likely to pay for it to get that like you are.

45:18

Yeah, absolutely. And I think for us, again, like whatever, there's books upon books upon books written about this topic, not in the context of open source software, but like in the context of like philosophy over like, in the like, you know, 16th, 17th, 18th century, you know, people writing about tragic of the commons, this is not a new topic, like in the world. Yeah.

45:36

But in the context of like, open source, our philosophy is we need to give something that's good enough for someone to be able to use on their own, where they wouldn't feel like they're going to be held hostage if they're not going to pay money. That's just not a sensible thing for anyone to do. We try to put ourselves in the shoes of our users, we would never adopt a product where we felt we'd be

45:54

at risk of being held hostage. But yeah, but instead, give them an opportunity where, hey, here's something that you can, that you can take advantage of, and that you can get value from. And then if you find yourself in this other scenario where you think you want to, you want to get something like auditing, you want to get additional visibility, you want this way to run something at scale, then we're going to be there for you. And we're going to provide a commercial product for you in

46:16

that situation. Yeah, that's fantastic. And I'm a huge fan of MongoDB. I told you this earlier, before we hit record, but all of our stuff runs on Mongo, and it's has for five, six years. It's been beautiful. I actually just looked at the Stack Overflow developer survey from 2020. And under the most wanted database, MongoDB is out by like 5% above Postgres. And then it's like,

46:39

those are the two that are way out front. So pretty neat. What lessons did you take from that, your time at MongoDB that maybe you wouldn't have otherwise brought to this venture? By far, the number one thing that I learned there is focus on the developer. And I mean, if you look at the mission and vision of our company, it's we put security in the hands of the makers. That is

46:59

all we care about. We have a singular focus on developers. If you woke up anyone on the team and shook them at night and you ask them, who is the number one focus of this company, I guarantee you that anyone would say developers. And that has been clear from the beginning and will continue to be clear for us. And that was definitely the main thing I took away from my time at MongoDB. Yeah. Yeah. Super cool. All right. Awesome. Well, I think we're about out of time,

47:24

but what a neat project and I wish you guys good luck with it. Let me ask you the final two questions though, before you get out of here. Graham, I'll start with you first. If you're going to write some code, what editor do you use these days? So you do some, not a ton, but if you are, what are you using? It was definitely VS Code. All right. Right on. That's a popular one. Sam? Yeah. I think that's probably because of my love of VS Code and I've forced it on Graham.

47:48

Yes. Influenced by Sam for sure. He went through his computer, uninstalled all the stuff that wasn't VS Code. Well, you got to edit something. Here you go. Text edit is gone. Sorry. Yeah. It's kind of stunned me how VS Code's like gradually seems to be taken over the world, but yeah. It's interesting. I find it surprises me a little bit because it came out of the whole Microsoft side of things. I thought that there would be a lot of communities that would go, just no. Yeah.

48:16

But it's somehow it's hit the right notes and people really love it. So yeah, it's definitely successful these days. And then a notable PI PI package, anything, Sam, you've come across maybe that like cool libraries like, oh man, you should really know about this. Or maybe we did our Rust integration with that or so on. So funnily enough. So one of the things I find always hilarious about Python is sometimes I'm not

48:39

even aware if I'm using package or it's just built into the standard library. Yeah. So like, I'm a big fan of like all the typing stuff that Python has been like gradually adding in. I think partially that's because I went through my like Rust phase and now I'm back and running all the types. So definitely all the typing extensions.

48:55

I love the type stuff. I put it all like all the Python code that I write on the boundaries, like say not every bit of code, but where like some part of code is written and some other part is going to be sort of externally consuming it. Types go on that straight away. And yeah, I think it's another standard library stuff, but I love all the metaprogramming stuff you can do with Python. It's, you can really do some crazy stuff with it, but it's, it's kind of fun.

49:18

Yeah. Awesome. I'll throw two things out there for people that are like really related there. So my py is a static type checker that will verify all the types you put in are consistent. And then there's my pyc, which will actually compile the native code. Some of your Python based on those types. I understand. I haven't done anything meaningful with it, but anyway, a bunch of fun stuff around the

49:40

types out there. Indeed. Yep. Yep. All right. Final call to action. People are interested in letting someone else handle their authorization and some library, maybe putting that into one of these polar files. How do they get started? What do they do if they want to get started with your projects? You guys go to a oso HQ.com probably the fastest way. There's a big button on the front that'll take you to the quick start. I would check that out. Awesome. All right. Well, thank you both for being

50:03

on the show and for working on this project for the last couple of years. It looks really helpful. Thanks for having us. Awesome. It was great to be here. Thank you. Yep. You bet. Bye-bye guys. See you. Bye. This has been another episode of Talk Python To Me. Our guests in this episode were Graham Neary and Sam Scott. It's been brought to you by us over at Talk Python Training. Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps

50:29

course. Or if you're looking for something more advanced, check out our new async course that digs into all the different types of async programming you can do in Python. And of course, if you're interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show. Open your favorite podcatcher and search for

50:48

Python. We should be right at the top. You can also find the iTunes feed at /itunes, the Google Play feed at /play, and the direct RSS feed at /rss on talkpython.fm. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it. Now get out there and write some Python code. We'll see you next time.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript