Python 3 .8 is coming soon. It's scheduled for release at the end of October 2019, and you can already download test versions today. Given that Python ships on an 18 -month cycle, it's time to talk about what's coming for us Python developers in the fall. On this episode, I meet up with Lucas Lenga and Anthony Shaw to chat about the highlights of this upcoming version of Python. Also, quick show note, we recorded this on location in Cleveland at PyCon 2019. There may be a small
amount of background noise, but I think you'll barely notice. This is Talk Python To Me, Episode 214, recorded May 5th, 2019. Welcome to Talk Python To Me, a weekly podcast on Python, the language, the libraries, the ecosystem, and the personalities. This is your host, Michael Kennedy. Follow me on Twitter, where I'm @mkennedy. Keep up with the show and listen to past episodes at talkpython .fm and follow the show on Twitter via at Talk Python. This episode is brought to you by Microsoft.
Be sure to check out what they're offering during their segments. It really helps support the show. Lucas, Anthony, welcome back to Talk Python, both of you guys. Hello. Hi, Michael. Good to be back. Yeah, it's great to have you back. We are not recording over Skype far distances, but in fact, we're right here at PyCon. And who knows how the audio for this is going to come out, but hopefully it'll sound good
enough, but it's great to be here on site in Cleveland with both of you. It's a unique experience to actually see you doing the recording. Yeah, it's really good to be here in Cleveland. The sun has finally come out today after the third day of PyCon. The fog was pretty incredibly epic and that was pretty special. And yeah, Lucas, the latency on this video call is incredible, man. It's so, it's lifelike. Yeah. So this video call is like 3D. It's amazing.
Yeah, it's amazing. So thank you both for being here. We're going to talk about Python 3 .8 and maybe looking a little bit beyond that, some of the peps that are out there and all the cool stuff there. But, you know, let's just start with PyCon. Like, how's your experience this year? It felt a little different to me. I'll say why in a minute, but you go first. I'm Kashlanga. I've been co -chairing
the Language Summit this year with Mariata, and this is an absolutely new experience for me. So that was interesting, actually herding all the cats at the right time to the right spots, you know, pre -selecting talks, actually making sure that everybody has the opportunity to speak, making sure that everybody is engaged. That was new. So I was very happy to have it over with after that day. I've heard it went pretty well. So I was happy about this.
The rest of PyCon was also rather intense so far for me. I've had a talk about Black and then an extended Q &A that I just performed during the poster session on Sunday morning. That's really awesome. I feel like Black is one of these things that has just taken off. I often ask people what module they recommend or package. or what is like special that they've seen. And it's way more than any other single answer is black. That's awesome. Congratulations. Yeah, thank you.
I'm very happy about it. In fact, I think there was at least five talks that mentioned, you know, using black as a good thing in them. So it seems like, you know, we've solved an issue that people had. We've solved a problem that people had. Yeah, we've had these linters and they've tell us what you're doing wrong. And we're just like, can't you just fix it? Like, I don't want to be told what's wrong. I just want it to be better.
Yeah, well, the difference between Black and other autoformatters, of which we have a few in the Python community, is that Black was kind of brave, obnoxious enough to tell you that it's going to just be done in one way. It's not really configurable. But that kind of changes it to a workflow tool where if you decide to use the tool, now the question of autoformatting sort of disappears. The question of how to format your code is no longer a problem when you're developing your own projects.
Yeah, and if you're on a team, you don't have to have this debate anymore of how we do stuff, right? You just run black and that's how you do it. Absolutely. Awesome. Okay. Anthony, how's your PyCon going? Like, what do you notice this year that's special? Yeah, this is the first PyCon I've actually given a talk. So that was definitely a milestone for me. Yeah, what was the talk? It was on code complexity. And I talked about Wiley, which is a project I've been working on.
And then the principles of code complexity and why complexity is bad. So how awesome cyclomatic complexity is? I actually talk about how it's inevitable, the more users you have to have more cyclomatic complexity.
people ask me how the talk went and my response so far has been i don't remember because i was so nervous that it was just a sort of adrenaline fueled dream so that was um yeah really cool and then yesterday there was mentored sprints that's new right which is a new thing this year and that's probably the highlight actually for pycon so far is actually just being in there and mentoring people and to contribute to the different projects and working with someone who ended up being able to
send a pull request through to CPython by the end of the day. And I think it was their first one, which is awesome. Yeah, that's super cool. That's got to help adoption and contributions, right? Because even me personally, I think about, okay, well, I'd love to do something, but I really don't even know how to get started, how to build it, what the right rules or expectations are. Like, it's a lot of work.
And if somebody can sit down who's done that or is at least knowledgeable about that and walks you through it, like the second time onward is much more likely to be smooth yeah and it was a proper amount of time allocated as well it was i think it's just over four hours on the second day in the afternoon so it's good enough time to actually sit and work on a proper a proper issue um or a proper feature and actually work through it from start to finish
yeah you actually got a pr in place not just started or whatever yeah exactly so yeah that's awesome to me it felt like when i walked into the expo hall that the booths were a little bit bigger there were a little bit more people. It just seemed like a little bit, I don't know, there are a little more people, a little more energy, even than last year. And last year was amazing. I don't know. Did you have this feeling?
Yeah, I was particularly impressed with the portrait of Guido at the Capital One booth, which you've got to see online if you haven't seen a picture of it already. Yeah, their booth was really artistic. It was pretty cool. Nice. All right. So it's great to be here at PyCon, but let's talk about the future. Let's start with the
idea of when is Python 3 .8 going to be out? How precisely didn't we know that? And Lucas, maybe just say a little bit about like, you're mostly in control of this at the moment, right? Like the release schedule and management and whatnot, right? Currently, I'm serving as the release manager of Python 3 .8. And I wrote the schedule for where things are going to go. Traditionally, Python has been released every 18 months, which puts Python
3 .8 at the end of this year. Like, you know, and in particular, like, you know, give or take, it should be the end of October. The reason why I'm not saying a particular date, even though it is in the PEP, is that those things tend to be a bit fluid. Like in the past three alphas that I released, like we've been a day early or a day late pretty much every time. Yeah, would you be willing to like hold up a release
if there's some important feature that's two days away? In fact, like this time, there was a small issue, well, but small in size, but big in significance that I held up Alpha 4 that I had a bit of time for early in the week, which kind of makes me late for my own schedule now. And I have to release Alpha 4 tomorrow at the start of sprints. But those things are more important to get right than to get on time. And this is something. Yeah, especially when it's 18 months.
If you were shipping monthly updates, like whatever, just ship what you got, right? But if it's 18 months, that couple of days, that means a big, big deal, right? This is exactly the point. We try to make the consecutive releases, even on the alpha level, consecutively better and not introduce a breakage that is going to be later reverted since every release, including alphas, is being tested by our users.
And we're happy to see actually alpha releases being increasingly used by the community to test their own libraries and applications. That used to be our problem, that it was only after the beta releases. And in fact, RCs very often were the first releases that users would see. Nowadays, with like PyPI working better, with CI options being out there for pretty much everybody, we see more adoption of alpha releases, which is great.
But that is also a bigger responsibility on the alpha releases because you are no longer free to just make a breaking change and then later revert it. Well, it's alpha anyway. Well, no, it isn't. Didn't you see what it was? It said alpha. Yes, now we would need to have feature toggles that look at the alpha version to see whether a particular bug exists or not. And we don't want that. This might be a slightly political question, but have the releases of Python recently become more stable?
It feels to me that people have almost zero trepidation or worry about just adopting 3 .7 when they had 3 .6 or 3 .6 when they had 3 .5 these days. And it seemed like there was more concern about what's going to break when I go to the next, you know, semi -major version. Has that changed or is that like perception? For a number of years, the only version that people used was 2 .7. And that was for a long time.
So it created this false perception that like Python is infallible, that every particular update of it like never introduces issues that were not there before. Because they were just minor fixes. Exactly. That was good. You know, at the same time, a blessing and a curse. since every particular version of Python, like including Python 2 .4 to Python 2 .5, Python 2 .5 to Python 2 .6, did introduce internal changes that made large projects actually complex to migrate.
The biggest example from the Python 2 world where it was when Python 2 .5 was released. Zope at the time created such a problem for itself that it took them enough time that they migrated to Python 2 .6 directly. So the previous version worked on Python 2 .4, whereas the next one directly on Python 2 .6, they did not migrate in time for 2 .5. Right, it took so long to migrate. They're like, let's just aim for the next one, huh?
So this is all just like a long, long introduction to just let you know that in Python 3, there are also changes. Many of those changes are deliberate, right? We are changing how the internal memory representation of objects looks like. We are introducing and removing bytecodes. We can change how modules get initialized.
You know, there's multiple things that might break real applications, but breaking them in ways where it is impossible for us to guarantee eternal compatibility with those things. Sure, but this greater adoption of alpha versions and testing and CI probably doesn't hurt, right? Some kind of crashers, right? Downright bugs and whatnot. That I feel like we are doing a better job these days with. Also just because Python 3 finally gets enough adoption that those versions get vetted much better.
Right, absolutely. When I started contributing, we were just working on Python 3 .2. So for a number of years, almost none of the things I worked on were actually very heavily used in the industry. Nowadays, the situation is different. So fortunately, the time between a change and actually having real users report on it is way shorter, which makes the QLT just better. Yeah, that's awesome. Anthony, let me ask you what your perception of that's like.
You work for a pretty major company that probably has stability in mind and stuff. What's your perspective and what are you seeing at Dimension Data and places like that? It's definitely become easier for people to install newer versions of Python, which is really helping. So in terms of, I guess, moving towards things running as microservices on Docker, for example, there's this, it's not a single system running a specific version that's got to support all these different applications.
Right, it's not cross your fingers or upgrading the server. Yeah, we're not running on the mainframe anymore. So I think that that's really helping in terms of there's more automation for systems deployment and there's a lot more tooling being used to automate like the building of new environments. there's this kind of idea of immutable infrastructure now where you basically create infrastructure in the cloud and you build it on a specific version. You don't change it, right, ever.
Yeah. So I think that has actually made it easier to move to newer versions because you can just spin up new infrastructure with the new version, test how it works, and you can inspect it properly. Whereas like 10 years ago, you're talking about we need to go and buy a million dollars of hardware to build an identical environment to see if this new version is going to work for us. Right. And maybe you have like downtime the whole weekend, the team stays all night and they do the
testing and the rollout. And, you know, these days when I go to websites and I see we're down for maintenance or we have even like we have scheduled maintenance over this like two to four hour window. I'm just like, what are they doing? Like what possibly could take four hours to upgrade? Like four hours, it doesn't take anyway. It just seems pretty wild when you see it. But that used to be common, right? So I guess, yeah. So it doesn't matter as much. Yeah, it still matters for some
really big applications. I know some of the biggest software vendors still have four -hour maintenance windows every Saturday. We definitely have to live through some of that pain at the moment. But more and more, I think people are using this sort of automated deployment and automated infrastructure, which is making it a lot easier to upgrade. That's awesome. What about the beta version or the alpha version even of 3 .8? You guys said late October for the main release.
For the main release, but we work that back. When will we see stuff that we can start playing with? Can we already? I know, Anthony, you get it in build from source a lot and then play with it, and you guys do as well. But when does the average person who just wants to install a beta or something like that get access to it? Any alpha release is released in both the forum of sources that you can freely build.
If you're cloning the repository from GitHub, there's tags that tell you exactly when a particular release was made. But also for alphas, betas, and later release candidates and actual versions that we release, we do have binaries, right? So for both macOS and Windows, we have plenty of binaries that you can use to test out your software. I would advise to do it as early as possible. especially 3 .8 was kind of like a shy release.
Right, because we had the whole governance thing, all the major changes were sort of put in. Yes, so now we are just four weeks before the first beta, which is the feature freeze for us, right? Since beta 1 to the main 3 .8 .0 release, we are just fixing bugs. In some unlikely cases, maybe even reverting features that we identified are not ready for prime time, unlikely to happen. What is more likely to happen is this is the time where you know that breaking
changes are no longer being, new breaking changes are no longer being accepted. So it's a great time to actually start using your CI to test your libraries, your applications on Python 3 .8 as well. Expect problems. There's things that we have not identified, even though we have our other extensive regression suite of tests. But it's great to be able to identify those things early. So by the time the distributors come in and package Python 3 .8, its quality is good and we're transparent.
We just can run your application with minimal churn. That's pretty cool. Can you test it with Tox or what's the best way to test on 3 .7 and 3 .8 beta or whatever? There's many possible ways depending on what particular operating system you're using. For open source projects that are already using a CI system like Travis, There are ways to just utilize the latest development version of Python. And at the moment, it's 3 .8.
So just by just saying that you would also like to run your tests on the development version of Python, you're going to get beta version like in a month. That currently is going to be a form of alpha. So that is probably easiest because you don't have to actually install anything locally on your computer, which tends to make things complex when you have many interpreter versions with you. So homebrewing, Patrick, likes to default to one Python 3 version, one Python 2 version.
So there's PyEnv that you can use to have multiple installations. And obviously, yes, there's Tox. They actually work rather well together. So you can set up your matrix of tests that later are run online, but you can also run them locally, which is what I'm doing. Yeah, okay. That seems pretty easy. All right, Anthony, let's kick off this PEP section and talking about the actual features, which, as we all know, appear as Python enhancement proposals and they go through a lifecycle and whatnot.
If people are wondering what peps are out there, what might make it into 3 .8 and so on, where would they go to find that out? So I think on the python .org website, and there's a list of peps. There's also a PEP index on the list of peps. I've also made a small web app called Pep Explorer, where you can go and search and filter and pull specific Python versions and get the status of the peps.
So I use Pep Explorer because I spend time looking at peps and reading about them and trying to understand what's coming in future versions. So yeah, if you're just curious, I'd say the Pep Explorer is probably a good way to go. Yeah, the Pep Explorer is pretty awesome. It's just a nice little grid. It's on GitHub pages, right? Yeah. Yeah. And I'll just link to that. And of course, people can go to python .org. But yeah, it's really nice to just keep track of that. And I find that super helpful.
All right. So let's kick it off. maybe since you're holding the mic, we'll go with you first. You know, what's like one of the notable things that's coming that you want to talk about, what PEP or feature? So I thought I'd cover off the two ones that changed the language first. Okay. So there's assignment expressions, colloquially known as the walrus operator. The walrus operator. Yeah, so this is PEP572.
Yeah, PEP572. So in Python, if you want to assign a value to a variable, you use the equals symbol, that doesn't return anything. So if you just do a equals one in the REPL, then that won't return anything in the REPL. An assignment expression is basically a way of combining the assignment of a value to a variable and returning the variable back again.
So the reason you would want to do that is in some statements, for example, within list comprehensions, within while statements, for example, within if statements, The thing in the if statement, the comparison, for example, you can actually do assignments inside the comparison, and it just removes some additional code that you might have to do. And also, there's a few other examples in list and dictionary comprehensions where you can do some fairly smart things inside the comprehension.
Yeah, when I first saw this, I thought, interesting. I don't know if it's really needed, but I wasn't super against it or anything. But certainly seeing it in the list comprehension space and seeing it used in other places as well, I think I'm pretty positive on this language change. It's pretty nice. Certainly anytime you need an expression, right? Within like some kind of comprehension, maybe a lambda or something like that. Like this often is the only way to, you know, do it.
Like, so if you want to create a variable, but also test it in a list comprehension, and that might be the response of a function. Like you can maybe have to call that twice. Once when you test it and once when you put it into the list, Now you could assign it and then test it, right? So these things get simpler. Yeah, they get simpler.
I think looking at the syntax, people's initial responses often, I can't see where I would use that, but it takes a while for these types of pieces of syntax to become common because once you know the patterns in which you would use it and you've memorized them and then you start to use it more and more over time. This portion of Talk Python To Me is brought to you by Microsoft and Azure Pipelines. Azure Pipelines is a CI CD service that supports Windows, Linux, and Mac.
It lets you run automatic builds and tests of your Python code on each commit or pull request. It is fully integrated with GitHub, and it lets you define your continuous integration and delivery pipelines with a simple YAML file. Azure Pipelines is free for individuals and small teams. If you're maintaining an open source project, you'll even get unlimited build minutes and 10 concurrent pipelines. Many Python projects are already using Azure Pipelines.
So get started for free at talkpython .fm slash Microsoft. And then the second PEP, I guess, changes the language slightly is positional only arguments. And basically this is PEP 570, which has also been accepted and merged into Python 3 .8. It wasn't part of alpha 3, so it'll be in the alpha 4 release, I believe. And basically this one is you add a forward slash in the list of parameters in a function definition so that it says that it's only positional arguments in this function.
The reason for that is basically to protect an API to ensure that people only use positional arguments and they don't start to use them as keyword arguments. Yeah, it's pretty interesting. It's like the anti -keyword -only argument one, right? which, so with the keyword argument one, I don't know that many people actually know about it, but it's pretty cool.
So if you say, you know, function, parenthesis, star, comma, argument, argument, argument, those all have to be explicitly called as keyword arguments. This is like, I want to make it impossible, at least in this section of the parameters, to call them as keyword arguments, right? Yeah, it's a cool feature. And it's also going to help with a lot of the standard library. That's the other justification.
And there's a lot of the Python standard library where the API needs to be protected so that it can be iterated on, where this feature is basically going to help lock that down. And also in 3 .7, there were some improvements to the performance of method calls. That performance improvement doesn't work with keyword arguments. I see. So basically, you could potentially use this as a way of enforcing that performance improvement. Okay, that's interesting.
The example that I saw, I think if I remember this correctly, was just like range. Like even just knowing when you see stuff, if you quickly read it, like you could have range and say stop and then start and then step. Or you could have start and then stop and then step. Or you could have step and then start. And just seeing, I mean, even though the words are kind of similar. And it's going, no, I want you to always just say start and then stop and then step or whatever, right?
Like just requiring them to not have this sort of almost arbitrariness of the order of the parameters seemed like an interesting idea there as well. Yeah, an additional detail is the fact that, you know, many of the functions that are implemented
in C don't implement keyword arguments. So they're effectively positional only by the sheer fact that they are just being implemented in C. And this just enables us to express those same APIs in Python faithfully, so that alternative implementations recreate the API in exactly the same way. Oh, yeah, that's interesting. So you don't want like the leaky abstraction of the C
implementation to leak out and maybe break PyPy or something like this? Yeah, so currently, like the issue is actually the opposite, where PyPy does not necessarily care that, you know, some argument is positional only in CPython. So they allow for keyword use of it. And then that piece of code is problematic going back to CPython. So that is just, you know, caring of, you know, making your library code, your application code, sort of exchangeable between runtimes. Yeah, interesting.
What about Cython? Cython is its own kind of thing because it's a language that is being compiled or rather transpiled to a bunch of C or C++, which is then compiled to a C module. They are kind of free to do a lot of modifications that Python itself is not free to do because they're compile time modifications. Right, their transpiler can make the adjustment it needs anyway, right? Yes, the source code that you're reading is not the source code that is being executed.
interesting all right what's the next PEP that you want to feature you want to talk about let's cover a few of them and in fact like the slew of peps is uh all related to typing uh let me start with something old which is PEP 544 protocols so that PEP should have been accepted a long time ago but it did not because of the governance or situation so protocol is this like kind of like interface inheritance type of thing like what what's going on protocols essentially is a way to
introduce duck typing to static typing to type checkers so you can have interfaces well or like protocols they are called protocols across the python documentation too which is why we're using that name too but you can have essentially implicit interfaces that are being implemented by a class by a type and then the type checker is able to act on them when you express a need for a as an argument to a function.
For example, if your function accepts anything that has a read method, now you can express that type. That's really cool. I'm super excited about this. Because if you take two things, like maybe a set and a dictionary, but you want to express, I'm going to have those types and I want to work with them. But really all I care about is I can iterate it. Or that's probably not the perfect example. But you know, like it's hard to kind of make the type system express that now.
And this just says, well, if it has an add and a pop method, we're good like whatever right that's is that protocols yes so protocols is the answer like to a question that we've received a lot early on when uh PEP 484 came out like the original you know formation of static typing for python that isn't static typing in direct opposition to what we have been telling everybody to do for all those years which is duck typing like if it um
if it quacks like a duck and looks like a duck it is a duck we don't care if this instance is working. We just care that the calls find the right methods with the right arguments and everything is fine. So now with protocols, you can actually structurally express this, that all you care about is a given field or a given method. I like it. I know you're a proponent of type hints and mypy and all that kind of stuff. How do you see the state of that these days? Well, we're definitely on the rise
there. At my time at Instagram and at Facebook, we've seen a lot of improvements, both in terms of security, you know, a team velocity, and as well, just being able to comprehend the source code when types were introduced to the biggest PHP component of facebook .com. And so since, I guess, 2013, I wanted to see something similar in Python.
So like pep484 came out, you know, soon enough, Python 3 started getting adopted more and more. And, And this is when annotations, which are the nice way to express types, have been gaining adoption. And these days, from what I've heard at the conference now, 90 % of functions in the Instagram code base, which is north of 2 million lines of code at the moment, is covered in types, which is amazing. That is a big achievement. So definitely, this trend is on the rise, which I am very happy about.
Yeah, that's awesome. Anthony, what are your thoughts on type hints, type annotations? And do you like them? Have they changed your code? Do you use them? I actually use them very rarely. In 3 .7, there's the type annotations, the delayed evaluation type annotations. Yeah, it's gotten a little nicer in that way. In 3 .7, which makes it a bit easier in terms of what you have to import and when.
But the only reason I use them seldomly is because I mainly work on libraries which I publish to PyPI, which are used by people who have Python 3 .5, 3 .6, and some 2 .7 as well. So I really have to cover the lowest common denominator when it comes to users because they're mainly utility libraries that I work on, not sort of single deployment applications. Yeah, or something like Black that doesn't really get consumed directly, but it's more executed, right? Like Black or pytest or something.
Yeah, interesting. What's the next one? All right, so the next PEP would be PEP 585 that I actually wrote. Well, it's still in draft form. So to kind of set the stage for the PEP, what Anthony said is like there's plenty of cases where currently typing that was added rather carefully to the language requires you to import names that you're later using as types. There's some situations where you are introducing names to your global scope just for type aliasing or to introduce type variables.
Right. For example, if you've got a function and you want to say its return type is this object, you now have to import at the top. If you had never actually had that part called, maybe it would have never been imported until lazily. There's changes in behavior because of that, right? Yeah, this is often problematic, right?
Like what is even more like just cumbersome for the user is that there is plenty of either built -in types or abstract -based classes that have their equivalents in the typing module, meaning if you want to express that some argument is a list of string, you have to import an uppercase list from the typing module and say uppercase list of string. And I always found that clumsy, right?
I always found that it is something new that you have to explain to new programmers that are first interacting with typing. And there's not really a great reason for that. We wanted the actual lowercase list to be orthogonal and not know anything about the static typing concept, which is mostly used by an external type checker. It does not have a big runtime component. It does have a little, right? Because you can inherit from generic types.
So you can actually create your own data structure where you say that this is, I don't know, a collection of types T, right? So that this is a possibility. But for very many cases, this runtime component is just a hindrance. It's something that you have to remember to import. The names look different because they're uppercase and lowercase. They might look exactly the same in form of set, but they actually mean something else now. Because the point being, so that's the first issue.
And the second issue is that this is something that sits in memory, right? This is something that you spend time on when you're starting up your program. So I always felt like this is something that we can maybe live without. Hence PEP 563, which actually postpones evaluation of annotations. That was introduced in Python 3 .7.
And stemming from that, you know, having that foot in the door that like now the annotations are not evaluated anymore, we can regain some of the usability that people expect just by the fact that this can be still valid Python syntax, but it doesn't have to be valid at runtime. So we can get away without importing things from typing. You know, the type checker will know exactly what you mean anyway, right?
We can come back to using a lowercase list of string instead of uppercase list of string And a few other things. You still do the bracket of string? Yes, yes. But on the lowercase list type? Yeah, so we will never do pointy brackets for that, like in Java or C++, because our LL1 parser is unable to deal with that case. Maybe if we switch to a different one, of which there is discussion, maybe then that would be possible. But at that point, it will still be way too late.
Yeah, I think it's fine the way it works. It's different, but it's just totally... It's different, but it's a way of expression. There's nothing that makes the angle brackets in templates or generics necessarily the right way, yeah? Yes, exactly. It's like as long as humans understand what those things mean, the goal has been achieved.
So yeah, the rest of the PEP 585 is just an attempt to reform some of the pre -existing constructs in the typing module, like creating new types, casting, aliasing, or type variables into variable annotations so that they are also not evaluated at import time, which enables, again, usage of types that are not imported and some of those tricks with syntax like lowercase list and dict and whatnot. So that's 5 .8 .5.
While we're on this performance and type annotations and stuff, what's the story of mypyC? Oh, this is actually a very interesting story. So mypy has traditionally been slow, to the point where running it over the entire Instagram code base was taking over five minutes, right? So this was a thing that you could do in continuous integration, but you could not absolutely run it in an editor or whatnot. We had some hacky workarounds to at least make people in the editors happy.
I wrote a silly flake 8 mypy plugin at some point that kind of brought us somewhere it was useful for a while. But all of that was just not very great. So in the meantime, mypy started implementing incremental typing, meaning the graph of your modules, which did not change, can be cached so that with every change, like most of your computation is already pre -done. And that is evolving to this point now. With well -populated cache, that cuts the time to around 40, 50 seconds.
So it's like a six, seven times improvement. Yeah, it's a big improvement. So that's good. but still the cold type checking was like rather slowish. In the meantime, Facebook started developing its own type checker for Python. Well, more with the goal of creating a static analysis tooling that just uses types. So the type checker part was only the base of the static analysis that was being performed on that very code with the important use case of doing security checks.
And one of the goals of that, you know, new type checker was like, we have to be faster than mypy, right? Yeah. Like, so that created competition and competition is always good. So in the meantime, like, Yuka Lektostalor, like, revived his original idea that, hey, if we have types, we can actually try to compile the Python code in a way that runs it way faster now. What does it compile to? So that's interesting, right? So the mypyC compiler actually creates a C extension.
It actually transpiles to C. This sounds weird until you think about the C API that Python provides. And the Python C API is meant to be consumed by C. So it is just natural that you would have a generator that emits valid C for your given use case. And it turns out that with just a few constraints on how your program works, you can achieve 20 to 30 times performance boosts with that. So that's great. And in a real production application like mypy, it's consistently four times faster.
This portion of Talk Python is sponsored by Microsoft and Visual Studio Code. Visual Studio Code is a free, open -source, and lightweight code editor that runs on Mac, Linux, and Windows with rich Python support. Download Visual Studio Code and install the Python extension to get coding with support for tools you love like Jupyter, Black Formatting, Pilot, pytest, and more.
And just announced this month, you can now work with remote Python code bases using the new Visual Studio Code remote extensions. Use the full power of Visual Studio Code when coding in containers, in Windows subsystem for Linux, and over SSH connections. Yep, that's right. Auto completions, debugging, the terminal, source control, your favorite extensions. Everything works just right in the remote environment. Get started with Visual Studio Code now at talkpython .fm slash Microsoft.
Do you see use cases for that outside just mypy, like random person doing data science that needs their Python parts to go faster? Currently, mypyC tries to limit their scope since they perceive the attempts by previous projects that meant to speed up Python. Those attempts failed mostly on trying to be 100 % compatible with every single feature of Python. So they're focusing on a subset. but they're growing that subset as much as they needed.
And the big missing piece currently is there is no async await support. And with that support, I could actually have black compiled, which could also significantly speed up the formatter, which is already pretty performance, already does pretty well. But that would just make it so much better for the users. So in fact, I think I managed to get Sully, the core developer of mypyC, rather excited about the prospect of having Black as the next production customer of MyPayC. So we'll
see. I have my fingers crossed. Yeah, that's exciting. All right, Anthony, what's the next one on our list of cool features in 3 .8? So this one is actually still in draft. It hasn't actually been decided and potentially might be deferred to a later release if it gets accepted. But when I've talked about features, at least proposed preps, this one gets quite a bit of attention and they're runtime audit hooks. And basically the PEP is a way of setting a callable when certain
system methods within the Python standard library get called. For example, opening a network socket or requesting a URL or opening a file or lots of different cases, I guess, of sort of low -level standard library functions or methods. When they get called, then you get notified.
That's super cool. So like if for some reason I'm in a lockdown environment, I want to use some package or write some app and it's, we think it's not talking to the network or the file system, but it turns out all of a sudden it's opening sockets or DNS stuff. That might be something to inspect. Yeah. So potentially you could lock down a Python distribution or a Python process to not be able to open certain URLs or open network sockets under certain circumstances. And it's
cool. So with the hooks, do I get to say, I saw what you did and okay, or I saw what you did and no, you don't get it. Is it like a place to stop it? Yeah. The default is just as an FYI. But if you wanted to raise a runtime error or something else in line, then it would actually stop the request through to the function.
That's pretty awesome. I think this is pretty interesting. I know there's some restricted environments and even like app stores and stuff that maybe it would be cool to package this up and use it. So yeah, definitely nice. Lucas, what do you think about this one? Well, I actually think this
is very important. If you ever worked for a break on an organization, very often the audit trail of what actually happened is important, not just for security reasons. Very often, cascading errors that end up with an entire site being down are very hard to foresee.
like you know the very easy to make mistakes or like long fix they're all patched like you know there's not a big red switch that if you just press the button you know the site goes down it's very often something that it was hard to you know combine and having the trail of oh this happened first and then another thing happened later like that is very valuable so i see this feature not only as a security feature but as just you know like a post -mortem kind of feature as well that's
Anthony, do you envision this might enable a different set of tooling? We have visual debuggers now. Could you maybe have other types of analysis and tooling and whatnot? Yeah, in terms of tooling, I guess there's a lot of things in the standard library that you might want to add hooks in. And also an easy way of putting hooks into additional modules as well. And then having people to catch those and deal with those separately.
I can definitely think of a few examples of libraries, deserialization libraries, not naming any specifically, that have … Rhymes with sickle? That have security backdoors, just in terms of the way they work. So unless you explicitly specify to load it with a safe mode, then you can actually run … That was a different one, a different rhyme maybe. Oh, okay. Yeah, XML as well is another one. Yeah, yeah.
There are sort of known, I guess, security backdoors in certain libraries, and basically this could be a way of protecting against those. Okay, yeah, that's great, because it should not be doing these operations while loading this file. Yeah, if you're loading a YAML file or an XML file, it shouldn't be opening network sockets. Yeah, probably not, or issuing sub -process commands or any of these not -so -lovely things. All right, Lucas, what's next on our list?
I would like to just say that there's quite a few peps are still in draft form and the authors have like an entire four more weeks in which they can decide to finish their PEP and publish it so things might change but the ones that i'm like personally interested in is always of course typing so let me just cover two more like the first one is pep 586 so that's literal types and second one is 589 which is typed dict both of them are kind of an
example of you know our type system kind of starting uh conservatively and then growing based on need, right? So literal types are very interesting because there are a bunch of calls where the behavior, like the return type or the cascading other arguments that you're going to use in the function depend not on the type of an argument, but on the actual value that you are passing. Like a positive integer, negative integer, something like this, or what?
So parametric types are kind of hard, but what we're doing with literal types is something like the open built -in. Like with the open built -in, you have a certain number of modes, right? And depending on whether you're saying RB or R, the resulting IO is either bytes or strings. And currently, there is certain hackery both in mypy and Pyre to just, you know, work around this. But it would be good if the actual type system supported this feature.
So literal is all about being able to express those types so that if you pass none here, it's going to behave differently. If you pass on string here, it's going to behave differently if that string is RB or is WB or is R and so on and so on. So that's literal types. There's some very, very interesting edge cases and deep thought in that PEP. It is surprisingly long and complex. I'm not going to go into this now. The other one is though, type dict.
Originally, dictionaries have been envisioned in the static typing as, oh, there's this key value store. So there's keys of a certain type and there's values of a certain type. What happens in practice is that a lot of pre -existing Python applications do not use named tuples, do not use attrs or data classes, which are very new. So instead, they use kind of lightweight classes in the form of dictionaries that have keys and values of various types.
So there can be name, which is a string, but there can be a birth year, which is going to be an int, right? And based on the actual name of that key, you're going to have different types. So that was very cumbersome to express in the previous form. Almost like a schema, yeah. Yes, it's like very much like schemas.
So now like there is a way to describe a type dictionary in the form of like a data class -like type, like where you just express it like, you know, class -like saying, this dictionary is going to have keys that are like this, and this key is going to be a string, this other key is going to be an int. That solves already a lot. But like then the interesting part is when those things start nesting.
That actually enables you to construct like rather complex schemas that can be used directly in JSON or, you know, in other forms of serialization. So that PEP alone is also very useful in practice. Even though you could just say like people are doing it wrong, they should be using, you know, name tuples or other forms of typing instead. Well, you kind of have to be pragmatic, you know, like you see pre -existing valid use cases of this and you have to just adhere to those. Yeah, interesting.
I guess since you're really into typing and you're on the core dev side of things, what do you think about libraries, especially I'm thinking of like web frameworks that use typing for like serialization and stuff? So like Molten, for example, you can have a class that has fields, but also those fields have types. And then you say this web function takes this class, but it's like really a form submission. and it'll like convert stuff to integers or like validate against the types.
Is that in your mind awesome or is that an abuse of the type system? So this is interesting, right? Because obviously as long as the type system is kind of an extension of the type system we're using for type checking, or maybe it's even exactly the same, like that is using a shared vocabulary. That's great. Like, you know, we support that. Like we would wish to see type hints in more places.
In fact, in 3 .7, I extended single dispatch so that now you can just use annotations on arguments instead of saying, you know, register off and like you can just say register and using the annotations of the first argument, it'll behave like, you know, as you expect. So you can use type annotations at runtime for whatever you want, as long as the type system is kind of, you know, the same with what we're using it for. Some use cases use annotations, function annotations in incompatible ways.
And that creates issues because an increasing amount of tooling like Visual Studio Code, you know, PyCharm and whatnot, like gets confused by seeing something that, you know, is clearly not a type in the place where types are expected. So I kind of, yeah. Yeah, an example of that. So this example I gave you with Moulton seems like it's consistent. The type checker says it takes an int. It's actually an int at runtime.
But I can't remember, but some of the other frameworks, maybe REST frameworks, they would say like you could say that this parameter is a header. And what it actually is the string value with that name out of the header. It's like the value comes from the header. But at runtime, it's not a header. It's not a dictionary. Whatever the header is, it's actually a string or an int or something. Yeah, I see.
It seemed like it was really cool and clever, but also like incongruent with what Python intended. A certain amount of those things like our valid use cases, like let's say in the case of attrs, adders like create valid classes for you from minimal information that you provide in source code. So this class is being fully functional at runtime. However, the type checker does not know this. It just sees like just some magic decorator and just this minimal set of attributes on it.
And it does not know that a certain amount of built -in methods have been created and a certain amount of functionality within it and whatnot has been added. So at least in the case of mypyEmpire, additional functionality had to be implemented in those type checkers to understand that those types actually behave a bit differently from regular classes. But that's just something that users want, something that users need. And so based on that, we're going to be extending that.
All right, cool, cool. Anthony, what do you got next on our list here? There's actually everything we have for 3 .8. All right, so ship it, we're good? Yeah, I think we're done now. So actually that PEP in particular, the type dictionary PEP, I've been thinking if anyone's in the JSON schema project, it's really cool. It's basically a way of defining a schema for JSON documents.
You can definitely see that if this PEP gets accepted, somebody will build tooling to integrate between the JSON schemas and this new type dict type. Yeah, and it makes perfect sense. I mean, dictionaries are so similar to JSON in a sense, and they both have this sort of dynamicness but mixed types. I mean, they're very, very sort of mappable, yeah?
Yeah, so I think it's just a matter of time until someone builds a library where you give it a JSON schema and it will generate a type dict class, and then you use that class. Similar in the same way that you in an ORM, when you would describe like a data class, and then you would deploy it as a database. It's basically like a similar way of reflecting documents. That's cool. I can definitely see it for serialization.
Like you say, this function takes a type dict, but what actually it is a form post or a JSON post or like a REST call or something like that. That's cool. So let's see. Did we talk about multiprocessing? Is that coming in 3 .8 or is that beyond? That's in 3 .8. So one particular thing that is not PEP -worthy, but it's still a very interesting new feature is that traditionally multiprocessing, which has been created to solve the gil problem, has solved it partially.
What I mean by this is that, yes, there is a master process that creates a bunch of children and then delegates work to it. So you can just call Python functions and those Python functions actually are executed on the other side in the child process. But the way this is achieved was that function call has been pickling the arguments of the function you're calling. That ended up being on the child side. That child unpickled the arguments. It did the computation it needed.
And then if there was a return value it wanted, it actually had to pickle that return value again and pass it back to the master process. And the master process unpickled the return value again. And if that's big, it's very slow, for example. Exactly. So like for small things, that was mostly fine. But if you had like a gigantic haystack and you were looking for a needle in it, just, you know, pickling that haystack was taking. And then I'm picking on the other side was taking a lot of time.
We were going to run that on all six cores. So here's six copies of our like 10 megabyte, whatever. That is actually annoying because like if you had like a master process that say it gets web requests, right? In the time that you're spending on pickling that haystack, nothing else can be actually done in Python because the gil is still there on that master process. So you are solving the gil problem only partially.
So now multiprocessing introduces this new fantastic feature where you can declare a shared memory segment and share that memory between parents and children. What that does is you can actually get away with a lot of serialization and deserialization. So for certain kinds of tasks like search, like filtering, this will decrease the churn just needed to pass data around. Meaning it will bring us way closer to the world we want to see, which is that, yes, there are certain Python processes.
They still have the gil, but it does not matter because we can use as many of those processes as we have cores and everything is fine. Yeah, and you don't have the replication of memory and the copying and all that. That's awesome. Yes. So I'm really excited when I saw that come out. I'm like, oh, this is going to be great. So that's in 3 .8? Yes, that is already in. Cool. And what about the subinterpreter 5 .5 .4? That's beyond? Oh, so that's interesting. That is kind of related.
Yeah, they're in the same category of things. Yes, however, the multiprocessing feature does have limitations, right? Like the shared memory segment is not right for any arbitrary Python object. There's like, you know, restrictions on what types you can use. That was actually a complex functionality to be added, you know, within particular operating systems, shared memory handling is way different.
So you have to understand how those differences work and which process is now responsible for creating that shared memory segment and shutting it down and freeing that memory when everything is, you know, shutting down. So that is all great work by Davin Potts. Like, multiprocessing is one thing, but subinterpreters is what if you had this multiprocessing API and actually just had one process and just used many Python interpreters within it, each with its own gil.
To achieve that, many changes in the Python C API have to be added. Like, you know, a much cleanup internally in terms of what constitutes local and what constitutes global state have to be done. Eric Snow is working hard on that. As far as I can tell, this is deferred to Python 3 .9. I'm eagerly awaiting that. I think this is going to be a great improvement. Yeah, it could definitely change the threading story.
In multipress processing strong in Python, async await is super cool for IO bound stuff. But threads have always been a kind of, well, sometimes they're helpful, sometimes they're not. It depends. And this could be awesome, right? You could just dedicate a sub -processor, a sub -interpreter, excuse me, to each thread, right? And really get free of that. I agree. Cool. All right. Well, thank you both for sharing what's coming. Pretty excited about 3 .8. Cool. That was a pleasure.
Yeah, it's going to be great. Yeah. And on that topic as well, I guess we've got 3 .9. So, some of the PEPs are being deferred to 3 .9. And on the topic of sub -interpreters, the unpacking of the startup sequence, and also the initialization configuration, there's two proposals for that. One is PEP 4 .3 .2, and the other's PEP 587, which are interrelated because if you have sub -interpreters, you want the interpreter startup time to be fast and also the configuration to be flexible.
So I think 3 .9 will definitely see some more proposals related to that. Awesome. Which are going to hopefully improve the startup time of Python 3. As we know, it's a little behind where Python 2 was for various reasons, but that'll be a great step forward. Yeah, that'd be really awesome. And then it also might make this sub -interpreter stuff better if those little sub interpreters can get created faster as well. I don't know how related
they are, but pretty cool. Pretty cool. All right. One, we used up almost all our time, so I won't keep you guys much longer, but especially Lucas, let me ask you this, like, will there be a Python four and does it matter? I mean, on one hand, like we've got stuff that's 0 .1, two versions that have been around for 10 years with a hundred releases. We've got Python
three. If we don't have like major breaking changes, is there a reason to start calling it four and five and six, or is that just going to like scare people with the history or like, what do you, what's your perspective here? We are at Python 3 .8 now. We're about to release 3 .9 later, you know, another 18 months later. Historically, Guido expressed his distaste with numbers after the decimal point that have more than one digit. And so he disliked the notion of
having 2 .10, 2 .11, and same with 3 .10, 3 .11. However, we have both philosophical and technical challenges with just releasing a Python 4. Well, the obvious philosophical one is that the transition between Python 2 and Python 3 was very, very challenging, right? It took us a lot of effort. And there's a lot of fatigue, I think, in the community to just not go through that again for a while. Absolutely. Our closets are still full of skeletons. So we are really trying hard
not to make that mistake again. It's not only a problem for the users, it was also unpleasant and a problem for the core developers. So we are really careful to make changes in a very incremental manner now and communicate them well and make them gradually so that we are disrupting our users the least, which just means calling something Python 4, well, would probably be just scary on
its own, just on the power of that number. But just more practically speaking, because of this Python 2 and 3 transition, there is a ton of code in the wild that does checks exactly for the number
3 in sys version, version info. And those checks would, you know, become invalid if we introduced python 4 like which is one of the reasons why you know like Linux had problems when it suddenly became Linux 3 and why we have windows 10 now just for that practical reason i do expect that we're gonna see python 3 .10 first at least before we ever decide to call the next release python 4 yeah yeah so 3 .10 is way more likely maybe we should call it python 6 because then it's like two times three
Well, I was wondering if there's a proposal to introduce calendar versioning to Python. Oh, yeah. What do you think about calendar versioning? If Python 2 .7 was called 2014 .1, then maybe people would reconsider. Really? 2014? What's up here? We just upgraded from 2013. It would certainly remind people how old their Python distribution is. So maybe they'll upgrade faster. That's funny. Well, I'm in no power to just make that change. I could be in power to create a PEP about it.
but this is probably not a sword I'm willing to fall on. However, let me tell you this, all of my private projects do use calendar versioning. That's the only versioning that I am familiar and comfortable with. There's obviously semantic versioning, but I don't know about others, but at least I don't see myself being as strict and consistent with applying semantic versioning every time. What does it mean to push the major version versus minor version?
Yes, so obviously there's rules, but the devil is in the application. Do you apply those rules consistently and every given time? I wrote an auto formatter because I was not able to apply rules of code styling consistently and every time. So I don't trust myself enough to do the same for semantic versioning. And if I'm not doing that, then my users cannot depend on what they expect from semantic versioning.
Hence, just using calendar versioning is way easier, adopted by many popular projects like Ubuntu, like Twisted, like others. Yeah, I love the calendar versioning. I don't know that it makes sense for like the main Python. Maybe, maybe it does, maybe it doesn't. It would be effective on showing how old some stuff is. But certainly I feel like semantic versioning requires, like on libraries, it requires some expertise in that library.
Like I depend on library A. It depends on library B. I see that when I pip installed it, it's 0 .1 .3. Six months later, is that out of date? I have no idea. Like I don't even know like roughly how old that is. But if I saw the calendar version on all the dependencies or stuff I'm not super familiar with, I'd be like, oh, yeah, this is actually, this is pretty much new or it's old. Like, it just, it makes it easier for newcomers, I think. Yeah. All right, guys.
Thank you for being on the show and sharing all this and looking forward to when you actually release 3 .8. I'm looking forward to that, too. Thank you very much. I can imagine. Thanks, Michael. Yeah, bye. This has been another episode of Talk Python To Me. Our guests in this episode have been Lucas Lenga and Anthony Shaw. And it's been brought to you by Microsoft. If you're a Python developer, Microsoft has you covered.
From VS Code and their modern editor plugins, to Azure Pipelines for Continuous Integration, and serverless Python functions on Azure. Check them out at talkpython .fm slash Microsoft. Want to level up your Python? If you're just getting started, try my Python Jumpstart by Building 10 Apps course. Or if you're looking for something more advanced, check out our new async course that digs into all the different types of async programming you can do in Python. And of course, if you're
interested in more than one of these, be sure to check out our everything bundle. It's like a subscription that never expires. Be sure to subscribe to the show, open your favorite podcatcher and search for Python. We should be right at the top. You can also find the iTunes feed at slash iTunes, the Google Play feed at /play and the direct RSS feed at /rss on talkpython .fm. This is your host, Michael Kennedy. Thanks so much for listening. I really appreciate it.
Now get out there and write some Python code.
