#123 BART & The Future of Bayesian Tools, with Osvaldo Martin

⁠¶ Introduction to Osvaldo Martin and Bayesian Statistics

00:05

Today, I'm excited to host again the great Osvaldo Martin. Osvaldo collaborates with numerous open source projects including Aviz, Pimcee, Bambi, PimceeBart and Prilis, helping shape the tools that make patient methods accessible and powerful. He's also an educator teaching patient statistics at UNSAM in Argentina and a problem solver for businesses.

00:32

In this episode, Osvaldo shares his insights on Bayesian Additive Regression Trees, or BARTs, explaining their versatility as non-parametric models that are quick to use and effective for variable importance analysis. Osvaldo also introduces Prelease, a package aimed at improving prior elicitation in Bayesian modeling, and explains how it enhances the Bayesian workflow, particularly in education.

00:58

From interactive Learning tools to future developments in Pimsevart and Arviz, Osvaldo highlights the importance of making Bayesian methods more intuitive and accessible for both students and practitioners. This is always a pleasure to have Osvaldo on the show because he does so many things and he's a dear friend and you will also hear from a special surprise guest during the episode but I am not gonna spoil you the surprise. This is Learning Bayesian Statistics.

01:29

Episode 1-2-3, recorded October 17, 2024. Welcome Bayesian Statistics, a podcast about Bayesian inference, the methods, the projects, and the people who make it possible. I'm your host, Alex Andorra. You can follow me on Twitter at alex-underscore-andorra. like the country. For any info about the show, learnbasedats.com is the last to be. Show notes, becoming a corporate sponsor, unlocking Beijing Merch, supporting the show on Patreon, everything is in there. That's learnbasedats.com.

02:20

If you're interested in one-on-one mentorship, online courses, or statistical consulting, feel free to reach out and book a call at topmate.io slash alex underscore and dora. See you around, folks. and best patient wishes to you all. And if today's discussion sparked ideas for your business, well, our team at Pimc Labs can help bring them to life. Check us out at pimc-labs.com. Hello my dear patients!

02:49

Today I wanna thank Mike Longkarick and Corey Abshire for joining our Patreon in the full posterity or higher. Mike and Corey, thank you so much for your support. This truly makes the show possible, you know, I pay for editing with your contributions and I make sure you like what you're hearing and seeing and I can improve the show and keep investing in it and the quality of it, so thank you so much. and I can't wait to talk all stuff base in the snack channel with you.

03:22

See you there guys, I hope you will enjoy your merch and now onto the show. Osvaldo Martin, welcome back to Learning Vision Statistics. Thank you much. It's a pleasure to here. Last time you were on the show, well... Do we have t-shirts or some gift for us? Yes, you're gonna get a bottle of wine but when it's your 10th appearance, so... Sounds like I'm, you know, inventing the rules, but... No, I'm not, I'm not. And so yeah, last time you were here was for episode 58, March 21st, 2022.

04:30

And you came to talk with Ramin Kumar and Jun Bang Lao about your new book at the time, Bayesian Modeling and Compensation in Python. I guess we'll talk a bit about that again today. We'll reference it at least because that's a really good book and I think it's still quite useful. of course, the very first time you were here was the very first Learn Bayestats episode, episode 1. When, well, we talked about lots of things and your background in open source, in Bayes, in bioinformatics.

05:14

So yeah, lots of things happened in the mid time. So I thought it would be good to have you back on the show because you always do so many things as well though. So, for... The listeners maybe tell us what you're doing nowadays and what's new basically in your professional life since you were last on the show. okay, let's try because I'm in a transition time so it's not super easy to set what I'm doing at the moment.

05:55

Well, last time I was here, I worked for CONNESET, the National Scientific and Technical Council. I'm not to work there anymore. So I resigned. But I'm still teaching at the university. It's Universidad Nacional de San Martin. It's a university in Buenos Aires. They have a data science program. So I'm teaching their vision statistics. And the good thing is not the first course in patient statistics, it's actually the second course.

06:26

That's super weird for me and probably for many people that you are the second person talking about patient statistics to students. like weird, but I am. So it's very interesting that you mentioned something and say, yeah, we already saw that. That's a new experience to me. I have been also doing some consulting for Pianzi Labs, doing some educational material for intuitive base. Probably we're going to talk about that. And of course I keep doing work for open source.

07:06

That is actually my main motivation to wake up every day. At least work wise, my main motivation. Yeah, I was gonna say I'm sure Romina, Arille and Bruno are gonna be happy to hear that. Yeah, So yeah, basically that's what I'm doing now. Yeah, so quite a lot of things as listeners might have understood. Well, I'll try to touch about a lot of those topics.

07:43

One I think that is going to be very interesting to start with is Bart, so Bayesian Additive Regression Trees, because that's one of the things you've developed quite a lot between the publishing of the book in 2022 and now, because you've actually spearheaded basically a sub-package from Piemce called Piemce Bart.

⁠¶ Exploring Bayesian Additive Regression Trees (BART)

08:12

especially to Bayesian Additive Regression Trees with PyMC. Maybe, what do you think? Should we... Yeah, let's define very quickly what bot models are and why... Maybe why you split up PyMC BART from the main package PyMC. Okay. Bayesian Additive Regression Trees BART is... non-parametric model, it's a Bayesian model. And the main idea is that you are trying to approximate functions by summing, by adding trees. The number of trees is something that you have to decide.

08:56

Usually people use something like 15, 100, 200 trees. That's what, important thing is that those many trees, that's a single point, it's a single function. And then you're going to have like a distribution, because your version, you are going to have a distribution of some of trees. And that's what is a PyMC BART is computing. The main motivation to have PyMC BART was not only bringing BART models to PyMC. It was also trying to make them really work for probabilistic programming languages.

09:34

If you check the... literature about bark you're going to see that most of the time people discuss bark as standalone models So you have like a package that can fit one or two variation of a bark model and then there is a new paper describing a new variation and sometimes there is a package for that and so on so on so it's like very this this thing that you have one model and then you have inference method tailored to that model.

10:10

So what I wanted to do with PIMC part was having something that is completely independent or at least try to do that. So essentially that you can mix part models with whatever you want. the main thing is probably that you want to switch between likelihoods. So if you want, you can have a normal likelihood that is typical thing.

10:34

but you can have a gamma likelihood or whatever negative binomial whatever you need and you can mix part models with linear models or maybe Gaussian processes or whatever you want in a single model. So that's the main motivation to create PMC part.

10:52

And one of the reasons to split it was probably just matter of organization in a sense because It's not only, I mean, essentially it's super easy to have BART models into PMC because you already use a lot of the technology, a lot of the tools that are already implemented in PMC. So the things that you need to add as a technical level is you need a distribution or something that like works as a distribution. There's a BART distribution that behaves similar to gamma, normal, whatever.

11:30

And then as sampler. some way to get samples from a part-mode. And the reason you need a specific method to get samples is because trees are discrete or kind of a weird kind of discrete thing. So you cannot use a Hamiltonian Monte Carlo or methods like that. So you need a special method and sampling from trees can be tricky. But essentially once you have those two elements, you can add it to PMC. and everything should work. But then you say, okay, it's not enough.

12:08

Maybe you want some functionality, extra functionality, like plots for diagnostics and plots for variable importance. That's something that you can do with apart models. You can not only fit a model and obtain a posterior and make predictions. You can also try to analyze which of the variables in your model are more important than the others. which one are the relevance, that kind of thing. So that requires extra functionality. So it makes more sense to have that functionality.

12:40

That kind of functionality is like too specific to have in a package that is so general like, like PyMC. So it makes sense to split it. And at that time we have actually Bart was first in PyMC, I think for the book, it was inside PyMC. Then we move it to PMC experimental that is this package that has methods that are not necessarily experimental in the sense that they don't work, experimental in the sense that there are extra functionality. Then we decide, okay, maybe we need a specific package.

13:16

So that's the current status. We have a specific package for that. And so to make sure listeners understand well, when Would you recommend using a BART model? A BART model? I would recommend a BART model when you are super lazy. I think that's best way to use a BART model. And the reason is that there is plenty of literature showing that you get good results with BART models without putting too much thought on what you are doing. So in some sense they are competitive.

13:57

against things like Gaussian processes or splines. Usually for those you need to think better. For instance, for spline you need to think, where I'm going to put the nodes or things like that. For Gaussian processes, there are a of things that you can actually tweak. For BART, it's like usually there is not too many options. So you just...

14:23

Define the likelihood, the prior or as I say if you do have something else like a linear model or something like that you put together and you get something that is usually super reasonable That doesn't mean that you cannot get better usually if you put a little bit of more thought you can do that and for that case Gaussian processes are excellent because you can mix and match different kernels and that kind of thing and that gives you more interpretability a more custom model

14:53

But I think BART excels when you want something relatively quick. Or if you don't have lot of domain knowledge to do the more tailored model. So that's probably a very good case. And I think another one is when you want to get information about variables, important variables. That's also something that there are quite a few examples in the literature. when people, the main variable of analysis is to understand which variables are more important.

15:29

So maybe it's not that you want predictions, you want to know, you understand which variables are the most relevant for your problem. And for that case is very good. And the thing is that we have a particular method in, in PMC bar that is, I have not seen that method elsewhere that allows to interpret variable importance a little bit easier because essentially to explain the method to understand variable importance is that essentially you count how many times you feed a lot of trees.

16:07

The trees can have one or two, they're usually very shallow, so they usually incorporate one, two, three, probably no more than that covariates. So essentially then you count how many times your first how many times the second, the third, etc. And then you say you plot that information relative frequency and that's That's the important thing.

16:33

But that's not super easy to understand how to interpret that because essentially you get relative frequencies so you don't know where you have to cut and say this is important, this is not important. And there are some heuristics out there that try to help you with that. It's not that people have not thought about that, but I think we have something that is much more easier in PyMCBAR that essentially what we do is we compare, we generate predictions from submodels.

17:08

So essentially we generate predictions for the whole model. And then what we do is try is prune the trees to generate submodels. a model with, let's say we have a model with 10 covariates. That's our reference ball. So we generate prediction from that. Then we simulate that we only have nine covariates and generate prediction from that. So first of all, until we have only one covariate. And essentially we plot that information. So you got a curve like saturates at some point.

17:43

And that's much more easy to interpret. Because essentially it's easy to spot the smallest sub-model that allows you to generate the closest predictions to the reference model. And actually that's an idea that we take from other places, the same thing that for instance, blockpred in R, doos or culprit in Python. This, if in the way that we generate the sub-models and we compute the predictions, But the idea of comparing to the reference model, that's the key point I think.

18:21

So you generate predictions to a reference model and then compare your prediction of your submodels to that reference. So that's something that you can, and that's super cheap to do because you don't need to refit the, you only fit the part model once. I further, we try to kind of approximate the submodel.

⁠¶ Prior Elicitation and the PreliZ Package

18:45

So it's super cheap to do it, once you fit the robot mode. Nice. Yeah, thanks for that great summary. Definitely recommend people to check out the chapter of your book about bots. I'll put the link in the show notes because you guys and your editor have been kind enough to have the online version free on the website.

19:15

I put a link to that and of course if you are interested in the whole book, which I definitely recommend, I recommend buying a hard copy because I often find myself referencing it where I forget something about Bored or the time series chapter is also very interesting, I really love it. I always forget the nuances of time series models because it's a lot of things to tweak and remember. And the Bob chapter also is really, really well written and explained and very clear.

19:50

So I definitely recommend that. Yeah. We now also have a paper archive. It's kind of a mix of writing a style that is close to a typical scientific paper, but also some elements more tailored to practitioners. So we try to provide some recommendations on how to choose the number of trees. And we explain what I just explained about variable selection and how to do it. That kind of thing. So that's also something that I think is easy to read.

20:29

You're talking about the chapter of the book, or is that something else? No, I mean our archive paper. It's a paper on archive. Yeah. Okay. I see. yeah. So we definitely need to link to that too. I wasn't aware you had that. yeah, definitely we want that in the show notes. yeah, feel free to add that to the document I'll share with you.

20:56

also one of the reasons I mentioned this archived paper is that, as you said, when we published the book, I think we have one of the earliest versions of Bart Bolling in PNT. So we have changed, I mean, the theory of course is the same and the examples still work and we have actually updated the examples in the book. So if you go to the repository, you're going to see that the examples updated to in one of the newest version of Find C. And so it's still worth reading that.

21:35

And we introduced a few changes in the API of parts, we try to make it a little bit more easier to use, more flexible. actually now there is Gabriel Statshulty. He's a core developer of Bambi and he's doing a superb job on speeding up TimeZBart. He's he is rewriting TimeZBart in Rust. We followed some of the advice and experience from NutPi that you already have discussed in the show. And we have seen a speedup of like 10 times or so. So we are super happy.

22:22

I think that's going to be ready by the end of this year, probably. Okay. December, maybe. So maybe that will be out by the time we publish this episode. Yeah. That'd be great. Which should I link to just the GitHub repo of Pimzibart? Now it's living in a separate repo, but it's going to be the same thing. So for users, users should not notice anything, except that the models are going to run much faster. It should be super easy for them to just run Pimzibart. Nice! Well, awesome!

23:04

That is really cool. Maybe we should get Gabriel on the show to talk about that and also some Bambi stuff because I know he does a lot of things on the Bambi side. He's doing tremendous work for Bambi. I didn't know he was doing that for a fancy part, but we're happy to hear that. He's also a patron of the show, thank you. Triple thank you, Gabriel. Yeah, one of the things that we have in PyMC part is that we have this plot dependence plot. dependence plot.

23:45

That's the other P. I was missing one P. Yeah, partial dependence plots. That's a cool way to try to add a little bit of interpretability to models. And I mention this because Gabriel has worked on something similar in Bumpy.

24:03

specifically tailored for linear models or models that are available in BAMBI that this is interpret model the they compute things that are kind of similar and the point is that how I how can I try to better understand my model and the answer is make predictions from the model and make plots and usually that's easier than trying to understand parameters in linear models and of course for non-parametric models like BART

24:34

The parameters that you fit doesn't make too much sense because they are branch of the trees, things like that. Yeah. Yeah. Yeah. I mean, cause you agree with that, Miles. My experience personally also working on models is, yeah, really first making in-sample predictions, making plots and then making out of sample predictions and making plots of them. Spend most of my time writing code so that the plots look good and I can understand them. Much more than the actual model code.

25:06

don't know about you, but yeah, it's like, most of the work is actually handling the shape issues when you're plotting and then handling the shape issues when you're doing out of sample predictions. Yeah. And that's one of the things Bambi tried to simplify a lot. Because it's Interpret model that has some summaries, numerical summaries and also plots. And usually the plots are super useful just out of the box.

25:33

yeah. Yeah, I mean, I lose Bambi all the time, especially when I'm developing a model. Then when I know the final structure of the model and I know when I add some more complicated structure to it, especially time series, like structural time series, instance, or motion random walk, they don't exist in Bambi at least yet. So then you have to go back to .c and write the full model.

26:01

I find that to do the iterative process of just, you know, trying a bunch of stuff, it's really awesome to have BAMI because then you can, it's really easy to iterate on the covariates. And then you have those plots, as you were saying, which is go work out of the box. And the best is really the auto sample predictions where you don't have to handle any of the shape issues. And that's absolutely fantastic. I don't think you can do Bart with BAMI yet. No, but that's one idea.

26:33

That's something that we would like to have because I think it fits very well into the Bambi ethos because it's like, as I say, many times you want to just BART as something like quick to explore a model, sign coming off like a baseline. Sometimes it's enough because I mean, people actually use BART as their final model. There's a lot of papers. using just Bart. But even if you end up entirely happy with the result, then you can iterate and create something much flexible and more tailored.

27:11

And I think that fits quite well into Bambi's. Yeah, yeah, yeah. Yeah, definitely. mean, Gabriel, you're the person indicated to do that. Working on Bart, working on Bambi. I'm almost disappointed you haven't done that yet, Gabriel. What are you doing? No, it's, yeah, completely agree. That'd be fantastic. actually I'd be, I'd be super happy to, to give a hand on that because I think that'd be super fun. also adding time series and, and state space methods and so on. That would be fantastic.

27:50

We're gonna have Jesse Grabowski on the show in a few weeks, and he's been spearheading a whole submodule in Prime C Experimental doing state space methods. And, Yeah, I think that'd be great if BAMI could plug into that state space submodule and then you have state space method in BAMI. That'd be awesome. So since you're talking about about Bambi and Bart, I thought, you know what, we should bring in someone who actually knows the nitty gritty of Bambi. What do you say?

28:39

Do you already have Thomas? Yeah, but I mean right now, during the show, during the conversation. Yeah, yeah. Okay. like yes you know last time i know if you well you you you probably know that The first episode you recorded with me, Thomas, Tommy Capretto. So he was on the show, a few episodes ago. I'll put the show in the show notes. But yeah, last time you were on the show for the first episode, Tommy was actually in the office, like just... listening to you? Yeah. And just eavesdropping.

29:25

So, you know, like I... Then I think we should just... Just invite him. What do you say? Well, look at that! No, he didn't even ask for permission! Why do you laugh? What's happening here? Hi Tommy! Yeah, thanks for the... Yeah, yeah, I'm not at home. I'm in motorcycle. Yeah, so Tommy, Now I... So you don't have to eavesdrop anymore?

⁠¶ Teaching Bayesian Statistics and Future Directions

29:56

You can just join the episode without even asking. that's awesome. Kidding aside, thanks for joining, Tony. That's great to have you both here. I think you had a few questions for Osvaldo. Yeah. So to be honest, I don't know what to be talking about. I didn't have the chance to be listening this time, but I can imagine some of the topics. Yeah, we were talking about VEMI and Osvaldo was saying it was not that good. was saying, yeah, it's getting better. yeah. now it's going to become usable.

30:37

Yeah, yeah, yeah. Some people find it useful. I don't know, Osvaldo, if you talked about your teaching activities that you have this semester. Have you talked about that? No, I briefly mentioned that I'm working on SAM, but... And I briefly mentioned that it's actually the second course in patient statistics, so I'm super surprised that I'm not the first person talking to the students about patient stats or base factors or don't know, sampling for a posterior, that kind of thing.

31:16

Okay. So since you mentioned that, so this is the second course. Do you already have in mind what would be... the third course or is it so specific that it's more like pick your own adventure and go deep into that? I don't know. I think there are many options. one course that could be like a third course is something that is more based on solving problems.

31:48

That could be an option like a workshop that you... work on problems and discuss some problems and students share their findings and improvement, that kind of thing, so that you can put in practice a lot of ideas of the workflow on diagnostics and going back and forth, that kind of thing, and putting things in context. That was something that I tried to do in this second course, but it's usually kind of difficult when examples are simple, I don't know.

32:26

did a lot of the time to this is difficult because usually when you teach things tend to be like very linear. Yeah. And when you work on a mobile, things tend to be nonlinear at all. that different is something that could be, we could try some work on that. Our dual option could be, but there are still plenty of topics to discuss like survival models or putting more emphasis on not just understanding or... These certain courses, they don't really know about linear models.

33:10

So we briefly talk about them and discuss a few more things about doing linear models using patient statistics. But linear models is something like it's so, so right, so fast that you can just have a course to say, okay, we are going to discuss linear models again, more in practice on all this and the things that we were discussing before, like the making predictions for linear models, so you can understand linear models and that kind of thing. I don't know.

33:45

So there are many things that we could do for a third kind of course. I really like the fact you mentioned that case study based approach, like, okay, in this semester we're going to work on these four problems. I think that that approach would be like a very good opportunity to invest time in prior elicitation. I don't know if you talked about prior elicitation before.

34:14

I know you have been working a lot in Prellis, which is a great tool that's It's been around for some time and I see that you are constantly like adding things to it. And I think that we as a community, we are not using it enough. At least myself in my workflow, I end up reinventing many of the things that are already implemented in Prellis. Yeah. What do you have in any analysis about the situation regarding tools for prior elicitation?

34:50

Because I think it's something it's mentioned a lot, but I know Prellies, but I don't know many other tools or people advocating for tools for prior recitation. I think it's also, it's also worth it to introduce Prellies and talk about it as a lot of sure about where it's at, where you'd like to take it and things like that. Yeah, sure.

35:20

the first thing is that we have a paper with many other, I have a paper with, a co-author of the paper with many other people from Alta University, Helsinki University and other universities that we discussed prior licitation. And this paper has this kind of thing that, okay, where are we with prior licitation? What we need to do? What are the approaches, tools out there? And one of the sections is discussing prior prioritization in the framework of the Bayesian workflow.

35:56

So when, you always want to do prioritization? It's something that you always want to do. It's something that you always want to do at the beginning, that kind of thing. And of course it tells us, no, not, sometimes you don't just do prioritization at the beginning. Sometimes you do it after a while. And sometimes you want to spend a lot of time doing pre-elicitation. Sometimes you just want to just default priors like priors in Bambi or whatever. So there is a section discussing that.

36:24

There is also a section about a prior elicitation software. That's something that is slacking and actually pre-elix is an answer to that. In a sense, it's an answer to that paper. If we say there, okay, there is a, we don't have enough tools. We don't have. The tools that we have are very sparse in the sense that they are not in the same place. So it's not easy to discover that you have different tools.

36:59

And many of these tools are not integrated or not integrated well with probabilistic programming languages. So maybe a tool is just, I don't know, a webpage where you can try to do some prior recitation, but then you get something and you have to manually move numbers into PyMC or PyStand or whatever. So that's one of the goals of Prellis, trying to answer all these issues. So essentially I started working on Prellis when I was working at Aalto University and we were writing the tape.

37:38

So it has been a while. And one of the reasons I'm still not super happy with Prelius, I can maybe mention what we have in Prelius now and what we don't have. So essentially Prelius is a packet for prior elicitation or for distribution elicitation if you want. Sometimes you want to elicit the likelihoods also, not only prior. It's sister project with Arbis.

38:12

So that's the reason it has a similar logo, has a similar name, is because it actually lives inside Arby's deps repository, Arby's organization. And the focus after it says distributional visitation. So one of the things that Arby's provides is a collection of distribution, like the distribution that you maybe find in PyMC or maybe you find in SciPy. And compared to SciPy, one difference is that the parameterization is a little bit more... ...statistical, inclined. Okay?

38:52

usually the name of the parameters are the same names that you are going to find in textbooks, that you're going to find in probabilistic programming languages. that's the thing. And in SciPy we support alternative parameterizations. So for gamma you have alpha and beta, but you also have mu and sigma. The same goes for beta distribution, etc. And then what we offer with the distribution is that easy to define a distribution, to plot the distribution.

39:21

So we have functions to plot it just to CDF, the PDF, also interactive functions. So you can just call the distribution and move sliders. So you see what happens if I increase this parameter, decrease this other parameter, how the distribution changed. So there are many functionalities at the level of the distribution.

39:44

And that, think, is already something super useful because, as we were saying, you teach statistics, people usually are familiar with distributions like normal, maybe gamma or binomial. And then you mention meta distribution or you mention something else and they say, so this is the user to distribution. Let's play a little bit. They get familiar with the distribution. So now we are also adding documentation, specifically tailored to distribution.

40:19

So you can have a gallery of distribution and you can go there and you're going to see a short description of properties of the distribution, how the distribution is used, that kind of thing. So all these things are super simple in a sense. Thanks. But I think they are already useful. And as you say, if you don't have these things, you have to invent yourself. You have to write yourself because you use it in practice. So it's that someone already do that for you.

40:51

And then because we have the distribution, we try to work on top of the distribution. So for instance, we have methods that can modify the distribution. We have one method that is called maxN because maximum entropy.

41:06

So essentially you can pass It's a function and you pass distribution, whatever distribution you want, and then you say, okay, I want this distribution, but I want the distribution that is bounded to be between this value and this value, and 90%, 80 % of the mass is going to be between those boundaries. And the thing is that if you only do that, for many distributions, can, in principle, you can have infinite distributions as answers.

41:37

So we add this extra restriction that they say, okay, I don't any distribution. want the maximum entropy distribution. So it's the more spread distribution that satisfy these constraints. And that is computing that again, is I think simple idea is an old idea, but it's something that is useful, very useful in practice to have. I actually, probably one of the functions that I use because after the distribution itself is the other function that I use most from previous, all the time.

42:15

And then there is also a little bit more flexibility because you can fit parameter if you want, you have like a TSTUTION distribution and you say, okay, I want to fix null at seven, you can fix null at seven and then do the res with maximum entropy or you can fix just the... whatever parameter you want. mean, some parameters make sense, some others that don't make too much sense to fix, but you can do it. Usually it's not going to complain.

42:44

It's just going to give you the distribution that satisfies the restrictions. And then we also have some functionality. One method that is commonly used in practice is called a roulette method, like roulette like in casino. And the name of roulette is because you have like a I think the analogy is that you have a certain amount of chips that you want to in a sense. So where do you want to place the chips?

43:11

And of course you want to place the chips where do you think the distribution has more chance to be something like that. So that's one way to see it and that's explain the name. But the other thing, if you see this functionality, you're going to see that what you are doing is essentially drawing a histogram. So essentially you have a grid. Okay?

43:35

And you have on this grid, can place, you can activate cells of the grid and then you get something that looks like an histogram and then you can pick from a pool of distribution and say, okay, if this is, if my distribution looks like this histogram, what distribution fits the best here? And essentially you can pass all distribution in Prellis or you can select one or two distributions or whatever. is very flexible. And that's it. And you get the distribution you can do. Stop with that.

44:09

distribution. And that's something that's very common in prior licitation literature. If you read literature, you're going to... That method appears many times. Mentioned. There are actual papers or tutorials with protocols about how to use that method if you have like many experts. And that's something that you can do in Preli. So for instance, we... The three of us can go and use that elicitation method.

44:37

We can then collect our three elicitation, elicited distributions and we can put it together in a single distribution and we can add weights. we say, okay, Tomás probably is an expert in this field, so I want to give more weight. Alex, I don't really know anything, so we have little smaller weights, something like that. And you can do that. So that's also something super useful.

45:03

The thing that there are some functionality, but the thing is missing at this point is elicitation, what is called predictive elicitation. All the things I have been talking about, there's elicitation at the parameter level. you have it, you know that you have something like use a slope or you have something like this. know it's something in your model and it's usually one dimensional. That method for multi-dimensional, but anyway. But the thing is that you're working at one part of your model.

45:38

And so if you have a of arguments in your model, you need to go one by one. But that's super annoying. And still that's useful because sometimes you, as I say, you can start with default priors and then you have one or two priors that you want to pay attention. So for those cases, these tools are super useful.

⁠¶ Exploring Prior Predictive Distributions

45:59

But anyway, sometimes you want to do something more automatic. So one idea is that, why don't we make predictions from the prior predictive distribution? And if you have predictions from the prior predictive distribution, then you can see what your model is trying to do. So that's super useful. Right? So there's some functionality at this point in Prellis to do that. For instance, you can pass a BAMBI model or a PMC model or a Prelis model because you can kind of define models using just Prelis.

46:39

And you can get samples from the prior predictive distribution and you are going to get like box or sliders that you can tweak the values and you can see the prior predictive distribution. So again, it's a simple idea that it's tried to make it easier this iteration process that you have a model, you generate predictions, you apply, then you go back to the model, you make predictions, you do another plot. So it's trying to speed up that process, to making it a little bit more interactive.

47:12

So that's something that is already there. It should work for many models. I'm not sure if there's anything that you throw at it, but it's the idea that should work for anything. Probably it's not going to work for anything because, know. have to test it better, but the functionality is there. And then there are a couple of functions that try to be more automatic in the sense that you provide the model. So you have like a disinvited model with some default priors.

47:49

And then in one of these functions, you'd also provide like a target distributions. That target distribution is supposed to be the distribution that you think the prior predicted distribution should be. And this function is trying to make both as close as possible. This is the most experimental method in Priors at the moment. because essentially that's kind of an inference problem, right? So, I mean, you are not trying to reinvent patient statistics.

48:26

You should do that in a way that is super cheap to do and super fast to do, that kind of thing. So there's some functionality for that. It works for models that are super simple and it provides something that makes sense. And also it's kind of experimental that if you provide like the default, you provide a model with a, I don't know, it has a normal and a half normal. For instance, two, half normal and half normal.

48:53

But some experimental thing that is maybe it returns a normal and a gamma because say it's better to have a gamma as a prior and not half normal because you're super shifted, something like that. So half normal is not a good fit. So that's something that is already there. Again, it's super experimental. I played from time to time, not sure if I will use it, probably not sure if it will work for a very complex model, but it worked. It's just fit.

49:23

At this point, actually worked for some models that are not that simple, like hierarchical models, that kind of thing, it worked for that. And then the other method that is for private elicitation that I think is not very good in practice, but I think could be good for teaching. Still not sure if something that it's... actually would or not something like that. It has to be decided yet. And it's a method that just provide a model.

49:53

It's going to generate samples from the prior predictive distribution. And you are going to see some of these samples and you can click this 3x3 grid and you say, okay, these samples look like what I think the predictive distribution should look like. And when you select is going to add both samples together.

50:15

So you have another kind of plot down that you can see that what happens when you collect all those individual samples and you can go clicking and every time you click it's trying to in a sense learn. It's super silly in size. not like I'm using generative AI. That's not that kind of thing. Not neural network. Just comparing distributions. But essentially the methods is layers that you like distribution that has some shape, some means, some variance.

50:52

So it's tried to provide you with map of those samples. And in terminal is also automatically adding samples that you didn't select manually, but are super close to the ones that you select. So we try to enrich the population. Anyway, it's the same thing. We're trying to approximate what you have in your mind. as like a target distribution to what you are seeing as the prior predictive distribution. And in this, the other method was kind of automatic and this method use your brain as the filter.

51:27

I don't share it super useful, but I think it's at least it's fun. It could be fun to show it to the students and they can try to play and try to match distributions. So at least for teaching that should be useful.

51:41

And of course, because you, general distribution, you put a sense of is your samples are too off from what you have in mind or are close or are too narrow or too wide, that kind of thing that you get, but you also get to interact and play a little bit with your distribution and maybe waste a lot of time clicking with your mouse instead of doing actual work. But anyways.

⁠¶ Interactive Modeling with PreliZ

52:08

So, okay, that's Prellies. Yeah, damn, thanks a lot. That was quite a performance. Like a one-man show about Prellies, ladies and gentlemen. You should drink some water. I'm surprised you don't have any Mate with you. I can see that Tommy has, of course, his Mate. No, I usually only drink Mate in mornings. Okay. Or maybe if I have to drive, I have to... Go to anybody who hasn't tried MATEA. I definitely encourage issuing that. Of course, it's better to do it for the first time under supervision.

52:52

So go to Argentina or Uruguay and then have a proper MATEA. Thanks a lot, Ozolo. think Tommy has another question for you. I'll give him the floor. Yeah, just want to say that, of course, I put the link to the Prelease package in the show notes and definitely again, encourage people to check it out because I use it all the time in my modeling workflow.

53:18

What I really love is that it marries very well with PIMC also, so you can have the prior elicitation and then just ask Prelease to output that as a PIMC distribution directly in your PIMC model. So that makes the workflow way more transparent also. super reproducible. Oh yeah. And, and as you heard Osvaldo say, are so many things you can check out here. So very broadly, there is something you need. Go check that out. And also, you know, ask Osvaldo if he needs any help on the repo. Yes, I help.

53:55

If you want to provide help, I'm super happy. The answer is always yes.

⁠¶ The Evolution of ArviZ

54:06

a very short final question, which is something I'm curious about. So it is a question I thought I knew the answer to, but I realized my answer was extremely silly. So you said Prellis is a sister project of Arvis. And when you say Prellis, to me, the name makes total sense because it's prior elicitation. Okay. You glue both words and you get Prellis. And what's the origin of Arvis? What does it mean? It's RV like in random variates.

54:41

You know when you write random variates and you write RV and S. So RVs. Yeah. And we say that with an S and then we say okay we could just say C because it sounds like a visualization. Okay, do you want to know what used to be my internal answer? Okay, Of course. So this is a visualization library and it was created by a guy in Argentina who's very patriotic. So, Arvis, of course. No, it didn't make sense. Very, very Argentinian, very Argentinian hypothesis from you. Yeah, yeah, yeah.

55:31

There was a time in Argentina that everything was art, art something. You remember? Yeah, yeah, yeah. That's why I supposed that could be the reason. yeah, I was wrong. you were trying to promote like it's an everything package or whatever they'll do. It has to add something. And I know had no less after doing that, that it could be the case because I think Colin make that assumption. I don't remember. I think was Colin Carlson. Okay, because you are Argentinian or something.

56:03

But no, no. Good, thank you. yeah, Tommy, feel free, I don't know if you have other things, but feel free to stay here and ask other questions. Otherwise, if you need to go, of course you can, but thank you so much for dropping in and making that surprise appearance for Osvaldo. Actually, talking about RVs, I think you guys are working on RVs 1.0, Osvaldo. Right? Yes, that's right. it something you can share with the world? No, it's toxic.

56:36

mean, that's the main like... LBS is kind of like the CNN of the Beijing world. you know, it's like... this is specify information. No, actually, yes, we are completely rewriting Aramis. Completely rewriting actually is the... One of the thing is that we are splitting in three sub-models. And again, probably when this gets released, users are not going to notice that unless they want to install them by separate. you will be able to just call all the functionality from a single place.

57:17

But from time to time, we get people that wants, for instance, to compute, I don't know, our hats, but they don't want to do any plot. So we are splitting and... that for those people, there's going to be easier to install Arbis without needing to install Plotting libraries. That makes a lot of sense when you're, for instance, working with clusters. Universities, typical scenarios that you're working in a cluster.

57:47

The guy in charge of a cluster say, okay, you only have to install only the things that you really, really need. Okay. Now we are going to split this functionality. Another thing we're working is that it's going to be much easier. We're still going to have like this batteries include plots that you just call something like ppc, block ppc and you get something and you don't need to do anything else. But it's going to be much, much easier that once you call the plot to tweak it.

58:24

So we have a thing that is has some reminiscence of grammar of graphics in a sense. It's not exactly that, but I think you, Thomas, with the first time you saw, because this is something that Oriol proposed, I think the first time you, Thomas, saw it, you say, okay, this looks very close to grammar of graphics. So there's some inspiration there. That, yeah, so essentially you're going to be able to call a plot and then tweak it quite a lot, even you are going to...

58:55

I'll be allowed to do completely nonsensical things. Super easy. So probably unexpected things that are useful too. And for us, it's going to be much, much simpler to add new functionality. And actually this past week, I was working on some functionality that is not available in current Arbis. This is for checking prior sensitivity likelihood. sensitivity checks. Essentially I add a few things to compute the statistics and whatever.

59:36

When I was to create the plot, I started thinking, okay, I need to do all this work and start working like super complex things. And then I realized, no, I just need to call a function that we already have and do some manipulation of data a little bit and then call a function and voila, I have a plot that has a lot of functionality. and I didn't have to write anything. So even I was surprised, Oriol was surprised, Andy that's also working and there was surprise.

01:00:03

It was so easy to create something entirely new. Another thing that's probably going to be very useful for people is that if you want to do something much fancier, we provide new objects, something that's called plot collection. that in a sense is like a, in this new everything is like an X-array data set or data tree or that kind of structure. So it's super, it's built around taking advantage of properties of X-array data structure.

01:00:46

So this prep collection is going to allow you to, if you want to build your own kind of plots, So I hope that will be useful also for researchers or people that want to like push boundaries, not reuse some plot, but create something new. I think that's going to be much easier to do it with this interface. So that's something that we are quite excited about that. And of course we are adding new methods that are not available. As I say, this prior

⁠¶ Advancements in ArviZ 1.0

01:01:23

sensitivity checks and don't know other methods that are not viable now in Harbis we are starting to move or add those methods in Harbis 1.0 because we want to be ready as soon as possible actually it is already usable you can go and check probably we should add the others of the repository Um, and you will see that currently you already have like, uh, plot forest. Actually, I love the plot forest in R &D New Arbus. It looks super amazing. It's super clean. I don't know.

01:02:05

I just want to look, it's like you cannot have a drawing or a good picture. It's just look nice. Um, we have, uh, I don't know, we already have a lot of functionality. Everything is there.

01:02:20

and some functionality is going to be in the new Arbus and not the old one but again, I think in a couple of months once this episode gets published it's going to be much much more usable ah, one thing that is super nice is that internally we have a better way to handle with different backends so in current Arbus we support matplib and bokeh and in Arbis 1.0 we also have Plotly, we have Matplotly, Bokeh and Plotly.

01:02:58

Again, was a pleasant surprise when Oriol added Plotly because it was relatively easy work and suddenly we have a completely new backend working almost perfectly out of the box. Still there are some things that we need to polish but it was so crazy and even for us that you can just add a couple of things and you get something that works. Actually at this point, probably works better than than Bokeh. It's super nice. yeah. Yeah. Damn, that's super exciting. For sure.

01:03:40

Let's put that in the show notes. The link to Travis Word 1.0. That sounds amazing. Yeah, I mean, can't wait to see the forest plot now because I use forest plots all the time with arties. It's just the best ones, especially when you have models with a lot of dimensions. It's one of the best ones because you get a lot of information in just one plot.

01:04:09

Something I always have difficulty with is mainly diagnosing and then visualizing models where you have one dimension or several which are huge, you know. So for instance... For my main job with the Marlins, instance, my models, I have a ton of players in them. It's like if you have a model with 10k players, it's a challenge to visualize a trace plot or even a forest plot. yeah, this I'm still trying to find my go-tos. But plot forests... Definitely, definitely one of mine. Love it.

01:04:58

Yeah, definitely. was... I'm still a fan of calculus because I like how they usually encode information, but I'm becoming more more enthusiastic about just point intervals, in plot forest. Just a point on some interval and that's it. Because as you say, when you have a lot of things, it's much easier to compare. I you don't have to worry about things like... the bandwidth that you're using. I don't know, there are many things that are useful.

01:05:28

course, each tool has their use, but I think plot forest and in general, the use of half-point intervals is super useful. Yeah, yeah, definitely. And also, if at some point you find some good plots and illustrations for models with huge dimensions, please let me know. Something I want to talk about with you too is intuitive base because you've done some work over there mainly writing down some really good educational content.

01:06:10

If you had briefly talk about that and also we should definitely put that in the shots. Yeah, I think the first...

⁠¶ Educational Initiatives in Bayesian Statistics

01:06:20

The first thing I did for intuitive bay was practical MCMC. So essentially it's a course for people that, I mean, you don't want to get an expert into sampling methods. Actually the promise of probabilistic programming language is that you don't need to worry about the sampling. You only need to worry about defining your models.

01:06:43

But of course in practice, it's usually a good idea to at least have a conceptual understanding of what MCMC methods are doing because if you have this conceptual understanding is easier to diagnose or and help know what things you can do to fix when you have problems like bad trace plot or divergences or that kind of thing So essentially for practical MCMC we do that, sometimes it's very conceptual and we try to use a lot of animations

01:07:17

And that was one of the things that I was more excited because I'm just to teach writing books or tutorials or things that are static. And then when I teach live, I try to have something that looks closer to an animation, but usually very simple ones, very rudimentary ones. So with PracticalMCNC, we were able to do something like this. those looks a little bit more fancy, a little bit more entertaining.

01:07:50

So we spent quite time trying to do that and making those animations actually something useful so students can actually learn the concepts in a faster and easier way. Yeah, essentially that is a course for people that want to do things in practice and they want to get some understanding of MCMC methods and how to deal with them in practice.

01:08:16

And then we wrote workflow guide that also was something interesting because as I say earlier, talking about the workflow in a practical way is sometimes not that easy. It's kind of challenging because of the nonlinear nature of workflow, all these branches and different things that you can do in actual work. And anyway, we tried to provide like, it's a short guide. This is a PDF. The other one was a videos. This is PDF.

01:08:56

And it's a kind of short guide that provides like the general ideas of the work from. So at one point we have like a kind of description of a linear work from. We clarify say, okay, usually things are not linear, but we're going to put one step below the other. I'm trying to provide some of the things that you usually want to pay attention at each step.

01:09:19

And then we'll provide an actual example when we go and iterate and we play with the prior predictor distribution and do some prior licitation and then check them all and then decide if we want to go back or not and repeat something and that kind of thing. Actually we have, we find issues and we go back.

01:09:40

And this is like, as I say, it's a short guide and was kind of a proof of concept because we wanted to see if we were able to write like this short, relatively short, booklets for people that they already know what they don't know or they already know what they need to improve. it's short because we assume you already know what... the modeling is, you already know what samples are, you already know your tools, but you need this extra step to create, how do I proceed in actual modeling?

01:10:23

And we also have some work that's already published, we are working on that for private solicitation. So we want to create also a booklet and or we're still trying to decide a video. course lessons for prior licitation. And again, prior licitation in the context of the patient workflow on probabilistic programming. When do you want to do prior licitation? How much time should you spend doing prior licitation? What kind of priors that can you think? It's not a surprise probably from the audience.

01:11:06

think that we really like weekly informative priors. So we tried to center the discussion about, but what are weekly informative priors? We tried to provide a practical definition of that. you can try to aim at working with weekly informative priors.

01:11:30

So yeah, I think it's quite, I have been super happy of working with Intuitive Faith, as I say, because for all the projects we have trying to do something that is new in the sense not only that I really have some experience teaching patients to people. when at each project we try to do something new in that sense, something new in how we are going to teach this concept that I have probably teach many times before.

01:12:03

How to approach it in a different way or how to learn that, how to try to collect all the things that you know you tried in the past and didn't work or do it differently. So it has been super quiet, super fun project. Yeah. And I definitely recommend listeners to check those out, all of those. So I put already your practical MCMC course in the show notes.

⁠¶ The Future of Bayesian Methods

01:12:33

And please, you can add, well, at least the first booklet, I know it's out, if you can add the link to the show notes. The second, as you were saying, I think you're still working on that. But maybe by the time this episode is out, we'll edit and definitely I'll recommend people check that out because as a lot of your work, I'm guessing it's going to be very to the point and enough technical details, but not too much.

01:13:03

So that's definitely the kind of thing I like in your work and in your writing in particular. It's like you give the readers enough technical details to make them understand, but at the same time, you're not just drowning them in a sea of technical details that are not necessary, at least to start with these kind of methods and actually use them in the wild. So well done on that. I love that. Thank you.

01:13:37

And that note, the way, and always, there is always a lot of humor in your writing because you're quite a funny guy. I have to say to the listeners in case they didn't notice. And actually, if people want to read more of you, you have your book, right? That was the first book I read when I started learning English and statistics. Now, you have... your first book that's... The title is Introduction to Patient Statistics or something like that?

01:14:11

No, Patient Analysis with Pact. No, Patient Analysis with Python. Yes, Patient Analysis with Python with Pact. And that's your third edition, right? Yes, exactly. Damn, congratulations. I know how it's hard to write one book, so three... I don't know. And thanks to you, actually, you managed to have your editor give some goodies to the listeners. So we're going to have a handful of ebooks to distribute to the patrons of the show.

01:14:50

We're going to do a random draw in the Learn Bay Stats Slack and then the winner are going to hear, guessing, from Pact. How many books are we able to give away this time? I'm not sure, really. Okay, so we'll get back to you on that, And for the rest of you, you have a discount code that we'll also put in the show notes when that episode comes out, so that you can buy Oswaldos book at a discount. And again, we recommend... Anything Osvaldo writes because it's usually really well done.

01:15:37

So thanks again Osvaldo for setting that up with PACT. Okay. Thank you. So we're already at one hour, more than one hour. So I don't want to take too much of your time. So I'll ask you again the last two questions I ask every guest at the end of the show because you that's cool. get another.

01:15:59

another stand at that but a personal curiosity i have i talked to you about that a bit but i thought it would be interesting to talk about that on on the show a bard for time series is that possible how is that possible is it just something that you know you you just add a bard prior to i don't know a model you're already doing kind of a linear regression on which you have a temporal element like a Gaussian random walk. Can you also add a barge to that? How does that work? that possible?

01:16:41

Yeah, you can do that. There's probably one or two examples I've seen from Juan Orduz because he has worked a lot with time series. I personally have not worked a lot with time series. So that's something that usually isn't out of my radar, but yeah, in principle you can do it.

01:17:06

As I say, probably one difference with Gaussian processes is that you work with Gaussian processes for time series or other methods for time series is that you kind of try to construct your time series by adding different terms that encode different kinds of information at different levels. And in that sense, is a little bit more dumb. But it's something that I have been thinking about, about how to try to add a little bit more structure to BART models.

01:17:41

And there's actually some literature about that, for instance, telling your BART model that one or more variables are always increasing or always decreasing or whatever. So, but I think there's, there is something to work on how to provide more information to, to encode more information, prior information into BART and also how to combine more BART models. Like maybe you can build different BART models together. and try to build a more complex function in that way.

01:18:26

I think there's something that is missing from literature. And I think that's partially because this idea of having like, bad models as standalone models. The moment you move from there and you say, no, bad model is not a model. It's a stochastic process or a distribution, if you want. Then we start thinking about Okay, maybe I can do much more general stuff and much more complex stuff. So I don't know, I'm collecting people. Actually, my hope is that people using PyMCVart try to do that for me.

01:19:06

So I have not to do it, they were able to do it with PyMCVart and they showed me how to do it. And so I can learn from them. That's my ultimate hope with PyMCVart. Yeah, I can guess that. Okay, so that's interesting. Basically, you're saying that, yeah, BART are more of a black boxy in a way. So it's kind of like you use a BART or you don't, but it's not something that's modular in that you can use a component of your linear regression, for instance, could be a BART.

01:19:43

And in addition to that, you have what you were saying for... Gaussian processes, for instance, where you can model a trend, you can model the seasonality, you can model short-term variations and each of these three components, you could do it with a Gaussian process, or you can also have structural time series as we'll talk about with J.C.

01:20:06

Grabowski, where you model your trend, you model the seasonality, each with two different two different, well, not models, but let's say methods in the same model. And also you can use something like an arena to then model the rest, the residuals. Once you've taken the trend and seasonality into account, what you're saying is that you cannot really do that with bars.

01:20:41

What I'm saying is that it's more difficult to give priors to part in the sense, the priors are super general in the sense that you pick the number of trees and then you say you have some probability of how deep the trees can be. But this is not that easy to encode a lot of information into the priors. Like for instance, you can do with kernels and Gaussian. That's the things that I'm saying. But I think it's not something that is intrinsic from BART, it's more of the methods that are viable.

01:21:17

And probably I'm saying this and maybe I hope some listeners say it all, but I know about the model when you have blah blah blah. Because there are a lot of things. For instance, there are bad models that at each leaf node, they return the Gaussian process. Okay. So the answer of the the result of the models is a sum of Gaussian processes. super complex and you can do a lot of things with that.

01:21:44

What I'm saying is that at this point with PrimeZ verb is much more flexible than other methods in the sense that you can use as part of a model. For instance, know Chris Farnesbeck, PrimeZ, he has used models that include part of GP components in a single thing. So you can do that. What I'm saying is that we need a little bit more work in able to provide like more prior information or encode more prior information directly into bark so we can restrict.

01:22:20

The thing is that bark models are super flexible. When you have something that's super flexible, it's always a good idea to say, okay, I want to restrict you to be more of this type or this type because that's a way to encode prior things. Right. Okay. Yeah. Get that. Yeah. Super fun. I mean, I can't wait to, you know, play with all these, uh, all these different, um, all these different methods, uh, which is, uh, we just talked about throw out, uh, the episode.

01:22:48

I'm already playing with all of them, but you know, uh, always learning. that's, uh, that's really cool. and, um, yeah, yeah, yeah, yeah. It's, it's, it's part of the job, right? Sometimes it's a bit like, Oh my God, don't know anything. And, uh, all the, and most of the things like, Oh, okay. Yeah. And now I understand that. that thing but now I have to go learn that next step of that method or learn that new method to actually combine it with another one I know about.

01:23:18

yeah, that's really the challenge and also the joy of that line of work, let's say. So I put the blog post from Juan Orduz that you talked about. think it's the one looking at a cohort retention analysis with VART. sounds like it's a time series with Bartel. I'll definitely give it a read because that sounds super interesting.

01:23:46

And since you mentioned GPs, I also put in the show notes two new... well newish... now they have a few weeks... that Bill Engels and myself wrote about HSGPs, so Hilbert Speights' decomposition of the Gaussian processes, which I definitely recommend people interested in GPs. you look at because honestly, HSGPs, if you're doing one or two dimensions, sometimes three, but at least one and two dimensions is absolutely amazing.

01:24:17

And that changes a lot of things, because it's way faster to compute way more efficient. And so if you want to get started with the HSGPs, recommend these two tutorials. First one is that guides you through the basics. And the second one is demonstrating two more advanced use cases. Yes, I have to teach GPS in a couple of weeks. I'm going to steal all your material from my class. Perfect. Awesome. Yeah, I can't wait. I love these methods.

01:24:50

So I really love hearing that they get propagated even more in Argentina, which is dear to my heart, as you know. So I think it's time to let you go as well. It's been a while. need to, you need to, get some alpha heart in your, in your blood. but before letting you go, I'm going to ask you again, the last two questions, ask your guests at the end of the show. So first one, if you had unlimited time and resources, which problem would you try to solve?

01:25:23

Yeah, I think I'm going to say exactly the same. I I'm still motivated to. to work on invasion methods. And I say at the beginning, for me it's really something I'm super happy when I get some code and you get something like a plot forest or something like that, or the first time you see PymC part running and it's actually fitting with something that at least looks reasonable or talking with other people like Gabriel and he say, okay, I speed up this 10 times.

01:26:00

or whatever, with Tomas about Bambi. So yeah, I think I'm still super interested in working on patient methods and making patient methods more useful to people. I think it's even more interesting to me to make methods so other people can do stuff than doing the stuff. Like this kind of general... approach. So yeah, I will keep doing the same and if I have more resources I will have more time, more people to help with this. That sounds good. I love it.

01:26:44

And who would be your guest if you could have dinner with any great scientific mind, dead, alive or fictional? I don't know, that's all... It would be easier if we, all participants of your podcast, have a single banquet with all the people so we can switch tables and talk to many people. So we should try to organize something like that at some point. I don't know. It's super difficult to think about a single person. The last time I think I mentioned Harold Shalaga.

01:27:25

that I have the opportunity to have language. Which him, he was from my previous life as a bioinformatician, biophysicist or whatever. statistics, I don't know. I don't know if I have some like hero, statistical hero yet. I see, see. I know some people that I already have the opportunity to have lunch and I don't know. I think close to me, I don't want to sound like a... I think, you know, to get back to Argentina, think José Luis Borges would be an interesting person to have dinner with.

01:28:08

I don't know, it's super intimidating. So not technically a scientist, could be interesting. but to be super intimidating, you know? It's a guy that... It's a guy that is like, I don't know if listeners have read him, even when we were reading fiction, he always kind of transmitted this idea that he's saying something that is super profound, true, something like that. I don't know, it's just poetry, it was fiction, but I don't know, it has a very interesting style of writing.

01:28:48

Yeah, yeah, definitely. yeah, mean, the Garden of F**king Path, things like that, I think it's definitely scientific-ish, let's say, you so it'd be interesting to think about how he came up with these ideas. No, have many things that look like scientific, in a sense. There fiction that look like that. Could be science fiction, instantly. Yeah. Yeah, definitely. Awesome! Osvaldo, that was a pleasure. Thank you again for taking the time. It's always a great pleasure to have you on the show.

01:29:29

Of course, the show notes of these episodes are gonna be huge because you do so many things. But yeah, so I've already added a lot of things. Feel free to add other links that we mentioned today. And of course, links where people can follow you. Yeah, and maybe before we close up the show, can you tell people where they can follow you, where they can support your work, something like that? Yeah, I am still on Twitter. I'm still calling Twitter, but probably I'm going to be great.

01:30:04

I'm also on Mastodon. It's all the same. It's all of TigerV. It's the same handler for everything. I'm in blue sky. Maybe we can then share my personal webpage. that is essentially a place that you can go and see what I'm doing at the moment. now I'm in LinkedIn. I have to admit that I hate LinkedIn, but I'm there for some reason. I still don't understand what people are doing there, but I see that a lot of people posting interesting things. So I started to follow those folks.

01:30:41

Perfect. Well, I'm sure people will connect there. Thank you again, Osvaldo, for taking the time and being on this show for third time. Thank you. I only need seven more. This has been another episode of Learning Bayesian Statistics. Be sure to rate, review and follow the show on your favorite podcatcher and visit learnbaystats.com for more resources about today's topics as well as access to more episodes to help you reach true Bayesian state of mind. That's learnbaystats.com.

01:31:21

Our theme music is Good Bayesian by Baba Brinkman. Fit MC Lass and Meghiraam. Check out his awesome work at bababrinkman.com. I'm your host. Alex and Dora. can follow me on Twitter at Alex underscore and Dora like the country. You can support the show and unlock exclusive benefits by visiting Patreon.com slash LearnBasedDance. Thank you so much for listening and for your support. You're truly a good Bayesian. Change your predictions after taking information.

01:31:51

And if you're thinking I'll be less than amazing. Let's adjust those expectations. Let me show you how to be a good Bayesian Change calculations after taking fresh data in Those predictions that your brain is making Let's get them on a solid foundation

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript