#100 Reactive Message Passing & Automated Inference in Julia, with Dmitry Bagaev - podcast episode cover

#100 Reactive Message Passing & Automated Inference in Julia, with Dmitry Bagaev

Feb 21, 202455 minSeason 1Ep. 100
--:--
--:--
Listen in podcast apps:

Episode description

Proudly sponsored by PyMC Labs, the Bayesian Consultancy. Book a call, or get in touch!


In this episode, Dmitry Bagaev discusses his work in Bayesian statistics and the development of RxInfer.jl, a reactive message passing toolbox for Bayesian inference. 

Dmitry explains the concept of reactive message passing and its applications in real-time signal processing and autonomous systems. He discusses the challenges and benefits of using RxInfer.jl, including its scalability and efficiency in large probabilistic models. 

Dmitry also shares insights into the trade-offs involved in Bayesian inference architecture and the role of variational inference in RxInfer.jl. Additionally, he discusses his startup Lazy Dynamics and its goal of commercializing research in Bayesian inference. 

Finally, we also discuss the user-friendliness and trade-offs of different inference methods, the future developments of RxInfer, and the future of automated Bayesian inference. 

Coming from a very small town in Russia called Nizhnekamsk, Dmitry currently lives in the Netherlands, where he did his PhD. Before that, he graduated from the Computational Science and Modeling department of Moscow State University. 

Beyond that, Dmitry is also a drummer (you’ll see his cool drums if you’re watching on YouTube), and an adept of extreme sports, like skydiving, wakeboarding and skiing!

Our theme music is « Good Bayesian », by Baba Brinkman (feat MC Lars and Mega Ran). Check out his awesome work at https://bababrinkman.com/ !

Thank you to my Patrons for making this episode possible!

Yusuke Saito, Avi Bryant, Ero Carrera, Giuliano Cruz, Tim Gasser, James Wade, Tradd Salvo, William Benton, James Ahloy, Robin Taylor,, Chad Scherrer, Zwelithini Tunyiswa, Bertrand Wilden, James Thompson, Stephen Oates, Gian Luca Di Tanna, Jack Wells, Matthew Maldonado, Ian Costley, Ally Salim, Larry Gill, Ian Moran, Paul Oreto, Colin Caprani, Colin Carroll, Nathaniel Burbank, Michael Osthege, Rémi Louf, Clive Edelsten, Henri Wallen, Hugo Botha, Vinh Nguyen, Marcin Elantkowski, Adam C. Smith, Will Kurt, Andrew Moskowitz, Hector Munoz, Marco Gorelli, Simon Kessell, Bradley Rode, Patrick Kelley, Rick Anderson, Casper de Bruin, Philippe Labonde, Michael Hankin, Cameron Smith, Tomáš Frýda, Ryan Wesslen, Andreas Netti, Riley King, Yoshiyuki Hamajima, Sven De Maeyer, Michael DeCrescenzo, Fergal M, Mason Yahr, Naoya Kanai, Steven Rowland, Aubrey Clayton, Jeannine Sue, Omri Har Shemesh, Scott Anthony Robson, Robert Yolken, Or Duek, Pavel Dusek, Paul Cox, Andreas Kröpelin, Raphaël R, Nicolas Rode, Gabriel Stechschulte, Arkady, Kurt TeKolste, Gergely Juhasz, Marcus Nölke, Maggi Mackintosh, Grant Pezzolesi, Avram Aelony, Joshua Meehl, Javier Sabio, Kristian Higgins, Alex Jones, Gregorio Aguilar, Matt Rosinski, Bart Trudeau, Luis Fonseca, Dante Gates, Matt Niccolls, Maksim Kuznecov, Michael Thomas, Luke Gorrie, Cory Kiser and Julio.

Visit

Transcript

In this episode, Dmitry Bagaev discusses his work in Bayesian statistics and the development of RxInfer.jl, a reactive message passing toolbox for Bayesian inference. Dmitry explains the concept of reactive message passing and its applications in real-time signal processing and autonomous systems. He discusses the challenges and benefits of using RxInfer.jl, including its scalability and efficiency in large probabilistic models.

Dimitri also shares insight into the trade-offs involved in Bayesian inference architecture and the role of variational inference in rxinfer.jl. Additionally, he discusses his startup Lazy Dynamics and its goal of commercializing research in Bayesian inference. Finally, we also discussed the user friendliness and trade-offs of different inference methods, the future developments of rxinfer, and the future of automated patient entrance.

Coming from a very small town in Russia called Nizhny Komsk, Dmitry currently lives in the Netherlands, where he did his PhD. Before that, he graduated from the computational science and modeling department of Moscow State University. Beyond that, Dmitry is also a drummer, you'll see his cool drums if you're watching on YouTube, and an adept of extreme sports like skydiving, wakeboarding, and skiing. Learning Basin Statistics, episode 100, recorded January 25, 2024.

Dmitry Pagaev, welcome to Learning Basin Statistics. Thanks. Thanks for inviting me for your great podcast. Really, I feel very honored. Yeah, thanks a lot. The honor is mine. That's really great to have you on the show. So many questions for you and yeah, we're also gonna be able to talk again about Julia, so that's super cool. And I wanna thank of course Albert Podusenko for putting us in contact. Thanks a lot Albert, it was a great idea. I hope you will love the episode.

Well I'm sure you're gonna love Dmitry's part, and mine is always... more in the air, right? And well, Dmitry, thanks again, because I know you're a bit sick. So I appreciate it even more. And so let's start by basically defining what you're doing nowadays, and also how did you end up doing what you're doing basically? Yes. So I'm currently working at the University of Technology in bias lab. And I just recently finished my PhD in Bayesian statistics, essentially.

So now I'm just like supervised students. I did some of the projects there and bias lab itself is a group in the university that primarily work on like a real time Bayesian signal processing. And we do research in that field. And the slogan, let's say of the lab is sort of like, is natural artificial intelligence and it's phrased. Uh, like specifically like that, because there's, there cannot be natural artificial intelligence. So it's like a play words, let's say.

Um, and the, the lab is basically trying to like develop automated, um, control systems or like novel signal processing applications. And it's basically inspired by, uh, neuroscience. I know. And we also opened a startup with my colleagues. which is called Lazy Dynamics. And the idea is basically to commercialize the research in the lab, but also to find the new funding for new PG students for the university.

But they're still quite young, so we are still like less than one year, and we are currently like in search of clients and potential investors. But yeah, my main focus still remains being a postdoc in the university. Yeah, fascinating. So many things already. Um, maybe what do you do in your postdoc? Um, so my main focus, like primary is, uh, supporting, uh, the toolbox that we wrote, uh, in our lab that I am a primary author. We call this toolbox, uh, RX and Ferb.

Uh, and this is like essential part of my PhD project. Um, and basically I love to code. So, um, more or less like, uh, my scientific career was always aligned with software development. And the Erikson FUR project was a really big project and many other projects in BiasLab, they depend on it. And it requires maintenance, like box fixing, adding new features, performance improvements. And, and we are currently have several sub projects that we develop alongside for the Erikson FUR.

And that's just like the main focus for me. And as something else, I also supervise students for this project. Yeah, yeah. Of course. That must also take quite some time, right? Yes, exactly. Yeah. Yeah, super cool. So let me start basically by diving a bit more into the concepts you've just named, because you've already talked about a lot of the things you work on, which is. a lot, as I guess listeners can hear.

So first, let's try and explain the concept of reactive message passing in the context of Bayesian inference for listeners who may not be familiar with it, because I believe it's the first time we really talk about that on the show. So yeah, talk to us about that. Also, because from what I understand, it's really the main focus of your work, be it through RxInfR. infer.jl or lazy dynamics or biaslam. So let's start by having the landscape here about reactive message passing. Yes, good.

So yeah, ARIKS and FER is what we call reactive message passing based Bayesian inference toolbox. And basically in the context of Bayesian inference, we usually work with probabilistic models. And the probabilistic model is usually a function of some variables and some variables are being observed. And we want to infer some probability distribution over unobserved variables. And what is interesting about that is that if we have a probabilistic model, we can actually represent it as a graph.

And for example, if we can factorize our probabilistic model into a set of factors, such that each node will be a factor and each edge will be a variable of the model, more like hidden state, and some of them are observed or not. And basically message passing by itself is a very interesting idea of solving Bayes rule for a probabilistic model defined in terms of the graph. So it does it by sending messages between nodes in the graph, along edges. And it's quite a very big topic actually.

But essentially here to understand is that we can do that, right? So we can reframe the base rule as something that has this messages in the ground, uh, reactive message passing, uh, is a particular implementation, uh, of this idea. So, because in the traditional message passing, we usually have to define an order of messages, like how, in what order do we compute them? It may be very crucial, for example, if the graph structure has loops.

So there is like some structural dependencies in the graph and reactive message passing basically says, okay, no, we will not do that. We will not specify any order. Instead we will react on data. So, and, uh, the, the order of message computations, uh, becomes essentially data driven and we do not enforce any particular, uh, order of competition.

OK, so if I try to summarize, that would be something like, usually when you work on a Bayesian model, you have to specify the graph and the order of the graph in which direction the nodes are going. In reactive message passing, it's more like a non-parametric version in a way where you just say, there are these stuff, but you're not specifying the. the directions and you're just trying to infer that through the data. How wrong is that characterization? Not exactly like that.

So indeed the graph that we work with, they don't have any direction in them, right? Because messages, they can flow in any direction. The main difference here is that reactive message passing reacts on changes in data and updates posteriors automatically. Right? So. There is no particular order in which we update the series. For example, if we have some variables in our mode, like ABC, we don't know which will be updated first and which will be the last.

It basically depends on our observations. Uh, but, uh, it works like that, that as soon as we have new observation, uh, the graph reacts in this observation and updates the series as soon as it can. without explicitly specifying this order. And why would you do that? Why would that be useful? So it's a very good question. So because in BiasLab, we essentially work with, we try to work with autonomous systems. And autonomous systems, they have to work in the field, right?

So like in the real world environment, let's say, right? And Real world environment is extremely unpredictable. If we want to, to be more clear, let's say we try to develop a drone, which tries to navigate the environment and it has like several sensors and we want to build a probabilistic model of the environment, such that drones wants to act in this environment and like in sensors, it has some noise in it. Like, uh, so essentially.

We cannot predict in what order the data will be arriving, right? Because you may have a video signal, you may have an audio signal and this, um, devices that record video, let's say they also have unpredictable update rate. Usually it's maybe like 60 frames per second, but it may change. Right. Um, so instead of like fixing the algorithm and saying, okay, we wait for like new frame. from a video, wait for a new frame from an audio, then we update, then we wait again.

Instead of doing that, we just simply let the system react on new changes and update the series as soon as possible. And then based on new posteriors, we act as soon as possible. This is kind of the main idea of reactive implementations. And in traditional software, for Bayesian inference, for example, we just have a model, and we have a data set, and we feed the data set to the model, and we have the posterior, and then we analyze the posterior, and it also works really great, right?

But it doesn't really work in the field where you don't have time to synchronize your data set and to react as soon as you can. Okay, okay, I see. So that's where, basically, This kind of reactive message passing is extremely useful when you receive data in real time that you don't really know the structure of. Yes, we work primarily with real-time signals. Yes. Okay, very interesting. Actually, do you have any examples, any real-life examples that you've worked on or...

You know, this is extremely useful to work on with RxInfoR.jl or just in general, these kind of relative messages passing. Yes. So I myself, I usually do not work with applications. So my primary focus lies in the actual Bayesian inference engine. But in our lab, there are people who work, for example, on audio signals. Right.

So you want to you want, for example, maybe create a probabilistic model of environment to be able to denoise speech or it or it may be like a position tracking system or a planning system in real time. In our lab, we also very often refer to the term active inference. which basically defines a probabilistic model, not only of your environment, but also of your actions, such that you can infer the most optimal course of actions.

And this might be useful in control applications, also for the drone, right? So we want to infer not only the position of the drone based on sensors that we have, but also how it should act to avoid an obstacle, for example. I see. Yeah, OK, super interesting. So basically, any case where you have really high uncertainty, right, that kind of stuff, OK, yes, super interesting. And so what prompted you to create a tool for that? What inspired you to develop our existing Forto.jl?

And maybe also tell us how it differs from traditional Bayesian inference tools. be it in Python or in R or even in Julia. If I'm a Julia user, I'm used to use probabilistic programming language in Julia, then what's the difference with RxInfoR? This is a good question. But there are two questions in one about inspiration. So I joined the bias lab in 2019. without really understanding what it is going to be about. So, but really understanding how difficult it is really.

So, and the inspiration for me came from the project that I started my PhD on. And basically the main inspiration in our lab is like the so-called the free energy principle, which kind of tries to explain. how natural biotic systems behave. Right. So, and they basically say they define so-called Bayesian brain portesies and pre-energy principles.

So they basically say that any biotic system, they define a probabilistic model of its environment and tries to infer the most optimal course of action to survive essentially. But all of this is based on Bayesian inference as well. So, right. At the end. It kind of, it's a very good idea, but at the end, it all boils down to the, to the Bayesian inference. And basically if you look how biotech system work, we, we note that there are very specific properties of this biotech system.

So they do not consume a lot of power. Right. It's actually, it has been proven that our brain consumes like about 20 Watts of energy, right. And it's like an ex. extremely efficient device, if we can say, right? It does not even compare with supercomputers. It's also scalable because we live in the very complex environment with many variables. We act in real time, right? And we are able to adapt to the environment. And we are also kind of robust to what is happening around us, right?

So... If something new happens, we were able to adapt to it instead of just failing. Right. And this is kind of the idea. So the inspiration for this Bayesian inference toolbox that we need to be scalable, real time, adaptive, robust, super efficient, and also low power. Right. So this is the main ideas behind RX Inferior project. And here we go to the second part of the question. How does it differ? Because this is exactly where we differ, right?

So other solutions in Python or in Julia, also very cool. There are actually a lot of cool libraries for Bayesian inference, but most of them, they have a different set of trades off or requirements. And maybe I will be super clear. We are not trying to be better. But we are trying to have a different set of requirements for the Bayesian different system. Yeah. Yeah, you're working on a different set of needs, in a way. Yes, yes. And it's application-driven.

Yeah, you're trying to address another type of applications. Exactly. And if we directly compare to other solutions, they are mostly based on sampling, like HMC or not. Or maybe they are like black box methods like a GVI, automatic differential variation inference or VDI. And they basically, they are great methods that they tend to consume a lot of computational power or like energy, right? So they do a very expensive simulation. It may run for maybe hours, maybe even days in some situations.

And they were great, but you cannot really apply it in this autonomous systems where you need to... Uh, like if we're again talking about audio, it's like 44 kilohertz. So we need to really perform Bayesian inference and extremely fast scale. And it seems you're not, uh, are not really applicable in this situation. So. Yeah, fascinating. And you were talking, well, we'll get back to the computation part a bit later. Maybe first I'd like to ask you, why did you do it with Julia?

Why did you choose Julia for RxInfer? And what advantages does it offer for your applications of patient inference? The particular choice of Julia was actually driven by the needs of the bias lab in the university because all the research which we do in the university now in our lab is done in Julia and that decision has been made by our professor many, many years ago. Interestingly enough, our professor doesn't really code. But Julia is a really great language. So if I would choose myself.

If I, I would still choose Julia. It's, it's, it's a great language. It's fast. Right. So, and our primary concern is efficiency. Um, and like Python can also be fast. Uh, if you like know how to use it, if you use an MP or like some specialized libraries, uh, but with July, it's, it's really easy. It is easier. In some situations, of course, you need to know a bit more. So my background is in C and C++. And I understand like how compilers works, for example.

So maybe for me, it's a bit easier to write a performance Julia code. But in general, it's just, it's just really, it's a nice, fast language. And it also develops fast in the sense that new versions of Julia, they, come up like every several months. And it really gets better with each release. Another thing which is actually very important for us as well is macros. Are macros in Julia?

So for people who are listening, so macros basically allow us to apply arbitrary code transformations to the existing code. And it also allows you to create sublanguage within a language. And why it is particularly useful for us is that specifying probabilistic models in Bayesian inference is a bit hard or tedious. We don't want to directly specify these huge graphs.

And instead, what we did and what Turing also did and many other libraries in Julia, they came up with the main specific language for specifying probabilistic programs. And it's extremely cool. So it's much, much simpler to define a probabilistic program in Julia than in Python, in my opinion. And I really like this feature of Julia. Yeah, these basically building block aspect of the Julia language. Yeah, yeah, I've heard that. There are other aspects I can mention of Julia.

By the way, maybe I also can make an announcement regarding Julia is that the next Julia the con is happening in I'm told in the city where I'm currently in. And it's going to be very cool. It's going to be in PC stadium in the football stadium. Right. The technical is the technical conference about programming language is going to be on the stadium. So, but so another aspect. about Julia is this notorious dynamic multiple dispatch.

And it was extremely useful for us in particular for reactive message passing implementation. Because again, so if we think about how this reactiveness work and how do we compute these messages on the graph, in order to compute the message, we wait for inputs. And then when all inputs have arrived, we have to decide how to compute the message. And computation of the message is essentially solving an integral.

But if we know types of the arguments, and if we know the type of the node, it might be that there is an analytical solution to the message. So it's not really necessary to solve a complex integral. And we do it by multiple dispatch in Julia. So multiple dispatch in Julia helps us to pick the most efficient message update rule. on the graph, and it's basically built into the language. It's also possible to emulate it in Python, but in Julia, it's just fast and built-in, and it works super nice.

No idea. Yeah, super cool. Yeah, for sure. Super interesting points. And I'm very happy because it's been a long time since we've had a show with some Julia practitioners, so that's always very interesting to hear of what's going on in that. in that field and yeah, I would be convinced just by coming to PSV Eindhoven Stadium. You don't have to tell me more. I'll be there. Let's do a live show in the stadium. Yes, I will be there. Yeah. Yeah, that sounds like a lot of fun.

And actually, so I'm myself an open source developer, so I'm very biased to ask you that question. What were some of the biggest challenges you faced when you developed RxInfer? And how did you overcome them? I guess that's like the main thing you do when you're an open source developer is putting a tire. This is an amazing question. I really like it. So, and I even have like some of the answers in my PhD dissertation. And I will probably just go ahead.

I'll probably just quote it, but I don't remember exactly how I framed it. But I took it from the book, which is called, um, uh, software engineering for science. So, and basically it says that people usually underestimate how difficult it is to create, um, a software in scientific research area. Uh, and the main difficulty with that is that there are no clear guidelines to follow.

Uh, it's not like designing a website with clear, like a framework rules and you just need tasks between like people and team. No, it's like, um, new insights of science, like, or like an area where we work in that they happen every day. Right. And the requirements for the software, they may change every day.

Uh, and it's really hard to like come up with a specific design before we start developing because requirements change over time because you may create some software for research purposes and then you found out something super cool which works better or faster or scales better and then you realize that well you actually have to start over because this is just better we just we just found out something cooler and It also means that a developer must invest time into this research.

So it's not only about coding, like you should understand how it all works from the scientific point of view, from a mathematical point of view. And sometimes if this is like a cutting edge research, there are no books about how it works, right? So we must invest time in reading papers. Um, and also being able to write a good code, which is fast and efficient.

Uh, and all of these problems, they, they also cured, uh, when we developed our extinfer, uh, even though I'm the main author, uh, a lot of people have helped me, right? It's like, uh, very thankful for that. Uh, and for our extinfer in particular, for my, I also needed to learn a very big part of statistics because when I joined the lab, I actually didn't have a lot of experience with Bayesian inference and with graphs and with message passing. So I really need to dive into this field.

And many people helped me to understand how it works. A lot of my colleagues, they have spent their time explaining. And even though, right, so we have already this stack of difficulties at the end or like maybe not at the end, but the software that we use, we would like it to be. Easy to use, like, or user friendly. So we already have this difficulties about we don't know how to design it. We have to invest time into reading papers.

But then we at the end, we want to have a functional software that is easy to use, addresses different needs and allows you to find new insights. So the software should be designed such that it does not. impose a lot of constraints on what you can do with this software, right? Because scientific software is about finding new insights, not about like doing some predefined set of algorithm. You want to find something new essentially. And software should help you with that. Yeah, yeah, for sure.

That's a good point. What do you think, what would you say are the key challenges in achieving scalability and efficiency in this endeavor and how does RxInfair address this? Basically, we are talking in the context of Bayesian inference and the key challenge in the base rule doesn't scale, right? It's, the formula looks very simple, but in practice, then we start working with large probabilistic models.

Just blind application of base rule doesn't scale because it has exponential complexity with respect to the number of variables. And Arikson-Ford tries to tackle this by... having essentially two main components in the recipe, like maybe three, let's say three. So first of all, we use factor graphs to specify the model. So we work with factorized models.

We work with message passing, and message passing essentially converts the exponential complexity of the Bayes rule to linear, but only for highly factorized models. And like highly factorized here is a really crucial component, but many models are indeed highly factorized. It's it means that Variables do not directly depend on all other variables. They directly depend on maybe a very small subset of variables in the model. And the third component here is variational inference.

So because it allows us to trade off the computational complexity with accuracy. So if the task is too difficult or it doesn't scale, basically what variational inference gives you is the ability to impose a set of constraints into your problem, because it reframes the original problem as an optimization task. And we can optimize with up to a certain constraint. For example, we may say that this variable is distributed as a Gaussian distribution.

It may not be true in reality and we lose some accuracy, but at the end it allows us to solve some equations faster. And we can impose more and more constraints if we don't have enough computational power and if you have large model, or we may relax constraints if we have enough computational power and we gain accuracy. So we have this sort of a slider. which allows us to scale better. But here's the thing, right?

We always can come up with such a large model with so many variables and so difficult relationships between variables where it still will not scale. And this is fine. But Alexin Fur tries to push this boundary for like scaling Bayesian inference to large models. And actually, so you're using variational inference quite a lot in this endeavor, right? So actually, can you discuss the role of variational inference here in RxInfer and maybe any innovations that you've incorporated in this area?

So the role I kind of touched upon a little bit is that it acts as like a slider. Right. In in the controlling the complexity and the accuracy of your inference result. This is the main role. Of course, for some applications, this might be undesirable. For some applications, you may want to have a perfect posterior estimation. But for some applications, it's not a very big deal. Again, we are talking about different needs for different application. here.

And the innovation that RX and Fer brings, I think it's like one of the few implementation as message passing, like variational inference as message passing, because it's usually implemented as like black box method that takes a function like a probabilistic model function and maybe does some automatic differentiation or some extra sampling under the hood.

And message passing by itself has a very long history, but I think people mistakenly think that it's quite limited to like some product algorithm. But actually, variational inference can also be implemented as message passing. And it's quite good. So it opens the applicability of the message passing algorithms. And also.

As we already talked a little bit about this reactive nature of the inference procedure, so it's also maybe even the first reactive variational inference engine, which is designed to work with infinite data streams. So it continuously updates this posterior continuously does minimization. It does not stop. And as soon as new data arrive, we basically update our posteriors.

But in between this kind of data windows, we can spend more computational resources to find better approximation for the variational inference. But yeah, but all other solutions, let's say that are also variational inference, they basically require you to, yeah. to wait for the data, then feed to the data, or wait for the entire data set, feed the data set, and then you have the result, then you analyze the result, and then you repeat. So RxInfoR works a bit differently in that regard.

Yeah. Fascinating. And that, I'm guessing you have some examples of that up in the RxInfoR website, maybe we can... a link to that in the shows for people who are interested to see how you would apply that in practice? So I, So it does not really require reactivity, but because it's kind of like easy to use and fast, students can do some homework for signal processing applications. What I already mentioned is that we work with audio signals and with control applications.

I don't really have a particular example if our sensor is being used in the field. or by an industry. So it's primarily our research tool currently, but we want to extend it. So it's still a bit more difficult to use than Turing, let's say. Turing, which is also written in Julia, because yeah, message passing is a bit maybe more difficult to use and it is not that universal as HMC and NUTS still require some approximation methods.

Yeah. So we still use it as a research tool currently, but we have some ideas in the lab, how to expand the available set of probabilistic models we can run an inference on. And yes, indeed, on our documentation, we have quite a lot of examples where we can use, but these examples, they are, I would say, educational in most of the cases. at least in the documentation.

So we are at this stage where we have a lot of ideas how we can improve the inference, how we make it faster, such that we can actually apply it for real tasks, like for real drones, for real robots, to make a real speech, like the noise or something similar. Yeah, definitely said. That would be super interesting, I'm guessing, for people who are into these and also just want to check out. I have been checking out your website recently to prepare for the episode. Actually, can you now...

So you've shared some, like the overview of the theory, how that works, what RxInfer does in that regard. Can you share what you folks are doing with Lazy Dynamics, how that's related to that? How does that fit into this ecosystem? So yeah, Lazy Dynamics, we created this company to commercialize the research that we do at our lab to basically find funding to make our extrovert better and ready for industry. Because currently, let's say, Ericsson is a great research tool for our purposes, right?

But industry needs some more properties to the addition that I have already mentioned. Right? For example, indeed the Bayesian inference engine must be extremely robust, right? It does not allow to fail if we really work in the field. And this is not really a research question. It's more about like implementational side. Right. It's like a good goal to good code coverage, like great documentation. And this is what we kind of also want to do with lazy dynamics.

We want to take this next step and want to create a great product for other companies, especially that can rely on Rx and Fur in the maybe in their research or maybe even in the field. Right. And maybe we create some sort of a tools, a tool set around RxInfer that will allow you to maybe debug the performance of your probabilistic problem or your probabilistic inference, right? It's also not about research.

It's about like having it more accessible to other people, like finding bugs or mistakes in their model specification, make it easier to use. Or maybe, for example, we could. come up with some sort of a library of models, right? So you would want to build some autonomous system and it may require a model for audio recognition, it may require a model for video recognition. And this kind of set of models, they can be predefined, very well tested, have a great performance, super robust.

And basically Lazy Dynamics may provide an access to this kind of a library. right? So, and for this kind of, because this is not a research related questions, it's, it must be done in a company with like a very good programmers and very good code coverage and documentation. But for research purposes, Ericsson-Fer is already a great toolbox. And basically many students in our lab, they already use it.

But. Yeah, because we are all sitting in the same room, let's say on the same floor, we can kind of brainstorm, find bugs, fix it on the fly and they keep working that. But if we want Rx, for Rxinfer to be used in industry, it really needs to be a professional toolbox with like a professional support. Yeah. Yeah, I understand that makes sense. Surprised you can, I don't know when you sleep though, between the postdoc, the open source project and the company.

So yeah, it's a great comment, but yeah, it's hard. Yeah, hopefully we'll get you some sleep in the coming months. To get back to your PhD project, because I found that very interesting. So your dissertation will be in the show notes. But something I was also curious about is that in this PhD project, you explore different trade-offs for Bayesian inference architecture. And you've mentioned that a bit already a bit earlier, but I'm really curious about that.

So could you elaborate on these trade-offs and why they are significant? Yes, we already touched a little bit about that. So the main trade-offs here are kind of computational load, efficiency, adaptivity, high power consumption, magic. Yeah. And another aspect actually, which we didn't talk about yet is structural model adaptation. So this is the requirements that we are favor. in the Ricks Center. And this has the requirements that were like central to my PhD project.

And this all arises, all of these properties, they are not just coming from a vacuum. They are coming from real time signal processing applications on autonomous systems. We don't have a lot of battery power. We don't have a very powerful CPUs on this autonomous devices, because essentially what we want to do also is that We want to be able to run a very difficult, large probabilistic models on the Raspberry Pi. And Raspberry Pi doesn't even have a GPU.

So we can buy some small sort of a GPU and put it on the Raspberry Pi. But still, the computational capabilities are very, very limited on edge devices. For example, one may say, let's just do everything in the cloud, which is a very valid argument, actually. But we also, in some situations, the latencies are just too big.

And also, maybe we don't have access to the internet in some areas, but we still want to create these adaptive Bayesian inference systems like a drone that they may... explore some area maybe in the mountain or something where we don't really have an internet so we cannot really process anything in the cloud. So it must work as efficient as possible. On a very, very small device that doesn't have a lot of power doesn't have a lot of battery and still this should work in real time.

Yeah, I think, I think this is mostly the main trades of and In terms of how we do it, we use this variational inference and we sacrifice accuracy with respect to scalability. Reactive message passing allows us to scale to a very large models because it works on Factor graphs. Yeah. And I think that's, these are very important points to make, right? Because always when you work and you build an open source, package you have to trade off to make.

So that means you have to choose whether you're going to a general package or a more specified one. And that will dictate in a way your trade off. In RxInfer, it seems like you're quite specified, specialist of message passing inference. So the cool thing here is that I'm choices because you're like, no, our main use case is that. And so we can use that.

And the devirational inference choice, for instance, is quite telling because in your case, it seems to be really working well, whereas we could not do that in PMC, for instance. If we remove the ability to use HMC, we would have quite a drop in the user numbers. So yeah, that's always something I'm. Try to make people aware of when they are using open source packages. You can do everything. Yeah, exactly.

Exactly. So I actually really, when I have a need, I really enjoy working with like HMC or NAT based methods because they just work, just like magic. And, but, and here's the trade off, right? They work magically in many situations. But they're slow in some sense. Let's say they're not slow, but they're slower than a message button. So here is this trade-off. So user friendliness is really, really important key in this equation. Yeah, and what do you call user friendliness in your case?

So what I refer to user friendliness here is that a user can specify a model, press a button with HMC and it just runs and the user gets a result. Yes, a user needs to wait a little bit more. But anyway, like user experience is great. Just specify a model, just run inference, just get your result. With RxInfer, it's a bit less easier because in most of the cases, uh, message passing works like that, that it favors like analytical solutions on the graph.

And if analytical solution for a message is not available, uh, basically a user must specify an approximation method. Uh, it actually also can be HMC, uh, just in case. Uh, but still our X and four does not really specify a default approximation method.

Currently, fine default approximation, but because it does not define it currently, if a user specifies a complex probabilistic model, it will probably throw an error saying that, okay, I don't know how to solve it, please specify what should I do here and there. And for a new user, it might be a bit unintuitive how to do that, what to specify. So for HMC, there's no need to do it, it just works. But if RxInfer, it's not that easy yet. That's what I was referring to, user-friendliness.

Yeah, that makes sense. And again, here, the interesting thing is that the definition of user-friendliness is going to depend on what you're trying to optimize, right? What kind of use case you're trying to optimize on. Yes. Actually, what's the future for RxInfer? What are the future developments or enhancements that you are planning?

So, we have already touched a little bit about like Lazy Dynamics side, which tries to make a really, like a commercial product out of the person, where we have great support. This is one side of the future, but we also have a research side of the project. And research side of the project includes structural model adaptation. We just, uh, which in my opinion is quite cool.

So what it basically means in a few words is that we, we may be able in the future to change the structure of the model on the fly without stopping the inference procedure and you may need it for several reasons, for example, uh, computational power, uh, computational budget change, and we are not longer able, we are no longer able to run inference. on such a complex model. So we want to reduce the complexity of the model.

We want to change the structure, maybe put some less demanding factor nodes. And we want to do it on the fly. We want actually stopping the inference because for like sampling based methods, if we change the model, we basically are forced to restart because we have this change and it's quite difficult to reduce the previous result if the structure of the model change. graphs, it's actually possible.

So another point why we would need that in the field is that if you could imagine different sensors, so we have different observations, and one sensor all of a sudden just burned out, or glitched, or something like that. So essentially, we are not longer having this sort of observation. So we need to change the structure of our model to account for this glitch or breakage of the sensor.

And this is also where reactive message passing helps us because we basically, because we do not enforce the particular order of updates, we stop reacting on this observation because it's no longer available. And we also change the structure of the model to account for that. Another thing for the future of RxN4 in terms of research is that we want to be to support natively different update rates for different variables.

And so what I mean by that is that if you imagine an audio recognition system, let's say, or audio enhancement system, let's say, and you have you modeled the environment of like a person who is talking around several persons and let's say their speech signal. arise at the rate of like 44 kilohertz if we are talking about a typical microphone. But their environment, where are they currently sitting, doesn't really change that fast because they may sit in a bar and it will be a bar an hour later.

So there's no need to infer this information that often as their speech. So it changes very rarely. So we have a different set of variables that may change at different scales. And we want also to support this natively in RxInfer. So we can also make it easier for the inference engine. So it does not spend computational resources on variables, which are not updating fast. We want to be able to support non-parametric models in Rx and FUR. And this includes like Gaussian processes.

And we have a research, so currently we have a PhD student in our lab who is working a lot on that and he has a great progress. It's not available in the current version of Rx and FUR, but he has like experiments and it works all nicely. At some point it will be integrated into the public version. And... Yeah, and it just, you know, just maintenance and fixing bugs and this kind of stuff, improving the documentation.

So the documentation currently needs improvement because we have quite some features and additions that we have already integrated into the framework and we happily use them ourselves in our lab for our research. But it's like maybe poorly documented, let's say. So other people in theory can use this functionality, but because they cannot go to my table in the office in the Einhorn University of Technology, they cannot ask how to use it properly.

So we should just put it into documentation and so other people can use that as well. Yeah, yeah. Yeah. That makes sense. That's a nice roadmap for this year. And looking ahead, what's... your, you know, what's your vision, let's say, for the future of automated patient inference in the way you do it, especially in complex models like yours. Yeah, what's your vision about that? What would you like to see in the coming years? Also, what would you like to not see? A good question.

So in my opinion, the future is very bright. the future of automated vision and like a lot of great people working on this and start to work on that more people are coming. Right. So so many toolboxes in Python and Julia, like I am see cheering Julia in our there are like in C plus 10. So, so many implementations and it's only getting better every year.

Right. But I think in my opinion, the future is that there will be several applications, like in our case, this autonomous systems or maybe something else. And this packages, they will basically not really compete. But in a sense, they will, like for different applications, you will choose a different solution because all of them will be kind of great in their own application.

But I'm not sure if there will be like a super ultra cool method that solves all problems of all applications in Bayesian inference. And maybe we'll have who knows. But in my opinion, there will be always this trades of trades of in different applications and we'll just have we'll use different methodologies. Yeah. Yeah, that makes sense in. In a way.

I like your point here, but all these different methods cooperating in a way because they are addressing different workflows or different use cases. So yeah, definitely I think we'll have stuff to learn from one type of application to the other. I like this analogy of like, no, we don't cut the bread with a fork. But it doesn't really make a fork a useless tool. I mean, we can use a fork for something else and we are not eating a soup with a knife, but it doesn't make knife a useless tool.

So these are tools that are great, but for their own purposes. So Alexin Fur is like a good tool for like real time signal process application. And Turing and Julia is like a great tool for other applications. So we'll just live together and learn from each other. Yeah. Fascinating. I really love that.

And well, before closing up the show, because I don't want to take too much time with you, but do you have any question I really like asking from time to time is if you have any favorite type of model that you always like to use and you want to share with listeners? You mean probabilistic model? Sure, or it can be a different model for sure. But yeah, probabilistic model. I actually, yeah, I mentioned a little bit that I do not really work from application point of view.

I really work on the compiler for Bayesian inference. So I don't really have a favorite model, let's say. It's hard to say. Yeah, that's interesting because basically you work, that's always an interesting position to me because you really work on the, basically making the modeling possible, but usually one of the people using that modeling platform yourself. Exactly. Yes. Yeah. That's always something really fascinating to me.

Because me, I'm kind of on the bridge, but a bit more to the applied modeling side of things. So I'm really happy that there are people like you who make my life easier and even possible. So thank you so much. That's cool. Awesome. Dmitri, that was super cool. Thanks a lot. Before letting you go, though, as usual, I'm going to ask you the last two questions. I ask every guest at the end of the show. First one, if you had unlimited time and resources, which problem would you try to solve?

Yes, I thought about this question. It's kind of an interesting one. And I thought it would be cool. to have if we have an infinite amount of time to try to solve some sort of unsolvable paradox because we already have a limited time.

So one of the areas which I never worked with, but I'm really fascinated about is like astronomy and one of the paradoxes in astronomy which is kind of I find interesting, but maybe it's not really a paradox, but anyway, it's like Fermi paradox, which basically in a few words, it tries to explain the discrepancy between the lack of evidence of other civilizations, even though apparently there is a high likelihood for its existence. Right?

So this is maybe a problem I would work on if I would have an infinite amount of resources I can just fly in the space and try to find them. That sounds like a fun endeavor. Yeah, for sure. I'd love the answer to that paradox. And people are interested in the physics side of things. There is a whole bunch of physics-related episodes of this show. So for sure, refer to that. I'll put them in the show notes. My whole playlist about physics episodes. Yeah, I know.

And I know also you're a big fan of Aubrey... Clayton's book, The Bernoulli's Fallacy. So I also put this episode with Aubrey Clayton in the show notes for people who have missed it. If you have missed it, I really recommend it. That was a really good episode. No, I know. I know. I know this episode. Yeah, awesome. Well, thanks for listening to the show, Dimitri. Awesome. Well. Thanks a lot, Mitri. That was really a treat to have you on.

I'm really happy because I had so many questions, but you helped me navigate that. I learned a lot and I'm sure listeners did too. As usual, I put resources in a link to your website in the show notes for those who want to dig deeper. Thank you again, Mitri, for taking the time and being on this show. Yeah, thanks for inviting me. It was a pleasure to talk to you. Really, super nice and super cool questions. I like it.

Transcript source: Provided by creator in RSS feed: download file