Approximate Bayesian computation with surrogate posteriors

00:00

And Jill, over to you. OK, well, hi, everyone, and thanks for the invitation and for the too generous introduction. And so this is this is Don't Work is Florence for CNN and 2010. And again and then what I'm going to talk about today is approximate Bayesian computation. That is ABC, in short, with a resurgent Posterous. So it's it's sort of a new way to to doing ABC that that we find that we find pretty interesting.

00:39

So as a disclaimer, I'd like to say first that we none of us is really expert of ABC, but any way we find the new methods particularly. So here are my collaborators of Florence, he's the head of the Stratify team in England, is at La Trobe University in Australia, and Trent in England is at university in France. So I think that all of them are here. So I guess. So by the end of my presentation, you should be more familiar with ABC. I hope so. I would start by presenting the venue.

01:24

I agree then that you can use in a series called The Rejection ABC and then and move on to the semi-automatic approach that was proposed by Fernet and Pringle in 2012. And what we do is that we build on this semi-automatic ABC. So we for this we we have preliminary learning step where we where we built what we call surrogate posteriors. And these surrogate posteriors are built as an inverse regression that is called glim. And so we call our ABC approach the clean ABC procedure.

02:02

So I will present some theory, some theoretical properties of it and and a number of illustrations in inverse problems. And then I conclude. So to give you quickly some context, and when we are interested in doing Bayesian inference that the likelihood is intractable when we to deal is the is in the context of approximate computation. So you get data generating model as as follows.

02:36

So the parameters denoted data and we have to have a prior pipefitter and given to the the likelihood is denoted fita. And I've said so that did I mentioned and an important condition of of ABT is that we know who to sample from this likelihood. We also know how to sample from the prior. So the goal of statistics, one on one goal in statistics is estimations or estimation of the parameter given some observed way.

03:12

And in patients, that's how this is done, is by by forming a process distribution by Peter, given why that is proportional to the prior times that I to. And the question is NBC what to do when the likelihood is intractable. So it's not possible to evaluate it, maybe because it's too costly or just because it's it's not their. So one way to proceed and the most simple way to do is is as follows.

03:48

So you get to you want some to values and parameter values from from the posterior, but you're not going to have them. Exactly. From the steering that ABC set up, it's going to be approximate. And so the way to to proceed is as follows. You need to sample quite a lot. A number of couples have parameters in there and sample data Tettamanzi them.

04:17

And so this is simple to do. If you know how to sample from both the prior and accurate, you can first sample type values from the prior and conditional on these two that I use, sample data I use from the likelihood. And the key starting point of rejection is that you are going to accept to keep parameter values for which you simulated data values that are close to the actual data.

04:47

And so this is done by a by a comparison with some metric capital G. And as soon as the distance is small enough, then you decide to to keep the parameters that you have sampled. So this distance D can be can take several forms in the most simple way. It takes the it's the Euclidean distance between between the victors of of data of true data and humanity data. But as we will see so often the Euclidean distance of summaries of these factors.

05:28

And so already now we can see a number of questions and we are faced with a number of questions actually, what choice for D for the distance, for the samaris and also for the for the threshold? Excellent. So in to talk and in our work, we don't really discuss the choice of the threshold, but we have this discussion and the choices for Froggie and for the simplest.

05:58

So they are a number of strategies for for these choices of DNA and s. So it's a starting point for this is the realisation that you cannot really use this simple distance efficiently in high dimension, etc. you would get too much variability in your procedure. So it's important to do to reduce the dimension. So doing this can be done in two ways. The first category of ways, the first committee of procedures is based on the effort is made in the summary.

06:39

I saw this in this family of approaches, this is a standard distance, and in this way, the the the the fact that you use summary's reduces the dimension and induces a smaller variance. But the the the problem is that you lose some information. And the choice the choice of the summaries is arbitrary if you don't have expert information for how to do it.

07:13

So so the the work that I already mentioned a couple of minutes ago by of in 2012 provided the first solution to this, to this and to this problem. The semi-automatic A.B.C. framework relies on the preliminary learning step where you you learn the defendant's between the parameter and the data in a generic way. But one of the one of the limitation is that it relies requires a modest dimensionality for the for the data.

08:02

And the second category of of approaches is, is the is the ones that are based on data discrepancy. So it was an active research line in the last five years or so where the idea here is to replace the distance by a distance and empirical distributions. So your view in this in this sort of approach is you view your your data vectors as empirical distributions. And so by doing this so here it's with some abuse of notation.

08:42

The victors are seen as these empirical distributions and then you can use the distances between empirical distributions. So a number of distances have been proposed in the in the literature and listed here. So a clear advantage is that you do not rely anymore on in some recent. But a problem is that the tree is moderately large, samples you need replicates of of samples for the same parameter to twin girls. And in many of those problems, you don't you don't have these replicates.

09:24

So you you in the problems we are interested in, you only have one observation. That can be a long observation, but you have one observation to two inverse four for every parameter of interest. OK, so what one reason why these ABC methods are interested in one of the reasons why we we we can count on them is that they have well-behaved limits when when Upsilon goes to zero. So this is a this is what we need to present here.

10:06

This is. So here the placea distribution is written with this intractable, like huge inbreds. And so since it's intractable, we replace it's ABC replaces it by this blue quantity that is that is a of the like Hudes with respect to the indicative indicator function here in root. OK, so using this approximate likelihood induces an approximate crustier here in red that is proportional to the prior times, the approximate.

10:49

And the reason why this crazy poster converges to the troops there is fairly simple to to see. It relies on the fact that when Upsilon goes to zero, the distances between the data vectors also goes to zero. So the set of accepted vectors converges to the singleton made of only true data. And so the approximate posterior Convergys to the troops there. So the details of this proof can be found in the references below.

11:30

And one of the starting point of the of the of the work is actually a realisation by Florence, if I don't want to mention her, that this condition that the set of accepted Zed's converges to the Singleton Y is somehow too strong of an assumption. Or you can rely on something not as strong as is for the for the convergence to to steal, to see, to hold. And and let me let me explain which sense this. This is a this is this is true. So we we can right.

12:08

Then the the base formula for the crazy posterior in a slightly different way. So we replace here the joints of t time and data by by the same joint here written in blue but but by using the chain rule in the other way round. Right. So we. It uses the post here and the and the and the evidence of that. And so this is a sort of first realisation. And then the second sort of bold endeavour is to replace some distance between vectors, Y and Z by distances between posterior distributions.

12:59

So here there is an overload of denotation. G is not the same, but we used to do the same, the same notation. So in this integral, we want to replace distance between vectors by by a distance between distribution's. And so we have no we are forming with with this we are forming a new quasi post here that is written in blue and that is the same as before. But where the indicative indicates our function is is is evaluated at the at the distances above.

13:41

OK, and when you first hear in that reproving in the Philippines is the fact that the crazy post here, Convergys to the to the posterior, to the troops there in total valuation when Upsilon goes to zero. So actually, the proof is very similar to to the intrusion of the of the original proof for for the ABC structure when Upsilon goes to zero. And you can get that the discrepancy between the surrogate wall between the stairs also goes to zero.

14:18

So that means that the possibility at sea converges to that evaluation of the true data. And in terms of of Kwesi posteriors, that means the quazi posterior to Convergys, to the to the posterior. And and so what we what we see in blue here is that the convergence note in terms of sets of indicator functions is a convergence to to a set that is potentially slightly larger than the Singleton way is set in blue here. And it contains white, but not necessarily on the one.

14:59

So if you follow me, well, you may ask yourself, why is it any and legitimate to use this unknown quantity in what we do in Saudi? Of course, in practise, we we need to use a practical approach, approximations to these to these bustiers. And this is what we call the surrogate bustiers. Um, so I'm I'm moving now to to give a few words on the approach proposed by Senate and Pringle, so when they what they do in semiautomatics, two, to replace the choice of summer is by by summer and.

15:46

By some experts, their expectation was Terramin, so of course, the Terramin is a quantity that isn't available by definition of your of your problem. This is one of the things that you are looking for. But they iji and they suggest to use a preliminary linear regression to learn this meeting between 2010 Z and and this is done by first simulating a large and a large.

16:19

A number of couples of parameters and data that is simply simply sampled from the joint distribution, so it's the same procedures that we do in B.C., but it's done as a preliminary step. Right. So in the end, you're going to do it twice. And so we have a number of the contributions in in the paper that we that we call variance in in this presentation. So I realised that I should I should not use the word variant that is too much contested those days. So the first one is was already suggested.

16:58

When was rejected, implemented by my papers, for instance, and Rehnquist, and it was already suggested in the original paper by Denny Simple, it's about using something else than a linear regression. So neural networks, for instance, actually, we can also use our own investigation to to implement our number one. And number two is to realise that not only the means could be used, but also some higher order moments like variances. So it was already suggested by June, but not implemented.

17:39

And we guess that the reason why it was not implemented is that it requires your procedure to be able to provide those moments at low cost. And the main contribution is for and the three is to replace summary's by by a good approximation of the post there. So this requires two things. It requires a learning procedure that is able to provide this approximate plasterers. And for this, we use the clean model Kushan locally mapping of the reforms 2015.

18:17

And then once once we have a positive approximation, we need a way to compare them. And this is this requires metrick between distribution's. So I have to stop maybe for for a couple of seconds to ask whether there are any questions and what we've seen so far before, I move to the. So I proposed this framework. I should move on so and so the surrogate posters that we propose are billed as mixtures of emotions so that this is one of one of the babies of the first quarter of the work by Florence.

19:04

And and the the idea of claim is to capture the the relationship between daytime parameters in a as a as a mapping that we that we learn beforehand. As to the way the mapping works, is is is is is as a mixture of coercion and distributions and that our parametrised by a number of by two sets of parameters, I'd say the first set of parameters I rely on the on the weights of the mixtures here below, while the second half of the parameters rely on the whole we.

19:49

parametrised Scotians, and there is this fine, a fine relationship of the means and a fine dependency in life. So to fit these climate models, we need a preliminary learning set, just as the semi-automatic approach does. So for this, we need to sample from the from the true joint and get people and couples. And then the clean relationship is learnt is is learnt by using an written. OK, so we estimate a five star K for the number of mixtures and the number of components and what this data says.

20:39

And we estimate Vistar and then all the procedure that follows can be done with this single value of Vistar. And so in our case, the three our that they have presented before take the following form. If, if, if I rely on Glim the Viant, no one uses the posterior mean. And as as a single as a single summary statistic and so means are staying close to home for everything is even close to home with these discussion mixtures.

21:23

So this is the mean that we use violence. Number two is the suggestion that we can add some higher order moments. So it happens that the violence is also close from Compline and they take this form. And then the violence and the three is the idea that we can use surrogate posteriors, so the full surrogate posterior in the case of cleaning are mixtures of emotions.

21:51

So if we want to use this as an as it is to be compared together, we need a metric for Fogelson mixtures and that's precisely the work that was done by doing this on paper, where they propose Westtown based dist. for mixtures of of quotients. So the the distances referred to as the MWI to. But are the distances can be used and we also implement a need to distance between mixtures.

22:28

OK, so this is a recap of the proposed algorithms, so remember that we have the first preliminary learning step where we need to to sample a large and large sample in and the data sets and we will learn glean on this data set. So we done this functional relationship between two things by getting a sci fi star and type of parameter estimate. This is an approximation of the troopers here. And then they are there is the second step that is computing distances.

23:09

So we need another simulated data set E capital M. For a single observed why and we can do two different approaches, the victor approach, that is either variant one and two with the expectation that with expectation and finances or look, variances.

23:33

And the functional summary variant consists in comparing directly the surrogate for stairs, so either by the N.W. to or by the L2 and then the sample selection is the usual thing that you only retain the best and the best, the smallest distances by by choosing Upsilon as usually as a quantity of the critical distribution and the distances. OK, so I'm moving now to some asymptotic properties of a very procedures, I can take questions if they are.

24:15

Otherwise, I'm happy to move to on. Looks like there's a question from Jeff. Jeff, if you'd like to just submit yourself and go. All Yeah, certainly the windows, I just I just want to check my understanding, the the way that there is is quite a high dimension. It's not just the data. It's it includes like the whole data set. Is that right? Or is it just the data? Also, it's a did I mention the object is the dimensional vector that and that is not that high in our applications.

24:57

And I'm going to comment on it later on. But it's usually it can be up to one hundred one thousand dimensional orbit. So it's a full data set is not to mention the full you have multiple copies of the site, actually, actually not really in the in the universe. Problems that we are interested in. We have it's a specific that you have one and and you want the Associated Press premature for for this data, for this single observation.

25:36

Great. Thanks very much. I just want to thank you. OK, so I've already mentioned the first result, theoretical visa that we have, that is that the the first tier converges to the to the troposphere when Upsilon goes to zero. And and this is this is not really an applied results because it relies on the fact that we use the exact opposite here that we actually don't have. So this this is more of a practical theoretical visa, this one, because we plug the actual surrogacies.

26:16

It's in in the in the crazy post where we are, we are working with. Um, so I have to acknowledge that this result is only does on hold on the restricted class of targets and surrogate distributions. So we need compactness actually for for being able to prove the results of compactness of both the joint space of time, light and as well compactness of the parameter say that that contains the defined parameters of an affair and family of of mixture components.

26:57

So according to these assumptions, we build components mixtures from the family capital H. We use this the set of this learning set down and computes just as before the end, any parameter as being this this quantity that maximises the likelihood for four, five and the surrogates are built as and the mixtures evaluated that this at this Vistar.

27:31

And what we can prove is with this and this framework and does some additional standard assumptions that that I detail here, that the injury distance between our approximate posterior and the exact posterior converges to to zero. So in some measure, Lambda, with respect to the data, to the true data way and in probability with respect to this to this sample and G or.

28:06

So an important caveat of our result is that glim actually do not satisfy these these compactness assumptions that we we we hope that some some mixture of some version of truncated a mixture of truncated Goshen distributions could actually actually meet these these restrictions. So what I want to say is that this is a theoretical result, does not directly apply to tutoring. So I'm moving to to a couple of there is a yes to that question.

28:58

So so is it important to work with the Hettinger justice? Options, I guess it's important to work with the distances where you know who to that you know how to deal with a distance sort of is a strong distance. So maybe you only need the distance. Yes, I see. Yeah, it's a good point, which might avoid the conflict is essential. So for sure we were not able to to avoid this assumption is we we did not see how to avoid it, even with the other function of the distances.

29:45

But it's probably a direction to try to investigate more. Yes, it's a good point. OK. Thank you. And so I'm moving to two illustrations where we have, um, we have a we have two examples with, uh, with multimodal bustiers. So when one point two to keep in mind, is that an. When we when our approach is is deemed to work to work best, when in the case of multimodal bustiers, so this is the reason why we focus essentially on these examples.

30:32

So in the both examples, we have to 10 dimensional observation, and that's the single a single observation. So in the maybe a follow up question is that in the case where the the actual observation is is like a very, very long observation, maybe it can be just summarised to add to a smaller dimension observation. And the first first example is a synthetic sound source, localisation with Tweedie parameters. And the next one is a real problem for these parameters.

31:08

So we compare four and four types of of of ABC models, the one that is based on Gleen with only the expectation, Glevum, which is the expectation and the variance in with functional summaries. So either with comparing by two or by N.W. to. And then the the Senate and Pringle semiautomatic. So for each of them, we rely on the on our packages and the same for Glim, we rely on this extreme package that was proposed by Florence and quota's to.

31:48

Sorry, I'm asking. So so you're not comparing that to was because in a sense your post. Yeah. So we do compare to to them as well. That's a pretty good point. I should have listed it because it's a totally legitimate question. You you start with you start with an approximation of the pushchairs. So maybe you could sit there and and we we can see that we get an extra we we find this approximation with the ABC step.

32:23

At least in our examples as so. So this is this is this sitting for them and then and we do a knee rejection ABC, we suspect that we could also do some some other sort of ABC algorithms. And so the numbers are as follows. And it's ten to the five. And the number of ABC iterations is ten to the five as well, or ten to the six. And the upside on this one is a point one percent quanti.

32:58

So the first application is an. Is it arises from the case where you have a source vocalisation, you want to infer the localisation of a sound source, so it's it's suggested to the parameter you want X, Y, based on the number of sort of some sound measurements and the sound measurements. One way to to get them through some devices is it. And so I guess that this is a bit of Soho the year and the years work that you get.

33:40

You have to and you have a pair of microphones and from this pair of microphones located in one in two, you're able to compute this function. That depends on on on the parameter to. The problem is maybe it's not a problem, but that's the way it is. You have a capability of solution. So actually two to capable rates of solutions. So how do we sample? That's a simulated example. So we sampled observations in the following way.

34:17

We assume a single data that we assume that we observe Y's that are key while student T noised versions of the. So this is an 80 plus some student with a with a quite small variance and knew when one degree of freedom. So this is not this is a really, really bad noise with. No, no, no expectation. And so the dimensionality is 10.

34:53

So actually, it's this is not exactly this this this illustration that I that I show here, it's a slightly different one where we still have one true position to discover. But instead of using one pair of microphones we use to pass and the microphones are located. So one on the x axis when the other one pair the exact x y axis. And so the likelihood in such a model is an equal mixture of the two single payer components.

35:33

And so this is the shape of the troops there, we can easily find the shape by by working with the ited function. Um, and these exhibits for symmetry, hyperbolise, and actually we can also use in Metropolis testing algorithms to to to sample from the posterior, but we see here that it's not doing really great. Maybe this is because we didn't use it well enough and so has to think about the results now. So we have these. Well, let's let's start by the by the mixture in red and the bottom left.

36:22

This is to reply to Dugit question. So this is what we what we get with with only the preliminary then instead. And we see that we have a well we we probably see those those groshen components here and we have a number of them, maybe something like eight or maybe a bit more, but it's not a perfect representation of the posterior.

36:46

And then I want to move to the to the to the to the to occurences of variance, number one, that is Gleeman E ABC and the the semi-automatic A.B.C. that they are not they are not doing that, they're doing OK at all. And and then we have the last three that are doing quite OK. Four for the three of them, the the expectation variance is doing maybe something a bit intimidated due to you two values that we see in the middle.

37:21

It's more spread of posterior. And so this is very interesting, and the two and then the three, the functional one is is doing really, really good, I think, and and quite, quite similar for both for both cases that ask another question. So is something I don't understand. So you have you you construct a first table to learn your bema. Yes. And then you use your second table.

37:52

To do the ABC, yes, if you had merged the two tables together and you had done done done just one glimmer, you'd get something better than the left hand side on the button. So would you do as good as a group? What do you think? Because you have a lot more data on the right, on the other your thinking then. Yes. So this reminds me of discussions that we that we had with our quotas of maybe maybe I have some some help from the quotas in the chat.

38:28

So I don't know. So actually, Yanguas was replying to your first question, so maybe you can have a go to the to the chad for that. And and yes. Fraunces that you can also use a single table into Indian or instead. So I guess that's I guess she did. She did. She did try the mixture with us by merging the two. And I, I agree that you're going to get something better than what we have represented here, but you're not going to do to something as precise as the the conversion, the function.

39:12

So in a sense, if you want to spare some time to to save some computation time, I guess that you could also learn both models on the same any learning sets. But maybe that wouldn't be very Bayesian. OK, so I have a second and last exam illustration that comes from the planetary science, there is an inverse problem and the data is to recover parameters from from the surface surface of the planet.

39:49

For instance, Mars surface from where is what is called the reflectance observations, reflectance measurements. So this is a typical inverse problem because it's the direct model is easy. So, you know, this relationship between, you know, how to get the reflectance Y based on some parameter X, small dimension parameter X, and you get a neat, noisy measurements of these quantities. And so in reapplication, we focus on this on a small number of parameters and reduce those four.

40:30

I have to say, I don't know what they mean. And the reflectance and is is high dimensional. But you can use only you can you can do with only 10 geometries of of these reflectance. So you can really compact your observations to something quite small. And in this case, to the on the premises that are used are given above. So 40 components, including in the mixtures and both capital and capital m are equal, tend to the five and Upsilon is OK in the same context.

41:11

So. This is also a simulated a simulated example where the sets of parameters were well decided equal to this, so they they are meaningful enough for the for this for the particular application and that we that we set because because France is also working with these scientists on this in order and in other projects. So she knows that these values make sense.

41:42

And the and the the the example is devised in such a way that the teacher value and has this symmetry between two plus two potential values. So both point fifteen and point forty two makes sense for the model. And and if we look at these results for the for the for the marginal seat of each of the four parameters, we see that they are. So what we have here is on the clean expectation and the the both functional we sell to an individual to in the semi automatic increase.

42:23

So for most of the parameters they are doing very similarly, they are maybe slightly more kicked in the for the W parameter, both the the blue and the black. So meaning the two functional ones. And which is interesting in that is that, again, when there is a multimodality in the in the posterior, these are the two procedures. The functional ones are the ones that seem to recover the best.

42:52

So the black has this vital modality as well as the blue as of getting these bipedality that red and green do not really have or maybe less is pronounced OK. So we only show marginal, but the same can be seen from from from joint representations.

43:20

OK, so I I guess I need to conclude so what we have what we have worked on is is building on the semiautomatic framework of an Trango, but with this shift of paradigm in the sense that we use the word bustiers to to compare and to compare observations instead of using of comparing summaries of observations. So this requires a tractable and scalable model to learn to desegregate. Sabrin is one and such as possible model.

43:57

This works well, as I was saying, to hundreds or thousands of dimensional observations, and it can mean missing daytime data and variables. And then we also are able to we need we need metrics to compare them. So we've used L2, NMW to sort a few. First few results are that we don't need anymore summary statistics. We have convergence results to the troopers there with this caveat that it's only working in a restricted class of of models that and that's it.

44:38

And and we have good performances when the Pushchairs Hamilton model. And it seems that the quality of the surrogate test is not critical in the experiences that we have had. And so gimmies is doing OK as we have seen. It's not a perfect approximation of the poster, but it's it's always something good enough for a hour procedure to to do something good.

45:05

And on some of the experiments that we have written with such time-based seems more robust than it to a number of perspectives or so it's it's still very young, very, very fresh work and say it's so there are a lot of improvements that we that we can think about. The choice of K for the moment is we have new information criterion to select the number of components we can we could think about some and and we haven't assessed the computation costs,

45:44

but that's probably something to do. And doing more experiments and restrictions and other metrics than to an individual, too. Could be could be Kotov. Or other learning scheme then. So when one option that I was discussing with Jeff before the talk is, for instance, normalising flows, that would probably do something interesting that would be interesting to check.

46:14

Also risky schemes, then the vanilla rejection. So you can think about the importance and CMC or sequential Montecarlo, and we haven't spoken about threshold level and also extending this to more than just one observation to. This is clearly something that we we want to to think about. So we have a number of critics to to thank them, can come and work and sit around with it.

46:42

And very, very quickly, I would like to to pass this message to I don't know if any students on top of Franscisco from the twenty seventeen page cohort, but we have these open positions of postdocs up so the subject can be anything related to the themes of the team. For instance, base conditions that the defence has before the end of the year and application is really soon before May 21st. So you can write us if you're interested.

47:17

I just finished with this slide of references, including errors and and thank you very much for your for your for using. Thank you very much. Know you're getting a few virtual rounds of applause from from the audience, I can see. We've got, um, uh, a few minutes for questions. So, um, if anyone would like to question you, just put your hand up and, uh, amuse yourself. Yes, I can see you first and then we'll have those idiots with a, uh.

47:55

Thanks very much, Judy, very interesting. And so my question is, that surrogate was staring at you, I think you call it is it is it was the sort of necessary and sufficient is it necessary or is it sufficient for folks here to converge to pie? Is it necessary for the surrogate to converge in order for the long distance to converge? But yeah, I guess it's I guess it's necessary.

48:30

Yes, I would say even better than. And no idea whether he's sufficient, but I was guessing you were going to say the other way around that it's not obviously sufficient, but perhaps that's not necessary. Yeah, maybe if I can add something, yeah, yeah, please. Yeah, I think you're right, Jeff. We just need that the surrogate posterior voices are sort of discriminative enough on the parameters.

49:06

I mean. Yeah, yeah, if let's say we have a biased estimation of the of the super stereo, if the I don't know if we can say that, but if the bias is like constant, somehow when you compare the two biased estimation, the distance between those biased estimation would be the same as the distance between the two posteriors. And then I think it would it would work. But the problem is that in practise, we don't know how to formalise this idea.

49:42

Yeah, I don't know if you have suggestions, but. And she did. Would you like to? Can you go back to your conclusion? Because I can't remember my question with that. I just have to wait. So, um, so you said that you thought you want to extend to to eyed observations. I don't quite know what you mean by that, but somehow the Y that you consider as your observation could be anything.

50:16

It could be idea of observation and it could be a long victor or like a long time series of applications of time service or whatever, you don't have. You don't need to structure. The only thing that you need is that you want to approximate just by your whatever method you're using. So he has a view. But I could be way, one way and know I could be a big one. Yeah, no, I'm I'm not what we what we do.

50:43

It heavily relies on the fact that for one parameter, we have only one observation and the reply and naiades sample of observations make it a little different to a different set up. And then. In, I don't know, so so because of your control, just the reason that you wanted to ask if you had some asymptotic as well, is a dimension in the dimension of the observation somehow. Maybe that has to do with the theory.

51:17

But I think even even in the actual implementation of what we do, we would we would have to think differently for a building. The forbidding limit, for instance. Yes, it seems like a dream is related to the fact that you have a specific structure in your way. In a sense, that's what you're saying. Yes. Yes.

51:39

So, yes, you will have well, we would that we would have to to fit a model with a. In a sense, with with a way that is that is no more just one when one observation, but the table of observations so that you don't I mean, familiar me why is like if you have any idea observation, you can look at it as one observation as well, which is a big victory of observation now. And the end of the day, you're constructing your table. As for each day, you have a big Z and then you'll meet you.

52:17

You'll fit your condition. You want to feature you will twist to meet the condition distribution of the tent even set. In a sense, that's the model, which is an estimate of the damage estimate of of the dead. Given is that in a sense, yes. So what does that is at once? And then all of and a diving complex doesn't I mean, in practise it makes sense fitting in the way the methodology is derived. The principle should be the centre. Yeah. Maybe maybe reduce something in doing it this way.

52:53

So I see France saying that, that you'd probably want to do something more clever. You would know you're right in the sense that we could we could stuck on the idea of salvation in one big vector. But then it's a bit it's a pity not to let them know that the idea. Yes. So you want to take this picture now? Because if you lose if you use, for instance, a discrepancy based method, that's exactly what they do. They know the data and they know that they all come from the same data.

53:30

So, yes. And so so it's it would be an not in favour of me not to know that. Mm hmm. And also, the idea that it's easy to adapt to the current implementation is not made for it, but it's just an algorithm, so it's too difficult to adapt it. And also for computational reasons, also, you have a very, very big victory, for instance, at the moment we cannot DeQuan two million or whatever, the using descriptives methods the other day may not be that large, but they have a lot of repetition.

54:11

It's a bit reminiscent of something that she was doing some years ago when she was using mixtures as a way to approximate the density, in a sense within the algorithm. And so I remember the talk to some years. I know. Yeah, you have to we have to check because indeed, the severe worked a lot on the mixture of experts and. Yeah, and so I wonder how related it is to what you're doing.

54:42

And I think in her case she could conclude the likelihood. So it's a bit different in this respect that she was definitely using these mixtures to approximate the posted identity at some stage in her guidance and then maybe as a proposal for. Yeah, I can't remember whether it was a proposal on the end. She didn't bother, but getting rid of the proposal and she was just looking at these mixtures as an approximation.

55:04

I'm not sure it's worth checking, but yeah, it's like it could be related, these attacks. Because she has a full book on miss. And she has an army of people working on that. Yeah. I see that we are close to the end of the of this, but there was also another set for for one on one discussion. So I'm happy to stay connected to the two. It's another Zoomlion, I think. Really? Yeah. Yeah. So if anyone can open this one and I stay, I stay there. They are so happy to continue the discussion here.

55:51

But if anyone would like to chat with Julian Giallo one on one, just pop pop me a quick email and and we can set this evening and to continue the discussion, but otherwise I think we'll answer any really quick questions. I think we will, uh, call it a day here. So thank you, everyone, for for attending. And thank you, as usual, for four great talk. It's an excellent invitation and my pleasure. Um, and, uh, everyone can look out look out for next week's talk of who the speaker is.

56:31

But, uh, they'll be, of course, another small seminar for the next few weeks. Uh, so thank you again, Geria. Thank you and right.

Transcript source: Provided by creator in RSS feed: download file

Episode description

Transcript