[Auto-generated transcript. Edits may have been applied for clarity.] So. If you haven't. Okay. Um, I'd like to welcome everybody to this chance is screen Lecture, which is our distinguished lecture. Um, before I introduce the speaker, one of our sponsors of progressive content, uh, who has been hosting this series since 2015. And if any students or researchers want to meet that afterwards, you can do that during coffee. Um, one last administrative thing up at the top.
We're going to have a Q&A here and we'll have some microphones running around. So if you want to ask a question, um, we'll do it that way. Okay. So I am super delighted to introduce Chef Nicole Foster's business about, um, the best speaker I could actually ever imagine. I would think there's a conference in the room like this where the other people concurring with this opinion.
Uh, she is a professor at Berkeley, um, and the former director of the Simons Institute for Theoretical Computer Science. Actually, she's just stopped me now, but she retains her role as research director of the. We're still you have to support, uh, Simons seeds. Um, sound. In addition to that, she is a professor at M.I.T. and also emeritus advisement. So she's done very many locations. Um, when I decided what to tell you, her the world's accomplishments are so significant.
And what I decided to do is to read you just a tiny bit from the citation of just a few different major awards. And I'm going to start with the prize in 1993. So Jaffe was awarded that prize for her paper, The Knowledge Complexity of Interactive Proof Systems. And I've shortened these a little bit tonight because of her time.
But the citation points out that she until the colleague introduced the concept of interactive systems, providing a rich new framework for addressing the question of what constitutes a mathematical proof goes on to say that this led to exciting developments, including the resolution of several major problems, about the difficulty of funding European institutions to comfort level optimisation problems.
Um, so a little prize would be pretty much enough for anybody, but, um, Jasmine's got two of them. Yeah, she had another kernel prize in 2001 for her paper, Interactive Approach to the Hardness of Approximating twigs. Uh, with, uh, if I didn't know much difference.
And, um, here the citation says that they developed a completely new understanding of what an algorithm property was in terms of art, with their deep insights and powerful results evoking different factors for many problems science, for example, computation and photography. The last one of them generated was Turing Award, which is pretty much the biggest prize you could ever get in science.
Um, so I'm looking forward to 2012, along with, um, the quality, um, um, citation says for transformative work that laid the complexity theoretic foundation for science, cryptography and in the process pioneered methods for efficient prosecution of mathematical proofs. But so Shafi is a member of the National Academy of Sciences, engineering and USA. Um, the Russian um Israel Science um, a former member of the um Russian side, Peter, and a bunch of others.
She's also got a long list of honorary degrees, including here at Oxford. Um, we are completely up for today. Um, speaking to us about privacy verification and robustness, a cryptographers perspective on on them. So thank you very much. Uh, discussion. So if you want a darker I. So. Hi everyone, and I'm very honoured to be invited for your lecture and also to have received an honorary degree here in Oxford. It's so beautiful. It's a little hotter than I thought. Uh, so okay.
Um, so the title is a little different than what I gave originally, but most of the same when you look at the content of the talk. So I'm going to call it constructing and deconstructing Trust in I totally a cryptographic perspective. So, um, you know, I've given some version of this talk without most of the results that are here, uh, maybe, uh, five, six years ago. And at that point, I don't know that I would have used this slide.
But at this point, it's very clear that machine learning revolution is really the scientific breakthrough of our times. And, um, that started with deep learning. Now with the large language models, it's kind of achieved an entire level altogether. And when Leslie was saying about the greatest honour in computer science, I was thinking about these Nobel Prizes. Oh, for machine learning people.
Um, so, you know, obviously it raises these very deep questions about human versus machine intelligence and whether large language models, this project that Michael and I'm working on will even give us the ability to translate non-human communication people. This is all talking about AGI and superhuman intelligence. They're betting markets about when that's going to happen. Um, you have uh, already the the oh one doing very well on competition math, competition code, PhD level science questions.
There's a whole benchmark of new questions out there which we don't know the answer for. We all agree. I think with this slide, and you don't need me for that. The leading is to the fact that, uh, whether or not we believe that there will be that this is the biggest scientific revolution,
it's going to solve the Riemann hypothesis, whatever. Uh, we do know already that is huge, uh, problem in not only promise, but, uh, for applications, whether it's health, drug discovery, infrastructure, natural language processing, policing, uh, financial markets. And all of this indicates, really, that there's a major shift of power, which we're all aware of.
For those who have data and those who have algorithmic expertise, and one can argue in, uh, in debate where the data is more important or algorithms are more important. But what is very clear is that's where the power lies and all that's beautiful. Uh, the problem that, uh, this talk is about is should we really trust models to the extent that we are very rapidly, uh, trusting, uh, we don't understand and we don't control.
And, uh, I think at least for myself, maybe because I'm of a certain age, you know, that makes me very nervous. But I think it should make everybody nervous, uh, when we're talking about critical applications. So, uh, since we talk about trust and since I'm a cryptographer, I want to tell you a second something about cryptography. So often people think about cryptography, basically just an arsenal of tools.
You know, there's encryption, there's digital signatures. People talk about obfuscation, zero knowledge proofs, metamorphic encryption and so forth. Okay, so the beautiful constructions out there, some of them are used in practice. Some of them will be used in practice. Uh, but if you want to sort of zoom out from all these tools, essentially all of them aim at the following. And that is how to use technology that you don't trust, uh, as if you should trust it.
So even though you are assuming that there are adversaries, you know, somebody is trying to listen to your messages, somebody is trying to pretend to be you to to, you know, have access to your bank account. Somebody might want to, um, reverse engineer your program. Is in the case of application, you want to say, I don't want to think about it. Well, I'm using this.
So what cryptography does. It comes up with abstractions and tools that enable you to sort of pretend as if these adversaries don't exist. So it is sort of a, uh, um, uh, a zooming out view. It's a way to establish trust in technology. And not only that, it's, um, this is true for each one of these, uh, tools that I've mentioned, but there's sort of almost at this point, uh, um, a cookbook recipe when you want to solve a crypto problem.
You kind of define the task that you're talking about. Uh, whether it's encryption or digital signatures and so forth. And then a very, very important step is you model what's called an adversary. So again, an adversary would have powers. He might be a passive adversary. Just be listening. It might be an active it might be changing messages going on the line and so forth. But you're supposed to model it mathematically. Precisely. Then you define what it is a secure solution.
So it might not be 100% what you'd like. What what you define it. At least it's mathematical definition of what security of solution would be. And then you construct something and most importantly, you have a security proof. So you want to say if, um, an assumption, you know, all of a sudden assumptions come in.
So we'll say something about that in a minute. But you say under very well specified assumption, if they hold, then this, uh, primitive that you constructed is secure according to the definition that you've laid out. Okay. And it's sort of the history of cryptography in the last, at least since I've been involved in it, is that these reductions between well specified assumptions and proofs of security is what the game is about.
Um, so without proofs, we're saying we have nothing, really, because we don't know if somebody is breaking the scheme and not telling us, then what's the point of the whole thing? So the proofs are really the main thing in our disposal. Now what kind of assumptions? I say the word assumptions because they are used all over cryptography and a lot of of course the work is to try to minimise them. They really fall into sort of uh, some categories.
One, which is the one that we mostly hope for, is that we believe that certain problems are hard. Certain computational problems. I'll talk about one of them during the talk. Usually people know about factoring integers. Here's a problem. We don't know how to do it efficiently. Let's assume there is no such algorithm. And now build a primitive and prove that it's secure under that. There are other assumptions you could talk about.
There's several entities in this system, and maybe they or each one of them might be adversarial, but they don't all collude, uh, against you. Another type of assumption would be a physical assumption. Uh, a physical assumption might be that, uh, you know, if you're talking about quantum physics or something that there are quantum computers, uh, might be some other assumptions of distance between that you're blocking, uh, two entities or they can't actually transfer information.
And these days, people talk about trusted hardware. The point of view of this line is to say that, uh, we would like to minimise we don't wanna use all of these assumptions. We'd like use none. Sometimes we'd like to use 1 or 2. But important part is really to emphasise this, uh, security proof. So what does this have to do with machine learning? Oh, by the way, I just want to say that sometimes you show that it's impossible.
So achieve security, you define a security, and then you show that maybe even also under the assumption that any scheme would break, you know, and um, okay, so machine learning. So the proposal and in some sense this sort of the idea of this whole program of research is to use the same kind of recipe, uh, for machine learning, uh, meaning that, uh, if there's a task, we'll define it first of all, properly.
And that's already, you know, I mean, I assume that a lot of machine learning researchers here, and certainly there are people who know more about machine learning than I do. Um, but I have I think we all have to agree that if you read any of these very fascinating papers with incredible ideas and breakthroughs, they very rarely define what they're doing. So, um, so if you talk about alignment, you know, I would like I want to work on alignment, but and I will work on alignment.
But I have to define it myself. And you might not be consistent with other people's understanding, but there are no definitions. Uh, there's words, uh, and um, words are, I guess, definitions also words. But in any case, and then in very important part is also who is the machine learning adversary. You know, maybe in cryptography it's more obvious. It's not necessarily the case here, but we'd like to define it.
And then we want to define what it would be. Since we're talking about trust what would be a trustworthy solution. So give, uh, a definition. And then finally you might try to come up with a solution. And we want to focus on the fact that, again, we would like to have some kind of proofs that under maybe some assumptions, this solution that you have constructed is trustworthy with respect to your your definition of trustworthy.
So maybe maybe this is more important slide than anything that it would be. It would be good to be able to have a definition. Maybe it's not going to be the right definition. Maybe later on there will be a more general one, or maybe they'll be another one that you could show equivalent, uh, but that there will be some kind of formal thing that you're trying to to satisfy. Uh, and again, what assumptions? It could be the cryptographic assumptions. It could be others.
So this is sort of the mindset for this talk. Okay. So um, and again, as I said before, possibly you what you might be want to achieve, like you define what what alignment is you you define what your adversary is. Uh, you might not be able to achieve the definition you came up with, and you might even be able to prove that you cannot achieve it. Exactly. So what do people want from you? Or you could change your definition, or you could change your solution.
And I'll show you in this talk a couple examples where you can actually, uh, achieve, uh, what we want and couple in one assumption where you can prove that it's impossible to achieve it, and then you have to contend with us. Okay, so, uh, just one word before. Sort of uh, go into these examples which are verification, robustness and privacy, like in the title that was advertised.
Um, I want to say that the first, I think, um, gut level reaction by people in machine learning is or at least it used to be, uh, is that whereas cryptography is designed with an adversary in mind, if there's no adversary, there's no problem. I mean, why should you encrypt if nobody's listening, you know, why should you protect, uh, accessing, uh, Amazon if nobody's pretending to be you?
But machine learning, that's not the mindset. You didn't come up with machine learning solutions because you're trying to get over an adversary. But the fact is. So maybe this whole adversarial thinking is not the right one. But the fact is that right now, AI systems and for the foreseeable future are very attractive targets. It's a huge payoff to break them in many different ways, whether it's on a national level or a commercial level.
And therefore, if you think about an adversary, it's a good way to think, because if you could show that things work in the presence of an adversary, you should feel safer than if you assumed, uh, a particular behaviour on, on, uh, on part of the adversary. So my second point is we don't when we talk about adversaries, we would like to prepare for worst case adversary. So we we we don't want to say, oh, the adversary uses a particular strategy.
Let's say, uh, those people are familiar with, um, you know, adversarial examples. There are solutions that say, well, if the adversary does the following to an image, say, then I can protect against it. If it's use restricted, the ability is restricted to some transformations of images. But why should the answer be restricted that they can put a patch on a stop sign? And I'm saying things we've all seen in other talks. Um, so we'd like to, if possible, to prepare for worst case adversary.
But one thing we do throughout cryptography, and also throughout this talk, is that we do assume that the adversary is not is computationally unbounded. So in other words, it cannot factor integers if you like, but more it doesn't have more than it doesn't have exponential time. So there is some bound on the running time of the adversary or other resources as well. The resources these days could be samples, you know, there's a distribution or there's data and there's some at this cost.
So there is some bound on how much data, how much time, how much. And again, when we talk about bound I mean more than polynomial okay. Um so. Uh, what are the challenges that I will talk about, which are sort of an example for this approach? Um, so here it looks really pale. You can actually see the screen. Yeah. Okay, fine. Um, good. So there will be I'll talk a little bit about privacy.
Uh, kind of aiming at this issue there. A lot of the power, at least initially, came from data of individuals and the, um, the training data. And, and we might not want to give over medical records and so forth, or might not be able to because of legal restrictions. Uh, then I'll talk about verification. And this aims at the fact that these models I'm not training my own model. You know, I'm using a model that was trained by others, whether it's DeepMind or, or, um.
OpenAI or whatever company, and we would like to have some way to verify that these models that are presented to us are good. Okay. So they haven't been tampered with for whatever, uh, for whatever reason. And lastly, uh, I'll mentioned about robustness. So that is uh, we have the data today. We train. It looks good. It's often with respect to some benchmarks or some holdout information.
But what about distribution shifts and data of the future? Uh, how can we trusted the models robust with respect to change in data. Okay, so those are the three, uh, areas that I'll talk about. And you can already see that there is if you had to sit down and say, okay, so who's the adversary? It is very clear the adversary in the first might be the training algorithm that's getting our data.
The adversary in the second might be this company that's training a model and might be not doing it the way that they claim they're doing it, or maybe saving on resources and therefore doing something different. And the adversary here is in some sense even nature. Uh, so distributions do shift. And what do you do then. But he could also be much worse adversary okay. So just to say um, very simplified version. So we'll talk a little bit about what happens during the collection and training phase.
Then post uh, development of a model so called the audit phase. And then when it's being used and uh, again, very, very high level, I'll think of machine learning just in these two, uh, places. I'm not looking at, um, you know, reinforcement learning or anything. Just think of a training algorithm gets inputs from a distribution labelled x, y and outputs a model H. So h for hypothesis a model and once a minimise loss according to some loss function.
And then the other phase will be there is already the training. Uh there's already the model H and now we are getting some individuals data and getting some prediction probability. But more than that this could be a a language model. So this is a query and this is an answer. Um, so not just about prediction but also generation. And um it could be distribution over sequences or sequences. So let's be very general in kind of thinking about what I'm aiming at. So I'll start with privacy.
So in some sense privacy is kind of the easiest, um, problem, not necessarily the easiest to achieve in terms of the computing power that's necessary, but it's the easiest to define because we have so much experience, you know, especially as cryptographers in trying to protect privacy of data. Um, so there are lots of tools, you know, the secret sharing, uh, some of these tools, I will mention it, but just kind of is a laundry list here.
The longer one, private information retrieval, secure computation and one morphic encryption and so forth, there's lots of tools to our disposal. And in some sense, we can use all of them. Uh, and they're all defined to have certain, uh, secure, achieve certain security under certain assumptions. And they kind of naturally fit to some of the problems that come up naturally. Uh, in, in machine learning, uh, in particular, we're going to use homomorphic encryption just in the next two slides.
So homomorphic encryption is this little chart here. It's an encryption method that has some special property that not only that you can encrypt and then uh, decrypt, but you can also do something we call evaluate. And that is you could take the encryption without understanding what is x, just what the ciphertext is. And if you run an evaluation on it so that at the end you get, uh, a new encryption.
But what's being encrypted is whether it be encrypted X, it'd be encryption of F of X. And you could apply that encryption step on this new C prime and get back uh f of x. So that's really and the question is which FS are being supported. Could be addition could be multiplication. It could be any circuit. Um and that's one of the biggest, probably the biggest invention in the um the last 20 years in cryptography. There's around 2008, the first homomorphic encryption scheme was proposed.
And it would be extremely useful also for what I'm trying to do here. So for example, just to kind of get us warmed up, suppose we want to actually, uh, apply it to the training data problem. Uh, again, this is sort of, I think a homework problem once you understand the definitions to, to do this slide in a fairly low level homework problem. But, you know, this is, uh, just to set the stage. Um, so let's say that all the data is encrypted now and who encrypted.
That's a good question. But let's say you encrypt it for some kind of, uh, trivial example. Each person encrypt their medical file, or each doctor or each hospital encrypts all the medical files in their disposal. They all come together, they give it to some, uh, trainer, uh, and the trainer works on it. Now instead of working on the, uh, data itself directly on X and Y, they, uh, they're working on the encryption.
So what you need is an algorithm that is able to do whatever program that we're going to run on the X and Y's in the clear to do it on the encryption so that if you, uh, what we just said a minute ago is, uh, called the encrypted compute stage. If we have in our disposal something that can take X and. Ys. Those are the X over there. And the labels wise with the medical file is x. The label of whether you survive or don't survive might be why?
Um, then um, everything would be done under encryption. We'll talk about cost in a minute, but in an ideal world we have this encryption method. We get the encrypted model. Now we want to use it though comes in model. We unnecessarily use it as a piece of information which is encrypted. So at this stage we have to decrypt it. So how are we going to decrypt it. So the idea here is there's no single party.
It's certainly not open. AI is going to have the decryption key because then they could just equipped crypto or medical record information. But there is uh, a decryption key metaphorically look like three people here that all each have a part of a key. And, uh, they are. There's and then they can apply their share in order to decrypt, uh, this encryption of each such that eventually, after each one goes within steps or together, voila!
H appears in the clear. Okay. Uh, so what are the assumptions here? The assumptions are, first of all, there is an encryption scheme that has the properties I want. And that's based on something called learning with errors, which I'll put up in a slide in a minute. So it's not a factoring problem of integers, but it's a different problem comes from sort of geometry of numbers. Um, and that these guys or these shareholders, uh, don't collude. So there are two assumptions here.
One is that you have given the pieces of the key to P to entities that don't all collude together in order to break the scheme, and the scheme is secure. Now, those people maybe worked on differential privacy. How many people have heard that? Yeah. Many, uh, anyway, so you could say that this is no good because H, for example, could be just the list of everybody's data. If it's a list of everybody data, then H itself reveals people's data.
So maybe you want to have some requirement from H. But that's sort of beyond this is a that's a different problem. So one problem is a design an H two does not leak the information of X and y. The other is how do you actually compute h. You know, without looking at the x's and y's in the data. Um, and the problem with differential privacy, there are lots of old papers about their training under differential privacy. I'm not going to touch that. Okay. So it didn't give you a definition here.
But the definition really would be that whatever algorithm you had that was working in the clear okay. If you look at that algorithm versus the encrypted algorithm, essentially you have you will learn whatever you can learn from the encrypted one is what you could have, um, is. Um, so the other way, whatever you could learn from um. You shouldn't learn more. I can't think right now. I went to sleep at 130. What is the definition you would want?
Let me ask you. So the definition we would want from this scheme is that we are, uh, you going to find out more about the data, what the x and y is more than H provides. So h we say is okay, everybody should get h at the end. Okay. Well open the at least you get h at the end. But we want that whatever it is you recover from H about x and y's is um is holds regardless of whether you had access to this encryption or not. So in other words, H obviously gives you things okay.
But you don't want to give you you don't want the fact that you looked at the encryption of the medical file to give you more than H. And that's the definition that would be satisfied. So what does this learning with errors. This thing that I said, that is this is the assumption based on which we build encryption scheme with this extra property that they can compute on ciphertext. So this is the problem. Um, essentially let's think of S as a vector in uh execution.
And uh, and then we're given and that's the secret. So we don't know what the values of these, uh, SES are, the components of the vector, um, and so on and so forth here. Uh, and is for but we're given a lot of equations in these variables. But of course, if we're just given equations, uh, without the, we could do Gaussian elimination and figure out what's is. So we're given noisy equations. So in other words on the left hand side are the coefficients.
And on the right hand side we add some noise. So this aid is not really the right solution. That's already the solution. Plus some uh noise for some Gaussian distribution. So it turns out that this problem. So give me a bunch of uh, coefficients in uh, in a noisy right hand side is, um, is as hard as decoding random linear code, as hard as, uh, approximating the size of the shortest vector in a worst case N-dimensional integer lattice.
So it's a it's not only problem that seems that we don't know how to solve, but it's a problem that people have shown that average on the average case distribution is as hard as the worst case problem, which we don't know how to solve. And when I say don't know how to solve, I don't mean that I don't know how to solve it. I mean that the best known algorithm and that is actually true even for quantum computers today takes exponential time.
So two to the n and it was the was the length of the vector. So this is a problem that has been considered for many years, especially in the lattice version. Um, and uh, this is the best we know how to do. Now, this doesn't mean that tomorrow some genius is going to come up, maybe from Oxford, and come up with a polynomial time algorithm, in which case maybe it's even better than, uh, having a secure FCE because they've sold some fundamental problem.
But it's what we call the win win. So we say if you could attack the scheme. The encryption scheme. Then we gave you actually a constructive reduction from that attack to solving, uh, this worst case approximate, um, this worst case, uh, shortest vector to lattice problem. Okay. So say you we need a way. Either you have a good scheme or you have a bad scheme, in which case you have a good algorithm. Um, and that is what was achieved by the proof.
Okay. So this is, um, this is all beautiful in theory, but. Oh, often when people talk about, uh, that is four years since it was invented in 2009. And the development people were saying, well, it's very nice in theory, but it's impractical. So first of all, I hate that sentence because it never really holds with time, because whatever is impractical in the past usually becomes practical in the future.
Um, so specifically here, uh, there is a library, an open source for polymorphic encryption library by many different people and, uh, certainly, um, it, uh, it's open. So, uh, it is possible to run this in practice. Now it's all a question of scale. So when I say scale, really, if you think about large language models and you want to, uh, train foundation models and fine tune and so forth, you would like to know how really what is the concrete efficiency of this, of these schemes.
And really the way to think about is this sort of three different axis. There's the adversary, which we sometimes think is very curious. Okay, but he's honest. He doesn't tamper. Then it might be, uh, actually an active malicious adversary as a change things also in his quest to break your encryption scheme. Um, the adversary here is the machine learning training algorithm. Right. Because they get the encryption. It could also be the people providing the data, and that's called poisoning.
It's not part of this particular part of the talk. Um, then there's a question of what is a computation. Could be linear regression. Could be logistic regression. It could be deep nets. Neural nets could be language models. Each one of these is a big jump in terms of complexity with computation. So where is we. We can use homomorphic encryption for linear. No problem. We know we have things for logistic when we get into large language models is obviously not there.
Okay. So what do people do. So there's another axis here which is what are the assumptions. So one assumption I talked about was learning with errors of factoring. So these are the computational assumption. Another assumption might be that you have more than one party privy to the computation. But they don't talk to each other okay. And the third assumption is trusted hardware. And that's really kind of the big player these days. Uh, so when people talk about trusted hardware.
A. What are they? What do they mean? So they may mean many things, but mostly what they mean is this new chip of Nvidia, which is not that new at this point, few years h100 where the idea is that they in some sense say, here's this, here's a secure, uh, box and give it put encryption in there. The box holds the secret key and it, uh, inside of it, it will decrypt, it will do some computation, and then it will encrypt and send it out.
Okay. So in a sense, you're trusting that this box works, uh, without leaking anything about the secret key or the message that came in encrypted. And then there's also another usually attestation, uh, module which says, not only that, but you're going to get approved. What was done inside there under the auspices of this box was the correct computation.
And this is, uh, there used to be, uh, Intel SGX, uh, which was similar, but it wasn't a GPU based, but uh, and it had similar kind of, uh, boxes and similar kind of belief that it just works. Um, so I guess the sponsors are hiding this line here. Just second. Okay, so beware whenever you talk about, uh, side channel attacks that, um, sorry, whenever you talk about secure hardware. Ah, that there are these what's called side channel attacks.
It could be that because of the amount of power consumed or because two people use the same chip, and then something was left from the previous, um, uh, from the previous, uh, encryption that was in the box. Um. There are some, um, bugs. I guess maybe you don't trust the companies blindly. These are things that there's a whole field of people that are out there working on it, on on secure hardware.
But for our sake, we could use it. But we just have to understand that that's part of the assumption. All right. So has this been used. So this is all like a, um it's like a blueprint. So mostly the usage is have been sort of in finance is a lot of crypto use, but the use more of a machine learning type use of privacy. Uh, it's been mostly on, uh, in medical and in actually in genomic, uh, uh, field. So why why do it in, um, when you talk about, uh, genomes?
Because you need a lot of data. So it's a place where complexity comes in and has large amounts of data because otherwise you have no effect. Right? So if you think about things like Crohn's disease or, uh, breast cancer, uh, in schizophrenia, you know, you only when you go to large enough numbers that you can actually find signal in the genome that's related to the phenotype.
And, um, so it's a, it's a perfect example of a place where privacy is required because you're talking about medical records, but, uh, size is, uh, in size is important. Uh, so how do you achieve it? So this was a place so 1 in 2020. We had a paper doing Gwas. Uh, this is also a company that I'm a co-founder.
And so also part of it where we use exactly this type of blueprint like I talked about, except what you're training is, you're running a dual genome wide association study analysis under homomorphic encryption. And then at the end you're getting whatever the answer is. And you'd like to check the accuracy of the answers so that it worked correctly. And also with the time performances. And at the time.
So this is 2020, the extrapolation was 100,000 individuals, 500,000 snips in this amount of time on a single server. This is the kind of application that doesn't have to work in real time. Today, the numbers would be much better. What's another example? Another example. Uh, and just a point to make here is that who is holding the keys in the case of Gwas, it's actually you can think of you have different medical centres.
If there's the Broad Institute and there's Oxford, I don't know, there's there's UCSF and they all each have their database. They could be the ones holding the keys, the parts of the key because they don't they don't want to share with each other, but they're okay. Uh, um, I guess the as a key holder, there is a partial key holder themselves and then they come together and reconstruct H. All right. So that was 20 1020 then 2023.
There's this is an application working on real world uh uncle oncology called data analysis. So this is actually a hospital that has a lot of, uh, cancer patients and wanted to do something which was, um, uh, combining actually, the novelty here was that you the data was, um, you know, rather than having lots of data, it looks the same, but for, for different patients. You had one medical centre that had some, uh, there are electronic medical records over here.
Is that another medical centre that had their, uh, results? Uh, once they were cancer patient. And you wanted to. So we think of it as sort of same patient, different data held in different places. So you need sort of joint operations to sort of identify this, the same patient enjoy order and all this under encryption because you're not allowed to send each other data. Um, and uh, what was used for that is also something called a threshold effect, but doesn't matter.
The main thing is you can support a lot of statistics that people do in medical packages, like, uh, you know, Kaplan-Meier, um, survival curves, uh, and lots of other very common statistical operations. Okay. So the last thing that I'm going to say about privacy is that now in 2024, the thing of secure hardware has come into the picture.
So often what you see is medical studies where you have different, uh, let's say different, uh, in this particular example, it's different hospitals, a, you know, minute of say, well, the application was each hospital trains sort of their own, um, a model locally. And then they combined using something called Federated Learning to come up with a bigger model that takes all the information of the different hospitals into account.
And you use secure hardware because let's say and this so there's both this thing called federated learning. So this is the federation and also fully homomorphic encryption plus what's called trusted execution environment which is hardware. Okay. Um why why use hardware if we were doing so well in the oncological study and the Gwas is because in this particular application, it was to take a whole bunch of, uh, a pathological slide images.
So they're very, um, uh, communication heavy to send this over the line. So you wanted to do things locally and then just send sort of updates of models A and even the updates, you wanted them to be secure. And that's why you use the secure hardware. Okay. There's a lot here. I just wanted to show the only thing to retain here is this. Privacy in some sense is easy to define because we've worked on it.
We have the tools, so a lot of it is about putting the tools together in an efficient manner, you know, for different applications. So it's at the stage where we're privacy matters. It is worth taking a look at this type of technology and making it work. And that's being done. Okay, let's go on. Talk about verification. What time is it? How much time? I've only been talking. Wow. Okay. Um.
Okay. I started a little late. Okay. So, as I said, machine learning is really should be thought of as a service. I'm not doing it in my own office. Somebody else is doing it, and I use it so I can think of it as a there's a client, it gives, uh, and it gets this model. And it could be even the client supply the data. So forget about privacy now. Somehow the machine learning trainer, the service provider, uh, got data and came up with a model and gave us back to the client.
Okay. So now the question is what do I know about this service providers. As you see they have like ears and a tail like the devil. Maybe not the devil, but the devil incarnate. Um, you know, these are sort of the signs, you see when you drive from Berkeley to San Francisco. There's a company called hive, and they keep on changing their signs, and the current sign is, um, uh, or maybe this was a few months ago. We make eye work. Uh, don't just trust us.
Test us. I don't know what they mean, but. But they themselves are confessing that you shouldn't trust. Um, okay. Okay. So, um, so what would we like to do is we would like to sort of post development, verify properties of this model. So what are the properties you want to verify? There could be many. For example you could verify you want to verify the model is accurate. That's the thing that most people concentrate on right.
They have they have their own data and they test the model on and see that they get the right result. Um, you want to say that it's correct? Maybe it's not good enough that it's, well, good on the average, but you want it to be correct on the prediction that it gives you. Uh, you want to talk about robustness. What happens when you're a distribution change. You might want to talk about fairness and the United States, you don't use that word anymore.
But you know, it used to be a big field a few months ago. So whether it's fair to different, uh, subpopulation, I think it is one of the words is stricken. I'm not sure, but, um, it should be. Now, what do you want to talk about? Safety. Um, a or maybe later down the, uh, down the line. When the regulations are set, you want to prove that it satisfies regulations, whatever they are. And obviously you could retrain by yourself. So that doesn't make any sense.
Uh, you have to have some restriction on this, uh, on this module that verifies properties, which is that it will be cheaper than retraining on your own. Cheaper? How cheaper? Maybe using fewer data samples, maybe lower quality data, maybe being much more efficient, maybe just using black box access to the model because they don't necessarily going to give you all the model weights and tell you how the model works. Exactly.
Okay. So this is like a big goal. How do you verify properties of the model that you were given. So I will show you, uh, a couple of results about this. So first of all, as I said, accuracy is the sort of simplest thing that to define with respect to a given distribution. And what people do today is that they have benchmarks, right? They have benchmarks. And if you pass benchmarks, then you then you're golden. Um, so could you have like a more rigorous definition?
Because the problem with benchmarks is everybody knows the benchmarks, uh, a benchmarks in general. It's hard to define what it is that they are achieving for you. Uh, so, um, so this is a paper from a few years ago, uh, which we call interactive proofs for verifying, uh, should say, verifying, uh, the quality or accuracy of machine learning.
And, um, and this is just getting warmed up in some sense, this particular paper, uh, so the definition we gave there is on one hand, there's valiant, uh, PAC learning, I'm sure that everybody here knows. So you give examples. And what does it mean that you're eventually your model, uh, it learns is that, uh, you know, it's accurate with high probability. And, uh, you could talk now about probabilistic and approximate verification, not just learning.
So whereas in PAC, you talked about learning. Now I want to verify. And here's a possible definition. What do I want to verify I want to verify the let's say this is the model designer. And they gave me a model. Let's say it looks like a neural net here, but just h in general. Now I could ask the questions and get back answers could go interactively.
And at the end I want to be with very high probability, uh, assured that, um, this model is within an additive error of the most accurate model possible for a given distribution. So what does that mean? So let's unpack it a little bit more. Um, we say that let's say this to this H is um, is in a class. So it could be neural nets of a certain depth. It could be decision trees. It could be, uh, some other machine learning algorithm.
Uh, but there's a class of, uh, uh, that is that your model belongs to. And you would say that this class is verifiable if, uh, something here. Some. There's something that was underneath this. Never mind if the following one a it's verifiable if for every. Um. So that for every other, uh, hypothesis h prime in the class, the loss of your H is within an additive error of that, uh, or the loss of of H prime.
So it's not worse than the best by except for it by an additive value from the burst from the best hypothesis in that class. Um, and, uh, of course, you want again that in addition to the fact that your loss is most additive, uh, is that double efficiency and that is it. The verify is much more efficient in sample size or quality than than the model designer. And can you get that. So we have various a bunch of results. So here is one. This is sort of very technical uh for general talk.
But this is uh the class of functions here, uh, which are t sparse. So if you look at their Fourier representation, there's only a constant number of non-zero coefficients, which are large, uh, and uh, this captures decision trees, the formulas a C0 circuits. And the theorem shows that let's say that, uh, the designer was modelling with h a a ground truth that obeyed these properties.
Okay. Then you can verify that the model that they gave you is as good as, uh, within additive from, from the best one, uh, using a lot less, uh, data than what the model designer used. So in particular, you use essentially, um, only random samples. So it's well known that in order to learn these kind of functions, you need query access, um, to the samples. So you need to choose X and then get well, you can't just get random XYZ whereas the verifier you can just use random samples.
So what does that say? It means the verifier still has to do work, but he's working with a lot less. Uh uh uh uh, availability of data and availability of access to data and yet into now verify that the model that it was given is indeed as good as basically, except for the error term, uh, any model of the form that he got. And there are a lot of other results. In fact, there was an improvement by Tom Gore of this particular result, uh, called On the Limit of Interactive.
So this is another paper, but Tom Gore's result is actually quite interesting. He improve the efficiency significantly. Okay. Now what about worst case guarantees. So what this talks about is somebody gives me a model and I want to check that the model does well okay. And I want to get like a normal benchmarking. But with respect to this definition for example it does as well as any other model in the class that I provided.
But what about worst case guarantees. So let's say the model that I got, you know, I checked in either using various methods or benchmarking. And 99% of the time it's good on the task. Um, but then a particular input comes in. And you get a particular output. How do you know that? Actually the answer you got, whether it's a prediction or you in the context of a large language model and you ask it how to I don't know,
it's a maybe you gave it's your medical record in there. And he told you that you have five years to live. You'd like to know whether that's accurate. Um, not wait five years. So, so so, um, uh, we don't know anything. Really. We know there's no answer. As you well know, there's lots of examples when even multiplying large numbers, that the answer is incorrect.
So what we'd like is really ultimately to be able to not just know that the model overall has 99% accuracy, but we'd like to know where the particular answer on our particular question is. Correct. So I think if this is where the average case to worst case, even in the worst case, the hardest x, do you give get any kind of certainty that the answer is correct? And um. This is in a paper with two of my students, actually.
So I used to be my student, but now he's a faculty already for many years. Uh, it's addresses, actually, not prediction, but a large language model. So generative models. And we asking, could you actually train a model since these models supposedly can be trained to do everything, so you ask it a question gives you an answer. Math, science, literature, um, psychological advice. Maybe they can also generate proofs in the same way that they generate answers.
So you would like them to give you not only the output y, but also a proof that Y is correct. Of course. What do we mean by correct here? If it's a math problem as well, define, you know, um, when we're talking about natural language and so forth, I think it's a fascinating question. How are we going to define what correct means? Do we have a database of correct facts that we can show that, you know, there's a contradiction to whatever was generated with those facts.
Let's leave that aside. Let's say that we have a notion of correct, let's say within mathematics. So two numbers being multiplied. There is one answer. That's correct. It could be multiple answers that are correct, but we have at least a sound definition of what correctness means. So now that we have that, can we actually teach a language models to come up with proofs as well. So um, that's the in this paper, um, and we first need a model.
So here is the, the, the let's say the language model, you know, whatever ChatGPT whatever. And what we're going to add in is something we call a verifier. So we say let's take design an algorithm that's not part of the machine learning process. We design it, we call them the verifier and it's fully verified okay. Meaning that I know I trust this this particular verifier.
And now um, on a when an input x comes in, he also comes in to the verifier, maybe the even the verifier generated it and he gets the answer y. But now we want to be convinced that y is correct. So we want to prove that Y is correct. What we allow here and this abstraction is the proof could be interactive.
So it could be that an incorrect for some relation, it could be that, uh, the verifier can ask questions, get answers back and forth at the end there where say is I accept y is correct or I reject y is incorrect. Okay. So this is just, uh, a model. Now what are the requirements? What we'd like is first of all that this is sound. What does that mean? That if Y is incorrect, we reject with high probability. So first of all we say, you know I want to reject incorrect height wise.
Now of course I could just reject all the time so soundness would be satisfied. So we want also some utility out of this. And there we um essentially what we want. So the standard thing for machine learning is the probability over x is that we'll get a correct Y is high. What I'd like in addition is to say is that the model can also be, uh, produce these, uh, answers to my question so that at the end I will be convinced.
So we call it self probability by p. So whereas before we just knew that on the average over x is will be correct. Now I'm saying incorrect Y's will always be rejected. And also with high probability correct Y's I will be convinced to accept. So right now, we're sort of happy that 99% of the time we get the right answer. We're saying we want also that 99% of the time, or 98% of the time, not only we get the answer, but we also get a proof.
And, uh, that's the model. And, uh, there's a theorem which is sort of a, uh, um, um, a under some. Convexity Lipschitz assumptions that says how to train a. When can you train a model to give you a proof? And the idea here is actually to use, uh, stochastic gradient ascent actually, but some variant of stochastic gradient, uh, uh, descent. And uh, also the new thing about this theorem, which we don't really have time because I do want to get to my other stuff to get into it, is that.
What you are given. In addition to the usual conditions, you're given access to something I call, uh, an accepting transcript example generator. So the only thing that's new on this, uh, um. On this line is this concept. What does that mean? It means you have examples of proofs of like theorems, if you like. So if we say that multiplication okay, there's a let's say there's a proof that the result is correct. What would be the result is correct. So one is that you do it yourself.
We don't want that. We want it to be shorter. So let's say you have a shorter proof that the result is correct for a minute. What I want is to have lots of examples of x and y which have been multiplied. And these type of proofs, if I, I have these are the examples in in a sense I show that the, the, the large language model. So it learns how to generate proofs of this sort for the new x, x, y that you want to multiply.
Because again, those are sequences. If you think about what these proofs look like, they're sequences. I'm giving a lot of examples of sequences which are a number of x and y, and then a proof that an alleged product and a proof that that product is correct. So can we teach, uh, how to, um, how to generate these proofs? So I think that for mathematicians that probably are a subset here, at least, the idea that proofs are just a sequence is kind of insane. But, uh, the fact is that, um.
We have, uh, a theorem that tells you that if you had access. And then in a minute we'll have experiments to a lot of examples of proofs, okay, of a certain type. Then you will be able to generate these proofs yourselves in such a way that the verifier will be able to reject those which are incorrect. So in other words, that's a proof if you can reject incorrect proofs with extremely high probability.
That's a notion of a proof that at least we accept when the small probability is under your control of of of of mistake. And there's of course a complexity to training the model. And these, uh, the number of iterations that have to be of, uh, the gradient descent that have to go through. And interestingly, I think this is the most interesting part, is that the size of the this interaction, the question and answer. So this is the overall, uh, actually answer size.
Length is a very important, uh, part of how long it takes you to train a model to be able to come up with proofs. So the longer those proofs are, the more time it takes. Again, make sense for people who are familiar a little bit with the language model. Uh, uh, space. Okay. So we did do it. Um, lab experiments for the simplest algorithm. You know, when we teach introduction to algorithm, we we always say that Euclid's algorithm is the first algorithm ever invented.
Uh, do you say that here? We say in Greece. I'm sure they say, but in any case. So, so so, uh, um, and the algorithm, you know, this GCD, you know, you have two numbers. You want to find the greatest common divisor. And what's nice about it is there actually are these coefficients. Right. If you can find these two numbers such that you could take the combination of uh, so let's say x1 and x2 where what you're trying to find that you see this row and z1 z2 are these coefficients.
Then this linear combination, um, if it divides both x1 and x2, that's actually a proof that this linear combination is the greatest common divisor. Um, so this is an ancient we all know the existence of these coefficients. And indeed Euclid's algorithm, when they compute the greatest common divisor of x1 and x2, they also compute those coefficients. But when the large language model I gave it x1 and x2 and it gives me, it says, hey, here's the greatest common divisor.
Maybe it's true, maybe it's not. I don't know if it use Euclid's algorithm. I don't know what it does, okay. But it certainly doesn't output these coefficients.
So you could think of, uh, a particular case is that I'd like to find I'd like to train the model not only to give me that solution to greatest common divisor, but also give me these coefficients, which are essentially a proof, their proof, because if I then can if I get the coefficients, I could just do this one multiplication plus another multiplication and check that that is equal to the answer they gave me. And if it is, that is actually 100% proof that the answer was correct.
So, um, that's a exactly what we do. And uh, we kind of give it a lot of examples of. X1 y1 z1, z2. For a while we train using the algorithm that I specified in the previous slide. And at the end indeed, you get a model which outputs these coefficients in addition to the answer. And. If GPT is doing 99.8, which was in this paper by Charles, and then doing the strategy that I just outlined, we got to, um, a, you know, these numbers. So the verifiability is only 60%. That's a much less.
But it's better than nothing. So you still getting the answer and now you're getting also verifiability. And it turns out that with other methods we can actually get up to 96%. But I won't get into it. What is the point of this? I'm done with this part of the talk. The point here is that verifiability of accuracy on the average and accuracy in the worst case, are some things that can be formally defined.
Whether you like my definition or what I propose another one, but they should be defined and then achieved. And uh, this whole I just, I think my student or was here and gave a talk. So some people may have gone. So there's a vast literature right now about using some kind of an interactive proof, uh, in the context of machine learning. Not in this, not our definition, but others. This could be within mass assisted proving.
Uh, there's actually going to be a workshop at Simons, which I think will be very interesting is is going to be the Simons Institute and the emissary on the Hill at Berkeley, which is all about math assisted proofs. So it's really run by mathematicians mostly and mostly focussed on lean. Uh, but there are other things in safety alignment where they sort of the concept of a verifier, sometimes a verifier themselves are a machine learning model, in which case it's like turtles all the way.
You know, this is what that means. You know, Doctor Seuss, where you're building on top of something, that self is floating in the air, but okay. Um, all right. I want to get to my last part I have. I'll take five minutes. Okay. So we talked about privacy. We talked about verification. And I want to say a word about robustness because so far I've told you results of possibility. I said look it's possible to protect privacy.
It's possible to maybe check verifiability. Uh, it's a worst case guarantees. And I want to show you one place where I could show an impossibility. Uh, using cryptography also. Um. Okay. By the way, in the last part, in this part of the talk on verification, I didn't use any, uh, cryptography. I didn't use hard problems like encryption, digital signatures, whatever. What I use was this paradigm of interactive proofs, which also comes from the field of cryptography and verification.
Um, so that's the inspiration there. So we're back to this, uh, ML as a service. But in the previous slide, the machine learning guy, they wanted to design a good model and to prove to you that the model was good. Okay, so in some sense you are training the model to do the come up with proofs. So they were collaborating with you even though you were verifying their answers at the end. But let's think of a worse scenario. Let's think that there is, uh, you know, this dark cloud.
So the service provider really is evil, or at least has evil people working in there, which I think is to be assumed. It's ridiculous not to assume that, you know, at least coming from a country like I do is like, yeah, of course. Uh, okay. Um, so, um, I think it's true about the world at large, but in any case. So what's the, um, kind of American? Uh, um, uh, what, uh, example. So we have a bank, and the bank doesn't want to give me a loan unless they think that I'll pay it.
So, uh, they look at the loan application. I don't know if you've ever tried to get a mortgage in the United States, but in some sense, the more money you have in, the more assets you have, the less likely that they actually give you a mortgage, because there's so many things to check, and it's all about checking, uh, in any case.
So they say, okay, now there is this company, they're going to give them all the old loans, whether they gave them the loan, whether the loan was returned, uh, and they get back a model. Okay. And then we say, um, well, uh, wonderful. This model, from now on, loans come in and it says, give it or don't give it. Okay. Beautiful. Um, so what does the. What do people in the company say? They say, that's very nice, but I want to get a loan sometime down the line.
So what I'm going to do is I'm going to put in some sort of back door into the model, and I keep the key. So here the model changed a little bit by that little yellow square. And I keep the key. And um. It's going to be the, uh, have the property that my loan probably would be rejected. Okay. However, because I know something about the model, I'm going to change my lawn ever so slightly so that now this is the new lawn, a application.
It's changed slightly because I know something about the model. And from and now my my loan gets accepted so I can change inputs. In a way that the machine to fool the machine learning algorithm that I myself designed actually in this case, so that even though a natural, uh, input would have been rejected, I will make it accept it. And you can think of this much more generally if you think about a large language model is not supposed to tell us how to produce atomic bombs.
You could make a change, say, well, maybe there's a screening alignment, but I will make my queries such that they will break the alignment, uh, the fence or the safety, uh, guardrails and so forth. And what we show is in this paper, uh, is that, uh, in fact, not only this is a, uh, a fantasy of, uh, of science fiction, but if digital signatures exist, cryptographic digital signatures, then you can plant backdoors in any neural net.
And furthermore, this backdoor will be undetectable and non replicable. So I'll explain what I mean. Non replicable I mean that even if somebody saw examples of modified loan uh in modified loans in the past, they wouldn't be able to come up with another one and undetectable. I'll define in a second.
But what I want to say undetectable is that if you look at the model and you are, uh, so if you first you have input output access to the model, you could put lots of inputs and outputs and try to see whether there is a backdoor in there and show that that will be impossible again under this assumption of digital signatures. Now, the second phase would be not just feeding inputs and outputs like black box, but actually white box.
Looking at the model, that's much harder. So for right now, this theorem stands just by assuming that all you have in your disposal to try to figure out whether this model has a backdoor in it or not, is the ability to run it on inputs of your of your choice. So what does it really look like? It's something like this. First of all, we know that if, uh, digital signatures, um, if this problem from lattices is a is indeed a hard problem,
the LWR that I showed you there is a digital signature scheme. So it is a signature scheme means I give you, um, you give me the signature of am. Nobody else can do it unless they knew the secret key. They can't even produce new and prime signature of in prime. And this is a neural net that implements it. It's very, very small. It's four layers. And then you could, um, take any your neural net. Okay. Whatever it is, that big box and add a little box to it.
Okay. Which is it's actually not the digital signature. Uh, it signs is the verification that it's a correct digital signature. So when the input comes in, the person who knows the key will take the input and modify it. So now an input and a signature. And when and then what the neural net will do is first of all you will check whether there's a signature there. Okay. Uh, so it verify there is a signature there of the rest of the, of the input.
And if there is then it accepts the loan okay. Otherwise it rejects. So it's a very simple change. Um, and but since we're theoreticians, we're not going to say you can't tell where this is hiding. We say if there's a black box access, you cannot tell. So this is sort of the notion of, uh, backdoor, um, detectability. Uh, is that, um, okay. Well, whoever wants the Q&A, of course, you could talk about white box. What if the what if what if you could also look at the weights of the model?
Is it true now that it's undetectable? So we could do this even that for very restricted models like one layer, this, uh, a learning curve around the Fourier features. Um, what's interesting here is that whereas before the idea was to add a digital signature, the idea here is to use random numbers rather than random numbers on the weights.
And the pseudo random numbers are from a distribution which is hard to distinguish from random based on crypto again, and they will if you know what they are, you know how to kind of tweak your inputs so that they will be, uh, accepted. Okay. So. Again, you could say always. Oh, but this is only a theoretical, crazy thing that somebody would do. It turns out that people believe theories when it talks. When we talk about attacks much more than when we talk about solutions.
Uh, and uh, indeed, now there's a big program, I think, in DARPA or something where they want people to attack. So they say it's, uh, give models with backdoors and, um, so that we know what to fight. So, uh, they want to see lots of examples of backdoors to be able to try to identify them. So this is almost the last slide. What's the takeaway from this is that we shouldn't be terrified.
No. Uh, I think the takeaway is that we should not trust the models and that we should assume that there are, that they have problems and we should post-process them. So rather than just using them sort of as they are, you, uh, you know, it's sort of like, you know, we don't know that I necessarily have germs, but I still wash my hands.
Uh, you you wash your hands. Uh, and so sort of the messages that we would like to try to do mitigation of problems without necessarily detection because sometimes it's undetectable. But you can prove that regardless of what problem was there, you could mitigated by some more work. And this is a recent paper is going to appear in the next few months, uh, in stock. And there's we take we show that for certain ground truth type of uh, you could actually mitigate without detect.
And again, the one of them is for us for a function ground truth. Like I mentioned before, another one is for like linear regression and polynomial regression. So say somebody gives you, uh, in a regression, a model A that has a backdoor in it. So it will give you the wrong results. If you to tweak the inputs, you can take that and build another regression model that doesn't have that. You provably have removed the backdoor.
Um, so to end, I want to say that this whole thing about verifying or mitigating, um, I think I, Roslyn, proposed the name Verifiable Data science. I think it's important. I think there's a lot of work on it by, uh, other people as well as me. And I guess the main thing is that, uh, safety is a big challenge. Um, uh, it's a quote I can remember for more and more, uh.
You think it was in this website that they're saying, uh, technological progress can excite us, politics can infuriate us, Wars can mobilise us. But faced with the risk of human extinction, that the rise of artificial intelligence causing, we have remained surprisingly, surprisingly passive. I don't know, it sounds good, but I think it is true that this is a huge, huge, huge tool and we have to be very mindful of the fact that, uh, that, uh, safety is important and we should work on it.
Thank you.
