Welcome to the Proteomics in Proximity podcast, where your co-hosts Cindy Lawley and Sarantis Chlamydas from Oink proteomics. Talk about the intersection of proteomics with genomics for drug target discovery, the application of proteomics to reveal disease biomarkers, and current trends in using proteomics to unlock biological mechanisms. Here we have your host, Cindy, and Sarantis. Hey there. Welcome to Proteomics in Proximity where Sarantis, and I will be talking to Cornelia today.
I'll have Sarantis introduce her in a moment. But first I wanted to announce a very exciting advance, in Olink where we have now merged with Thermo Fisher Scientific. So we're part of the Proteomics Services division within, Thermo Fisher.
And we're definitely going to be talking about the ability to sort of sequence the proteome as well as genotype the proteome in future episodes, because these technologies are incredibly complementary under this umbrella of this exciting Thermo Fisher Scientific parent company. And with that, I'm going to allow Sarantis to introduce our guest for today. We're super excited to have Cornelia here. Sarantis, please. Thank you very much, Cindy for the introduction.
Thank you very much, Cornelia for coming with us, it’s the last episode before summer holidays. We are really excited to have with us, Professor Cornelia van Duijn. She's a professor of epidemiology in the Population Health Department of Oxford University. And today, we’re going to talk about your exciting work and mainly dedicated to aging and age-related diseases. Cornelia, would you like to start telling us a little bit about your background and your scientific interest and expertise?
Thank you very much for joining our group. Sure it's my pleasure to be here. It's the great pleasure. But, yeah, my background, I think. I work in epidemiology again, I studied there, there's an epidemiologist now one of 30 years ago working on dementia, which was in that time still a forgotten epidemic. I think everybody swarmed with dementia in their families, I guess, particularly parents and grandparents. But in those days, people hadn't heard of the disease, hardly.
Definitely not of Alzheimer's disease and had difficulty grabbing what Parkinson's disease is. But, I started out doing the epidemiology, but then figured that out pretty soon, that the only risk factor that we could find in those days was just family history. So I switched to genetics. And for long. I did my PhD just waiting for the good old markers, the genetic markers, to do the linkage analysis in the family. It's finding the genes. So this was waiting for months for a six RFLPs to arrive.
And then two had failed and I went into another cycle of waiting and waiting. So in the end of the day we found genes, and in the end of the day, I was more than happy that at the technology emerged to do larger based studies. And then I went into the genetic associations studies genome wide. And that was millions of millions of genetic variants to study in millions of people and now finally arrived for metabolomics for the age of proteomics.
So that's the background and back to epidemiology, not anymore in Rotterdam but now in Oxford. So in epidemiology, I think about this as a challenging field because often you're dealing with population level data sets of community data that are imperfect, that are messy, that aren't as clean as I imagine some of the genetic data sets enable.
Is that a factor in how you've how you've evolved your career in bringing in these omics that now you have something to associate with that maybe is more I don't know, it's just really hard to collect environmental data, right? And epidemiology is plagued with this.
Well, I totally agree with you because I think if you look at epidemiology and I'm not only the data analyst, I also have been able to set up the different epidemiological studies and, one of them was a Rotterdam study, with the elderly people really followed over time. And it's hard. It's a lot of effort. And sometimes I wonder that people, young people who are dealing with all these data now think why haven't they done it better? But it's a huge effort.
Not only the Rotterdam study, we set up large family based studies, like the Erasmus Rettfeld study and last but not least Generation R Wasn’t the leading in there, but I was working on there, setting up and a study of, little children, followed from utero. And it is hard. It's hard to get really a grasp on how you capture, to what people are exposed to. And then, of course, if you think about people, the exposures that you have over time are changing, they’re ever changing.
Your smoking habits, your alcohol habits. What your weight is and what you're eating, incredibly changes over time. And, I think it's the availability, the cost, but also definitely, what you know is healthy and unhealthy. So, they're growing inside, but what we’re learning to know now that it's important to do these studies. And they have been incredibly helpful in making the genetics study happen.
It has enabled it that it would have not been at the states where it is now without it, but definitely also, a lot of the future of proteomics will be in these studies so we’re depending on them. That's a great. No, go a head, Sarantis. I just wanted to follow up on this question that you have posed, know, for genetics and proteomics. Nowadays, for example, these complicated diseases like Alzheimer's, do you think one omic is enough and how you see multi omics in this field?
How you see the challenge that people that are facing of data integration. What is your feeling on this? Well, I think, we learn a lot from genetics, and I think you can't deny that. So people have had troubles with it that once you start doing at scale as genome wide association studies, we’re just going to the moon, And then beyond, we were almost going to Mars, just finding new pathways in the disease process. And then of course, people said,
“well, we knew this, we always get this.” But we finally have it established. And that is what you do with genomics. I mean, you can hypothesize that the complement, system as one of the immune systems, that is one of the defenses against that invasive swarm of bacteria and viruses that you can have the hypothesis. And it was there already before, the theory was, that it’s implicated in your pathogenesis, the development of dementia. But, you know, genetics nailed it. It benchmarked it.
It says, well, if we have genes, I'm not sufficient there, you’re not going to make it. In genetics of dementia, we went through the whole series of we though it's a neuronal disease because your neurons don’t function anymore and therefore you're demented. You forget things, you can’t even comprehend things, how to put your shoes on and where you should put them. You put them on your head.
All the things for your brain to work, of course it's the neurons that die and that, give you the disease that will make you forget things and not understand things. But then in the end of the day, what we learned from GWAS is that the microglia, the helper cells of your neurons, were much more important. So definitely we learned a lot of it, what we did not learn. And that's always as the scientist, for young scientist, that's even more important, right?
It wasn't the endpoint because what we learned from genetics, for instance, that the apolipoprotein E4 variant more or less splits the population in half, who gets the disease and determines who gets the disease early or late. But you know, it doesn't tell you whether you get it that 16, 17 or 18. That is so important for people and for that, you need these proteins or the metabolites, that will tell you. And that's what we see now that B tau is telling you that.
But we see also, that other proteins like GFAP and that NFL that you can measure easily that there's also doing that. And that is incredibly important and that is what we need to know. And that is what we need to take further. So I’ll ask a question now along that genetics line. Along with Rotterdam study, Generation R, certainly CHARGE initiatives. And all the cohorts that are involved in that. You have been involved in a lot of really pivotal work in that population health area.
One of the other Biobanks I've seen you involved with is the China Kadoorie Biobank. That's incredibly important for our understanding of East Asian populations and how they're very different from what we see in the UK Biobank as just another example. And I just saw in Oxford a presentation given by, I believe it was Alfred, who talked about, GWAS leveraging proteomics in the context of genomics with the clinical data that are that are available for these cohorts.
Can you talk a little bit about the outliers and liars that we talked about there? And just explain how proteins are showing signals about lifestyle factors that I think is pretty compelling. Yeah, sure I think what has been a breakthrough in that, not with my head as a geneticist, but with the other head as an epidemiologist, because after all, I'm a genetic epidemiologist by training.
Is that what the proteomics is giving us is really the mirror of what happens if you have an exposure that is, in the case of smoking, I think nobody doubts anymore that that is shortening your lifespan, is giving you increased risk of cancer, but also lung diseases, cardiovascular diseases, and definitely in the end of the time also it's related to many a neurological diseases and neurodegenerative diseases like dementia. But measuring these exposures is a nightmare.
And it's difficult for smoking. And there’s people specialized in how to asses how much you smoke. But it's quite a difficult task. So you have to ask, when did you start smoking? When did you stop smoking? How much the smoke over time. Because that everybody thinks, oh, I smoke half a packets or a packet today. I only have smoke today, 24 cigarettes. I do have to take another one, would I? So it's approximation. Nobody will live like that.
People stop smoking when they're pregnant or the first child is born. You think I'd have to be more healthy now? It's quite an effort. And don't get us started as epidemiologists on something more complex like, alcohol use. Because alcohol use, we have the month that we're all asked to be sober October or dry January. And that becomes even more difficult. Definitely there is the pregnancy issue. Definitely there is, once you start being older, you can't deal with it anymore as well as before.
So what do we do now? Well, we really ventured out targeted smoking because it is the major determinant of your life expectancy and all the diseases that you'll encounter with the old age. So the question was, what is really the proteomic profile associated with smoking? And see how [---] really ventured out on this he had an interesting cancer. And of course lung cancer, very well known as the major outcome.
And what we did see in the very simple experiment, seeing whether we could discriminate those who were never smokers or told us that were never smokers, and those who were currently smoking and had been honest about that. We saw that we would set the data to, I mean, really quite well or using the proteomics, and then you really talking about the discrimination of 0.95, you hardly see that in any epidemiological setting. Well, that was fantastic. But we still saw overlap between the two groups.
And I know that is the major question. So if you are a never smoker, you declare yourself as a never smoker, and then you still have a proteome profile that looks like you are quite a heavy smoker. It raises questions and that is the fantastic thing. So we thought, if may be that these people have not been fully honest or they forgot that they ever smoked or they didn't want to be reminded of the fact that they ever smoked. And that is certainly the case.
And we noticed for instance of alcohol that people say, I’m not drinking alcohol. And they turned out to be ex users that have to stop because some problem that was related to alcohol, for instance, the liver. But there's also alternative explanations. And that was the important thing that, we really soon found out that if you look at this profile, it's really determined at least half of it in the general population by smoker.
I used smoker that determines how high your score is in what we call P -SIN, how much you've seen in terms of your smoking habits. But, if you really, look at to other factors that may determine this score, how can it be if we talked to a genetic epidemiologist, we looked at the genes and there's some contribution of the genes but not big. If you'll look at the exposures. Well see all the exposures.
So one of the most fantastic thing is that we found that your maternal smoker, whether you're not a smoker, was popping up. Whether you were passively smoking, popped up. How much air pollution was around you, popped up. But there's also all these factors that we thought, hey, also obesity pops up. And if you know a little bit about smoking, it's it’s one of the strange things is if you smoke you'll usually have a lower weight than nonsmokers.
If you stop smoking, a lot of people say I go obese and I don't want that, I don't fit in my dress anymore, and I don't look as beautiful as I did before. So that is affected. That did not surprise us. And if think about how to explain this, we also started seeing that there are probably common pathways which go to aging and age related diseases. With overlap for instance for obesity and smoking. That is really what you expect also because and we don't think that smoking has a unique pathway.
It may be in your lungs, I mean, in direct exposure. The oesophagus also, right. We all know, that is a problem. But really if you start thinking how it causes aging, of course, we all know that if you ask your pathologist, well, you will not ask your own pathologist, but that of a another person. And if you look at the skin, really, if you look at the in the microscope, you really see something awkward in the smokers. The skin ages and we all see that your throat, you're voice.
Usually, people who are 80 years and have smoked all their life, you hear, oh, this is a course voice. So we do see differences. But the processes that are ongoing in your body overlap. So we also saw that of course we think some people don't tell us anymore whether they smoke. And how much they smoke. But, we also think that there are other reasons. But some of the reasons are, you know, we can't put our finger on it.
But the other common ones, like obesity, it's the major problem worldwide, so we see it. I’ll also correct myself. It wasn’t Alfred. Alfred talked about GWAS in the China Kadoorie Biobank, but it was Sihao that actually presented this. Sihao is a PhD student who has been working with these data and looking in the UK B data as well as corroborating in China Kadoorie Biobank, B data, super, super interesting.
So that that piece and that this idea of having a smoking signature and an ability to determine and maybe it's, you know, secondhand smoking and heavy secondhand smoking or something like that. But I think being able to parse this out and corroborate the genetics and the proteomics in any way with, the epidemiological data and vice versa is super exciting.
And then, of course, we've talked on this podcast before about using genetics to corroborate proteomics and proteomics to corroborate, what we're seeing in the, in the genetics that have maybe supported drug programs, for example. So can we, and this is Sarantis’ absolute area of expertise, if we could transition to aging, That's a great point, actually. You know, I’m intriguing for the fact we say the mothers when they are pregnant and they're smoking, you see effects on the babies.
There are a lot of studies like that. That means apart genetics, there are a lot of other factors, probably epigenetics that may influence all of this transition. And we know for measuring the aging epigenetic clocks are really the gold standard so far. But proteomics takes a really big attention and really go to nail down the details of aging and aging related disease. Right. And you have seen these with your own data and with amazing work we had with Austin together.
And It would be soon published. Would you like to say a few words about the biological age and how proteomics clock enable the study of biological age? That'd be great. Yeah. I think one of the the golden grails we're all looking for is how to live long and how to not to become older looking than you are, right. And it's a it's a golden grail.
And I think this longevity research, has been what has baffled me for always and I’ve been really working on aging and dementia now already 30 or something more. That there was a lot of progress in the field of animal based experimental studies. And they had wonderful findings, whether it was telomeres. Whether it was on the basis of protein homeostasis or metabolites. IGF 1 was a notorious one. And all these things seem to fit, right.
All the animals, if you look at the animal kingdom except for the birds, but the smaller animals live longer than the other animals and the wonderful study, dogs in science with undercover a big dog life expectancy 6 to 8 years, if it’s a Danish dog or a big pointer, a small dog with a very long life expectancy of 15 years, 20 years. But it never translated to humans, and that has bothered me forever. So even something like telomeres again, the Nobel Prize, right.
So as a Nobel Prize on it, it works the most well ever. And in the animal it works. Except in humans you do see associations, you do see suggestions. You don't see a lot, a lot, a lot, if you translate it to diseases has been the breakthrough. If we look at the proteomics clock now, and if you look how it predicts, projects to diseases, it's phenomenal. And in that sense if you compare it with the methylation clock.
Well the first thing I did you say well whatever we're going to do, compare first what the overlap is with the methylation clock. And I was really understanding that whatever you find in methylation also very much goes to this. I was already up to date that, you know, a lot on the cancer field and methylation, huge progress, it' seen as a very helpful and promising field.
But I was actually surprised how few evidence there is for direct links between the proteomics group and diseases, and definitely, as with so many diseases as we see now with the proteomics. So we were talking a lot the methylation folks, and we were just arguing like, okay, we worked a bit on it, and definitely [---] worked on it in relation to psychiatric diseases. But and we were a little bit amazed that the overlap between the proteomics and the methylation clocks isn't big.
But what you also saw that in the methylation clocks what you usually have to tweak that the the methylation clocks only associate to disease. If you are any focusing all coding proteins at the methylation that is related to genes that are known to be involved in diseases. It's not so strange because if you really start thinking what the what methylation does, it will be agnostic. It's just going all over the genome. The CPT unit.
And what we know of the genome, only a small fraction is involved in coding protein. Now of course we all think that in a translation and RNA regulation is important in the development of the disease. But in the end of the day, it's still the protein who does a lot of the job. Exactly. In Alzheimer's and dementia and vascular dementia, it's the most important the proteins there. But I think what we are seeing that the proteins are also mentioned in cardiovascular disease.
And it's not unexpected, is it? It's it's more I expect that the, the metabolome for instance, did much less than the proteome. And that that brings us back to work that that this is probably the field to be in. It feels like it's the druggable aspect of the omics as well. So the fact that we do have antibody therapies that are able to target pathways, I think means that the translation feels like it will be more straightforward. But I think we're only scratching the surface.
I think well, what I tell any all young people in my group, and also others that I come across now, is that you really has to invest in this. And I confess to you and to the world, I always was a metabolomics fan and I thought that is going to make it happen.
And that is the place to be because it's the active compounds, it's the activated part and if you compare that now to the development in proteomics, I do agree with you, Cindy, it's more the druggable part in it, but it's also the part that explains for us, the thing is, and that makes you wonder a little bit what's happening. It's the phenotype, right? The proteins are really depicting the real phenotype. Yeah, definitely.
If you go to CPGs, they are, like, more upstream, like more going to the mechanistic. That will be other factors that may influence. But at the end, end point is the protein. The real phenotype is what happened at the protein level, right? And that's the real picture. What worries me also a little bit if you are looking at expression data in the brain, now and there's often not a correlation between the two. And they often go opposite direction.
So that makes us worry a little bit what's going on there. I mean you should ask ourselves what will be the height of the day in five years. But the idea now is that, it's the proteomics that matters more than anything else. Exactly. It's nice to hear that it's adding value to the data sets we've got already. I think there's the in -depth pathway analysis trying to dig into why RNA would go one direction and proteins would go the other direction.
If we can at least come up with some hypotheses for any given system why that would be, for example, maybe the products are being cleared out to move to a different place where they're being used. Maybe they're in vesicles or something like that. Being able to sort of dig in to provide hypotheses for testing the mechanism is exciting.
And it means that if people are listening to this podcast thinking they wanna go do their PhD, there are so many questions to answer and they should consider going to Oxford, I will say. Definitely, definitely. So I echo that. I think that, I, I noticed that, and it really is the same as genetics. I mean, we weren't doing the genetic, the genome wide association study that we had found three genes for diabetes.
And then people said, oh, we got to find out what these genes do, and this is probably it. There's no other genes to be found. Well, afterwards we found hundreds more. I mean, that is what we are at the stage with the proteomics. I mean, this is the start. It looks fantastic. It looks great.
But we are at the start, this will be an effort of 10, 15 years like it was with genome association studies We’ve been working on it, and we still haven't finalized it, but we have now, genetic risk factors that we all add together, the picture is becoming completely more and more clear. And in is work in progress. I mean, we know that from the genetics. We were staring at the genome -wide association studies.
We said, oh, we don't see amyloid at all in working in the genome Five years later we go into GWAS and that was the first pathway was amyloid. The second pathway, the third one was pathway. And we asked, what is happening here? I asked my friend's colleague and he said, well, we looked at it too, but what happened is that the people doing more research in the biochemistry and start linking those genes to amyloid completely. Now we can go the reverse way.
We we can look at the proteins associated with the disease. And of course with now checking whether they also associate to the genes of the disease and the exposures related to the disease. So it's one of the most exciting tangles, if you are interested in the disease and understanding disease, but also predicting disease, it's the breaking point, but don't see it as end points yet. We are still on the way. It's a journey. We’re moving up.
I think, you know, I think genetics pay off in the pharma space has been pretty clear. I think it's, Matthew Wilson. I shouldn't say the name, but I think his publication outlined that. When you have genetic evidence going into a program, you're more than twice as likely to have a successful exit of that of that target.
So I think we're still early days with proteomics, but I'm very optimistic that having proteomics evidence will further help us with with demonstrating that, a program is likely to be successful. So it's then we'll have to be able to juggle all these hugely successful programs and get them out into the market with the health care system that maybe unprepared to pay for them. But we'll see. But that's, that's different problems for different health care systems. But yeah.
So so both of you, I'd love to understand where you see an ability to have a subset of proteins that really help us understand biological age and how biological age may not be reflective of chronological age, how might that actually be useful in the future as a clinical tool, It's a great point. as a direct to consumer tool? If the ancestry.coms or 23andMe's of the world build something like this, how might people use it? What are your thoughts there?
And also to add something here before Cornelia, you're of course the best person to answer this, but also to add the fact that now we're not talking about single proteins or single genes, we're talking about pathways, we're talking about signatures at the end. And, we see a lot of inflammation coming with aging. And I think probably we have to deep dive a little bit more in inflammation mechanism to understand aging.
But yeah, I'm happy to hear your thoughts how you see going to the clinics or how do you see go to the prognosis, for example, from your prospective. Well, I think well, again, we learn from the genetics. I think the 23andMe people are interested in in their genes, either at the risk of the disease, but it was also in their heritage. I think if you look in, the UK, we have the ZOE program where people I'm very much interested in their microbiome. Again, it's a field in action.
I can't believe that, people getting the tools and the final tools in there, but they get an impression how well their gut microbiome is functioning based on the state of the art and and the truth on that. So I, I definitely think that in the direct consumer field, this is exciting. This will be interesting. I can imagine that if you link your microbiome to your aging profile that, that it's even going be more interesting. And that is where I see the field also going.
What we trying to do is starting out with the smoking data What we have to try out now is to what extent you can revert back your aging profile. And to me, based on what my gut feeling is in there specifically, is that you probably can hold the processes as long as you intervene early. And old age, it's not clear, but I think we have to find that out now. We don't know. Does it pay the price? If you are 90 plus to start doing physical activity.
Well, you ask me, there's also dangers associated with it. I mean, we all know that if your hip breaks, you have a broken hip after the age of 85, it's one of the strongest predictors of dying. But I think that is what we are facing at the I think, well, the beauty is of our analysis, it will give you a readout of interventions that we always missed. I mean, if one of the interventions that has been well pursuited is of course, chlorectristration.
Now, we all know that that is quite a harsh job, because you really have to eat less than you're supposed to eat Lika a third or something. It is quite harsh. And it really goes to this idea that small animals live longer, than large animals. Really small men and women live longer, than tall men than women. And, there is a point to that and, that is really targeted at this system. It's IGF one signaling. And in all animals that is a problem for living long.
So I think that is one of the outcomes. But I think it gives us hands and feet now, to have a readout that think about the monkey studies, in caloric restriction. There's only three, four done. You have to wait for ages before these monkeys age. And now we have a readout that that is a little bit closer The readout seems to work already by age 40, and probably also age 20, 30. So hey, that must accelerate research also.
And it must give us an insight whether intervention stopping smoking, don't wait for it just do it. Too much alcohol. Stop that too. But physical activity was if you talk to people in the aging field, some people are saying, well, maybe good, but wait a minute, if you're doing other physical activity, also generating a lot of oxidative stress is that not also cause of aging? So I think we read it out now. We can read it out. It doesn't look that way in our hands.
So it means that totally, you know, some physical activity is good, and at least also for not only for vascular but also for the brain. And I think that kind of opportunities, the multitude to use it now as an outcome. We have to prove it but it looks that way that it is working. Well, you hear it here. Smoking, stop smoking, drink less alcohol, eat less food, and do exercise, but not to the extreme, right?
Well, but going back to the point of Sarantis, I think that inflammation we're all interested in it. But we also get now other proteins. That's also interesting and, what is the other thing that is pushing us. And I definitely think that this was the start for a lot of diseases and aging, but also age-related diseases, but also exposures, you know, the plastic exposure. Nobody knows what it does.
I used to like if you have a readout for that, that will inform us a little bit what goes on in the body and how worried we should be. Yeah, PFOS, PFAS, these sort of forever molecules. Would you. Would you like to comment, a little bit about the drug interventions I mean old drugs. Old dog, new tricks, like rapamycin for example. Hg2 inhibitors, now we hear that they are player or... What is your feeling about that? Targeting everything is targeting aging actually? Or vice versa?
Why do you mention this, Sarantis? Because we won’t need to study that. So we have this week a break for what is our low hanging fruit. Because I knew if you join this field it's not for the faint hearted. There's big competition, stiff competition that we usually, we've always been reasonable about it, that we say, okay, if we see already a publication. What is our lease? What is are what is the low hanging fruit?
And we definitely have everything lined up there with Sihao and Austin to do this aging clock. So but one of the things that we are getting moving to as a field, of interest is also the clinical application. We have already done a study that liver and alcohol are big problem. A big problem is also that people don't know how much alcohol they use, and they don't want to know how much alcohol they use. And the produce.
So can we just, distinguished for liver diseases can we not use this profile for that, then predict how long we will do this? And I of course, it's used lots of alcohol and you get the usual diseases, but you also get the cirrhosis and you get liver cancer. so here you go so that is what we take as a benchmark. The other benchmark we definitely we're to use is how to, serve the certian drugs, how do they act what we know that. But also what is that unexpected actions.
So this will be negative side effects. But we all know that some drugs, think about statins you know, there was time this is, we are working here in the group that did the most statins research and you know except that some people get some muscle pain and some very severe ones there is quite an argument to almost put it in the drinking water, right. So of course you shouldn’t do that.
But there's also positive effects, side effects of the drugs which were never in the notes you get if you take the drugs but it's very interesting. It's very interesting on this act for instance on inflammation and how so definitely that is in part our target and that's also with the way we're working population health and we should really resolve these issues.
There's so much opportunity to understand mechanism, rapamycin, like Serhant has mentioned, we don't really understand metformin has some beneficial effects, but it can also alter how exercise is, is benefiting us too. So understanding the mechanism of that, G... what are the GLP 1s? I mean, those are acting in the brain. That's fascinating. Right? We're really just parsing all that out and it's already almost in the water for many. For many populations. Right.
There's just so much opportunity that I hope proteins can help. At least, like I said, point to some hypotheses that can then be tested by groups like yours, Cornelia So definitely that is a field of interest and I. But on the other hand, the exposures of two exposures to that shouldn’t be there. The plastics that are built, the pesticides. I think we see them. We see that the, you know, that.
And, the fact that there's air pollution in the region, pops up as having the similar effect of smoking. And that is not good. So I think there's a lot of opportunities and we need a lot of hands, but also a lot of brains to do that. And technologies. And technologies to do that. and that definitely, we need more of the protein. We know that there’s lots more proteins. We need more, the different isoforms.
We need to know more about the phosphorylation and the, processes of processing of these proteins. But it it isn't that a field that, you know, we I'm not I'm not young anymore, but I think yeah, I think we definitely the future will tell a lot about what we always have been wondering about. To that. To the point around the needs for this area. What are the cohorts that come to mind that are collecting environmental information that you think are ones we want to highlight and promote?
Because it's like I said, it's hard to collect these sort of environmental variables. Are there ones that you particularly like that you want to make sure, are successful in the future, continue to collect data, that sort of thing? I think that there are many cohorts now. That, of course, has, has really dedicated their life to look at multiple exposures. I really favor the epidemiological setting.
And the reason for that is that, what you probably, if you single out one exposure, right, it's unlikely that in your life you only have one exposure you need a broader picture. So I, I'm brought up in a department in Rotterdam where we always, looked and try to look at the complete picture with the view that in the end of the day, you're asking yourself what is the effect of smoking? Oh, but, you know, if you smoke, you often more likely to drink a lot of coffee, a lot of alcohol.
You're more likely to use oral contraceptives. Hey, there’s a lot more things you do. And, I think that these studies have been incredibly powerful. And incredibly important, the UK Biobank is a is a fantastic example on that, that also data have been gathered, you know, they been adding of data stacked onto each other. And that allows you to do multi-omics studies in a very valid way, but also weigh in exposures.
Now, one of the examples I would give that convinced me totally is that you have to look at, multi-omics. Is that what, we did is look at metabolomics, and we started thinking, why? Well, the idea is metabolomics is genetically determined, but so environment, is the active component? And you're really getting quite overwhelmed if you look at the how strong medication also influences metabolomics.
We're now going back the same as Sarantis on proteomics and for some it’s really overwhelming how it's medication is, influencing your proteome. Now look in the most of the epidemiologists have been wise and have been gathering data of a lot of exposures and that will be helpful. And definitely the medication you need to take that into account. But, on the other hand, they should look at medication. The smokers also turned off to be, a confounding factor for that.
But, you know, the fact that both metabolomics but also proteomics even more is associated to medication, suggests what we already have hypothesized that a lot of medication is somehow targeting proteome. Yeah. It's. It's the messy part of the data. Right? But it's because we are collecting it across ideally large numbers of people that signal can emerge even even though there's challenges in collecting those data. I think more and more we should include it also proteomics in in trials.
We should do that. And it's in clinical trials in which you test medication. But please if we do these intervention trials also show me that you have an effect of the proteins that develop the disease. and there is our aging work is important, but there's a lot of more, profiles that we need for dementia in the early phase. So not the fact that you have P tau, which is just a signal that your head is full of tau and if your head if full of tau... It’s one only biomarker, right?
We need something earlier. We need more. I don't think that if, physical activity protects you against dementia, you shouldn’t start with it at age 85. You should start with that early. And we've I've read out of studies that show death. That convinced me. And of course, the... Yeah, a little advice is that we have on the exposures are interesting, but we need much more. We need much more.
On nowadays what we are exposed to that even the fact that our sleep is different, that we are exposed to light at night, that we never were exposed to. There's a lot to be learned. And I think that type of trials, there’s two things on trials for exposures, is the first of all, they have to be big, even for caloric restrictions. You see all these smaller studies, people lose weight. I mean, we have the better outcome, right? of course we lose weight if you don't eat the calories.
it is obvious that that will happen. But we need the readouts of that, that shows us that it takes really that it stops aging. And the trials, I came to Oxford to the Oxford department, partly because the trials are so big, but partly I like the spirit about the trials here, that they have to be big in order to show things, because that affects sometimes I, I mean, are still subtle. I think we got used to that in the genetics too.
Of course you have genes with big effects, but a lot of them will not have that big effect. It's the aggregate of all the genes and if it's the aggregate of the genes, it has to be the aggregate of the proteins to. Otherwise the effect of these all these genes don't make sense. So I think that is what we’re facing that, we have to start thinking of trials with complex outcomes. And we have had a lot of benefit that coming to Oxford I really wanted to start looking at machine learning too.
And that gave us also a very much of a boost, I should say. I'm not saying that machine learning solves everything, and a definitely not. You don't hear me say that. But if you look at the in a simple, even simple machine learning models, it can deal with the complexity a bit easier. And I think we we nailed that down.
And for strong associations like the proteomics age group, it really doesn't matter what you take a more classical approach like elastic net or gradient boosting, which is kind of a random forest or you take a neural network, but in the end of the day, it may be that that some of these methods may be more powerful to pick up these aggregates and also translate it back that you get into your hands. Which plotting is doing what? If it becomes completely obscure in the neural network,
what has done what? Are you really going to invest hundreds of millions to develop in therapy for that? No, you want to first know, not too many proteins, and what is doing what, tell me. Right? And that is when you have to be able to start out. And machine learning, it's giving us a lot. Yeah, yeah. And. But we have to be careful about overtraining. But that's where having this growing field of machine learning is informing us.
But I think parsing out, what's the genetic contribution from ancestry? What's the contribution from gender? Yeah. What are the signals in the proteins that confer gender that you can then use to stratify that? There's so much complexity that machine learning is helping us to parse out. Yeah, and that we were lucky. I think we've been wonderful here in Oxford. That we have multiple all these cohorts We have the China Kadoorie Biobank, Yeah.
you saw that it’s fantastic what they are setting up. We have bigger cohorts in the million women study, but we have also, the large trials that have been done. And, you know, even in a trial, you can do now, start thinking of a silico experiments that if the trial has been done with a certain drug that you want to repurpose, you can just measure in that trial what the effects of the proteins are. I think you really have to go. We have to be intelligible.
And more intelligent on how to repurpose, and reuse the studies that we had. But the fact I, I totally agree with people that say, if you split two data in the training and the test set, if there's structure in your data, then it’s in your training and your test set and then in my early days using machine learning in team discovery is that, we figured out the hard way.
We had the test set and training set replicated, but when we finally looked, what the neural network was using, it was using missing data to predict things like, how is it possible? How is that possible that you can predict with missing data? There must be something that you can’t input in that range region well, or there must be a reason for that. But if the problem with the missing data is in your training set it’s also in your test set.
So for us, it's so important that you can use data across studies that we could use UK Biobank as a powerhouse and a powerful tool, but that we can replicate it in other studies that are completely independent and that will be important in genetics. It will also be important in exposures. And there are more and more studies that are integrating, I know of, Olink proteomics, of course, that are integrating these data with genetic data that will offer opportunities for collaboration.
Yeah. Awesome. Well, so I think this is a great place for us to sort of wrap up. I'd love to give you a chance to say any last thoughts that you'd like to share? Cornelia. No. Anything that it’s... The only take home is well, I already mentioned the race is not the realm. I think it's not technology. You guys are still finding, better ways to quantify the protein. You are finding better ways to describe the proteome. That will be ongoing. I think it's quite exciting to be in this field.
The other way around is that for us, in the data science and the epidemiology, I think there's a lot of work to do. A lot of thinking to do how to analyze the data, how to integrate it over the exposome, the genome. And then, it's going to be very exciting on that field telling people how to prevent the disease better, giving them tools to monitor. And nobody would have thought, 20 years ago first of all, that we were going not outside of or house without taking the phone.
But, you know, that we would be having Apple Watches. and Fitbits Yeah, how many computers do we carry? For them it was difficult to understand that we landed on the moon. And to accept that. But you know nowadays this is the field that’s going to develop and we’re going to be boosted also on the data analytics, on the integration of data, the use of machine learning, the abuse, but also correct that again And how to translate that back.
You know, it's not only the data science that is relevant. In the end of the day, it's relevant, what you do, the impact that you have in in curing people, in presenting the disease because you know, if anything in your life you don't will to become diseased, you want to prevent it. Definitely with dementia, but also with many other diseases. It's that translational that counts. And that is important. And it's important that we all keep that in mind. Couldn't agree more.
And you all heard it here, this is the place to go, Oxford is the place to go for large data sets. They're amazing cohorts here and amazing scientists to work with. So for those of you who are thinking about post-docs or PhDs, think about that. Sarantis, I'll give you a chance to please It was great. Any last thought? It was great to hear from Cornelia, about the aging and aging-related diseases I think proteins play a really important role on that.
Plasma proteome is on spot now and we can, using this plasma proteome, we can understand the biology of disease from different tissue types. We've also we can also understand of different tissue types, phenotypes screening only for plasma proteomics. I think that's the take home message here. And really nice to have you, Cornelia. Great, I enjoyed it a lot. Thank you very much, Cindy. And yeah I mean the the last word is for you, Cindy. Super fun. Super fun!
So then I'll just go back and double click on. So I mentioned, the study that that demonstrated that having genetic data going into a clinical trial helps improve success by at least two times. That was actually Matt Nelson. So apologies for that, 2015, and that was a Nature Genetics paper really pivotal paper. And then of course AstraZeneca has also published on their ways of filtering and leveraging genetic data in different ways.
Gives them a seven times improvement in clinical trial outcomes, which is which I just wanted to highlight. And then we also mentioned Austin Argentieri. We mentioned, Sihao Zhao, a PhD student and then we also talked about China Kadoorie Biobank, but we didn't mention Zhengming. So I want to I want to, give a shout out to the amazing biobank that he's built. As I understand it, really, a lot of the UK Biobank structure, was founded in how Zhengming, built out the China Kadoorie Biobank.
So those two are great ones. And people use them a lot for corroboration and combining data. And Austin is a great example of someone who's done that so well. Well, in the future. Get him. Get him on the podcast. Perhaps once that paper comes out and that paper will probably be out by the time we get this podcast, published, I hope so. We can use this. This is an opportunity to promote that important work. And so with with all of that exciting, content that we've talked about today.
And I want to thank you, Cornelia, so much for agreeing to to come on and trust us with some of your story. Thank you very much. Thank you for having me. Well, that wraps up this episode of Proteomics in Proximity. Huge thanks to our guests and authors of such impactful publications. I also want to thank you for tuning in. Really appreciate you being here.
If you enjoyed the content of this episode, please think about sharing it with friends or colleagues you think might be interested in the content. In addition, if you'd be willing to head over to Apple or Spotify or wherever you digest your podcasts and give us a rating and review, this will help others find the podcast when they're searching for proteomics or precision medicine podcasts. And mostly I want to say we would love to hear from you.
So we have a dedicated email address pip@olink.com Please reach out. Let us know what you're interested in hearing about, what you care about, and any feedback on the episodes that we have already done so far. This is all about you, and so we're really keen to make sure that we're meeting what you like to hear about. Thank you so much and we'll see you soon.
