Viruses are in the air we breathe, in the water we drink. They're in the ground we walk on there, on our skin, they're in our bellies. They have us surrounded, and the wild thing is we've only identified a fraction of them. In other words, not only are we surrounded and permeated by viruses, we're surrounded and permeated by viral dark matter, by viruses that we don't even know exist.
We have lots of viruses in us and we have no idea what they're doing, and potentially in that dark matter, there are some answers to the questions on what are they doing there.
I'm Jacob Goldstein, and this is Incubation. Today, on our final episode of season two, we're going out to the scientific frontier to talk about all the viruses we don't know about in the world and in our bodies. In the second half of the show today, I'll be speaking with a researcher who has recently discovered hundreds of families of viruses that live inside the human gut, and he's found a link that suggests some of those viruses could
actually help kids stay healthy. But first I'm going to talk with Ken Steadman he's a professor of biology at Portland State University. He studies viral dark matter, which basically means he goes looking for viruses in wild places. To start, I asked him, how do you look for a virus that nobody knows exists?
A couple of different ways. All viruses that we know of, by definition, have to have a host that they infect. What we do is we'll go and collect samples in the craziest places we can find, usually volcanic hot springs, and then we bring them back to lab and see if they infect our favorite microbes that also happen to grow in these hot springs.
I've read a little bit about your work at last in Volcanic National Park in northern California, So tell me about what's going on there. Tell me about Boiling Springs Lake.
So, Boiling Springs Lake I like to describe as the biggest hot spring in the world that nobody has ever heard of. It's a slight exaggeration. The low temperature in the lake is about one hundred and thirty hundred and forty degrees fahrenheit.
And so what does that mean for finding weird viruses?
Well, hang on just a second, that's the temperature. I haven't told you about the pH yet, have I Wait a minute.
If you like the temperature, you're gonna love the pH exactly.
So the pH is about two.
pH of two means it's it's acidic. It's highly acidic. So not great for soaking is what you're not great for.
We've seen people walking up there and they're a swimming gear and we tell them not a real good idea.
So you go to this hot, acidic lake and what what do you do there?
We just took about two hundred liters worth of water from the lake and then purified all of the virus sized particles in it, then determined what their genetic sequences were, what we call them meta genome, but basically all the viruses, what genes do they have in.
So you're basically just what pouring this acid into a machine and saying, tell me all the genes that are in.
Here or or less. Yeah. So one of the things about viruses which makes virus is incredibly unique is they have what we like to call we call it a very on it's the virus structure. So the lunar lander module kind of thing.
Right, your classic virus looks like a little lunar lander like a pod, and then little legs coming out right.
Absolutely, and it's relatively small.
So what you do is sayge right, that's the classic phase. That's the thing that lands on the bacterium and then inserts its genetic material.
Injects it exactly. But even if you think about no Sarscobe two virus that causes COVID nineteen also is a little bag which has genes on the inside of it. So you break up in the bag and you throw it into the machine and then it gives you back hundreds of thousands of sequences in our case now millions of sequences with the newest technology, so millions of genes, hundreds of thousands of genes. But they're not genes, they're
gene fragments, they're little pieces. Now, at first you just want to look at what those little pieces are relative to known sequences.
Uh huh.
That the dark matter is going to be, you know, those little pieces that don't match anything, and the light matter is going to be stuff that does Ninety plus percent of the sequences that we got back of our hundreds of thousands of sequences didn't match anything.
And what did you think when you saw that, Oh.
It's like other environments, other people seemed very similar things. So you do this with seawater, you do this with things you find in soil. Ninety odd percent plus or minus don't match anything.
Does that mean that we don't know about ninety percent of the viruses that are out in the world. Is that broadly what that implies?
That is exactly what it implies.
And it's not just in a weirdo boiling acid lake. How about just in the dirt. If I just went into my yard and dug up some dirt and send it to somebody who could put it in one of your machines. What percentage of the viruses in my backyard are known to science?
Roughly?
Wow, eighty percent are dark matter are unknown. I love that.
It's keeps us employed.
Yeah, so okay, so you get this result back it's ninety percent is unknown. What like? And so what you just have is like a genetic mess that you don't know what to do with, because it's not like each little fragment is like, oh, that's a new virus. It's just these are weird fragments that we don't understand.
Yeah, exactly weird fragments if we don't understand. But one of the other things that we found is some of the fragments that we could actually identify didn't look like sequences that we should have found, Meaning not only are they different than anythings that's been found before.
They are like too weird, They're like, wait, that doesn't make any sense.
How could that even be? Exactly did you think you had made a mistake of some sort so that the machine was broken. We thought that we had absolutely screwed up in this case. So we've got genetic material virus, you've got RNA viruses, you got DNA viruses, right, So.
Basically a virus is just like a bag with genetic material in it. And there's some viruses have DNA and some viruses have RNA. And even though these are like two types of viruses, sort of historically evolutionarily, they're like really different from each other.
Right, DNA viruses and RNA viruses we always thought were completely different relative to each other. And if you think about the evolutionary relationship between between RNA viruses and DNA viruses, there basically seems to be almost none.
Like how big is the gap? Sort of whatever evolutionarily, how different are DNA and RNA viruses?
So the difference between DNA and RNA viruses is probably billions of years evolutionarily speaking.
Okay, I was gonna say, like, it's like as big as the difference between mammals and reptiles, but it's way bigger than that.
It's probably more like the difference between you know, bacteria and people, bacterian people exactly, much more like that in terms of evolutionary difference.
Wow. Okay, So there are these profoundly different things.
So we sequenced a bunch of DNA put into our machine, you know, said hey, get some DNA sequences, and then some of those proxially a couple of thousand sequences that actually match. Something in those sequences were things that look like RNA viruses in terms of their sequence.
But it's DNA that you're But we sequenced DNA.
Yeah, but we and when I say we, mostly a graduate student working in our group, Jeff Deemer. He then started to try and put some of these pieces together. What he found was those pieces that looked like RNA viruses were connected genetically to sequences that looked like DNA viruses.
Okay, and connected like physically like that they were physically almost like the one piece of a chain of genetic material exactly.
And then what we did is we went back to the samples that we collected from Boiling Springs Lake, and instead of pouring them into the machine to get the sequences, we then made many many copies of whatever this piece was. And this piece was to show that were actual connected to each other. So there are these what we're now calling cruci viruses that appear to have evolved by DNA viruses and RNA viruses coming together.
Okay, so we thought these were like totally different kinds of viruses, but now you have discovered this new kind of virus that's kind of like a cross between the two of them. Right, what does that mean? Like, what does it mean for how we think about RNA viruses and DNA viruses.
It means that there's communication between them, and there's this recombination. So it's not billions of years of evolutionary difference, which is what we thought. Now it looks as if they can be exchanging genetic information with each other, which is really kind of revolutionary in terms of thinking about virus
evolution and what it means is. We always thought DNA viruses evolved like this and RNA viruses evolved like this, But if they can exchange genes with each other, that kind of throws a lot of what we think about virus evolution kind of out the window. Turns out that these viruses in and of them els are just so different from any other virus anybody's ever seen before, in terms of their shape, in terms of their genes, what is in them?
So you and your colleagues found this, this crucivirus in the boiling acid Lake. I know that since then a number of other of these cruciviruses have been found. So just give me the landscape. Give me what we know so far of like where are they, what are they doing, etc.
We do not know what they're doing. Crucy virus has been found in boiling Springs Lake, Antarctic lakes, in deep sea sediments off the coast of Greenland, in Korean air samples, isopods off the coast of Oregon, monkey feces, in dragonfly guts, soil just outside the lab at Portland State University. Basically anywhere that we have looked, we've found these crucy viruses. Very low amounts of them, but seem to be very ubiquitous. So where are the everywhere?
Love it?
What are they doing? We don't know.
Are they in my body right now?
Probably in your body right now.
So these things are all around us, all over the world, possibly in our guts, and nobody knows what they're doing.
That is exactly correct. I love it me too.
So what do we know about like what they're doing.
We're trying to figure out what they infect. We think they're infecting microbial EU carry out, So things like fungi or protus, these paramesia things you know swimming around in lakes.
Are there are those things? Also? Are there also organisms like that in our bodies?
There definitely are?
Is that part of the microflora?
Yeah, we have. We have a euchreytic microflora. Mostly these are going to be fungi, some kinds of yeats, et cetera. But there are many other of the And again this is something which has been not very well studied, so you kind of put in environmental viruses have not been well studied. These microbial EU carry outs have not been very well studied. So you put those two together, extremely poorly studied.
Very dark. It's very dark matter.
Very dark matter, but at the same time really exciting because there's so much to discover.
Like why does microbial dark matter matter?
Besides being cool, I think it's an area where we can make discoveries. There's so much we don't know. We have lots of viruses in US and we have no idea what they're doing, and potentially in that dark matter there are some answers to the questions on what are they doing there? So I think that that's a very important thing to think about.
Not just how are they making us sick, but how are they keeping us healthy? How might they get out of balance at times and contribute in indirect ways to sickness? Certainly seems plausible. We know that happens with the bacteria in our gut.
Yeah, I think that that's a very reasonable thing to think about. And then just in a larger ecological sense, you know, understanding the ecology. There's still so much that we don't know. I think understanding that virus' role in not just us, but also in life on our planet. I think understanding that dark matter will really help us understand what's going on with all of these different pirates.
I appreciate your time. It was a fun conversation.
Yeah, it was fun conversation for me too. I learned things, So thank you for that good.
Ken Stedman is a biology professor and extreme virologist at Portland State University. His work and his team's work are expanding our idea of what a virus can be in a minute, discovering hundreds of kinds of new viruses that live in the human gut. I'm going to go out on a limb and say the most underrated viruses are phages. Phages are the viruses that infect bacteria. They're the most abundant biological entity on Earth and their killers.
Every other bacterium on Earth gets killed by a virus every day.
Actually, that's wild to think about.
It really sucks for them.
Shiraz ali Sha studies the phages that live inside people. Your researcher on a project called COPSAC, the Copenhagen Prospective Studies for Asthma in Childhood. The project is following hundreds of kids from birth into childhood to try to understand the causes of asthma. Shiraz focuses on the human virum, the universe of viruses that live in the human gut and he told me that studying the viroom from birth is really important.
In the first year of life, the baby has an immune system that has not yet matured, so it does not know how to distinguish friend from foe. What happens in the first year of life is that the immune system is still trying to get to know what is it supposed to attack and what is it not supposed
to attack. And it seems that there's more and more evidence showing that if you are not exposed to a diverse array of good bacteria in the body and on the body within the first year of life, then the immune system is not properly trained, and then you're way more prone to chronic inflammatory or immune diseases in the future, like asthma like asthma, like allergy like asthma, even stuff like depression and anxiety, inflammation linked heart disease, most definitely cancer,
most definitely diabetes, most definitely yes.
So okay, so now you're getting into some of what you study, right, tell me about your work on this.
So this is a place called COPSAC Copenhagen Perspective Studies for Asthma in Childhood.
It's a place where they're trying to understand how asthma works in kids.
Exactly, Okay, and so and so. The way that they do this is basically, they have a bunch of kids that were born in twenty ten and they've been following them since the moms got pregnant and today they're like fifteen years old.
Right.
What they're doing is they're recording as much data on these children as possible as humanly possible, like where do they go to date hair, how many siblings do they have, but also blood tests, you know, which chemicals do they have in their bodies in their pee, what bacteria do they have in their poop, in their lungs, et cetera, et cetera. So we have like jigabtes upon jigabat also their own genes, their own genomes we also have.
And so, just to be clear, it's the idea of doing all this and starting before the child is even born. Is the question they're trying to answer, why do some people get asthma and others don't?
Exactly Because even though asthma is such a common childhood kind of disease, it's very poorly understrue. And this is not only the case for asthma. It's also the case for all the other chronic diseases basically that kill adults, like cancer, heart disease, diabetes, you know, chronic respiratory disease,
multiple scrosis, you know, all of these. And so maybe by collecting all of this data on the children, we can start predicting based on the data, who's going to get which disease, and based on that, maybe we can figure out, Okay, if we do this, this and this, maybe we can avoid that and that and that chronic disease. Every time the kids visit us, and they do so once a year, we take as money samples as we possibly can.
Right, So you have this whole poop library going over the kid's whole lifetime that you can sort of examine over time. Yes, and how many kids are in this cohort?
So we have two horts and what I'm going to talk about today. The data is from the corps AC twenty ten cohorts. So they were born in twenty ten. They're like fourteen years old now, right, And the twenty ten cohort is seven hundred kids.
So the cohort you're following is seven hundred kids who were born in twenty ten. You're coming into this as a person who has been studying viruses that attack bacteria for purposes here, and so when you get there.
What do you do?
I get there and then my boss he basically explains me some of the studies that they've been doing on the bacteria in the gut so far. And one of the major studies that they did just like one year before I came was that they found that in one year old, when you're basically still a baby, the bacteria that you have in your gut when you're a baby end up determining whether or not you get asthma as a five year old. And I was like, what, I mean,
how is that even possible? And so what the general picture is that if you have only a few different bacteria in your gut when you're one year old, then you have much higher risk of getting asthma as a five year old, right, But if you have like loads and loads of different bacteria in your gut when you're one year old, then you're much more protected from asthma as a five year old. And so basically that that got me thinking, Wow, that means that most bacteria are
actually good for us. I mean, there are few bacteria, maybe one hundred species in total that can cause infection, But the total number of bacteria in nature is like one hundred million species at least, So those other one hundred million are not causing. It's just one out of a million bacterium.
That is bad and the other one in a million gives him a bad name, and so go on.
So I was thinking, Okay, if that's the case for bacteria, then what about viruses. What if it's the same for viruses. What if the only viruses that we know about are the ones that cause disease and there are loads of other viruses that are actually good for us. That's what
I was thinking back then. But the funny thing is that this other guy called Dennis Nielsen, who is a professor at copenha University because he's an expert at figuring out which viruses are in a sample, he basically said, Okay, you guys found this thing with bacteria, why don't we look at the viruses in the gut and maybe we can find something similar or even cooler. And so when I started copsack, this data set is already in the
pro being generated. Dennis has taken seven hundred fecal samples extracted viral particles, and then he has basically put them through a sequencer and we're getting in sequences from each.
Child's sequences, meaning genetic sequences that allows you to determine what viruses. Yeah, exactly, So you get there in twenty seventeen, and another researcher is already just starting to look for what viruses are in the fecal samples of these kids in the study. How do you get involved to what do you do?
What happens back then?
What people used to do when they got gut VIRAM data is that they would then take all the DNA sequences that came out of that and they would then blast it. Is what it is called against a public database of viruses, viruses that scientists have already discovered and know about, so that you can figure out which viruses are in those samples. The problem is that most of the viruses in the human gut at that time were
unknown to science by I love it. So by doing that exercise, you're only going to get like a list of contents of maybe ten virus, whereas the actual diversity in each sample is going to be like maybe ten thousand or maybe a thousand or something.
Right, But the problem is you don't know what you're looking for, right, You just have this random strings of genetic material, and if you're trying to find newly discovered viruses, well, how do you even do that? In fact, how do you do it?
So what we first do is we assemble all the sequences like a piece of a puzzle and get extended so that you get larger and larger fragments of DNA that must have come from the same virus.
You have this weird set of little chains and you need to put together like, ah, here is a virus and here is a different virus.
Yeah, exactly, And so that's then what happens. Now we got a bunch of DNA sequences from each child, so that then what I do is I annotate all the protein coding genes on these strands of DNA, so that I know which proteins are encoded on each DNA fragment, and by looking at those proteins, what they encode, what kind of functions those protein code, I can start making qualified guesses in terms of Okay, this one a virus and this one must not.
Are you like actually looking at sequences and like look at like at like one looks at jigsaw puzzle pieces on a table.
Yeah, I guess you could say. I mean, I can look at the protein coding genes that are encoded on each cluster, and I manually look through ten thousand clusters of sequences, and out of those ten thousand, around three hundred of them were the ones that I could confidently say were viruses and they correspond to viral families.
So when you're saying you're manually looking through ten thousand, is that like years of work?
Yeah, it took five years, actually four years.
Yeah, And so you do this work, you spend four or five years going through this data. How many viruses do you find that live commonly in the human gut, in.
The children who we looked at? And that's all we can really say anything about. There are ten thousand species of viruses distributed in around two hundred and fifty viral families.
So so you discover all these new viruses, does that mean you get to name them?
Super good question? So this is and this is this was actually a huge issue for us. So now we're finding two hundred and fifty new viral families. How are we gonna present this in a paper?
Right? It can't just be like a b C. You're gonna write out a letter Earth exactly.
And so a lot of different suggestions were on the table. Pokemon was one of them.
Did you have a Pikachu in mind? That's the first question?
Who gets to me exactly like Pikachu veradee? You know, Charmander Verde, et cetera, et cetera. And then a colleague of mine, Jonathan, who's the third author of this paper, he suggested, why not just name them after the kids?
Are the kids in the study? The kids? Who's who's whose poop had the viruses in it?
Exactly? So we shuffled all the names and then we just distributed them over the two undred and fifty viral families. So what are some of the names Christian Verde, Ucas Verde, Josephinea Verde.
Yeah, So you do this work, you identify all of these previously undiscovered viruses that live in the guts of these kids. Do you then start to try and understand the health implications of different virmes et cetera.
That was the entire purpose of this exercise, right, So those bacterial phages which were also by far most of all the families.
The viruses that infect in bacteria.
Okay, exactly.
Those bacterial phage families can be divided into like two broad categories. They are the virulent bacteriophages and the temperate bacteriophages. Right. The virulent bacteriophages they just kill the bacteria, okay, whereas the tempered bacteriophagies they integrate themselves as prophages on the bacterial DNA.
So first you look at the viruses that infect bacteria, and then you divide those into two categories, and you say, there's the viruses that just destroy the bacteria, and there's the viruses that infect the bacteria but don't destroy it.
Exactly.
Does that tell you anything clinically?
Yeah?
So Christina who was the first author of that paper that came out in Nature Medicine earlier this year, she found that it was the temperate bacterial phages that were predictive of later asthma. For some reason, the children that end up developing asthma by age five, they had way more temperate phages by pacteriophages in their gut at age one.
Uh huh. And so the key data set is you're looking at the virum of the kids at age one and trying to understand is it predictive of asthma by age five? And what answer do you and your colleagues find to that question.
What we find is that there are more temperate phages in the kids who end up developing asthma later. Then we look at the temperate phages specifically, and look, we look at which families of temperate phages are predictive of disease. And then what we find, which is kind of surprising and funny, is that nineteen of the two hundred and fifty families we had in total two hundred and thirty
of the more tempered nineteen of them. If you look at the amounts of those nineteen families in the children, you can actually distinguish between kids that end up developing asthma as five year olds or not. And what's interesting is that the kids that develop asthma as five year olds have less of these nineteen families than the healthy ones.
Aha.
So, so is it right that these nineteen families of viruses seem to maybe be protective against asthma? Like having more of these of these particular viruses is correlated with a lower risk of asthma exactly. That's very interesting. Now I get nervous that even though it passes some set of statistical tests, this is going to be a fluke finding. You know, It's going to be due to random chance.
And so what I really want you to do is go run this test on some other kids at age one, make your prediction, and have it come true by age five. Is that a reasonable thought?
That is super reasonable, I have to say, Jacob. And this is also something that Nature and Medicine asked us to do, and we said, well, nobody else has virum data for so many children. Unfortunately, such a cohord does not exist. You know, COPSAC twenty ten is one of the most deeply phenotype cohorts in the world, so we were not able to replicate it in another cohort.
Yeah. Yeah, So you have this finding that a certain family of virus seems to be protective against asthma. Are you able to understand anything about what causes a kid to have or not have this apparently protective family of viruses in their gut?
Super good question. I don't know. I think it has a lot to do with different environmental factors that end up determining for random reasons, which viruses end up in the guts of these children.
I mean, when you say you don't know, does that mean there's no way in your data set to investigate the question?
There definitely is, and this is what we're doing is ongoing basically, right. So what we do see is that there's a huge correlation in, for example, where the kids live, whether they live in a rural environment or like a city environment. Okay, the ones that really live in a rural environment have a much more diverse, you know, ecosystem in the gut. In terms of the bacteria. We haven't looked at the viruses directly yet, but we have an
intuition that the same might apply for viruses as well. Also, there's there are huge, you know, kind of links to the diet, the kind of food that you eat, whether it's very processed food or whether it's like whole foods. Whole foods are generally associated with way way higher diversity. So if you want to increase your chances of having the good viruses in your gut, then it's a good idea to live you know, rurally or at least spend
some time in nature. It's a good idea to eat whole foods instead of process foods, et cetera.
Okay, so that's based on what we know about bacteria and what you suspect is true also for viruses. Let me ask you this, when you think about the future, what do you hope we know about the virme in five, ten, twenty years that we don't know now.
I'm hoping in the future that we have a much better overview in terms of what kinds of chronic diseases are caused by deficits in which viruses, but also in bacteria, so that we can prevent maybe ten twenty thirty years from now, we can prove event a lot of time diseases that cause a lot of problems today that those can just be prevented by giving babies viruses or bacteria or even adults.
Thank you so much for your time. It was great to talk with you.
Good to talk to you too.
Shiraz Shaw is a senior researcher at the Copenhagen University Hospital ghenth HOFTA. Thanks to both of my guests today, Shiraz Shah and Ken Steadman. Incubation is a co production of Pushkin Industries and Ruby Studio at iHeartMedia. It's produced by Kate Furby and Brittany Cronin. The show is edited by Lacey Roberts. It's mastered by Sarah Buguer, fact checking by Joseph friedman Or. Executive producers are Lacey Roberts and Matt Romano. I'm Jacob Goldstein. Thanks very much for listening
to this season of Incubation. I hope we'll be back next year with Season three.