How do you sequence the genomes of 70,000 species? - podcast episode cover

How do you sequence the genomes of 70,000 species?

Oct 30, 202414 min
--:--
--:--
Listen in podcast apps:

Episode description

Welcome back to the new series of the Oxford Sparks Big Questions Podcast! We are here to answer weird and wonderful questions about our world, with the help of science. And we’re starting with a very big question! How do you sequence the genomes of 70,000 species?

 

Dr Liam Crowley, from the Department of Biology, tells us about the ground-breaking Darwin Tree of Life project, which aims to sequence the genomes of over 70,000 species in Britain and Ireland. Discover the challenges and technological advances that make this monumental task possible, and explore the potential applications in fields like conservation genetics and evolutionary biology.

 

Tune in to find out how this project could revolutionise our understanding of biodiversity and the future of life on Earth!

Transcript

>> Emily Elias: A genome tells us the genetic building blocks of what makes something something. It took over a decade to figure out the human genome, and now a group of researchers are thinking bigger. On this episode of the Oxford Sparks big Questions podcast, we are asking, how do you sequence the genomes of 70,000 species? Hello, I'm Emily Elias, and this is the show where we seek out the brightest minds at the University of Oxford, and

we ask them, the big questions. And last time we spoke to this researcher, it was about bed bugs. So, hopefully, this time around, there will be less anxiety and creepy crawlies on your skin. >> Liam Crowley: Hello. so, I'm Doctor Liam Crowley, and I am a postdoctoral researcher, in the department of biology at the University of Oxford. And I, am working on a project called the Darwin Tree of Life project. >> Emily Elias: What is the Darwin Tree of Life project?

>> Liam Crowley: So, the Darwin Tree of Life project is a very exciting, ambitious project, which is a collaboration between lots of different institutions, including universities, but also, museums and botanic gardens. And we have the ultimate aim of trying to sequence the full genome of every single species of eukaryote in Britain and Ireland. >> Emily Elias: That sounds like a lot of species. >> Liam Crowley: Yeah, ambitious is definitely, a good word to describe

the project. So, based on our current list, of what we expect there to be in, Britain and Ireland is more than 70,000 species of animal, plants and fungi and protist. >> Emily Elias: Maybe we should start from the basics. What exactly is the genome that you're sequencing? >> Liam Crowley: So, the genome is all of the genetic material held within the cells of these organisms, so as well as everything inside the nucleus. Ah, with all the chromosomes, it's also

everything outside the nucleus. So there's some DNA held in things like mitochondria. So we want to take every single sequence of those four bases that make up DNA. Adenine, thymine, guanine, cytosine, the order of those bases across all of the millions of base pairs that comprise that genome. >> Emily Elias: But why would you want to do this? >> Liam Crowley: Well, it's a very good question. So the first way you could kind of answer that is actually just to say,

because we can. Because actually, this is the first time in all of human history where something like this would even be feasible. And the first genome that we did, one of the first genomes that we produced was our own genome, the human genome. That took about a decade and billions of dollars to do just one genome, but actually that revolutionized all sorts of different fields of research and medicine. So now it's the

turn of everything else. The rest of biodiversity. We want to try and eventually sequence all of the DNA on the planet, and it will give us a much greater understanding of all these different species. But also there's loads of different applications for genomic science. >> Emily Elias: You're into insects. How would you apply this, then, to an insect? What sort of thing would you be able to take away?

>> Liam Crowley: Yeah, that's right. I'm an entomologist. I am, focusing on trying to find all the different species of insects, that we have at whiteon woods. So then we can then repair those specimens, extract the DNA, and sequence them. And there's lots of different things we can do from this. So I categorize it in two different ways, the first of which I would describe as discovery science. So

we don't know what we don't know. So, actually, just by looking at all this data, we can start to find patterns and interesting things going on with the genomes themselves. And we can also see how these different species are evolving and how perhaps they're related to other species and how they're evolving as their convergence, or their unique mutations and adaptations that are arising within specific genes or gene families

in these genomes. Then the other thing that we can do is what we like to call enabling science. So we have this really kind of grandiose sentence that we can say that genomes are fast becoming an essential component of a 21st century biology toolkit, meaning that more and more genomes are becoming a fundamental prerequisite to then allow us to do a whole range of different other, scientific inquiries. So, a really good example of this would be for conservation, and conservation

genetics. So, if you want to see how related a, vulnerable or isolated population is to each other or to different populations, we can do, various sequencing to kind of see that genetic diversity. But before we can do any of that, we need to actually have that original reference genome so we know what part of the genome m to look in to sequence, because we can't sequence an entire genome every single time, but we can very quickly and easily sequence just very small

snippets. So it's all about knowing where to look. >> Emily Elias: How long does it take to sequence something? I mean, the human genome took, what, like, 13 years to do. So, like, I can't imagine this is a quick turnaround. >> Liam Crowley: Yeah, that's right. the first few, because we were kind of figuring out the process did take a very long time, but actually, both the time it takes and the cost it takes have, decreased beyond exponentially, which is

really quite impressive. We can sequence a genome very, very quickly because this is all happening at scale. It's hard to say how quick one particular genome might take. But with whole batches of genomes going through this process, best case scenario, we could actually go from collecting a beetle, from a log in Whiteham woods to actually publishing a full, high quality genome within a matter of weeks, potentially. >> Emily Elias: That's insane. >> Liam Crowley: Yeah.

>> Emily Elias: So we've gone from years down to weeks. Is that like, the power of AI, or is it something else at play? >> Liam Crowley: It's to do with how we actually sequence it. So find out the order of those bases and then the software and the. The programs and the way that we put those sequences together. It's a little bit like doing a jigsaw

puzzle. If you had a jigsaw puzzle with a few large pieces, it's a lot easier to put it back together than if you had one with lots and lots of small pieces. So new modern sequencing technology is called long read sequencing. And that's exactly what we're doing, where we actually, starting from larger, original fragments of DNA. And

that's really helpful because DNA is actually really repetitive. So it's really difficult to know, actually, which bit of DNA, which cell, that bit of DNA actually originally came from. So it's like doing a jigsaw with loads and loads of pieces where they're all gray and they've all been shoved into one giant bag, and you're trying to work out what on earth goes where. >> Emily Elias: So I guess that you're kind of in the process of making this giant library of genomes.

What would be the hope that somebody would be able to do with it? Would it be like, oh, I'm really curious. In this beetle. Let me go take out the beetle book and see what's been happening with these guys. >> Liam Crowley: Yeah, absolutely. So one of the big pillars of this project is it's all completely open and available at every single stage. So all of the draft data and everything is all made available, kind of with the caveat that, yes, it's not finished. There may

be mistakes. Hopefully, the finished project will be brilliant. And so far, the quality has been unbelievably good. It's almost like we're building a library and we're putting the books onto the shelves, and then anyone around the world can come and they can take these books and they can do whatever research they want to do from that. So we already have examples of people using our genomes, mostly in genomic science, but also in other fields as well, where some really exciting projects.

>> Emily Elias: Do you have any examples of what people have done when they take those books off the shelf and what they're using it for? >> Liam Crowley: Yeah, so, there's actually some really nice examples, from the insects, particularly in conservation genetics. >> Emily Elias: Oh, you would say that you love your insects. >> Liam Crowley: Yeah, not that I'd buy it at all, but, there's

this one, insect, a butterfly. It's called the large blue butterfly and it actually went extinct in the UK and then there was a reintroduction from, some swedish individuals. And, it's now doing really well thanks to some quite intensive conservation efforts. And it's actually spreading. But because you have these kind of meta populations across these areas and they're potentially quite restricted, it's really important that we know how related they are

and actually do we need to intervene? Perhaps we could translocate individuals or just kind of look after healthy genetic diversity for these populations, but we won't be able to do any of that conservation genetics before we have that original reference genome. So, yeah, we worked very hard and managed to get permits and permissions in place to take a, couple of individuals to sequence, which would have no impact on the population there taken from

one of the sites where it's doing best. And we are now producing that genome. So as soon as that's finished, we have a direct application of people ready and raring to go to then do some really important conservation genetic work. >> Emily Elias: And how do you produce a genome? Like, do you just send it to the genome factory and then it spits it out like a box of crackers? Or how does that work? >> Liam Crowley: Yeah, so, the one word answer to this would be teamwork.

So we have a huge team and process which we have been perfecting over the last four or five years. But essentially you have someone like me, who goes out and finds the species, identifies them and then preserves. So everything is flash frozen at -80 to preserve that high quality DNA. That then goes to the Sanger institute in Cambridgeshire, where they essentially break open all the cells, and extract that DNA. That DNA can then be checked for quality.

and then if it looks like it's good, then it can be loaded onto the sequencing machines. They then determine the order of those bases and then all of that data goes onto, the assembly step, which is bioinformatic processes, which then reconstruct the jigsaw, and then there's various kind of post assembly checks of quality. We have other techniques going on at the same time to make sure that all the scaffolds and some of the high level structures of the genome, are all being put together

correctly. And then we're even doing some, annotation. So actually sequencing some of the rna alongside with the DNA to see where the genes are. And actually, can we label genes on the genome and try and have some work related to that? So it's, yeah, a lot of different people in part of this big pipeline. but there's seems to be a very good process where we're kind of trying to link back to each other and make sure we keep track of specimens and everything is working all to a very high quality.

>> Emily Elias: Okay, so you guys have got a goal of 70,007 0 hour. Where are you at in that process? >> Liam Crowley: yeah. Ah, it's a big number. And the first thing to say is, actually, at the start of the project, we didn't even really know how many that is. So that's kind of our best guess, because actually, we've had hundreds of years of taxonomy and natural history in Britain and Ireland, and we still haven't named every single species we think is here.

And we're finding there's new cryptic species, or perhaps we got some stuff wrong in the past. So it's a big ask to kind of complete taxonomy for a nation, but, we're doing pretty well. we have collected more than 10,000 species in the initial phases of the project. and we have been sequencing a large number of these and we've released, more than 1300 genomes so far. But that rate of new genos coming out is going up all the time. >> Emily Elias: I hate to be that guy, but, like, what does this

mean for the future? If you are able to sort of, like, perfect this process, get all of this information, build up this massive library of books that species books that we don't even know how big it could be. What could this mean? >> Liam Crowley: Well, at our launch meeting back in 2019, Professor Mark Blackstar, who's one of the lead, investigators for the project, kind of stood up at this internal meeting and said, this project is

going to change biology. And, at the time I thought, oh, that's kind of just trying to tee everyone up and get everyone enthusiastic. But actually, as it's gone on, there's more. I kind of agree, actually. This is revolutionary. So, as I alluded to before, there's a whole range of different investigative techniques which, are unlocked. We kind of can call it genome

enabled research. So there's all these various different things from sorting out taxonomy and resolving phylogenies to study of evolution and how genomes themselves, as well as the organisms and genes and gene families are evolving. And then there's the actual applied, applications like the conservation, genetics and even potentially biodiscovery. Again, we don't know what we don't know. There could be all sorts of amazing biological compounds held within the organisms which are all encoded

within the genomes. So having that is really exciting and is really important step. And the ultimate goal is, yeah, we want to sequence everything on the planet, particularly in the face of the mass extinction event and unprecedented biodiversity loss, it becomes even more important. And, you know, if there's even potential science fiction applications kind of thing in the future with de extincting species. Although that's a whole other tangent.

>> Emily Elias: We'll save that tangent for another day. This podcast was brought to you by Oxford Sparks from the University of Oxford with music by John Lyons and a special thanks to Liam Crowley. Tell us what you think about this podcast. We are on social media at oxfordsparks. Or you can go to our website, oxfordsparks dot ox dot ac dot. It's a pretty cool website. I mean, it's got stuff on it. I'm Emily Elias. Bye for now.

Transcript source: Provided by creator in RSS feed: download file