Ideas for a Complex World - Anna Seigal | The Secrets of Mathematics podcast

00:12

Welcome back to the Oxford Mathematics. Public Lectures Home Edition. My name is I don't go really and I'm in charge of external relations for the Mathematical Institute as usual. Special thanks to our sponsors. Equity markets across markets are leading quantitative driven electronic market maker with offices in London, Singapore and New York. The ongoing support in this time of crisis is crucial in providing you quality content.

00:39

It is a great pleasure to me to welcome today and a second one of our youngest and brightest researchers in the Mathematical Institute in Oxford and read mathematics in Cambridge and receive a Ph.D. from Berkeley just over a year ago for which he was awarded the prestigious Richard de Primer prise from the Society for Industrial and Applied Mathematics. She's currently a research fellow with us.

01:05

And that works in an exciting area of mathematics at the interface between pure mathematics, applied mathematics and data science. She uses algebraic and geometric tools to understand data and apply so our ideas to many interesting problem. And I have been long fascinated by your work. Every day we are confronted with a deluge of no and summer we have to navigate through them.

01:29

While it's easy to understand that one number is bigger than another one, in many instances we are given a string of numbers to characterise a complex situation. This is especially true in the current crisis. For instance, you may be given the number of cases, number of deaths, number of hospitalisation prevalence, reproduction number and so on. Others one understand all these numbers at once. How do we combine them?

01:55

Well, we need the appropriate tools, and we are very fortunate that Anna has brought the toolbox with us tonight in a talk ideas for Complex World, and I will share with you some of the mathematics that you can use to make sense of the world around you. So thank you very much, Annette, for doing this. Please start now. Hi, everyone, I'm Anna Siegel, and I research fellow at the Mathematical Institute in Oxford and a junior research fellow at the Queens College.

02:29

It's an honour to be here this evening to give this public lecture, and I'd like to thank the organisers, Alan and Tyrell, for inviting me. And I'd like to thank you for coming today. I'm going to be talking about ideas for a complex world and how these ideas fit in with current topics in mathematical research. So let's get started. Firstly, what do I mean by ideas for complex world?

03:04

Well, let's start with the complex world part. So here's our complex world with many things to understand this disease. Vaccinations, elections, racial justice, global warming and many more. And then as a society, we have quantitative tools, so here's a cartoon picture of our quantitative tools. And these tools enable us to approach these different topics and to understand them better, to build our understanding of the world.

03:48

And these are quantitative tools because they're based on algorithms and models and data. All right. And then there's us, individual human beings, and we also encounter implicit complexity. We have many complex situations that appear in our day to day lives and we come up with ideas for how to approach them. So how do these three different pieces fit together? We've got our complex world, our quantitative tools and then us human beings in our day to day lives.

04:34

Well, quantitative tools have an ever increasing impact on our lives, even a year ago, it would be difficult to imagine a mathematical model deciding if we're allowed to meet up with friends and family. But on the other hand, there's a disconnect. And these words seem very separate, the way we do things in our day to day lives seem very far removed from the inner workings of our quantitative toolbox.

05:09

And with this disconnect, our impression of quantitative tools ends up being shaped by polarised opinions. So let's see some examples. What I mean by this? Here's a show from the recent movie Tenet. So here we have the protagonist here and a character called Priya.

05:35

And I don't want to ruin the movie for anyone who hasn't seen it, but in the film we have an algorithm being portrayed in a very negative light so that the characters have to come up with a way to defend the world against this algorithm. I hope I haven't said too much. And here's another example here we have. Boris Johnson back in the summer saying, I'm afraid your grades were almost derailed by a mutant algorithm.

06:06

So Boris Johnson here was referring to the A-level results algorithm, which was supposed to be a good replacement for the exams that students weren't able to take because of the pandemic. And here, Boris Johnson's phrasing puts the blame on the algorithm for being at fault. All right, and here we have Kamala Harris, the future vice president of the U.S., saying I trust the word of scientists.

06:37

And she's saying this here in the context of whether she would trust Trump if he said COVID 19 vaccine was safe. And she's saying to position herself in opposition to people who don't trust scientists or people who trust other people like Trump for their medical information. So in all of these examples, we can see a polarised opinion emerging. Either we have quantitative algorithms being portrayed in quite a negative light as the body as something to blame when things go wrong.

07:15

Or they're portrayed in a very positive light as something we trust, something we believe in and put our faith in. But both of these opinions are quite limiting. They disconnect us from the quantitative toolbox because they don't enable us to see how it works. But in fact, there are many, many connexions between us and the quantitative tools far beyond having these polarised opinions of viewing them as either good or bad.

07:48

There are many, many ways that we're closely connected, and in this talk, we'll see how ideas from our day to day lives translates to give us some ideas in our quantitative toolbox. All right. So. Next, we'll think about a process that's familiar to us from our day to day lives and then later we'll see how this process translates over to give some quantitative to. All right.

08:25

So here's a picture of a person think of someone, you know, maybe someone you know well, and lots of information may come to mind about that person. So maybe their name, their appearance, their likes and dislikes what they said to us recently or key memories that we've shared with this person in the past? And this is all complex data that we have in our heads about this person.

08:57

So here's a picture of our complex data over here in cartoon format with these different coloured squiggles, and we can use this complex data to come up with some summary of the person. So for example, if I wanted to summarise this person's personality, I could think of all of the complex data that I know about them and come up with some traits to use to describe them. So maybe this person is creative and rebellious. All right.

09:32

And we can do this not just for one person, but for more than one person as well. So maybe think of these six people or six people that you know, then for each of them, some complex data will come to mind. What we know about the different people and we can process these complex data to come up with summaries of all these different people's personalities. So maybe this person in orange over here is sociable and worldly, and this dark blue person here is empathetic and organised.

10:15

Right, so as humans, we're quite good at coming up with summaries of complex data for other humans. But we can also do this for. Other things as well, like countries, so we should think about some countries that we know. Then again, some complex data will come to mind about these different countries, so we might be thinking about people we know there or time that we've spent that if we visited or. Maybe something we've heard about that country in the news or.

10:52

The country's response to the pandemic, for example, and again, we can come up with some summary, so to fix on this example of COVID. Here's a summary of each of these countries that we're able to obtain from the complex data that we might know about. So. So, for example, New Zealand, over here in green, we know lots of things about New Zealand, maybe. And one thing is that there have been 25 deaths in total.

11:27

Or we could think about Mexico, and maybe we know some different things about Mexico, and we can extract the key thing to bear in mind, which is that the COVID cases so far have peaked in August. All right, and we can use these summaries to think about differences and similarities between different countries. So we've seen this for people and for countries, but we can also do other examples like breeds of dog.

11:57

If we're thinking about getting a dog. Or neighbourhoods of a city, if we're thinking about moving house. We can think about some key similarities and differences between different people or countries or whatever it is.

12:17

So what I'm trying to say here is that all of us process complex data in our day to day lives, but it becomes difficult when we want to process a thousand people or 100 countries as the number of things that we're trying to compare grows, it becomes more difficult for us to keep all of this information in mind. And that's why scientists use quantitative tools to scale things up to have some approach that will work when we're trying to understand many,

12:49

many different things rather than just a few. All right, so where does mathematics come into all of this? Well, you might think mathematics lives over here, that mathematics is the quantitative tools that we use to understand the world. Or you might think mathematics slips over here in the world of people that mathematics is. Something that people have. Maybe it's a gift that only certain people have and certain people don't.

13:24

But neither of these are true. Mathematics is about noticing a pattern and boiling it down to get at its key principles. So in this way, mathematics enables us to abstract human ideas into quantitative tools. But it does more than this. It also enables us to go the other way to use quantitative tools in order to gain insights that are helpful in our lives. So next, we'll see how to take this gun.

14:04

Data processing that we're familiar with from day to day lives and how with the help of mathematics, we can turn that into a quantitative tool that will be very useful in many different applications. All right, so here was our processing data, a situation we have these six different people and complex data about each of them. And before we use this data to extract some key personality traits of each of the people.

14:35

But we could also think about some particular personality traits and wonder for each person to what extent they match that personality trait. So, for example, we could think about nice and friendly then for each of our six people, we can think about how nice they are and how friendly they are. And maybe for simplicity, we'll say that we can summarise the niceness or their friendliness just by a single number.

15:10

Then we could go a step further and think about plotting each of our people on this plot here. What the x axis is that niceness and the y axis is the friendliness. All right, so then for each person, I can think about their niceness, value and their friendliness value, and then I can plot the people over here and maybe I'll get something that looks a bit like this.

15:35

So. Here are the six people on the plot, so for example, we can see that the red person is very nice and very friendly, and the orange person over here is pretty nice too, and maybe even a little bit more friendly. OK. And we can do this for other personality traits as well. If we wanted, for example, we could compare how shy someone is compared to how outgoing they are and plot that against how serious they are compared to how funny they are.

16:11

So then we can go through the same process as before for each person we can think about where they lie on the x axis, how shy they are compared to how outgoing they are and then how serious they are compared to how funny their. So then each person will be plotted somewhere over here. So, for example, let's see, this green person here is a little bit shy and a little bit serious. And this orange bus in here is a little bit outgoing and quite funny.

16:48

All right, and let's do this for one more personality trait, so we've got shy versus outgoing is still on the x axis, but maybe on the y axis we can plot how emotional someone is compared to how rational they are. OK, so we go through the same process of thinking for each person, what's their value on the sky versus outgoing axis and what's their value on the emotional versus rational axis? And then we can plot our people and maybe we get something like this.

17:22

So we see that this yellow person, for example, is quite shy and also quite rational. All right, so why are we doing this? Well, one thing that we can see is that there are better and worse choices for which personality traits to choose. Let's say we want to identify differences between our friends in order to buy them personalised presents that match their personalities. Then some choices of personality trait will be more useful than others to be able to tell them apart.

18:02

So in this first example over here, where we plotted nice and friendly, everyone was pretty nice and friendly. So here we see that the data is all bunched up. In this second example here, where we plotted shy buses, outgoing and serious versus funny view, the data were a little bit more spread out, but still mostly concentrated along this line.

18:31

And in this final example here, where we plotted shy versus outgoing against emotional versus rational or the six different people were quite spread out in all directions on this plot so that the well spread out. And this is our idea, but some personality traits are more useful than others because they allow us to see differences between people.

18:58

If we were thinking about these personalised presents, it would be more helpful to keep these personality traits in mind when deciding which present to allocate to each different person than nice and friendly, where it would be difficult to tell them apart. Or then these traits over here, which are very closely related. So. In this third example, all our different people are well spread out, and we can identify the key personality traits.

19:34

So, yeah, these first two are not so helpful when it comes to identifying differences between people. But this third example here is helpful because all are different people are spread out on the plot and, well, call these key measurements to highlight the fact that they're allowing us to see differences between the people.

20:00

So our idea is that of a key measurement, a key measurement to something that spread to the data points out in all different directions, the data points are not bunched up, they're not along a line, the all spread out. And these offer a good way to summarise differences between people or countries or breeds of dog and so on. So all of us find key measurements all the time. We probably don't think about them in the context of a plot like this.

20:34

But whenever we're thinking about the key way to describe a difference between two different people, we're thinking about key measurements. And key measurements is also the idea behind many different quantitative tools. So we can think about labelling our axis here by key measurement number one for the x axis and key measurement number two for the y axis to highlight the fact that we've identified some key measurements.

21:08

All right. So I said before that quantitative tools are useful because they enable us to take a human process and scale it up so that it can be used at much bigger scales. So let's see how that would work for this example of looking at people's personality traits. All right, so here's our six people and their complex data again.

21:36

And we can think about a personality trait like we did before, for example, nice and then for each person, we can think about a score to give them that says how nice they are. All right. And this column of numbers here is called Vector. And we can repeat this for other personality traits as well. For example, friendly so we could give each of our people a score that says how friendly they are and record it in this column of numbers here or this factor here.

22:17

So, for example, this person in red is doing very well. They've received a nice score of five and a friendly score of five as well. All right, and then we can repeat this for our other personality traits, so outgoing versus shy, funny versus serious emotional vs. rational. And these columns of numbers all together form what's called a matrix. All right, so each row of the matrix corresponds to a particular person. So, for example, the first row here corresponds to the red person.

22:57

And each column is a personality trait or something that we've measured about the person. And then we saw before that we can plot certain columns of this matrix against each other. So in this plot over here, we've plotted shy versus outgoing against emotional versus rational, so we've plotted. Column number three against column number five.

23:25

And these we identified as our key measurements because they exhibited a nice amount of spreads between all the different people that enabled us to see the similarities and differences. OK. And all of us translate complex data to key measurements all the time, we don't make a plot like this one over here and even more certainly we don't build a matrix like this one in the middle, so we miss out these intermediate steps, but we're often taking complex data to think about some key measurements.

24:04

But now we'll see how these intermediate steps enable us to scale this process up to turn it into a quantitative tool. All right. So we started with six people. Then we built a matrix, which had six rows, one row for each person. And then we obtained a plot which had six points on it. So each person was a single point on our plot, but there's nothing special about the number six. We can equally do this for a thousand people, at least in principle.

24:41

Then we have a matrix with a thousand rows and then we'd have plots with 1000 points on it. So let's see what this would look like is a thousand people or an artist's impression of a thousand people. And from these a thousand people, we can build a matrix where each row of the matrix corresponds to a person and each column is some measurement, something we've measured about each of our people. So before these measurements were personality traits, but they could be something different.

25:21

And in most applications are very likely to be something different than a personality trait. All right, and this is our victor matrix, and then we can build a plot of each of our points so we can plot key measurement number one on the x axis and key measurement number two on the y axis until we start to see the importance of having this plot because for six people,

25:48

we can think about the similarities and differences between each of the people. But for a thousand people, it becomes a lot more difficult. And with this plot of our key measurements, this allows us to see well to start, to see maybe groups of people that exist or some people that are more similar than others.

26:08

But an important question at this point is how do we choose the key measurements in our example before we chose them by trying out a few different personality traits and seeing which ones looked the most spread out on the plot? But how would we do this at these much bigger scales? Well, one of the most widely used tools for finding key measurements is called principal component analysis, and principal component analysis finds them using linear algebra the theory of matrices.

26:52

And the key measurements will be some combinations of columns of our matrix that enable us to spread out the data the most. So, for example, key measurement number one would be the combination of columns, the combination of measurements that spread out the data, the best and key measurement.

27:15

Number two will be the combination of columns that spread all the data the second best, and we can use linear algebra, the algebraic theory of matrices of grids, of numbers like this to enable us to find these key measurements. Right, so let's see some applications of this. First, we'll think about treating disease or coming up with different ways to treat disease. So now maybe our 1000 people are hospital patients. Then our matrix of data, he could record their genetic information.

28:01

So this matrix will have a thousand rows if we have a thousand hospital patients and if we have a recording for each of their genes, it will have 20000 columns. So this is now a really huge matrix that it would be impossible to understand and find structuring by hand. We have to use some tool to be able to. Extract information from this matrix and then we can find our key measurements. So for this example, they'll be key genes, so we'll have key genes.

28:35

Number one on the x axis and key genes number two on the y axis. And I say jeans rather than Gene, because our key measurements will be some combinations of the columns of our matrix, they may not be one exact column plotted against another exact column. All right, and then on this plot, we can start to see similarities and differences between the different hospital patients.

29:03

So maybe we can identify a key group over here circled in purple, and this group of points on the plot corresponds to some rows of our matrix. And that then corresponds to some people in our cohort of hospital patients. And maybe these people are patients who may be suitable for a new treatment. So our key measurements shouldn't be relied on entirely to suggest patients for a new treatment.

29:38

But they allow us to see similarities and differences between the different patients and to identify key groups where we can then use our or someone else's medical expertise to interpret what these key groups might be. OK. And it's difficult to do this by hand. As I said, but it's important to be able to do the because we don't want to be trialling new treatments on everyone, we want to identify some particular group of patients who who may respond well to it.

30:19

All right, and let's see another example. So exam results prediction now maybe a thousand people are students, and now our matrix of data is some information about the schoolwork. So we have schoolwork information no one plotted on the x axis and schoolwork information number two plotted on the y axis. And now we've identified some group of students over here. And maybe these are students whose predicted grades may be too low.

30:52

So it's useful to be able to identify this group of students because we can have a second look at the predicted grades and make some judgement over whether that is indeed too low. And this is not practical to do for all the all the students on the on the plot. We want to identify some particular group of students who we think need some extra scrutiny. Shh. All right. And here's a third example in personalised advertising.

31:26

So now maybe a thousand people are some social media users or in this example, it's likely to be many more people. And then the data matrix is information about what each of these users like to click on. So now our plot over here, the key measurements are key clicks. We have the key kicks number one on the x axis and key clicks number two on the y axis.

31:55

And this plot enables us as before to identify perhaps some important groups of different users in terms of how they interact with this platform. And maybe we think that these users over here may respond particularly well to an advert, and that's useful because it costs much less to advertise to fewer people. So you want to make sure you select people who are likely to be persuaded by a particular advert rather than going to the cost of advertising to many different people.

32:34

OK, so I said before that quantitative tools are often either labelled as being good or bad. So what about principal component analysis? Is it good or bad? Well, we've seen various context in which it could be applied. So we saw that it could be used to give personalised presents to people based on their personalities. That seems quite nice. Or it could be used to identify patients who may be suitable for a new disease treatment.

33:13

This seems like a good use of the quantitative tool, although I would need to be taken to make sure that it was. Are able to be interpreted medically. And we also saw that it could be used to identify groups of students who predicted grades may be too low. Again, this is a situation where we would find it helpful to have the input of some quantitative tool, but we have to make sure it wasn't unfairly prejudiced against certain students.

33:50

And then we saw the example of personalised advertising of identifying users who may respond well to a particular advert, and this might seem like a good way to use this tool. It's helpful to see adverts for things that we might be interested in buying. Or it could be seen as a problem, as a potential issue with data privacy and has the potential to be used in a way that we might not like so much.

34:23

So, for example, principal component analysis was said to be a key tool that was used by Cambridge Analytica in Donald Trump's 2016 presidential campaign to send targeted adverts to people based on their social media platform habits to be able to send them very persuasive adverts.

34:52

And whether you think of this as a tool for psychological manipulation or as just a good way to get across relevant political information maybe depends on which campaign you support or which campaign is using this approach. So already we've seen quite a spread of possible uses of principal component analysis along this spectrum of good to bad.

35:21

But it can get even worse. So principal component analysis originated from a theoretical perspective in this paper called on lines and planes of closest fit to a systems of points in space by Karl Pierson in 1981. So here's a picture of Karl Pierson here, and here's a quote by him. He says In Germany, a vast experiment is in hand, and some of you may live to see its results.

35:53

If it fails, it will not be for want of enthusiasm, but rather because the Germans are only just starting the study of mathematical statistics in the modern sense. So this was Pearson in 1934 praising the Nazis for the eugenics programme, which culminated in the Holocaust. And Pearson, as well as being a mathematician and a statistician, was also a eugenicists, a study of eugenics, the racist and discredited attempts to improve the human race by identifying racially superior groups.

36:36

And Pearson made these comments at around the same time my grandfather cut, pictured here in the light blue, was leaving Germany as a refugee. So we've seen that principal component analysis, this same tool can be used for good and it can be used for bad. And in this middle ground between giving people presents and eugenics, how can we identify if it's a suitable tool to use? Well, we can see this from looking at its limitations, the limitations of a tool.

37:18

Very informative, and they can be seen from the limitations of the underlying idea of the idea that gave rise to that tool. So for example, here we have our tool is principal component analysis, and the idea is that of a key measurement. So what's the limitations of a tool and how can we see this from the limitations of the idea? Well, to go back to our example with the people, we had these six people and complex data about each of them, and we can encode that information into a matrix.

38:02

So this is our matrix. Each row of the matrix corresponds to a person, and each column is a measurement. So here we can imagine that this is some grid filled with numbers, and we saw before that. Our key measurement is some combination of columns of this matrix. And here we can start to see a possible problem. Let's imagine that instead of allocating presents to our friends, we're allocating jobs to people.

38:39

Then the complex data that we have in mind for each of the individual people may fall into some different types. So let's say that the data has comes in blue, green and red, so the blue information is about someone's personality. Maybe the green information is about the. CV, the professional experience or educational background, and then the red part is the demographics.

39:11

So for example, that gender, all that race, then in our matrix of data over here, some of our columns will correspond to these different types of data. So maybe the first few columns over here are the personality of each person. And then next, we have the professional details about someone. And then finally, the demographic information so we can think about splitting our matrix into these three different pieces. We've got the personality parts, the CV part and the demographics.

39:51

All we could imagine that these A-level students were trying to come up with a prediction of their grades of what they would have got in the exam so we can use that as the grade, then maybe the first part of information is about the individual student and then the green information could be about the teacher and the Reds information could be about the school. So all of these different types of information are important to keep in mind. But do we want to keep them together or separate?

40:35

We could also imagine separating the data, so we have student information completely separate from teaching information and school information. So this is our problem that the key measurement is a measurement of which data. Is it a key measurement we get by combining all our different measurements together? So we may combine school and teacher and student all all together? Or do we want to find individual key measurements of these individual pieces?

41:04

And then we'll end up with many different key measurements and not this nice plot of just key measurement number one against key measurement. Number two will have key measurements one and two for blue, green and red, and it all gets much more complicated.

41:21

And combining all of these different types of data together into our key measurement is a big problem, because then we may end up deciding what grade someone should have got in that A-levels, not based on information about the student, but based on information about the school. But on the other hand, we don't want to throw away that contextual information because we'll lose a lot of information that way.

41:47

All right, so we need new tools that can overcome these limitations, and luckily, there's a wide rich well beyond principal component analysis. So here's our data matrix here, we've got our six rows again correspond to our six different people. And then we have different columns of the Matrix, which are different information for each of the people. We've got the school information teacher information on school. Sir, student information, teacher information and school information.

42:25

All right, and we want to keep these different pieces together, but also separate. And one way we can do that is to separate them into three matrices, three grids of numbers like this. And in many cases, it's possible to combine this together to get a three dimensional grid of numbers, which is called a tensor.

42:51

So here's our tents over here, where each row of our three dimensional grid is still a person, but now we have three layers corresponding to the school information, teacher information and student information. All right. And we saw before that we can design tools for matrix data using linear algebra, using the theory of matrices. And similarly, we can design tools for tensor data using multimedia algebra, the theory of tenses.

43:31

OK, so it might seem a little bit strange to store data in this way in the form of this three dimensional grid. But there are some examples of tenses which are very, very familiar to us. For example, colour photographs. So in a colour photograph, we have different pixels and each pixel is made up of three different numbers. We've got a number for how red it is, how green it is and how blue it is.

44:00

So a data structure like this is really a colour photograph. Now, instead of the rows corresponding to people, they correspond to the vertical location of a pixel and the columns correspond to the horizontal location of a pixel and then each location. We have three numbers how blue, how green and how is. OK, so the tension enables us to keep these different types of information together, but also separated, and we can design tools using the theory of tenses.

44:38

But maybe we want to understand more complex interactions between these different types of information. So maybe we think that school has some complex influence on the teacher and then subsequently on the student. And in this more general setting, we can design tools for data using applied algebra. OK, so these different areas of linear algebra and then multi linear algebra and applied algebra are ongoing topics of interest in the mathematical community to me and other people.

45:16

But at this point, you might be wondering, Well, who are these other people who are also interested in types of algebra? And I'm happy to report that there's a world beyond call Pearson. So here's a photo of my colleagues and me from the Society for Industrial and Applied Mathematics Conference on Applied Algebraic Geometry from July 2019.

45:44

So in the world of mathematics has come a long way since the days of Karl Pearson, and it's a fun and diverse and sociable community of people with an ongoing project to improve its diversity over time. So to anyone young who might be watching, I highly recommend a career as a mathematician, we get to travel around different parts of the world and meet up with each other. So, for example, this photo was taken in Switzerland.

46:25

Well, now we have virtual conferences as well, but hopefully we'll be able to travel and see each other in person soon. All right, so in summary, we've seen how these three different parts of our world, how our Human Day to day lives and our quantitative toolbox. And then applications in the world all connects together. So well, specifically, we've seen how an idea via mathematics can be translated to some quantitative tool.

47:03

And then that quantitative tool can be used in applications. And insights from these applications can help us to better inform our new ideas. And we saw this in a particular example where the idea was that of a key measurement. And then the mathematics that we used to scale it up to a quantitative tool was linear algebra. And then the quantitative tool was principal component analysis.

47:34

But the story is by no means finished, so we're in need of many new ideas and new areas of mathematics in order to turn these ideas into a new quantitative tools. And that's all I wanted to say, thank you very much for listening.

Transcript source: Provided by creator in RSS feed: download file

Ideas for a Complex World - Anna Seigal

Episode description

Transcript