Thank you very much, Crystal. It's really nice to be here. Very strange to be in a glass box, so I hope this works, so you'll have to bear with me if if it's a bit strange. Thank you very much for the invitation to come to the department. So the subject of my talk is indeed statistical ethics. It's a topic which I approach with some humility, since I know that we have different experiences and views regarding ethics.
In part, that's because statisticians work within a very wide range of different economic, cultural, legal and political settings. Also, because we work within different branches of our discipline, each involving its own techniques and procedures and its own ethical approach. Can everybody hear me OK? Can I just check the technology? I'm being indicated that I should speak up. I'm speaking quite loudly. We can hear you online. You can have hear me online. That's good. That's good. Great.
The other reason why we have a variety of different experiences with respect to ethics is that many statisticians work in close partnership with professionals from other disciplines, and those other disciplines have existing conventions. And in such situations, statisticians should make their own ethical principles clear, but should also respect the ethical principles of their collaborators. They might be medicks, they might be economists, et cetera.
So even within the same setting and branch of statistics, individuals may have very different moral precepts which guide their work. So in this talk, I'm not seeking to impose a rigid set of rules to which statisticians should comply. And I think that's really important we could come back to that in discussion, if you like. My own background is Crystal Sanders as an applied social statistician.
I've worked in central government, in this country and in the United Nations, and for a brief period in the NHS. My more recent roles in academia have involved supporting graduate students and researchers and social science and medicine. But my interest in ethics began very early in my career in my twenties. I chaired the Ethics Committee of the Social Research Association, and we developed professional guidelines, which was subsequently adopted by the International Statistics Institute.
The aim was to support the social researcher, stroke statistician in making individual ethical judgements and informing them of shared values. The goal called guidelines for a very important reason, and that is because, as the introduction to the guidelines says, it isn't appropriate to draw up a list of regulations, not least because context is important. So I'm going to be drawing on those guidelines, and you can reference them here.
International Statistics Institute Declaration on Professional Ethics. I'm also going to be referring to the United Nations fundamental principles of official of national official statistics. And let me just start off by quoting from those fundamental principles. Official statistics provide an indispensable element in the information system of a democratic society serving the government, the economy and the public with data about the economic, demographics, social and environmental situation.
So they send official statistics that meet the test of practical utility are to be compiled and made available on an impartial basis by official statistical agencies to honour statistical citizens entitlement to public information. I would say to empower the public as well. But I think underpins my talk, and I'm going to structure my talk run for things. Ensuring honesty and integrity, avoiding over claiming, promoting transparency and delivering public goods.
So I'm not covering the whole area of ethics, but focussing on these. So theme one. Ensuring honesty and integrity. Let me quote from the ISI declaration. While statisticians operate within the value systems of their societies, they should attempt to uphold their professional integrity without fear or favour. They should also not engage or collude in selecting methods designed to produce misleading results or misrepresenting statistical findings by commission or omission.
We're all familiar with the damage wreaked by misinformation, which has, to be quite honest, undermined trust in statistics from the false claims made about the windfalls that Brexit would deliver to the NHS to denying the reality of national PPE shortages. So the more recent lies relating to, say, Stallmer, issue, the guiding principle of many has been to lie first and avoid questions later.
And research, unfortunately, shows that once the light is made, it gains a life of its own and can be extremely difficult to counter. I rather like the saying that lies move round the world and back whilst truth is doing up at shoelaces. And, of course, social media and echo chambers amplify the lies.
The U.N. Fundamental Principles state, I think rather strangely in principle for statistical agencies, are entitled to comment on erroneous interpretation and misuse of statistics, find it rather strange because of course they're entitled to. I would prefer the word entitled in this, so be much, much strengthened. I think we should be aiming for correcting misinterpretations of any data for which we have responsibility.
We can't do it across all data, of course, but any for which we have responsibility. The IOC declaration talks about the statistician considering the likely consequences of collecting and disseminating various types of data and should guard against predictable misinterpretations of misuse. So that's rather difficult. So you've got to anticipate the the misuse, according to the IOC declaration.
Of course, here in the United Kingdom, the UK Statistics Authority in particular tries to correct misinformation. For example, their recent intervention. They rebuked the prime minister and the Home Office for claiming erroneously that crime is falling.
Unfortunately, they don't have teeth, and the prime minister repeated very recently the claim that there are more people in work in the UK now than there were two years ago, and he was rebuked by the statistics authority some time ago for that incorrect statement. World leaders and famous people are vectors for disinformation if they condone and they normalise false claims. We have particular problems in the UK what I call the land of the performance indicator.
And our government confuses setting targets with delivering improvements. So problems arise when governments or other organisations are both monitoring and being monitored by indicators. And ranking by the indicators exacerbates the problem because it raises the pressure. It's a mess report is an absolutely wonderful report a few years ago by the Royal Statistical Society called The Good, the Bad and the Ugly about the the benefits and the best benefits of performance indicators.
One of the difficulties we have is that when a measure becomes a target, it ceases to become a good measure. Rather, like David Boyle's paradox, if we don't count something, it gets ignored. If we do count it, it gets perverted. Misinformation refers to claims that are false, but are not necessarily created with an intent to mislead. So you can counter misinformation. It basically requires explaining what is wrong. And why it's wrong. Disinformation is created expressly to mislead.
And it requires that, as well as showing what's wrong. You need to investigate who's behind it and why. And of course, that can be very difficult because some in disinformation campaigns can be very sophisticated. It is sometimes possible to show that a false claim is being shared online in a coordinated manner by a number of social media accounts. Particularly where they share similar traits or behaviour and perhaps are all linked to certain political campaigns.
And there are some tools like CrowdTangle that can be used to investigate patterns of behaviour in such accounts. I'm grateful to the Reuters Institute for the Study of Journalism here in Green Templeton College for these explanations and for their work on fake news, and I would encourage you to to look at their communications about the topic of fake news.
I'd like to see statisticians being much more proactive in speaking out of the importance of maintaining trust and statistics and more involved in the science which tackles this and this information. It would also be good to see greater cooperation between fact fact checkers and statisticians. Students of this department could do internships at organisations such as Full Fact, Full Fact was established in 2010 by Will. In part because of Peter O'Brien's wonderful book The Rise of Political Lion.
The initiatives of the Royal Statistical Society in establishing a journalism prise and in training journalists in the sound use of statistics are also to be applauded. Some media organisations are also building expertise. Interestingly, sometimes known under the label of digital verification. And I was extremely pleased when the BBC appointed head of statistics. He has a particular challenge to correct the issue of the false balance of reporting.
It's also positive that journalists are more likely to say now when politicians are lying on this discussions taking place. But if an MP lies, colleagues can ask the speaker to have a statement checked by the House of Commons Library. So I think those are positive statements. Developments. There are also significant benefits when statisticians work in partnership with scientific communities that are committed to the integrity of reporting.
As illustrated over the last couple of years by the strong involvement of statisticians in the Science Media Centre during the pandemic, I think it has been really positive. Difficult as the issue is of maintaining trust and data in the UK, it's significantly worse in countries where the official statistics agency is not independent from the government of the day.
Having worked in the UN, I can give you numerous examples of what I call political numbers when the statistical system is badly under-resourced. When the evidence base for such numbers is poor or even non-existent. When statisticians have no power and no external support system, they're really not free to produce. Inconvenient truths. So often the data are produced to satisfy the political masters. Freedom to speak, truth to power has been seen as a characteristic of democracy.
The need for a set of principles governing official statistics became apparent at the end of the 1980s, when countries in Central Europe began to change from centrally planned economies to market oriented democracies. It was essential to ensure that national statistical systems in such countries would be able to produce appropriate, reliable data that adhere to professional and scientific standards.
However, since then, I think we realise that the principles are of much wider global significance. And there are examples, even in Western Europe, of direct manipulation of statistics. So we could look at Greece, for example, in the manipulation of the economic data in order that Greece could join the EU and then further manipulation and audit an order that Greece could join the euro currency. There are just too many incentives for governments to mis report, and I'll return this to this later.
This slide shows some of the characteristics of a statistical system which help to protect its independence, the autonomy of the statistician, whether the statistical legislation, the existence of an independent board overseeing statistics, development of codes of conduct and breaches of the code being identified, investigated and importantly publicised. So it's not just that it happens, but the perception is that it happens.
The important employment of the senior statisticians, particularly the national statistician, being removed from the political process. But there's involvement with users, and the users should be involved in setting the agenda, asking the awkward questions. There are external audits of processes, and any audit body doesn't report to the government of the day, but reports more widely to parliament.
A red flag is the dismissal of a national statistician, as has recently happened in Fiji and in Turkey. Let me read from the RSS website just today. The Royal Statistical Society and the American Statistical Association note with concern the dismissal of the head of the Turkish Statistical Institute by Turkey's president.
At a time when the country is experiencing high levels of inflation, the United Nations fundamental principles of official statistics clearly set out the professional and scientific standards for official statistics to ensure reliable and robust information to aid decision making as learnt societies with memberships of government statisticians across the world. The RSS and the USA look to highlight the importance of these principles in maintaining public trust.
The RSS and the ISI condemn political interference in the production of official statistics and urge President Erdogan to reassure those statisticians working for the Turkish Statistical Institute that they are free to produce objective statistical information that serves not just the government, but the Turkish public. This is essential to ensure a healthy democracy in Turkey and to maintain international credibility and statistics. My second theme is the avoidance of over claiming.
The IOC declaration talks about statisticians depending upon the confidence of the public, and that they should, in their work, attempt to promote and preserve such confidence without exaggerating the accuracy or explanatory power of their data. So it's important that we shouldn't have a claim for our results, in particular, we should not claim representativeness beyond the population we've studied. To do so risks bringing statistics into disrepute.
I could quote many examples, but perhaps one of the most frequent problem areas is the reporting of opinion polling, where the data are assumed to be subject only to sampling error. Ironically, even when probability sampling has not been employed. And where on sampling error is completely disregarded? Indeed, in relation to many data sets, the analysis of many data sets, we see the possible effects of non-response ignored.
And there's an overemphasis on sample size with little understanding of bias. I love these two quotes, one from Brad Efron. He talks about scientists having misled themselves into thinking, if you can, if you collect enormous amounts of data, you're bound to get the right, the right answer. And Nate Silver responding to Chris Anderson, the editor of Wired Magazine, claims that the sheer volume of data would obviate the need for theory.
Need for statisticians, I guess, and even for the scientific method. And Nate Silver argues that these views are badly mistaken, says the numbers have no way of speaking for themselves. A related ethical issue is that statisticians have a responsibility also to say what we don't know. Indeed, that can be very informative, as it often highlights priorities within our society. Deaths in old people's homes at the start of the pandemic is an example which springs to mind.
So I've talked about the importance of over claiming, but this is not the side to this story. And the other side is statisticians are often seen as being too cautious. It's unhelpful if we resist drawing conclusions from our data. I have a wonderful T-shirt that I got on the NASA conference that says being a statistician means never having to say, I'm certain. And as David Spiegelhalter said in last week's Desert Island Discs, statistics do not speak for themselves.
We imbue them with meaning. Incidentally, isn't that a wonderful sign as the profile of statisticians being raised a statistician on desert island discs that if was the first statistician, but I certainly can't remember one before. A topic that we might discuss is whether statisticians should ever make policy recommendations when alive us here during the pandemic. I'm not someone who sees a clear line between the objective and the subjective.
What we choose to study, the questions we choose to ask. The timing of our research, et cetera, et cetera. These are all decisions that we make, but which are going to impact the findings. Somebody else doing a similar study may come up with different results. So it's very difficult, in my view, to say what's objective and what subjects have.
I'll come back to this in relation to openness. The other reason why I think we should have a discussion about whether statisticians should ever make policy recommendations is that many statisticians become real experts in the substantive areas they study. I could choose examples from this department. I think you've got a number of statisticians in this department who are real experts and subtract substantive areas, and it's a great pity if they feel overly constrained and what they can say.
Of course, statisticians are frequently put under pressure by the media to step beyond their area of expertise. And it's easy to get sidelined by a clever, clever interviewer. And we need to take account, too, of the weak statistical literacy in our society. The lack of understanding of the public, the use of modelling and of variability led to an over interpretation of these as predictions.
Incidentally, I'm reminded of Fiddler's statement that forecasting is very difficult, especially if it's about the future. We do need better skills as statisticians to how to operate responsibly and hopefully with the media. Especially in an environment where we're urged to demonstrate greater impact of our work. And there's a great temptation for academics to become media celebrities. In particular, I think we need to improve the communication and the understanding of uncertainty.
My third theme is openness or openness and transparency. The ISI declaration says that statisticians are frequently furnished with information by the funder or employer who may legitimately require it to be confidential. Incidentally, elsewhere in the declaration, they talk about the importance about being open, about who you are funded by and employed by. Statistical methods and procedures that have been utilised to produce published data should not, however, be kept confidential.
I can't stress enough the importance of transparency of methodology and openness about uncertainty. But I also want to champion the moral imperative to share data. Crystal didn't mention that I used to be director of the UK data archive, so this is something that is very close to my heart. The scientific principle is that data should be available for others to refuse to confirm, to clarify, to extend, to enhance the results.
That's part of public accountability. We have a responsibility to society. But of course, also to our funders, it's OK to use resources efficiently, and it's important to to reduce response burden. Steve Fineberg, Byron Staff, Sue Martin, I think it's I think it's Sue Martin wrote this wonderful paper some time ago, 1985 gosh, about the importance of of access to data and that public funded research are a public good. Produced in the public, interest should remain in the public realm.
Of course, there are constraints, as they identified in this quote. So I would argue also, it's important to archive data because many of the issues we study change over time and most datasets can't be reconstructed. So where possible and consistent with confidentiality, data should be shared and at the individual level to facilitate replication.
So not just aggregate data. Great progress has been made with respect to open access journals, but the importance of integrating data with the associated publication has been paid too little attention. The concept of data stewardship is gaining recognition. I define this as the responsible use collection and management of data in the participatory and rights preserving way informed by values and engaging with questions of fairness.
Ways are being explored to allow people to gain increasing levels of control and agency over their data, from being informed about what's happening to data about themselves through to being empowered to take responsibility for exercising and actively managing decisions about data governance. The GDPR requires such consent to be specific, informed, unambiguous and given freely requiring affirmative action by the user.
Now, those are fine words, but I'm conscious that there are difficulties and I'm applying them in some circumstances. We have statisticians often rely on secondary sources. As combining data sources becomes more prevalent, record linkage in particular can post privacy challenges.
Similarly, obtaining informed consent or any consent at all from units for access to and linkage of their data from non survey sources such as administrative data, which has been collected for an entirely different purpose that continues to be challenging. And of course, increasingly we are using administrative data in our studies.
In cases where a statistician has been granted access to administrative or medical records or other search material for a new enquiry, the custodians permission to use the records should not relieve the statistician from having to consider the likely reactions, sensitivities and the interests of the subjects concerned, including their entitlement. I don't think there are easy answers to these challenges, but I don't think we as statisticians discuss them enough.
So as a society, we're seeking to chart a way between a data free for all where people feel powerless, but how data is being used and a situation where opportunities for beneficial research are lost because data isn't shared. Data stewardship can build trust. Despite the problems of using secondary data sources, I'd argue that we shouldn't fund from public finances new primary studies, which are carried out in ignorance of what has previously been researched.
When I was at the data archive, I worked with the FCC to develop a data policy, and that data policy was but the deliberate replication to be applauded but ignorant duplication, waste resources as to respond burden and shouldn't shouldn't be funded. For responsibilities to society or statistics as a public good, which I've referred to a couple of times. The ISI urges statisticians to act in the public's interest.
Says that their obligations to employers, clients and the profession can never override the public interest, and fellows should seek to avoid situations and not enter into undertakings which compromise this responsibility. So I want to focus on how the agenda for statistics is set with respect to the issues we address. It's vital that statisticians are free to collect information which might be uncomfortable to those in power.
Statistics give visibility and they can hold a mirror up to our societies. We also need to acknowledge that those on the fringes of society are often unrepresented, underrepresented in all studies. We're talking about people without permanent homes, I'm talking about people in institutions, I'm talking about the poor. Those working in academia may have more freedom than those working in government to decide what to study, though we have seen an increase in directed research funding.
And some of us worry that our funding bodies are not always politically independent. But working in government, I've been aware of some of some potential interference in what's collected, so I can tell you tales of when I was the secretary to the Census Committee and Margaret Thatcher tried to influence the questions being asked on the census.
As every statistician knows, gathering the data is only half the story, the other half is getting it out to those who need it, and the story which accompanies the data is important but is more easily manipulated. So when I worked in the NHS, the Department of Health. Try to stop me producing an interpretation of some statistics that showed a rise and caesarian births year on year. So you do get interference and the freedom to collect information, the freedom to report on the data is critical.
Resistance to such interference has been very dependent upon the professional leadership, and that professional leadership needs to understand the nuances of the situation need sometimes to be brave. The Royal Statistical Society has also been crucial.
So it's argued for published dates and times when statistics will be released to avoid sensitive data being deliberately timed so that they'll get less coverage and for a reduction in the number of people who have prior access to official data and the length of time they have access to it. And I've got a couple of rather old slides here showing you the sorts of things we wanted to avoid. So Gus O'Donnell, who says he was misquoted.
So we'll have to. I have to add that. But he was quoted as having said that he wanted the Office for National Statistics to be boring, to put out the plain facts and nothing but the facts and unclear and predictable deadlines. He said it would then be for politicians and government press officers to interpret the figures.
He added the RSS wrote to him the same day and know because I helped write that letter, saying it's clearly the task of statisticians to interpret the figures in a statistical context to facilitate understanding and avoid misunderstanding.
The Code of Practise of the UK Statistics Authority explicitly states that official statistics, accompanied by full and frank commentary, should be readily accessible to all users and that all UK bodies that are responsible for official statistics should prepare and disseminate commentary and analysis. The aid interpretation and provide factual information about the policy or operational context of official statistics.
This is the issue that I was saying about having the predictable deadlines, and some of you may remember this, but this is a statement that was made of 911, where a senior government adviser suggested it was a good day for burying bad news. So fortunately, we've been able to counter some of these issues, not in all countries.
I think I've got time just for a quick aside, when I was working in the in the U.N., I was at a meeting where the National Statistician Canada talked about no politicians, no government ministers had access to statistics, except for the government minister with responsibility for those statistics. And he was given access. He or she was given access one hour before in a closed room without any access to technology to distribute information.
So you didn't get his or her spin first. And the head of statistics for Russia was listening to this and then asked for the floor and said. He listened with respect to the distinguished head of statistics from Canada. But what he wanted to know was what happened when the minister wished to change the statistics.
So illustrates the difference of different countries. When I was working as a statistician in the U.N., I was concerned about pressures on poorer countries to collect data that's been determined by wealthier countries, by donor agencies. What I would call data colonialism. There is, in my view, an an overemphasis on cross nationally comparable data at the expense sometimes of locally specific data, which might be used to develop relevant policies that can make a difference in people's lives.
Instead, we have countries getting obsessed with targets and that positions in international league tables. That was my topic, the topic of my RSS presidential address, which I hesitate to say was 21 years ago. Scary. A thread running through the whole of my talk has been that of the importance of understanding the impact of incentives. I mentioned incentives, particularly in relation to official data, but we also need to understand the unintended consequences of the system of academic rewards.
In response to the UK Science and Technology Select Committees, recent call for evidence on science reproducibility, many early career researchers expressed concern about how they were being measured. And their impressions were backed up by the Wellcome Trust survey of Research Culture. Forty three percent of the respondents thought that metrics were valued over research quality. Nearly a quarter of early career researchers had felt pressurised by a supervisor to produce a particular result.
And a common response was that the criteria for hiring and firing young researchers needed to be modified. Respondents arguing that we must be careful when we use proxy measures of quality, even though such as number of publications or amount of grant funding. And instead, we should try to reward work that's conducted in a reproducible and reproducible and rigorous fashion.
Clearly, something's wrong in a system where so many young researchers feel that there's a mismatch between doing good science and having a successful career. Dorothy Bishop here in Oxford argues that we should give we should try while giving researchers an automatic and flexible research budget, say £10000 targeted at replication. No reviews, no university overheads. Just pre registration of the hypotheses to be tested.
A law promoted by Ben Goldacre. And open science flexibility would allow researchers to pool their grants to undertake a larger piece of work and voluntary and contrived collaborations. Crucially, researchers would switch from asking Is this fundable to is this true? I think that's a rather wonderful note on which to end my talk. Do we measure what we treasure? A question for you. Thank you.
