Saturday, April 18, 2020

Data Decides?

According to the headline of an article published this morning by a data journalist about when New Zealand will move out the lockdown, the Prime Minister Jacinda Ardern said, “The data will decide”. I presume that she did not actually say that, because it is quite a stupid statement. I spent thirty years of my working career working with data, and I never saw a piece of data make a decision about anything.

People make decisions, not data. A person can use data to help them make a decision, but it must be done carefully. Often, there will be numerous data series that are relevant to a decision, and they sometimes give a conflicting message, because they measure slightly different things.

The decision-maker must analyse the data to know how each piece of data was measured and how reliable it is. They need to understand what the data is indicating. Has the trend actually changed, or are the last few observations just random noise?

Usually there will be various data sources that are relevant to a decision, and often the data that is really needed is not available, or not up-to-date. The decision-maker has to decide how much weight to each piece of data, which is a human judgment. Using data wisely takes a great deal of wisdom.

The decision-maker needs to understand the context behind the data. Information about numbers of new Covid19 infections only means something in the context of an understanding of epidemiology (how viruses transmit), knowledge of geography (where infected people live), and psychology (how they will behave).

So, I hope the leaders making the decision about the lockdown will use the data wisely. I also hope that they will understand that they are making the judgement and must be accountable for their decision.

Data rarely supports a decision one way clearly, so the people deciding usually need a great deal of wisdom and insight. Of course, the benefit of claiming that the data will make the decision is that if the decision turns out to be wrong, they can blame the data.

A problem with politicians is that tend to use data badly to justify their decisions and sometimes to scare people into complying with their them. An example is the Prime Minister’s claim that if a severe lockdown was not introduced, 8000 to 13000 people would die. It has now emerged that those numbers came from a Covid19 model that was poorly specified. The same model produced an estimate of 500, when more plausible parameters were used. The big numbers were a worst-case scenario that was unlikely to happen. Although unreliable, they were used to scare people into complying with the lockdown.

A piece of data that we need to be careful about is the results of the community surveillance testing. 300 people have been fairly randomly tested in a Christchurch supermarket (and in a couple of other towns) to determine if there is significant community spread. The problem with this is that sample surveys are useful for measuring something that is quite common in society (ie support for large political parties), but they are not very useful establishing that something is missing from a population, (ie community spread of a virus).

There are 300,000 people living in Christchurch. If there are ten people wandering around the city with undetected Covid19, that is only one in 30,000. The probability that one of these ten would be at the Moorhouse Avenue supermarket (let alone be selected for testing) is negligible. So the surveillance testing will almost certainly not find any undetected infections, but that does not mean that none exist. A surveillance survey would need a vastly larger sample to determine that.

However, I presume that the information from the surveillance surveys will be used by politicians to make people feel safer about any decision to relax the lockdown, just as an exaggerated death toll was used to persuade people into staying at home.

No comments: