In situations where tensions run high and lives are at stake, being able to make informed decisions based on data is important. However, having the incomplete or incorrect data can also make a difference for the worse.
In order to leverage data there first needs to be data. Whenever data is asked from anyone, their guard will go up, and rightly so! Data collection without explicit consent is a serious breach of privacy and must be prevented at all times. The challenges of gathering the correct data can be broken down into several sub-challenges.
Finding trustworthy data
In situations where tensions run high and lives are at stake, being able to make informed decisions based on data is important. However, having the incomplete or incorrect data can also make a difference for the worse. If decisions are made based on incorrect data, those decisions will most likely be just as incorrect. Having low quality data forms an unnecessary distraction at a time when resources need to be used as efficiently as possible. What this means concretely, is that selecting the data that is needed is a process in and of itself. Although it might seem that all data is useful, this is (often) not the case.
For instance, data gathered from an area where there are social distancing measures in effect, cannot be used to build a prediction model for areas where there are no social distancing measures. This model would be too optimistic. If measures were taken or policies were made based on this ‘optimistic’ model, this could have disastrous consequences. Hospital stocks would be too low, people would not be accurately warned of the spread and much more. This is one very obvious example, but in reality there are hundreds of contextual variables that need to be taken into consideration before a model can be called ‘applicable’ based on correct data.
Gaining trust of people using the data
Every country has a different approach on how they tackle the problems that the pandemic has brought. Russia, for instance, monitors COVID-19 quarantines through facial recognition on the streets. South-Korea, determined to nip COVID-19 in the bud, pulled out all the stops, monitoring mobile location, CCTV cameras, facial scans, temperature monitors and credit card payments to make sure inhabitants respected the lockdown. At the top of the list there is China, who didn’t have any stops to pull out, because there were none there to begin with. Their approach was thorough and aggressive, using all of the above technology and then some. They flew drones on the streets to identify individuals that should have been in lockdown, they tracked their citizens in real time. Finally, any citizen spotted by the drone not wearing a mask would receive a major fine. These extreme measures are the reason there is distrust among citizens towards data collection. Many believe that opening the door slightly for the government to access their data would be like opening the flood gates to access their data for the rest of their lives.
It is important that data ethics and privacy considerations are a central part of any type of intervention. This effect is two-fold. Firstly, by approaching it in a fair, ethical, responsible and transparent manner, the amount of accessible data will be much larger as the initiative is more likely have the support of the people. Second, there will be a bigger chance of people trusting the predictions based on the data and behave accordingly. A double win, but only if set up in such a way that people trust the move.
Using the right knowledge and assumptions
Although the technologies to leverage data in a situation such as a pandemic definitely do exist, it is often a challenge to find the right fit. If they are presented without appropriate expertise, they can lead to catastrophically wrong predictions. The model will need more than just data-scientists and statisticians to make real-world predictions. It will need real-world policymakers, epidemiologists, sociologists and their real-world knowledge to make a prediction model usable.
An example of a situation where that went wrong, is the model that the White House relied on. The University of Washington’s Institute for Health Metrics and Evaluation (IHME) had developed a prediction model for the virus. This model has been used in the US by hospital planners, media outlets and by the government. It predicted in early April that there would be fewer than 60.000 deaths by August in the US. As of the beginning of June the US has already surpassed double that number. Another major flaw in the usefulness of the IHME model was the way it reacted to new data. It has always been changing projections when new data became available. This also means that the predictions of the model change every time. Adjusting and re-adjusting model is nothing now. However, epidemiology models should be an anchor point for policy makers to hold onto and base their policies on. If the model only flashes ever changing predictions, this limits its usability for any long term policies. As was the case with the IHME model.
In no way can the importance of regulation on the collection of data be understated. However, in situations like the one we find ourselves in right now, the temporary sharing of anonymised data could help save lives.
Although leveraging data will be able to help in combatting and monitoring this and future pandemics, it must be done within certain limits pertaining to privacy. Although those limits are very important, under current privacy and (cyber)security legislation properly leveraging data will be nearly impossible. Regulatory reforms will be needed to enable such technologies to properly function. The legislation in place is lagging behind the technology. As is often, technology develops first and legislation scrambles to properly keep up. It is still a thorn in the side for data leverage. Much of the current legislation is cast like a trawler net, catching all forms of data and subjecting them to the same legislation.
In no way can the importance of regulation on the collection of data be understated. Take China’s data landscape for instance, where citizens have very little privacy law protecting them. Although the data technology industry is growing rapidly there, it has been done at a cost that we shouldn’t be willing to pay. That being said, it is necessary that our national governments assess the differences in data. As of now, most regulation that exists encompasses all data, but there are different kinds of data.
For instance, at the moment in most jurisdiction there is no discrepancy between input data, data used to develop data technology models, and output data (predictions based on the input data). All are caught by the trawler net and are therefor limitedly shareable. Again, it cannot be understated how important privacy laws are and that they definitely need to remain in place or even be stricter in some cases. However, in situations like the one we find ourselves in right now, the temporary sharing of anonymised data could help save lives. Even though this might be something that a lot of people are willing to do, under current regulation this isn’t easily put in motion.
There is one data oriented solution that is being developed at the moment that faces all these challenges. In many countries, contact tracing applications are being rolled out. Several European countries, like Switzerland and Germany, have released an application to ease the strain on healthcare workers that are responsible for contact tracing of COVID-19 patients. There are many digital methods to determine when people were in close proximity of each other. However, there are only few that can do so whilst properly respecting privacy of the users. Both the German and Swiss application utilise Bluetooth of mobile phones, meaning they do not track the GPS locations of users. This is done anonymously so that neither the app, nor the other users, will be able to determine who was where at any point in time. This means that the application respects the privacy of its users.
Other countries are hot on their heels to release similar solutions. This solution does have limitations, mostly due to the Bluetooth technology that it uses. It is still a good example of how data can be leveraged against COVID-19. For more information on how the applications work exactly, read this article.
Companies or governments that are able to walk the thin line of respecting privacy on the one hand and collecting valuable data on the other hand, will have the greatest chance of succeeding in using data to combat this pandemic. Technologies, methods and ethical exchange of data to do realise this should therefore be developed sooner rather than later. This would prevent the data scramble that we’ve seen during these times.
It isn’t in any way a silver bullet or a complete solution to the pandemic as a whole. It therefor shouldn’t be sold as such.
In conclusion, like any solution, leveraging data has it’s pros and cons. It can help solve and assist with complex problems such as global pandemic. This must be done with caution to prevent flagrant privacy breaches. Also, the use of incorrect data or tools could lead to wrong conclusions, with major consequences. One thing is certain though, data technology will be able to support and aid the fight against COVID-19. In the future it could help prevent and disperse pandemics as well.
(There’s always a) But, before we get ahead of ourselves, it is important that we realise that data leverage alone will definitely not solve the COVID-19 pandemic. It isn’t in any way a silver bullet or a complete solution. It therefore shouldn’t be sold as such. What it can do is help support the social (and non social) systems that are now buckling under the strain of the pandemic. In order to do so we need to recognize the opportunity, acknowledge the challenges to overcome and work together to overcome them and lastly, really give data technology a chance to show its worth.