Since the beginning of the pandemic, the Italian Statistical Society (SIS) has repeatedly offered its expertise to help decision-makers and scientists to manage and study the situation. The former never listened. Yet, hundreds of scientific works have shown how such skills were essential in understanding and predicting events related to the pandemic.
In light of this premise, the SIS invites all civil society to sign the following open letter. To sign go here
Fight against COVID-19: high-quality data is needed for analysis and adequate skills to analyze it
The emergency due to the COVID-19 pandemic has highlighted the fundamental importance of the availability of reliable data and high skills in analyzing them to allow us to understand the pandemic, predict its evolution, prepare tools for both health policy and economic policy to face it, and evaluate the effects of the choices made.
It is increasingly evident that it is vital to offer competent support for a data collection inspired by quality criteria. We need to integrate available information using statistical criteria that protect this quality. And it is even more evident that, alongside the collection of high-quality data, there is a need to reclaim space for the scientific skills necessary to analyze them.
Why accessible data
To a large extent, the data necessary to construct adequate information are already collected by government agencies and bodies. Still, they are not made available to the scientific community. Confidentiality issues, and further unknown considerations, turn raw data into inaccessible information.
Currently, the available data are collected with the declared purpose of surveillance. Still, suppose the quality, the comparability between geographical areas, and the fundamental defining aspects are not guaranteed. In that case, any analysis of these data will be limited to monitoring the status quo, producing more projections than predictions. To study the epidemic's progress in detail, information is needed as detailed as possible to follow the individual pathways of contagion and clinical evolution.
On an aggregate level, the figures updated daily by the Civil Protection are available to all. We recognize and much appreciate the enormous work of data collection and dissemination carried out by this Agency. However, we note how, at this point in the evolution of the pandemic, what has been made available by the Civil Protection is no longer sufficient to make the government's decision-making mechanism and the scientific understanding of the evolution of the pandemic itself transparent.
In particular, based on this data, it is not possible to carry out some crucial activities.
• Reproduce the quantitative bases of institutional decisions. This emerged in all evidence as regards the recent division of the country into three zones. How indicators are defined and constructed, and the criteria for determining final decisions must be transparent. The disaggregated data with which these indicators are fed must be made available. Only in this way can the scientific community be able to evaluate the methodologies used.
• Ex-post assessment, quantitatively and rigorously, of the effects of decisions. An example of fundamental importance in this area is the choice of whether or not to close schools. Many researchers are trying to give a rigorous evaluation of the "school" effect; however, numerous scientific research on the subject does not yet provide shared conclusions. They are all based on aggregate data analysis.
• Understanding still obscure aspects of the phenomenon. The Italian scientific world is rich in skills that could usefully investigate essential elements of the phenomenon based on disaggregated data in collaboration with the institutions and agencies involved in managing the epidemiological crisis.
Why adequate skills
Statistical skills are currently in high demand and very difficult to find around the world. They have become increasingly exclusive and rare given the ever-increasing demand, reinforced by the current COVID-19 emergency. For example, Pfizer, a pharmaceutical company at the forefront of vaccine development and distribution, will only share its data in research groups where a biostatistician conducts the analyzes. In Italy, the data currently collected in the wake of the emergency is affected by many problems and high variability. Therefore, they need, even more than other biomedical data, specific skills to correctly deal with elements of confounding, imbalance, and high variability. All these aspects cannot be managed correctly without having advanced statistical skills.
Timely and effective, methodologically reliable, and shared answers are obtained when the right skills are involved in collecting and validating data and the same analysis. The scientific process requires numerous steps, in each of which specific skills are necessary for a correct construction of the information tools.
• Definition of the problem. First of all, it is essential to define what needs to be observed to answer the questions of containment, monitoring, and forecasting of the epidemic and its impact in the social and economic sphere. Diversified skills are needed in this process. Highly multidisciplinary teams, within which scientists from different areas can interact, are necessary to address all aspects of the problem. In this phase, on the one hand, the primary data required for the analyzes must be defined and, on the other, the construction and implementation of harmonization protocols between the different data sources.
• Management of databases. Specific computer and statistical skills are required to construct and manage data archives with massive flows of information. The data must not only be stored/saved but, above all, validated quickly to give timely answers and to ensure public access.
• Information analysis. In this phase, the ability to define and develop models capable of grasping the underlying characteristics of the phenomenon of interest, highlighting potential causal relationships, defining specific estimation procedures for unknown quantities and indicators, and building predictions that take into account the uncertainty that accompanies each estimate.
• Sharing of information. Different analysis models need to be compared, for example, in terms of predictive ability, interpretability, and robustness. To this end, it is desirable to establish periodic meetings, at least twice a week, between the researchers who develop the models and the institutions that could use them, openly and transparently, to share the best solutions.
• Dissemination of information. We are supporters of access to data by the entire scientific community. Accepting this request would allow greater transparency on the part of politics. It would enable civil society to obtain reliable and certifiable information. However, accessibility must be accompanied by an incisive and growing promotion of quantitative culture in all areas, starting with communication operators and political decision-makers.
It should be noted that this document asks for access to detailed data, and this access is not new to the national information system. In fact, on matters of an economic nature, the information is available in great detail. This point allows interested parties to analyze and process any type of issues (for example, data produced by ISTAT, Bank of Italy, Chambers of Commerce).
It should be strongly emphasized how the right skills are fundamental for analyzing such a complex phenomenon as the COVID-19 pandemic. The enormous variability observed at global, national, and regional levels must be incorporated into the assessments that lead to political and economic decisions. Knowing how to distinguish between association and causal relationships concerning observations and variables included in the analysis models is fundamental to avoid decisions based on random variations and/or spurious effects.
Nessun commento:
Posta un commento