Saturday, 9 May 2020

Statistics in the time of pandemics


Pandemic statistics have become an important input into economic and social policy-making, but suffer from weaknesses that need urgently to be addressed.

From the 1990s international bodies, in particular the IMF, the World Bank and the United Nations Statistical Commission, have put substantial efforts into providing high-quality statistics on macroprudential risks to assist policy-making for the economy. If only because economic policy-making now requires also analysis of the progression of, and prospects for the impact of, the Covid-19 and any pandemic in the future, it is critical that similar enhancements are made to the pandemic statistics.

In general, one would expect statistics on a pandemic to be produced on a global basis, given the global nature of the pandemic and the crucial role of interconnectedness in determining its spread. The relevant specialist UN agency in this regard is the World Health Organization (WHO), which does publish statistics, but they are derived from national sources and it is not clear that there is any standardization. Moreover, the WHO has recently been drawn into political attempts to scapegoat it for the pandemic, which may cause any immediate efforts at a statistical initiative to run into further difficulty.

Given all that, the relevant European agencies, in collaboration upwards to the global institutions and downwards to the nationals, should take the lead in devising a set of high-quality statistics for this area. And if a European standard emerges, it would likely be adopted in other parts of the world too, especially if developing countries can be helped with technical assistance in implementation. In any case, the issues that arise in a standard-setting exercise are of themselves of use more widely.
The European Centre for Disease Prevention and Control (ECDC) is the European body tasked with monitoring, preventing and controlling diseases in its constituent countries. During the Covid-19 pandemic it publishes daily an impressive range of data on the pandemic. However, the ECDC is an EU agency, and thus produces figures just for “EU/EEA and United Kingdom”. This aligns with the coverage of some other European agencies, such as the European Systemic Risk Board (ESRB), but is less inclusive than some global observers, who also include the microstates and the countries outside the EU/EEA such as Switzerland and the western Balkans. Also, even for those covered, the ECDC’s numbers are not always the same as those from other sources.[1]

At present a range of private and university bodies are producing daily data on the impact of the pandemic. These derive largely by aggregating from national sources, and in the US from state sources. Some US states and some countries themselves provide the data in user-friendly accessible form. At the present time, when there is heighted public interest in the data, presentation is key, and indeed the presentations that have found large audiences are aesthetically refined, as well as having clear messages that are easy to understand. One sees some parallel regarding the provision of economic statistics by private sector companies such as Havers Analytics that take basic official data and add value by better timeliness, direct access to the contributory sources, and presentation.

At one level the data that are produced are amazing: daily data from every territory in the world. Johns Hopkins University and Reuters produce the global statistics from which other providers generally draw. Worldometer for instance, a private company owned by Dadex, is geographically totally inclusive, with every European micro-state, each Caribbean island, and (bar North Korea, unless the country excluded only because it officially has no cases) even politically challenging states such as Taiwan. This is truly exemplary and particularly important in this case because of the importance of identifying interconnectedness. To complement these health statistics, Deloitte for example seeks to measure the economic impact of the pandemic through presenting indicators of economic activity such as traffic density compared to some base period. Their statistics, as well as those for instance from a number of US states, are beautifully presented.

A first quibble though is the focus on static cumulatives. The numbers are generally presented in the way that centrally-planned economies invariably presented them, and that those working in the centrally-planned economies found very hard to change when their economies changed their operating paradigms after the fall of communism. In such presentations the headline figure is nearly always the cumulative total, just as cumulative output was headlined in centrally-planned economies as they headed towards plan fulfillment. Even the main ratios are static cumulatives, such as the number of cases per million inhabitants. In the long run these will be useful, if one wishes to compare the relative success of countries’ strategies. In the short term, however, they may be dangerous for exactly that reason, and thus may be particularly vulnerable to official attempts at distortion. For proper analysis one needs time series, and first and second derivatives of the total—i.e. has a country “crested”, and how rapidly is it coming down from the crest? “Mapping” the trends, as some statistics providers are doing, can give an immediate picture as to what is happening.
A second quibble is that daily data invariably contain noise, which is hard to filter out. Some observers, such as the Financial Times, use averages of daily data, which is helpful. Particularly where collection resources are limited however there would be a case for reducing the frequency of data, perhaps to weekly, but making sure that the resources thus saved are put to use to enhance the quality of the weekly data.

A third observation is that the data seem to be too good to be true. The data are in principle exhaustive for each country, i.e. they report all cases and deaths within the country (or in some cases just in the hospitals in the country). Conceivably this might be possible in advanced countries, but in places where resources are poor and infrastructure almost non-existent full coverage seems very unlikely. The figures for Africa show much lower rates for infection than in Europe, and for given infection rates lower mortality rates. These phenomena could prompt urgent analysis as to why people in Africa are less subject to the disease; but if the figures reflect reporting (or testing) deficiencies it must be a priority to make these clear.

At present there is no standardization. Comparisons are “apples and oranges”, as for instance some countries include just hospitals, while others go wider; some cover only confirmed cases, while others include those suspected. Nor can one easily adjust from one series to the other, since for those countries only producing hospital figures the wider figures are often not even collected. The figures may also not be consistent over time (not a problem in a centrally planned framework but important here): sudden jumps are explained in terms of expanding the reporting base, but there seem to be no back-revisions to get a consistent time series. Much is made of the technical models, for instance those run in Imperial College, London and Washington State, and how their predictions of mortality rates have influenced governments: the modelling techniques may be the best available (little information on their specifications seems to be available, so they cannot be assessed by outsiders), but insofar as the data inputted are not of high quality the results too will need to be handled with much caution. Over-promising on the accuracy of their forecasts, or over-simplistic dissemination of their results, can lead to exaggerated discrediting, particularly as the models are providing “inconvenient truths”: this seems to be the case to some extent for the Washington model, where predictions now are significantly lower than earlier.

Producing high-quality statistics for Europe as a whole could forestall many of these difficulties. Two key elements would seem necessary. First to define common standards, and then to assess and improve the quality of the statistics. One can then also work on the complex issue of establishing an operational indicator.

There is a precedent worth copying for the development of standards. On the macroeconomic statistics side, countries commit themselves to meet certain data requirements, the IMF’s so-called Special Data Dissemination Standard. All European countries do now so comply; most of those countries that cannot yet commit to this have nevertheless committed to a general “system”. In addition, the IMF developed a Data Quality Assessment Framework, and has assessed countries against this framework. The basic framework is shown in the table below. It covers a range of factors necessary for good statistics, including proper governance and resources in the collecting agency. Beyond the direct impact of improving statistics, receiving a public assessment of the quality of their statistics within this framework—with its heavy emphasis on independence, resource adequacy, professionalism, reliability and accuracy, and accessibility—may deter political leaders from seeking to distort the figures for their own ends, and may encourage them to make evidence-based decisions rather than ones based primarily on politics.

The proposal therefore is that the ECDC should collaborate with other institutions to determine the statistics that need to be collected, and then to harmonize the definitions; if there are multiple possible definitions (e.g. hospital deaths and total deaths) both can be collected and clearly labelled. As regards health statistics, work will have to involve collectors at a local level, and indeed at the level of the collecting unit, such as the hospital or the care home, and even the doctor and the inputting technician, in order for the national figures to meet the required standard, and if necessary adjusted so to be put into the harmonized European framework.

Incorporating regular outside assessments of the quality of the statistics will also be important. The IMF’s Data Quality Assessment Framework is a good starting point, and no doubt can be modified to incorporate the specificities of medical statistics. Thereafter it will be important to arrange the outside assessment. Experience in other fields, such as the Financial Action Task Force, suggest peer review. In the case of the ECDC, initial reviews could be carried out by national agencies, both from inside the EU and outside, from countries with recognized high-quality statistics. The purpose is not to point-score by stressing deficiencies, but collaboratively to work to improve statistical quality. In time such reviews might also be carried out by global agencies, such as the WHO, or sister regional agencies, for instance from East Asia.

Certain elements of the quality framework are particularly to the point at this time of weaponizing medical statistics by politicians. In that regard, amongst the best-practice requirements is not giving access to the information to the politicians before it is released to the public (although sometimes a limited confidential pre-release is permitted so that the government can prepare a response).

It will also be important to seek a summary key indicator, much as the monetary supply and inflation rates have at times been amongst the main indicators for the operation of monetary policy. Of the various possibilities, the Robert Koch Institute in Germany uses R0 as its key indicator. This aims to measure the number of people to whom a person infected with the virus will pass it on. The Robert Koch Institute estimates that, unchecked, the number with the coronavirus is between 2.4 and 3.3; the need is to have it consistently below unity for policies to be relaxed. The number is calculated by dividing the number of new infections by a weighted number of infectious diseases. However, the use of this indicator is controversial, as there are a number of problems with it at the moment. Neither the numerator nor denominator is accurate, so the ratio itself will not be accurate. In its pure form it assumes that nobody is vaccinated, and calculations are very time and place specific, ignoring multiple possible extraneous factors. As a lagging indicator it may be less problematic, but it only measures correlation not causality, so may not be a useful tool for analysis. Nevertheless, R0 may be the best indicator at present available, if its results are appropriately qualified, and (as with the initial macroprudential indicators 20 years ago) it could be a good starting point for further work.

Use of the ECDC rather than a specialist statistical agency, in particular Eurostat, is in line with specialist agencies collecting specialist statistics in other areas, such as education and finance, and also because Eurostat will anyway be facing its own pandemic-related challenges. For instance, it will be struggling to conduct the in-person interviews needed for survey data. Also, trade statistics may be distorted: for instance, medical supplies such as ventilators may not be acquired through the usual procurement processes, and it will be necessary therefore to seek to trace the current practices to ensure that balance of payments figures are still relatively accurate[2].

A comprehensive statistical exercise would be an important priority for the ECDC, in conjunction with other agencies, to improve the quality of decision-making regarding the handling of the pandemic and future pandemics, and to lay the groundwork for extension to statistics for other infectious diseases and beyond. It will be important that the EU urgently provides funding for this work as a priority.

Data quality assessment framework-Generic framework
(July 2003 Framework)
Quality Dimensions
Elements
Indicators
0. Prerequisites of quality
0.1 Legal and institutional environment—The environment is supportive of statistics.
 
 




 
 




0.2 Resources—
Resources are commensurate with needs of statistical programs.



0.3 Relevance—
Statistics cover relevant information on the subject field.

0.4 Other quality management— Quality is a cornerstone of statistical work.
0.1.1 The responsibility for collecting, processing, and disseminating the statistics is clearly specified.
0.1.2 Data sharing and coordination among data-producing agencies are adequate.
0.1.3 Individual reporters' data are to be kept confidential and used for statistical purposes only.
0.1.4 Statistical reporting is ensured through legal mandate and/or measures to encourage response.
0.2.1 Staff, facilities, computing resources, and financing are commensurate with statistical programs.
0.2.2 Measures to ensure efficient use of resources are implemented.
0.3.1 The relevance and practical utility of existing statistics in meeting users' needs are monitored.
0.4.1 Processes are in place to focus on quality.
0.4.2 Processes are in place to monitor the quality of the statistical program.
0.4.3 Processes are in place to deal with quality considerations in planning the statistical program.
1. Assurances of integrity

The principle of objectivity in the collection, processing, and dissemination of statistics is firmly adhered to.
1.1 Professionalism—Statistical policies and practices are guided by professional principles.
 

 


 
 

1.2 Transparency—Statistical policies and practices are transparent.











1.3 Ethical standards—
Policies and practices are guided by ethical standards.
1.1.1 Statistics are produced on an impartial basis.
1.1.2 Choices of sources and statistical techniques as well as decisions about dissemination are informed solely by statistical considerations.
1.1.3 The appropriate statistical entity is entitled to comment on erroneous interpretation and misuse of statistics.

1.2.1 The terms and conditions under which statistics are collected, processed, and disseminated are available to the public.
1.2.2 Internal governmental access to statistics prior to their release is publicly identified.
1.2.3 Products of statistical agencies /units are clearly identified as such.
1.2.4 Advanced notice is given of major changes in methodology, source data, and statistical techniques.
1.3.1 Guidelines for staff behavior are in place and are well known to the staff.
2. Methodological soundness

The methodological basis for the statistics follows internationally accepted standards, guidelines, or good practices.
2.1 Concepts and definitions—
Concepts and definitions used are in accord with internationally accepted statistical frameworks.

 2.2 Scope—The scope is in accord with internationally accepted standards, guidelines, or good practices.
2.3 Classification/ sectorization—
Classification and sectorization systems are in accord with internationally accepted standards, guidelines, or good practices.
2.4 Basis for recording—Flows and stocks are valued and recorded according to internationally accepted standards, guidelines, or good practices.
2.1.1 The overall structure in terms of concepts and definitions follows internationally accepted standards, guidelines, or good practices.
2.2.1 The scope is broadly consistent with internationally accepted standards, guidelines, or good practices.
2.3.1 Classification/sectorization systems used are broadly consistent with internationally accepted standards, guidelines, or good practices.

 
2.4.1 Market prices are used to value flows and stocks.
2.4.2 Recording is done on an accrual basis.
2.4.3 Grossing/netting procedures are broadly consistent with internationally accepted standards, guidelines, or good practices.
3. Accuracy and reliability
Source data and statistical techniques are sound and statistical outputs sufficiently portray reality.
3.1 Source data—Source data available provide an adequate basis to compile statistics.



 
 


 
3.2 Assessment of source data— Source data are regularly assessed.



 



3.3 Statistical techniques—
Statistical techniques employed conform to sound statistical procedures.


 

3.4 Assessment and validation of intermediate data and statistical outputs—Intermediate results and statistical outputs are regularly assessed and validated.




3.5 Revision studies—Revisions, as a gauge of reliability, are tracked and mined for the information they may provide.
3.1.1 Source data are obtained from comprehensive data collection programs that take into account country-specific conditions.
3.1.2 Source data reasonably approximate the definitions, scope, classifications, valuation, and time of recording required.
3.1.3 Source data are timely.

3.2.1 Source data-including censuses, sample surveys and administrative records-are routinely assessed, e.g., for coverage, sample error, response error, and non-sampling error; the results of the assessments are monitored and made available to guide statistical processes.
3.3.1 Data compilation employs sound statistical techniques to deal with data sources.
3.3.2 Other statistical procedures (e.g., data adjustments and transformations, and statistical analysis) employ sound statistical techniques.
3.4.1 Intermediate results are validated against other information where applicable.
3.4.2 Statistical discrepancies in intermediate data are assessed and investigated.
3.4.3 Statistical discrepancies and other potential indicators of problems in statistical outputs are investigated.
3.5.1 Studies and analyses of revisions are carried out routinely and used internally to inform statistical processes (see also 4.3.3).
4. Serviceability
Statistics, with adequate periodicity and timeliness, are consistent and follow a predictable revisions policy.
4.1 Periodicity and timeliness— Periodicity and timeliness follow internationally accepted dissemination standards.
4.2 Consistency—Statistics are consistent within the dataset, over time, and with major datasets.



 

4.3 Revision policy and practice—Data revisions follow a regular and publicized procedure.
4.1.1 Periodicity follows dissemination standards.
4.1.2 Timeliness follows dissemination standards.
4.2.1 Statistics are consistent within the dataset.
4.2.2 Statistics are consistent or reconcilable over a reasonable period of time.
4.2.3 Statistics are consistent or reconcilable with those obtained through other data sources and/or statistical frameworks.
4.3.1 Revisions follow a regular and transparent schedule.
4.3.2 Preliminary and/or revised data are clearly identified.
4.3.3 Studies and analyses of revisions are made public (see also 3.5.1).
5. Accessibility
Data and metadata are easily available and assistance to users is adequate.
5.1 Data accessibility—Statistics are presented in a clear and understandable manner, forms of dissemination are adequate, and statistics are made available on an impartial basis.





 


5.2 Metadata accessibility—
Up-to-date and pertinent metadata are made available.




 



5.3 Assistance to users—
Prompt and knowledgeable support service is available.
5.1.1 Statistics are presented in a way that facilitates proper interpretation and meaningful comparisons (layout and clarity of text, tables, and charts).
5.1.2 Dissemination media and format are adequate.
5.1.3 Statistics are released on a pre-announced schedule.
5.1.4 Statistics are made available to all users at the same time.
5.1.5 Statistics not routinely disseminated are made available upon request.
5.2.1 Documentation on concepts, scope, classifications, basis of recording, data sources, and statistical techniques is available, and differences from internationally accepted standards, guidelines, or good practices are annotated.
5.2.2 Levels of detail are adapted to the needs of the intended audience.
5.3.1 Contact points for each subject field are publicized.
5.3.2 Catalogues of publications, documents, and other services, including information on any charges, are widely available.
  Source: IMF

#Covid19; #WHO; #Brexit; #EUTransition; #CharlesEnoch; #Pandemic; #Immigration; #HomesFitForHeroes; #NoDeal; #HighestDeathRate


Charles Enoch

ESC Fellow, European Political Economy Project, European Studies Centre, St Antony’s College, University of Oxford


[1] For instance, in late-April 2020, the total number of recorded cases in France was much lower in the ECDC tally than in those of the private global agencies.
[2] The IMF issued guidance in March 2020 to statistics agencies on how to maintain their statistics during the Covi-19 pandemic.

No comments:

Post a Comment