Skip to main content

Address

ICC 650
Box 571014

37th & O St, N.W.
Washington, D.C. 20057

maps & directions
Contact

Phone: (202) 687.6400

Email: provost@georgetown.edu

 

Data and COVID-19

Many of us are checking each day (or at least every other day) a set of numbers — the number of confirmed cases; the number of deaths; the number of viral tests administered. Of great interest is the movement of those numbers over time, for the locale of interest.

As the weeks go by, we are, however, learning the lessons about how any empirical data must be challenged, probed, and queried before one understands them fully. All statistical information is useful for some purposes, and, simultaneously, useless or harmful for other purposes. Unfortunately, web pages and media text treatment of numbers often don’t document the nature of the measurement processes that generate numbers.

We’ve learned that there are multiple possible uses for the number of tests administered. From a public health perspective, it’s common to use a hierarchy of first testing those who are 1) symptomatic, then 2) at-risk health workers, then 3) other essential workers, and then 4) those in close contact with positive cases (often the result of contact tracing). Only at the end of this process are those who are asymptomatic routinely tested. Using the number of persons tested or the percentage of persons tested in a population as an indicator of how thoroughly the population is being measured usefully tracks the evolution of public health activities.

If, however, one uses the count of “confirmed” cases from such testing as an indicator of the prevalence of the spread of the virus, then it really matters who was selected for testing. The first group, those with symptoms, generate much higher percentages of positive outcomes, and the rates decline over the successive groups above.

Some areas have tried to improve on the targeted testing by asking some convenient gathering of people whether they would like to be tested (e.g., people entering or leaving a grocery store). The utility of these tests to describe the full population is dependent on whether those who are captured are in some way differentially susceptible to becoming positive (e.g., those who are sheltering in place are likely to have lower propensity to transmission than those who are frequenting a grocery store).

Even greater sins of data aggregation are being witnessed. For example, one jurisdiction just revealed that they had been combining the results of viral/diagnostic testing with those of antibody testing – an obviously mixing of apples and oranges.

What’s the gold standard for the estimation of prevalence? One would like to test everyone in the population, but that’s prohibitively expensive and logistically complex. Alternatively, some locales (e.g. Iceland, Miami-Dade County) have mounted probability sampling of persons within the target geographical area, to yield a representative sample of tests. This careful sampling is crucial for estimating the total number of persons in the population that are positives. Through careful sampling of a large population, followed by skilled administration of tests, accurate estimates of large population prevalence of the virus can be made. Doing this repeatedly would allow decision-makers to answer the question of whether things are getting better or worse.

In the absence of this, the counts of positives in many areas are gradually built on an unknown mix of high and low risk persons being tested. When that mix changes over time, as testing becomes more widespread, no one really knows how to interpret cross-time comparisons in counts of confirmed cases. We can do better.

One thought on “Data and COVID-19

  1. It’s sure helpful to have a data guru in the Provost role, especially at a time like this! This was the most clear explanation I’ve seen of testing during the pandemic. Especially appreciate the points about data being skewed based on what group one is testing (ex: symptomatic people). Interesting to hear what cities are doing to get a better cross-section.
    Thanks for sharing!

Leave a Reply

Your email address will not be published. Required fields are marked *

Office of the ProvostBox 571014 650 ICC37th and O Streets, N.W., Washington D.C. 20057Phone: (202) 687.6400Fax: (202) 687.5103provost@georgetown.edu

Connect with us via: