Forgive, dear reader, a commentary on issues of intellectual interest to me alone, perhaps, but…
Much of what we know or are told about modern societies rest upon analyses of self-reports of people. Journalists talk to passers-by and ask questions about their welfare and interpretations of events. Customers are asked by sales representatives about how they evaluate the service. Comments to articles express readers opinions. Storytelling describes the state of a single person, often illustrating experiences common to others.
For decades and decades in the late 19th and early 20th century, decision-makers used these collections of self-reports to build evidence to guide decisions. Around the time of the great depression, however, the weaknesses of such haphazard methods of measurement of public beliefs and behavior became obvious. For example, in the 1930’s the US did not have credible estimates of the number of unemployed. How could ameliorating interventions be designed without good measures of the extent of the problem?
The scientific sample survey was an invention of the mid-1900’s, using structured uniform questions posed to a representative sample of a target population, and producing accurate estimates of the full population based on a subset identified by giving each member a known probability of entering the sample. The sample survey is considered by some as the most important invention of the social sciences in the 20th century.
This is a post identifying four prominent eras of scientific surveys. The first (1930-1960) was a period of wonderful invention. A 1934 article provided theoretical logic that probability samples, when measured in consistent structured ways, could generate unbiased estimates of large finite populations like those existing in nation-states. Such theory permitted the development of national statistical systems throughout the world. Many countries developed ongoing measurement of employment, income, housing, education, GDP, and other societal attributes, providing monitoring of the well-being of the country.
The second era of surveys (1960-1990) saw the survey method spread throughout societies, providing the evidence base for key decisions in government and the private sector and as a tool of research discovery in academia.
One of the weaknesses in the 1934 theory is that it had to assume complete measurement of the sample drawn, in order to achieve the accuracy extolled. Chosen sample persons had to be willing to interact with a complete stranger, an employee of a remote institution, and reveal to that stranger intimate details of their life or their enterprise’s economic activities.
In the third era of surveys, roughly 1990-2010, the social fabric that permitted high participation rates appear to fray. This first appeared in private sector surveys, which used relatively few efforts to contact all sample persons and persuade them to provide information. But it spread to other sectors, lastly, the central government sector. As it attempted to increase efforts to encourage participation, survey costs greatly inflated. Cost inflation in government statistical agencies has prompted the dropping of some surveys. In that sense, some countries know less about their societal welfare than in prior years.
The outline of the next, a fourth, era of sample surveys is emerging. Instead of a world in which surveys become extinct, it is a world that will transform surveys. We live in societies in which digital data arise from almost all processes involving human activity. The fourth era will involve a blending of survey data with digital records of transactions in e-commerce, digital records of program participation in welfare support systems, educational records, employment records, social media data, and other digital information.
This fourth era will bring the benefit of repairing the growing gaps of coverage by surveys. However, while surveys are governed by informed consent of the responding persons, many of these digital resources spur privacy concerns from potential harmful uses. Further, while survey measurements are designed to achieve prespecified informational goals; the “harvested” digital data were designed for other purposes. Hence, there are formidable analytic challenges of deriving credible estimates of some important societal attribute (e.g., unemployment, income distributions) using multiple data sources that were not designed to be blended.
This fourth era will perforce require unprecedented rates of innovation. It will be a ride.
Address
ICC 650
Box 571014
37th & O St, N.W.
Washington, D.C. 20057
Contact
Phone: (202) 687.6400
Email: provost@georgetown.edu
Office of the ProvostBox 571014 650 ICC37th and O Streets, N.W., Washington D.C. 20057Phone: (202) 687.6400Fax: (202) 687.5103provost@georgetown.edu
Connect with us via:
Interestind post!
Considering the challenges of blending survey data with diverse digital records in the fourth era of sample surveys, what innovative approaches or methods do you foresee being crucial in deriving accurate and credible estimates from these disparate sources?
Fascinating history of surveys. It’s a hopeful discussion about integrating other available data with the surveys . One key glitch – how do we account for people’s belief in “alternative facts” ! Any thoughts on that in all our data gathering ?