One of the most important challenges of the coming few years concerns how societies will produce and absorb information about themselves. In democracies, the very notion of an informed citizenry requires that there be freely available, trusted information — the basic performance of the economy, distribution of job holding, the distribution of wealth across persons, access to educational programs, the success of these programs, adequacy of housing, distribution of access to health care services, quality of those services, the food supply chain, dietary habits of the population. Potential voters can use such information to judge whether or not the current government is performing well.
The Markle and the Gates Foundations have just finished a report scanning the terrain of issues facing the future of producing information for the common good. The central question addressed was, can the apparent US rise in income and wealth, health, housing, education inequalities be ameliorated if causes of these would be illuminated using data that exist but not being used now? Would such data reveal what programs to address these issues were successful and which, not successful?
The new report takes a utilitarian stance, assessing how data resources can be improved to be more informative about traditionally disadvantaged groups and how access to the data might be enhanced for unlocking insights into ways to reduce inequalities that plague the society. It notes that policy based on reliable evidence has a greater chance of success than policy made on purely ideological grounds.
In a republic like ours, this involves new ways of sharing data, across sectors, across federal, state, and local governments. For example, many programs seeking to improve the lives of lower income groups and racial minorities are administered at state and local levels, but partially or totally funded by central government agencies. Understanding when and for whom programs are most successful requires combining data across these governments.
However, laws, regulations and practices often prevent this. Further, government staff members who are stewards of the data often lack financial and human resources to be active analysts of the data. Hence, many more data resources exist that would be useful to improving the lives than there are data being analyzed to provide such benefits.
It is noteworthy that the data are generally collected to administer a program, and often do not document key features that would be useful for program evaluation. However, despite the weakness of various unanalyzed sets of data, there are examples of combining them to improve ways of benefiting disadvantaged groups. Hence, the notion of “data equity” has arisen. This entails the goal of assuring that all groups in the society enjoy equal benefits of data describing their well-being. With such assurances, these data sources can be more powerful sources of contributions to the common good.
An important feature of the report notes the largely unexploited existence of large data sets from the private sector. These include data from credit card processing companies, large e-commerce platforms, and other sources of data on consumer behaviors. Such data generally fail to cover large swaths of the population, especially those with constrained incomes or wealth. However, when blended with other data resources, real common good outcomes could be achieved.
The report argues for more open data. It devotes less content to the traditional concerns of unwarranted intrusions into the private lives of persons described in the data. What it does cover is important, however. Our society must learn how to balance concerns about privacy of individuals and societal benefits that can accrue only by using data describing the full population. Fortunately, new developments in computer science, statistics, and data science offer ways of extracting common good information from individual data while respecting
The last two issues – a) the use of private sector data for common good purposes and b) discovering the right societal balance between privacy protections and data use for social good – are two of the most pressing issues in building a society that uses data for the benefit of all in the society.