Skip to main content

Address

ICC 650
Box 571014

37th & O St, N.W.
Washington, D.C. 20057

maps & directions
Contact

Phone: (202) 687.6400

Email: provost@georgetown.edu

 

Finding the Relevant in a Sea of Disorganized Observations

I recently saw a documentary on some developments in science and mathematics occurring in the 7-8th century, much of them surrounding the Middle East. It was a time when basic observations were a prevalent way to make progress. While the world is always filled with stimuli observed by human senses, developments in mathematics permitted an extraction of principles to sharpen what observations had merit to deepen understanding of the natural world. Scientific developments allowed us to organize the subset of observations that offered long-lasting insight.

In more modern language, the world is always filled with unlimited amount of data; we are bombarded by them daily. Most of the data have no lasting value to understanding a given problem. Our ancestors looked at the sky; they saw birds flying, different types of clouds, stars at night, the sun, the moon, lightning, and rain. How these affect the passing of seasons, however, must not have been obvious millennia ago. Indeed, most of the observations of looking at the sky are not relevant to that restricted question.

This has relevance to us today, I believe. While since the beginnings of humans, we have been privy to continuous sensory observations, understanding comes when we place these observations into subsets, extend them beyond our immediate surroundings, and refine them. Often, only at that point can we make advances in our understanding.

Today, Internet- and sensor-based observations are beginning to represent a parallel world, digitally recording much human activity. Like the observations of earlier, these “big data” tend not be organized or stratified into domains of interest; they’re all bunched together. (People tweet about almost every possible topic; CCTV captures digital images of whatever is happening at the focal point.) For any particular question, there’s a lot of noise, data irrelevant to any specific question. In a way, they are merely constantly recording what is occurring in the same way that our early ancestors’ senses were constantly observing.

The new feature of our world is that these Internet and sensor observations are not human sensory traces or memories but are digital. They are not fleeting; they can be stored for diverse analyses. They are not dependent on whether humans are present to make the observations; they are produced wherever we place the measurement device. They are not dependent on our volition to make the observations; they are continuous. They are not subject to human variability; they consistently conduct the same observation.

Storable, digital, ubiquitous, continuous, and consistent observations of our world create a fair bit of data, to say the least. But, we have tools with high-throughput computing to extract information relevant to a particular question from the vast amounts of data. Their powers of searching and transforming various digital forms of data into a standard form offer new promise. Instead of just gobs of data, we might be able to increase radically the utility of information from the data.

Of course, they do not reflect the inner conversations that we have with ourselves. They don’t reflect what we choose not to express in the Internet world or choose not to act on. (In that regard, they are weaker than social science tools that have the researcher ask questions of persons on topics of focus to their own research.) Currently, they are far from covering large swaths of human activity; they miss a lot of action.

Importantly, they are disproportionately controlled by institutions that are not driven by common good but by narrower motives. There are serious issues of access and privacy rights.

On the other hand, technological advances will eventually allow us to build a fully measured world with interlocking data ecosystems.

Right now, we’re missing an opportunity concerning these systems. We’re not engaged in a broad discussion about how these diverse data systems should best be used. In recent months, we have seen how they can be abused. But a central question of our age is whether we use these systems to build a better society.

Leave a Reply

Your email address will not be published. Required fields are marked *

Office of the ProvostBox 571014 650 ICC37th and O Streets, N.W., Washington D.C. 20057Phone: (202) 687.6400Fax: (202) 687.5103provost@georgetown.edu

Connect with us via: