Since 1986 - Covering the Fastest Computers in the World and the People Who Run Them

Language Flags
October 13, 2011

Big Data for Big Brother

Nicole Hemsoth

Billions of tweets, Facebook updates, location-enabled applications and web searches are leading to an unprecedented amount of data “byproduct” that an increasing number of business are mining through in search of new insights, trends and sentiments. While predictive and real-time analytics hold enormous value for business, as one might imagine, governments to see an opportunity to understand citizens far better than ever as well.

According to a report this week by John Markoff in the New York Times, this past summer, an obscure government intelligence agency solicited ideas from the academic community about how it might be able to automatically “scan the Internet in 21 Latin American countries.”

This three-year experiment, which is slated to begin in April, would devise an automated data collection system that looks for patterns of “communication, consumption and movement of populations.” Rather vague, yes?  

This “data eye in the sky” will use publicly available data to take the digital pulse of an entire region. In their view, this includes everything from IP traffic and web searches to more “easily” available sources like blogs and social media streams.

This type of research has been in the news quite a bit over the last year. Stories have emerged about everything from mining Twitter for brand sentiments to using supercomputing resources to predict the future. What is different here is that this is no longer a branding-driven or academic institution-geared initiative, this is a project backed with public funds on behalf of a small agency that is refusing to comment about the scope of the analytics endeavor.

The group behind the effort, the Intelligence Advanced Research Projects Activity, is part of the office of the director of national intelligence in the United States. As the NYT report claimed, the agency’s research would “not be limited to political and economic events, but would also explore the ability to predict pandemics and other types of widespread contagion, something that has been pursued independently by civilian researchers and by companies like Google.”

As the article’s author noted, there are potential privacy and more general logistics concerns involved. Markoff writes that “the ease of acquiring and manipulating huge data sets charting Internet behavior causes many researchers to warn that the data mining technologies may be quickly outrunning the ability of scientists to think through questions of privacy and ethics.”

Full story at New York Times

SC14 Virtual Booth Tours

AMD SC14 video AMD Virtual Booth Tour @ SC14
Click to Play Video
Cray SC14 video Cray Virtual Booth Tour @ SC14
Click to Play Video
Datasite SC14 video DataSite and RedLine @ SC14
Click to Play Video
HP SC14 video HP Virtual Booth Tour @ SC14
Click to Play Video
IBM DCS3860 and Elastic Storage @ SC14 video IBM DCS3860 and Elastic Storage @ SC14
Click to Play Video
IBM Flash Storage
@ SC14 video IBM Flash Storage @ SC14  
Click to Play Video
IBM Platform @ SC14 video IBM Platform @ SC14
Click to Play Video
IBM Power Big Data SC14 video IBM Power Big Data @ SC14
Click to Play Video
Intel SC14 video Intel Virtual Booth Tour @ SC14
Click to Play Video
Lenovo SC14 video Lenovo Virtual Booth Tour @ SC14
Click to Play Video
Mellanox SC14 video Mellanox Virtual Booth Tour @ SC14
Click to Play Video
Panasas SC14 video Panasas Virtual Booth Tour @ SC14
Click to Play Video
Quanta SC14 video Quanta Virtual Booth Tour @ SC14
Click to Play Video
Seagate SC14 video Seagate Virtual Booth Tour @ SC14
Click to Play Video
Supermicro SC14 video Supermicro Virtual Booth Tour @ SC14
Click to Play Video