November 22, 2012
Twitter is a veritable gold mine for those looking to garner large-scale sentiment analysis. As a result, the rush to provide customers with that large-scale analysis is on. SGI, via their UV 2 “Big Brain” supercomputer is literally showing its growing Twitter proficiency.
SGI is using the UV 2 to demonstrate how the such a system can decipher the Twitterverse by producing “heat maps” of various large events including the days leading up to Hurricane Sandy’s landfall and the day of the presidential election. SGI compiled these heat maps, with the help of the University of Illinois’s Kalev H. Leetaru and Dr. Shaowen Wang of the University of Illinois at Champaign-Urbana’s CIGI lab, by taking one out of ten tweets and determining if, in the case of the hurricane, whether the tweet was positive or negative. In the case of the election, they distinguished between pro-Obama and pro-Romney tweets.
The resulting maps look kind of cool, especially when paired with epic music, as SGI does in their time-lapse YouTube videos. The result is impressive. It highlights the supercomputer’s ability to gauge sentiment on an issue that the balance of the population of a 300 million-person nation is tweeting about. Even when that issue is binary (good/bad, Obama/Romney), the process is not straightforward, requiring a fair amount of computing smarts. With regard to the hurricane, SGI had to factor in a) whether or not the tweet was indeed about Sandy, b) whether the tweet was positive or negative, and c) the location of the person tweeting.
Parts a and c are no small feat, especially considering the system had to process 50 million tweets a day (which actually represents only about 10 percent of the tweets). But the real challenge lies in part b, a task mostly foreign to computers. That requires a much deeper level of semantic analysis – something IBM’s Watson machine has become notably skilled at doing.
Thanks to the UV 2’s coherent shared memory architecture, the system is well suited to these types of data-intensive problems. And because it can operate in a highly parallel manner, it has the ability shuffle this data around at top speed. For example, according to SGI, a UV system can “ingest the entire Library of Congress print collection in less than three seconds.”
Presumably, the company’s eventual goal is to translate this success to the business world, where companies can get a sense of, say, how a marketing campaign is going, or how to interpret customer feedback posted on the Web. Beyond that, the company hopes its UV 2 becomes a fixture in the scientific research arena, where problems like genomic sequencing and analysis of climate simulation results provide similar types of big data challenges.
In a recent solicitation, the NSF laid out needs for furthering its scientific and engineering infrastructure with new tools to go beyond top performance, Having already delivered systems like Stampede and Blue Waters, they're turning an eye to solving data-intensive challenges. We spoke with the agency's Irene Qualters and Barry Schneider about..
Read more...
Large-scale, worldwide scientific initiatives rely on some cloud-based system to both coordinate efforts and manage computational efforts at peak times that cannot be contained within the combined in-house HPC resources. Last week at Google I/O, Brookhaven National Lab’s Sergey Panitkin discussed the role of the Google Compute Engine in providing computational support to ATLAS, a detector of high-energy particles at the Large Hadron Collider (LHC).
Read more...
The Xeon Phi coprocessor might be the new kid on the high performance block, but out of all first-rate kickers of the Intel tires, the Texas Advanced Computing Center (TACC) got the first real jab with its new top ten Stampede system.We talk with the center's Karl Schultz about the challenges of programming for Phi--but more specifically, the optimization...
Read more...
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.