SGI Supercomputer Takes Twitter’s Pulse
Twitter is a veritable gold mine for those looking to garner large-scale sentiment analysis. As a result, the rush to provide customers with that large-scale analysis is on. SGI, via their UV 2 “Big Brain” supercomputer is literally showing its growing Twitter proficiency.
SGI is using the UV 2 to demonstrate how the such a system can decipher the Twitterverse by producing “heat maps” of various large events including the days leading up to Hurricane Sandy’s landfall and the day of the presidential election. SGI compiled these heat maps, with the help of the University of Illinois’s Kalev H. Leetaru and Dr. Shaowen Wang of the University of Illinois at Champaign-Urbana’s CIGI lab, by taking one out of ten tweets and determining if, in the case of the hurricane, whether the tweet was positive or negative. In the case of the election, they distinguished between pro-Obama and pro-Romney tweets.
The resulting maps look kind of cool, especially when paired with epic music, as SGI does in their time-lapse YouTube videos. The result is impressive. It highlights the supercomputer’s ability to gauge sentiment on an issue that the balance of the population of a 300 million-person nation is tweeting about. Even when that issue is binary (good/bad, Obama/Romney), the process is not straightforward, requiring a fair amount of computing smarts. With regard to the hurricane, SGI had to factor in a) whether or not the tweet was indeed about Sandy, b) whether the tweet was positive or negative, and c) the location of the person tweeting.
Parts a and c are no small feat, especially considering the system had to process 50 million tweets a day (which actually represents only about 10 percent of the tweets). But the real challenge lies in part b, a task mostly foreign to computers. That requires a much deeper level of semantic analysis – something IBM’s Watson machine has become notably skilled at doing.
Thanks to the UV 2’s coherent shared memory architecture, the system is well suited to these types of data-intensive problems. And because it can operate in a highly parallel manner, it has the ability shuffle this data around at top speed. For example, according to SGI, a UV system can “ingest the entire Library of Congress print collection in less than three seconds.”
Presumably, the company’s eventual goal is to translate this success to the business world, where companies can get a sense of, say, how a marketing campaign is going, or how to interpret customer feedback posted on the Web. Beyond that, the company hopes its UV 2 becomes a fixture in the scientific research arena, where problems like genomic sequencing and analysis of climate simulation results provide similar types of big data challenges.