By now most people have heard of Zika, the mosquito-borne disease that can cause fever and birth defects and that threatens to spread to the United States from Latin and South America. Earlier this week more than 50 data scientists, engineers, and University of Texas students gathered for the “Austin Zika Hackathon” at big data specialist Cloudera offices in downtown Austin.
Hackathon participants investigated ways to pool together different sets of data, such as outbreak reports, stagnant water sources, empty swimming pools and ponds that are potential mosquito breeding grounds, and even Facebook and Twitter feeds. The Texas Advanced Computing Center (TACC) plans to store all the data in Wrangler, its new data-intensive supercomputer.
“We’re trying to collect these disparate pieces of data, and there’s not a good way for people to ask questions about that data—that’s the big problem,” said Ari Kahn, human translational genomics Coordinator at TACC, which is providing infrastructure and consulting to support this project.
“What we can do in a one-day hackathon is to focus on one data problem, for example, if there were an outbreak – where we would we first send support and kits to local communities and direct awareness programs on prevention by removing stagnant water or using repellents that are effective against Aedes, [the mosquito that’s the disease vector],” said Eddie Garcia, Cloudera chief security architect.
The Hackathon’s intend is to raise ZIKA awareness and to start building a platform that could be used, not just for the Zika virus data analysis, but also for other similar disease agents. “Someone can basically take what we did here today and apply it to some other unknown outbreak or some other analysis,” said Garcia. Wrangler runs and an optimized version of Cloudera’s enterprise Apache Hadoop platform.
As of mid-May 2016, Mexico had reported around 272 cases of Zika, and the problem has grown so large that President Obama has requested $1.9 billion to halt the spread of Zika. A full account of the hackathon by Jorge Salazaris is posted on the TACC website: https://www.tacc.utexas.edu/-/zika-hackathon-fights-disease-with-big-data