Experimental scientific HPC applications are continually being moved to the cloud, as covered here in several capacities over the last couple of weeks. Included in that rundown, Co-founder and CEO of CloudSigma Robert Jenkins penned an article for HPC in the Cloud where he discussed the emergence of cloud technologies to supplement research capabilities of big scientific initiatives like CERN and ESA (the European Space Agency).
We followed up with him to hear more about where what he called the ‘Science Cloud’ is headed over the next year to two years and what his company is doing to abet that process.
“Institutions that have public data that they want to disseminate, public cloud can be a really interesting. Rather than having it on a server locally in their institution, putting that data out into an environment where it’s surrounded by a lot of connectivity, particularly in the public cloud, is making it accessible to the community.”
For Jenkins, a lot of the scientific innovation in the public cloud comes down to the accessibility of massive datasets. He compared public data to music, referencing a time now largely past when people owned and carried around their music.
“People use to own music, they owned CDs. Now people use Spotify, they use Google Music. They don’t actually carry it around with them.” While it is possible to download songs and keep them on a hard drive from those services, a consistently strong internet connection renders that meaningless.
Of course, there is a discernible data and bandwidth difference between a few minutes of music and a massive scientific dataset. But with the rate at which scientific public cloud usage has grown, the hope is that within a couple of years, accessing and analyzing those datasets from a public environment will become as simple as music is currently.
“Probably the biggest new thing to come out of the public cloud is this ability to make datasets accessible. What I hope to see, and what we’re working on, is over the next year to two years is these public datasets being ubiquitously available.”
However, Jenkins mentioned that the scope does not have to be limited to scientific purposes. Those datasets that are generated by the supercolliders or agencies that track global weather patterns can be useful to more disciplines than the ones they were created for. Specifically, a financial analyst, Jenkins argued, could use weather pattern data to determine if there exists any correlation between warm weather and market movement, for example.
Another benefit to be had from large scientific institutions moving large datasets to the public cloud is simply an inexpensive expansion of computing power.
“We are seeing that public cloud at the right place can benefit them in term s of having additional capacity. They’re not at the point where they’re necessarily looking to downgrade their in-house datacenters. But on the other hand they face their own challenges in terms of getting enough power supply in managing datacenters.”
Operating in a cloud would shift those management concerns to an outside datacenter operator. If that provider is trustworthy and shows they can keep their facilities operational on a consistent basis, a significant amount of funding is saved for the scientific institutions. That operational trust is already there, with ESA moving datasets to the public cloud with the assistance of CloudSigma, according to Jenkins.
“The technical aspect, which is what we’re doing with CERN and ESA, we’re deploying their workloads in the cloud,” he said. The European Space Agency is specifically moving some datasets into server spaces called ‘supersites,’ locations where connectivity is at a maximum. That allows those datasets to indeed be accessed by anywhere, so long as the internet connection is strong.
ESA is already moving their data into public clouds into ‘supersites,’ a server node that is accessible from anywhere on the internet. The development of those supersites is a step in the direction of analogizing massive datasets to music, a property that can in theory be accessed and analyzed from any terminal.
The goal is to get institutions from outside the sciences involved over the next year or two. Jenkins, as noted above, notably mentioned financial institutions. However, the sciences is where the greatest growth is happening right now, and for good reason. Along with issues of data access and computational power expansion being potentially addressed by moving HPC applications to the public cloud, scientific collaboration is fostered significantly by such a commitment to share data resources.
After all, global science initiatives led by agencies like CERN and ESA require the efforts of researchers and scientists across the country. It is necessary that they have some seamless method of accessing the critical information. That necessity leads to the building of the Science Cloud.