February 28, 2012

Cloudscape IV Spurs Discussion

Nicole Hemsoth

Defining HPC in the cloud — Juelich Supercomputing Center's Morris Riedel highlights key topics raised by Cloudscape IV event.

Cloudscape IV, under the direction of the SIENA Consortium, took place Feb. 23-34, at the European Commission in Brussels. The event aims to drive advances in interoperability and cloud computing standards as captured by the SIENA Roadmap, which seeks to “define scenarios, identify trends, investigate the innovation and impact sparked by cloud and grid computing, and deliver insight into how standards and the policy framework is defining and shaping current and future development and deployment in Europe and globally.”

Cloudscape IV logoSIENA, the Standards and Interoperability for eInfrastructure Implementation Initiative (2010-2012), is an EU-funded group whose objectives are complementary to the role of the US National Institute of Standards and Technology (NIST) in providing guidance on cloud standards and technology.

Over at GridCast, Juelich Supercomputing Centre’s Morris Riedel, as EMI Strategic Director & EUDAT Data Replication Task Force Leader, has written about some of the interesting topics that came to light at that CloudScape IV event. Two of these discussions center on HPC in the cloud and data management in the cloud.

With regard to the former, Riedel, raises three important points:

There is an inflation in terms – what the term HPC means in context of statements like ‘HPC in clouds’ must be more carefully defined. Can we consider 32 cores used with parallel computing techniques really as HPC today? Where is the boundary of speaking about HPC that was traditionally more towards using large-scale systems being at the TOP500 list (e.g. towards 300.000 cores and more emerging when we look towards Exascale)?

Scientific ‘ready-to-run applications’ exist (e.g. blast, namd, etc.) and are useful in the cloud (e.g. scaling up to ~1024 cores or more generally a few hundreds of cores). However, many HPC applications are developed during the active use of HPC systems meaning that the code evolves over time (e.g. linking new libraries, testing new mathematical models, etc.) with numerous compiling steps and tunings in between the overall process. Applications that are re-submitted with the exact code (like blastp for example) are not the majority of those HPC applications scaling up to thousands of cores on HPC systems. The cloud approach is not completely suited for this ‘scientific application development process’ but has benefits for small-scale HPC applications that are mature.

Key ‘application enabling support’ is an important ingredient of getting large-scale HPC applications efficiently running. This means that the knowledge of scaling up with scientific codes based on numerical laws and mathematical models is typically provided by unique support staff being present at HPC centers (e.g. Simulation Labs in Juelich). It seems to be unclear who provides this unique knowledge about partly hardware-level tunings (e.g. network topologies, shapes, etc.) when public cloud systems are used. In many cases this is experience obtained by experts over years working closely with the systems with scientific applications. Is there a helpdesk at Amazon where you can ask for a scaling workshop or asking where in the HPC codes are bottlenecks in MPI communication (e.g. performance analysis teams)?

On the subject of data management in the cloud, Riedel points out that trust issues have yet to be completely worked out. He mentions using law and certifications as well as using a mix of private and public clouds to navigate security hurdles. Riedel also comments on the need for more comprehensive data transfer standards and a better understanding of data complexity.

Share This