Recent tests performed at Clemson University achieved a 25 percent improvement in Apache Hadoop Terasort run times by replacing Hadoop Distributed File System (HDFS) with an OrangeFS configuration using dedicated servers. Key components included extension of the MapReduce “FileSystem” class and a Java Native Interface (JNI) shim to the OrangeFS client. No modifications of Hadoop were required, and existing MapReduce jobs require no modification to utilize OrangeFS.
Today Intel announced some new pitches to push Lustre in front of enterprise eyeballs with usability features for Lustre and a total rip and replace for the native Hadoop file system designed to appeal to the HPC-oriented Hadoop set. We talked with Brent Gorda, former CEO and founder of Whamcloud, which Intel acquired just a tick under a year ago about how….
Culling together massive data has provided some profound opportunities for a wide array of analytics projects but has created a number of complications for those who want to gain actionable intelligence from it. While the “big data” movement is still unfolding, a number of companies have emerged to help simplify access and use, especially of unstructured information. HPC stalwart Platform Computing entered the race to refine handling of vast datasets — not to mention the management behind such operations to stake their claim in this emerging space.