Enabling Research with MATLAB on the TeraGrid

By Nicole Hemsoth

October 18, 2010

Rajesh Bhaskaran at Cornell’s Space Systems Design Studio CUSat Satellite Project is leading a multi-year effort to create and deploy an autonomous in-orbit inspection satellite system using a MATLAB-based simulation.

Meanwhile, Ricky Harjanto at UC San Diego’s Cartilage Tissue Engineering Lab is also using MATLAB to examine changes in the shape of mice femurs during postnatal development via statistical shape modeling techniques to determine variations in mouse development at different stages of growth.

At the same time, Harshal Mahajan at the University of Pittsburgh’s NSF Quality of Life Technology Center is modeling power wheelchair driving to determine different techniques to improve and enhance mobility for the many thousands who rely on safe, effective wheelchairs. Mahajan’s code uses the MATLAB system identification toolbox to build models from the wealth of driving data collected.

Outside of using MATLAB as their primary tool, these and other researchers have something else in common; they are all using Cornell University’s MATLAB on the TeraGrid experimental computing resource, which is helping them achieve fast results delivered to their desktop — and doing so in an operating environment they are already comfortable with.

High-Level Programming for the Non-Programmer

MATLAB is ubiquitous in scientific and large-scale computing with estimates closing in on over one million researchers who use the tool for a wide variety of technical computing applications. Outside of its use in technical applications, it is also being deployed to manipulate data gathered from a range of scientific instruments, including satellites, telescopes and sensors.

There are clear incentives to deliver easily-accessible software and computational resources to a large number of scientific users in general. This has been the goal of any number of universities and national labs from the era of grid until the present. This has been an aim of the National Science Foundation, which is one of a handful of funding sources for these types of projects and accordingly, it is not difficult to see how their interest was engaged when Cornell stated it would be capable of delivering MATLAB and high-performance computing to more researchers.

As Robert Buhrman, Senior Vice Provost for Research at Cornell, stated, “MATLAB on the TeraGrid will help enable a broader class of researchers who are well-versed in MATLAB to reduce the time to solution in a scalable manner without having to become parallel programming experts.” It is this reduced time to results and mitigation of programming challenges that makes this an attractive option — and one that has some direct results, judging from Cornell’s long list of research projects both pending and underway on the MATLAB and TeraGrid resource.

Part of the appeal for researchers is that the computational learning curve is diminished. Access to the 512-core resource does not require understanding of any particular operating system, MPI library, or batch scheduler. By utilizing the Parallel Computing Toolbox and the MATLAB Distributed Computing Server to access the resource via desktops and the TeraGrid science gateways, users who are part of TeraGrid are granted high-performance equipment without some of the common hassles on the programming front they used to encounter on a regular basis. In other words, it is allowing researchers to focus distinctly on their research problems, rather than forcing them to become, by proxy, experts in parallel programming.

The Partnership to Bring MATLAB to TeraGrid

Cornell University, in partnership with Purdue University, received an NSF grant to deploy MATLAB on the TeraGrid for what is currently deemed an experimental resource. Since MATLAB is such an important data tool for complex data analysis for many TeraGrid users, as a parallel resource it could provide an even greater opportunity to expand access to high-performance computing for researchers.

The goal of the partnership between the universities and the NSF is to provide “seamless parallel MATLAB computational services running on Windows HPC Server 2008 to remote desktop and Science Gateway users with complex analytic and fast simulation requirements.”

In a recent interview, David Lifka, director at the Cornell Center for Advanced Computing, noted that the funding from the NSF was in part to provide staff at Cornell that would develop software to allow MATLAB clients from any platform (Windows, Linux, Mac) to seamlessly connect to the experimental resource at Cornell and run jobs in parallel. This would mean that users would get their results back on their desktop via the Web interface without needing to learn a new batch system or new programming model. As Lifka explained, “Basically, once the users know MATLAB, they can use parallel MATLAB directly from their host client.”

The NSF also set aside funding for staff at Purdue University who were tasked with enabling the same sort of connectivity via their science gateway. Purdue has a software framework for building scientific gateways called HubZero — a framework that has been rising in popularity as more disciplines create domain-specific gateways of their own to share and augment research projects.

On a hardware and software level, it should be noted that Cornell’s cluster is not a “tricked out” resource by any means. The Dell PowerEdge HPC cluster is not a gigantic system; there are no special interconnects and it is not running any specialized, customized software. One look at the specs reveals that it’s running everything off the shelf, including Microsoft Windows HPC scheduler and the standard version of the MathWorks software, for example.

Lifka stated that the only part that is customized is the software interface that the client installs on his or her MATLAB client that handles the secure communication with the cluster to submit jobs.

The resource itself is modest, although the team hopes that it will eventually grow after proven success with the MATLAB on TeraGrid project. Current wait times are still an issue; this is not the instant-run access that some HPC-as-a-service providers from the “outside world” can deliver. The team publishes the current wait times, which generally run between three and four days, give or take.

Opening Doors to Discovery

MATLAB is in such wide use across disciplines because it allows researchers to focus on their immediate discipline-specific questions without needing to become advanced programmers. It is generally perceived as being far more compact for scientific and mathematical uses than Fortran or C, and for this reason, it is has become the most comfortable environment for many in academia, engineering and beyond. By delivering it to a larger number of users, Cornell, Purdue and TeraGrid are helping to advance scientific discovery and aid in the ease of access to many researchers.

“One of the beauties of MATLAB is that it’s such a broad tool that can be used across disciplines and that was the key thing we felt was important — and why we wanted to do this project with the NSF,” said Lifka. “The MathWorks’ MATLAB is used across business, academia and in national labs because it works and because it doesn’t require a steep learning curve. If you know your science and you know your MATLAB, you can get a lot done very quickly.”

Encouraging Broader Impact

Delivering parallel MATLAB as a resource for a broader class of researchers was part of what made the deal attractive to the National Science Foundation (NSF) as it examined the benefits of funding such a partnership. David Lifka, director at the Cornell Center for Advanced Computing, stated, “What we wanted to do and what the NSF wants to encourage is broader impact — bringing new users into the fold who need large-scale computing without the learning curve. We want to get them scaling their science up and hopefully, along the way, they ask some questions so we can continue to improve.”

The funding came from a Strategic Technologies and Cyberinfrastructure grant, which is backed by the NSF’s stated aims to bring new resources to bear to encourage greater access to high-performance computing. The idea behind the project is to present this as a resource so that later it can be determined whether or not this project will belong in the TeraGrid resource provider collection in the future. As Lifka noted, “We’re hopeful that someday we will be part of this collection, but today we’re not.”

Additional support for the project came from Dell, Microsoft and The Mathworks, purveyors of MATLAB. According to Lifka, this backing was due to the interest these stakeholders had in watching how utility computing could be made available and how the experimental resource might enable seamless access from Web to the desktop.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

Live and in Color, Meet the European Student Cluster Teams

November 21, 2017

The SC17 Student Cluster Competition welcomed two teams from Europe, the German team of FAU/TUC and Team Poland, the pride of Warsaw. Let's get to know them better through the miracle of video..... Team FAU/TUC is a c Read more…

By Dan Olds

SC17 Student Cluster Kick Off – Guts, Glory, Grep

November 21, 2017

The SC17 Student Cluster Competition started with a well-orchestrated kick-off emceed by Stephen Harrell, the competition chair. It began with a welcome from SC17 chair Bernd Mohr, where he lauded the competition for Read more…

By Dan Olds

Activist Investor Starboard Buys 10.7% Stake in Mellanox; Sale Possible?

November 20, 2017

Starboard Value has reportedly taken a 10.7 percent stake in interconnect specialist Mellanox Technologies, and according to the Wall Street Journal, has urged the company “to improve its margins and stock and explore Read more…

By John Russell

HPE Extreme Performance Solutions

Harness Scalable Petabyte Storage with HPE Apollo 4510 and HPE StoreEver

As a growing number of connected devices challenges IT departments to rapidly collect, manage, and store troves of data, organizations must adopt a new generation of IT to help them operate quickly and intelligently. Read more…

Installation of Sierra Supercomputer Steams Along at LLNL

November 20, 2017

Sierra, the 125 petaflops (peak) machine based on IBM’s Power9 chip being built at Lawrence Livermore National Laboratory, sometimes takes a back seat to Summit, the ~200 petaflops system being built at Oak Ridge Natio Read more…

By John Russell

Live and in Color, Meet the European Student Cluster Teams

November 21, 2017

The SC17 Student Cluster Competition welcomed two teams from Europe, the German team of FAU/TUC and Team Poland, the pride of Warsaw. Let's get to know them bet Read more…

By Dan Olds

SC17 Student Cluster Kick Off – Guts, Glory, Grep

November 21, 2017

The SC17 Student Cluster Competition started with a well-orchestrated kick-off emceed by Stephen Harrell, the competition chair. It began with a welcome from Read more…

By Dan Olds

SC Bids Farewell to Denver, Heads to Dallas for 30th

November 17, 2017

After a jam-packed four-day expo and intensive six-day technical program, SC17 has wrapped up another successful event that brought together nearly 13,000 visit Read more…

By Tiffany Trader

SC17 Keynote – HPC Powers SKA Efforts to Peer Deep into the Cosmos

November 17, 2017

This week’s SC17 keynote – Life, the Universe and Computing: The Story of the SKA Telescope – was a powerful pitch for the potential of Big Science projects that also showcased the foundational role of high performance computing in modern science. It was also visually stunning. Read more…

By John Russell

How Cities Use HPC at the Edge to Get Smarter

November 17, 2017

Cities are sensoring up, collecting vast troves of data that they’re running through predictive models and using the insights to solve problems that, in some Read more…

By Doug Black

Student Cluster LINPACK Record Shattered! More LINs Packed Than Ever before!

November 16, 2017

Nanyang Technological University, the pride of Singapore, utterly destroyed the Student Cluster Competition LINPACK record by posting a score of 51.77 TFlop/s a Read more…

By Dan Olds

Hyperion Market Update: ‘Decent’ Growth Led by HPE; AI Transparency a Risk Issue

November 15, 2017

The HPC market update from Hyperion Research (formerly IDC) at the annual SC conference is a business and social “must,” and this year’s presentation at S Read more…

By Doug Black

Nvidia Focuses Its Cloud Containers on HPC Applications

November 14, 2017

Having migrated its top-of-the-line datacenter GPU to the largest cloud vendors, Nvidia is touting its Volta architecture for a range of scientific computing ta Read more…

By George Leopold

US Coalesces Plans for First Exascale Supercomputer: Aurora in 2021

September 27, 2017

At the Advanced Scientific Computing Advisory Committee (ASCAC) meeting, in Arlington, Va., yesterday (Sept. 26), it was revealed that the "Aurora" supercompute Read more…

By Tiffany Trader

NERSC Scales Scientific Deep Learning to 15 Petaflops

August 28, 2017

A collaborative effort between Intel, NERSC and Stanford has delivered the first 15-petaflops deep learning software running on HPC platforms and is, according Read more…

By Rob Farber

Oracle Layoffs Reportedly Hit SPARC and Solaris Hard

September 7, 2017

Oracle’s latest layoffs have many wondering if this is the end of the line for the SPARC processor and Solaris OS development. As reported by multiple sources Read more…

By John Russell

AMD Showcases Growing Portfolio of EPYC and Radeon-based Systems at SC17

November 13, 2017

AMD’s charge back into HPC and the datacenter is on full display at SC17. Having launched the EPYC processor line in June along with its MI25 GPU the focus he Read more…

By John Russell

Nvidia Responds to Google TPU Benchmarking

April 10, 2017

Nvidia highlights strengths of its newest GPU silicon in response to Google's report on the performance and energy advantages of its custom tensor processor. Read more…

By Tiffany Trader

Google Releases Deeplearn.js to Further Democratize Machine Learning

August 17, 2017

Spreading the use of machine learning tools is one of the goals of Google’s PAIR (People + AI Research) initiative, which was introduced in early July. Last w Read more…

By John Russell

GlobalFoundries Puts Wind in AMD’s Sails with 12nm FinFET

September 24, 2017

From its annual tech conference last week (Sept. 20), where GlobalFoundries welcomed more than 600 semiconductor professionals (reaching the Santa Clara venue Read more…

By Tiffany Trader

Amazon Debuts New AMD-based GPU Instances for Graphics Acceleration

September 12, 2017

Last week Amazon Web Services (AWS) streaming service, AppStream 2.0, introduced a new GPU instance called Graphics Design intended to accelerate graphics. The Read more…

By John Russell

Leading Solution Providers

EU Funds 20 Million Euro ARM+FPGA Exascale Project

September 7, 2017

At the Barcelona Supercomputer Centre on Wednesday (Sept. 6), 16 partners gathered to launch the EuroEXA project, which invests €20 million over three-and-a-half years into exascale-focused research and development. Led by the Horizon 2020 program, EuroEXA picks up the banner of a triad of partner projects — ExaNeSt, EcoScale and ExaNoDe — building on their work... Read more…

By Tiffany Trader

Delays, Smoke, Records & Markets – A Candid Conversation with Cray CEO Peter Ungaro

October 5, 2017

Earlier this month, Tom Tabor, publisher of HPCwire and I had a very personal conversation with Cray CEO Peter Ungaro. Cray has been on something of a Cinderell Read more…

By Tiffany Trader & Tom Tabor

Cray Moves to Acquire the Seagate ClusterStor Line

July 28, 2017

This week Cray announced that it is picking up Seagate's ClusterStor HPC storage array business for an undisclosed sum. "In short we're effectively transitioning the bulk of the ClusterStor product line to Cray," said CEO Peter Ungaro. Read more…

By Tiffany Trader

Reinders: “AVX-512 May Be a Hidden Gem” in Intel Xeon Scalable Processors

June 29, 2017

Imagine if we could use vector processing on something other than just floating point problems.  Today, GPUs and CPUs work tirelessly to accelerate algorithms Read more…

By James Reinders

Intel Launches Software Tools to Ease FPGA Programming

September 5, 2017

Field Programmable Gate Arrays (FPGAs) have a reputation for being difficult to program, requiring expertise in specialty languages, like Verilog or VHDL. Easin Read more…

By Tiffany Trader

HPC Chips – A Veritable Smorgasbord?

October 10, 2017

For the first time since AMD's ill-fated launch of Bulldozer the answer to the question, 'Which CPU will be in my next HPC system?' doesn't have to be 'Whichever variety of Intel Xeon E5 they are selling when we procure'. Read more…

By Dairsie Latimer

Flipping the Flops and Reading the Top500 Tea Leaves

November 13, 2017

The 50th edition of the Top500 list, the biannual publication of the world’s fastest supercomputers based on public Linpack benchmarking results, was released Read more…

By Tiffany Trader

IBM Advances Web-based Quantum Programming

September 5, 2017

IBM Research is pairing its Jupyter-based Data Science Experience notebook environment with its cloud-based quantum computer, IBM Q, in hopes of encouraging a new class of entrepreneurial user to solve intractable problems that even exceed the capabilities of the best AI systems. Read more…

By Alex Woodie

Share This