Python Snakes Its Way Into HPC

By Nicole Hemsoth

November 17, 2010

Interpreted programming languages usually don’t find too many friends in high performance computing. Yet Python, one of the most popular general-purpose interpreted languages, has garnered a small community of enthusiastic followers. True believers got the opportunity to hear about the language in the HPC realm in a tutorial session on Monday and a BoF session on Wednesday. Argonne National Lab’s William Scullin, who participated in both events, talked with HPCwire about the status of Python in this space and what developers might look forward to.

HPCwire: Python is not a language normally associated with high performance or scientific computing. What does it have to offer this user community not being fulfilled by traditional languages, like C, Fortran or other high productivity, interpreted languages like MATLAB?

William Scullin: In a way, Python’s growing adoption in the high performance and scientific computing space is a homecoming. Guido Van Rossum originally began Python as a way of providing an administrative scripting language for the Amoeba distributed operating system. Then as now, it combines simple, easy to learn and maintain syntax with access to the same powerful libraries and function calls you would find in any C or Fortran implementation. While there has always been an emphasis on reducing the time it takes to perform a computation, Python has truly shined in improving scientific computing by taking the work out of programming and reducing the time to solution.

Often, projects fail when they try to be all things to all people. MATLAB, Mathematica, SPSS, and Maple are all very useful tools, in part because they are focused on meeting the needs of a well funded community with very specific goals. Python, arising from a very diverse community that ranges from astrophysicists to game programmers to web designers to entry level computer science students, has been very successful due to the diversity of users. The standard library has become amazingly extensive without becoming inconsistent.

Likewise, the amount of software that has come out of the community is amazing, most of which is open source, and the vast majority of which follows the same coding guidelines as the core modules. This makes it possible to easily develop an interface to an embedded microcontroller to turn off the desk lamp when your simulation finally ends and automatically push results to a web server in less than an hour — or alternately turn on a coffee pot and resubmit your job when the simulation fails — all in one language.

HPCwire: Obviously, performance is a driving issue in HPC. How is the issue of execution performance being addressed?

Scullin: Performance is a matter of perspective. A favorite maxim in the Python community is that the greatest performance improvement comes from going to the working from the non-working state. A second maxim, from Knuth, is that premature optimization is the root of all evil. While the execution speed of a Python application may not be as fast as one written in C, C++, or Fortran, its ease of use and low learning curve sharply improves overall time to solution. It’s a question of developer time versus compute time.

Side stepping the issue, it’s ridiculously easy to extend Python with modules written in C, C++, and Fortran. It’s common in our community to utilize compiled high performance numerical kernels, then use Python to handle areas like I/O, workflow management, computational analysis, and steering. When areas become performance bottlenecks, those areas tend to be rewritten in C.

Conversely, I’ve seen C and Fortran projects where code complexity has prevented maintenance and functionality, leading to thousands of lines of compiled code being replaced with less than a hundred lines of Python. In many ways, Python is coming to fulfill the roles that frameworks like Cactus and Samurai sought to fill at the start of the decade — letting scientists worry about their problems while letting the language and interpreter do the heavy lifting.

HPCwire: Do you think a compiled implementation of Python would be a step in the right direction?

Scullin: There will always be a place for the interpreted reference implementation, especially in development, but if a Python compiler comes along that provides better performance without compromising the language, I can’t see it finding much resistance.

That said, there are currently projects such as Unladen Swallow, PyPy, Stackless Python, Jython, and Iron Python that provide alternatives to the CPython interpreter. Unladen Swallow, backed in part by Google, and PyPy both seek to close the performance gap with compiled languages. Unladen Swallow is particularly exciting as it’s backended into the Low Level Virtual Machine, which is the basis for multiple compilers including Clang, currently the default compiler under Apple’s OS X. This makes a Python compiler more a matter of when than if.

HPCwire: Can you describe some of the more important Python initiatives — language extensions, libraries, tools, etc. — that are aimed at the HPC domain?

Scullin: I cannot speak highly enough of NumPy, which is almost the Swiss army knife of Python for scientific and high performance computing. It’s been under active development for years now with each release providing better performance, automatic integration of popular high performance libraries like BLAS and LAPACK, more features, and greater portability. NumPy is further extended by SciPy, which provides additional tools and lab kits addressing almost every science domain.

Likewise, I think very highly of mpi4py, PyMPI, PyCUDA and its sister PyOpenCL, petsc4py, and PyTrilinos. All of these keep improving the options we have to accelerate our code using the very same tools and interfaces that are available through traditional compiled languages with none of the complexity.

HPCwire: Are there vendors out there with commercially-supported solutions?

Scullin: Indeed, and more importantly, most of them are active contributors to and supporters of the Python community. I can no longer count the number of consulting firms that provide Python solutions. It’s also been very encouraging seeing vendors add Python support to their products. Two companies well known in the HPC space, Rogue Wave and ParaTools, have both been very responsive.

Rogue Wave has provided access to their mathematical libraries, IMSL via PyIMSL. Furthermore, they have brought a number of people into the Python community via PyIMSL Studio which they market officially as a prototyping tool. I’ve encountered PyIMSL studio users so happy with their prototype Python applications with PyIMSL Studio, that they ran with the Python code as production code. I should also mention that while the TotalView debugger is not officially a Python tool, it’s seen a lot of use by Python HPC users and it will be interesting to see where it goes since Rogue Wave’s acquisition of Acumen.

ParaTools, a major contributor to the TAU Performance System and a leading consultant in the area of parallel and high performance codes has done a very good job of adding Python support to TAU.

Without hesitation, I have used their tools with C, Fortran, and Python and found their support to be helpful and responsive regardless of language.

While not directly in the HPC market, Enthought, deserves special mention. They host an array of Python projects with engineering and science applications. They provide a commercial packaging of the Python interpreter with commonly used libraries and utilities along with technical support as the Enthought Python Distribution. Most of all, they are active developers of NumPy and SciPy. Without their support and involvement, I am not sure that NumPy would have come together as nicely as it has.

While relatively new, I’ll also be interested to see what the future holds for MBA Sciences’s SPM.Python toolkit for bringing parallelism into serial Python programs. I’ll be keeping a close watch on PiCloud, a firm which provides an amazingly easy to user cloud computing platform that makes running Python codes on a compute cloud ridiculously easy. PiCloud users have their computations offloaded without any serious code changes, having to be involved in any aspect of setting up a cloud infrastructure, or doing any server management. They’ve seriously made it as simple as coding and running.

Finally, though it hasn’t been making a lot of noise lately, NVIDIA has been putting effort behind Copperhead, which while not a complete Python, allows for the rapid development of CUDA kernels in Python-like code.

HPCwire: Do you think most uses of Python in HPC will eventually involve either integration with C or Fortran or source code translation to those languages?

Scullin: I believe that HPC users will continue to choose the best possible tool to address a need in a given situation. Python is flexible enough that there will be continued integration with C, Fortran, and other languages. At the same time, interpreter performance is being rapidly addressed, which makes the issues that come with language translation into C and Fortran cause that sort of project to be less attractive to active Python developers. What will be interesting to watch is how codes written in a mix of C, C++, Fortran, Python and other languages perform and evolve as the LLVM platform continues to mature.

HPCwire: Can you point to any successful case studies or projects where Python has been employed in this arena?

Scullin: At Argonne, we are involved in the development of GPAW, a density-functional theory Python code based on the projector-augmented wave method. Originating out of an international collaboration, it is mostly a mix of C and Python with the vast majority of the code being Python. It has been run at scale successfully and routinely on our Blue Gene platform. While the porting of any application to platforms like the Cray XT series or the Blue Gene is an interesting exercise in computer science, it’s far more remarkable that the performance has been on par from what I’ve seen in C or C++ codes. Moreover, it is being used to produce reliable data used to generate publications.

The other community that a lot of people think of when looking for successful Python applications in the HPC space is bioinformatics. While I’ve not been involved with many bioinformatics codes, the last four or five years have seen a rising number of chemists and biologists appearing on Python-related mailing lists and at conferences discussing how they have been using Python to power their science. While Perl still holds sway in the field, Python is quickly becoming almost as popular.

HPCwire: For those HPC developers interested in learning more about what’s available in the Python ecosystem, can you point to some resources they could tap into?

Scullin: Depending on their particular interests, one of the best places to start is by visiting www.scipy.org. From there, you can find links to numerous mailing lists, information about conferences, code recipes, documentation, and much more. In the Chicago and Bay Areas there are very active Python users groups with sizable memberships with an interest in HPC and scientific computing. Finally, given Python’s ease of use, one of the best things you can do is to spend an afternoon with the interpreter, simply playing with code and seeing what the language can do for you without any effort. The joy of doing powerful things with simple code is one of the most admirable traits of the language.

Subscribe to HPCwire's Weekly Update!

Be the most informed person in the room! Stay ahead of the tech trends with industy updates delivered to you every week!

How the United States Invests in Supercomputing

November 14, 2018

The CORAL supercomputers Summit and Sierra are now the world's fastest computers and are already contributing to science with early applications. Ahead of SC18, Maciej Chojnowski with ICM at the University of Warsaw discussed the details of the CORAL project with Dr. Dimitri Kusnezov from the U.S. Department of Energy. Read more…

By Maciej Chojnowski

At SC18: Humanitarianism Amid Boom Times for HPC

November 14, 2018

At SC18 in Dallas, the feeling on the ground is one of forward-looking buoyancy. Like boom times that cycle through the Texas oil fields, the HPC industry is enjoying a prosperity seen only every few decades, one driven Read more…

By Doug Black

Nvidia’s Jensen Huang Delivers Vision for the New HPC

November 14, 2018

For nearly two hours on Monday at SC18, Jensen Huang, CEO of Nvidia, presented his expansive view of the future of HPC (and computing in general) as only he can do. Animated. Backstopped by a stream of data charts, produ Read more…

By John Russell

HPE Extreme Performance Solutions

AI Can Be Scary. But Choosing the Wrong Partners Can Be Mortifying!

As you continue to dive deeper into AI, you will discover it is more than just deep learning. AI is an extremely complex set of machine learning, deep learning, reinforcement, and analytics algorithms with varying compute, storage, memory, and communications needs. Read more…

IBM Accelerated Insights

New Data Management Techniques for Intelligent Simulations

The trend in high performance supercomputer design has evolved – from providing maximum compute capability for complex scalable science applications, to capacity computing utilizing efficient, cost-effective computing power for solving a small number of large problems or a large number of small problems. Read more…

New Panasas High Performance Storage Straddles Commercial-Traditional HPC

November 13, 2018

High performance storage vendor Panasas has launched a new version of its ActiveStor product line this morning featuring what the company said is the industry’s first plug-and-play, portable parallel file system that delivers up to 75 Gb/s per rack on industry standard hardware combined with “enterprise-grade reliability and manageability.” Read more…

By Doug Black

How the United States Invests in Supercomputing

November 14, 2018

The CORAL supercomputers Summit and Sierra are now the world's fastest computers and are already contributing to science with early applications. Ahead of SC18, Maciej Chojnowski with ICM at the University of Warsaw discussed the details of the CORAL project with Dr. Dimitri Kusnezov from the U.S. Department of Energy. Read more…

By Maciej Chojnowski

At SC18: Humanitarianism Amid Boom Times for HPC

November 14, 2018

At SC18 in Dallas, the feeling on the ground is one of forward-looking buoyancy. Like boom times that cycle through the Texas oil fields, the HPC industry is en Read more…

By Doug Black

Nvidia’s Jensen Huang Delivers Vision for the New HPC

November 14, 2018

For nearly two hours on Monday at SC18, Jensen Huang, CEO of Nvidia, presented his expansive view of the future of HPC (and computing in general) as only he can Read more…

By John Russell

New Panasas High Performance Storage Straddles Commercial-Traditional HPC

November 13, 2018

High performance storage vendor Panasas has launched a new version of its ActiveStor product line this morning featuring what the company said is the industry’s first plug-and-play, portable parallel file system that delivers up to 75 Gb/s per rack on industry standard hardware combined with “enterprise-grade reliability and manageability.” Read more…

By Doug Black

SC18 Student Cluster Competition – Revealing the Field

November 13, 2018

It’s November again and we’re almost ready for the kick-off of one of the greatest computer sports events in the world – the SC Student Cluster Competitio Read more…

By Dan Olds

US Leads Supercomputing with #1, #2 Systems & Petascale Arm

November 12, 2018

The 31st Supercomputing Conference (SC) - commemorating 30 years since the first Supercomputing in 1988 - kicked off in Dallas yesterday, taking over the Kay Ba Read more…

By Tiffany Trader

OpenACC Talks Up Summit and Community Momentum at SC18

November 12, 2018

OpenACC – the directives-based parallel programing model for optimizing applications on heterogeneous architectures – is showcasing user traction and HPC im Read more…

By John Russell

How ASCI Revolutionized the World of High-Performance Computing and Advanced Modeling and Simulation

November 9, 2018

The 1993 Supercomputing Conference was held in Portland, Oregon. That conference and it’s show floor provided a good snapshot of the uncertainty that U.S. supercomputing was facing in the early 1990s. Many of the companies exhibiting that year would soon be gone, either bankrupt or acquired by somebody else. Read more…

By Alex R. Larzelere

Cray Unveils Shasta, Lands NERSC-9 Contract

October 30, 2018

Cray revealed today the details of its next-gen supercomputing architecture, Shasta, selected to be the next flagship system at NERSC. We've known of the code-name "Shasta" since the Argonne slice of the CORAL project was announced in 2015 and although the details of that plan have changed considerably, Cray didn't slow down its timeline for Shasta. Read more…

By Tiffany Trader

TACC Wins Next NSF-funded Major Supercomputer

July 30, 2018

The Texas Advanced Computing Center (TACC) has won the next NSF-funded big supercomputer beating out rivals including the National Center for Supercomputing Ap Read more…

By John Russell

IBM at Hot Chips: What’s Next for Power

August 23, 2018

With processor, memory and networking technologies all racing to fill in for an ailing Moore’s law, the era of the heterogeneous datacenter is well underway, Read more…

By Tiffany Trader

Requiem for a Phi: Knights Landing Discontinued

July 25, 2018

On Monday, Intel made public its end of life strategy for the Knights Landing "KNL" Phi product set. The announcement makes official what has already been wide Read more…

By Tiffany Trader

House Passes $1.275B National Quantum Initiative

September 17, 2018

Last Thursday the U.S. House of Representatives passed the National Quantum Initiative Act (NQIA) intended to accelerate quantum computing research and developm Read more…

By John Russell

CERN Project Sees Orders-of-Magnitude Speedup with AI Approach

August 14, 2018

An award-winning effort at CERN has demonstrated potential to significantly change how the physics based modeling and simulation communities view machine learni Read more…

By Rob Farber

Summit Supercomputer is Already Making its Mark on Science

September 20, 2018

Summit, now the fastest supercomputer in the world, is quickly making its mark in science – five of the six finalists just announced for the prestigious 2018 Read more…

By John Russell

New Deep Learning Algorithm Solves Rubik’s Cube

July 25, 2018

Solving (and attempting to solve) Rubik’s Cube has delighted millions of puzzle lovers since 1974 when the cube was invented by Hungarian sculptor and archite Read more…

By John Russell

Leading Solution Providers

US Leads Supercomputing with #1, #2 Systems & Petascale Arm

November 12, 2018

The 31st Supercomputing Conference (SC) - commemorating 30 years since the first Supercomputing in 1988 - kicked off in Dallas yesterday, taking over the Kay Ba Read more…

By Tiffany Trader

TACC’s ‘Frontera’ Supercomputer Expands Horizon for Extreme-Scale Science

August 29, 2018

The National Science Foundation and the Texas Advanced Computing Center announced today that a new system, called Frontera, will overtake Stampede 2 as the fast Read more…

By Tiffany Trader

HPE No. 1, IBM Surges, in ‘Bucking Bronco’ High Performance Server Market

September 27, 2018

Riding healthy U.S. and global economies, strong demand for AI-capable hardware and other tailwind trends, the high performance computing server market jumped 28 percent in the second quarter 2018 to $3.7 billion, up from $2.9 billion for the same period last year, according to industry analyst firm Hyperion Research. Read more…

By Doug Black

Intel Announces Cooper Lake, Advances AI Strategy

August 9, 2018

Intel's chief datacenter exec Navin Shenoy kicked off the company's Data-Centric Innovation Summit Wednesday, the day-long program devoted to Intel's datacenter Read more…

By Tiffany Trader

Germany Celebrates Launch of Two Fastest Supercomputers

September 26, 2018

The new high-performance computer SuperMUC-NG at the Leibniz Supercomputing Center (LRZ) in Garching is the fastest computer in Germany and one of the fastest i Read more…

By Tiffany Trader

Houston to Field Massive, ‘Geophysically Configured’ Cloud Supercomputer

October 11, 2018

Based on some news stories out today, one might get the impression that the next system to crack number one on the Top500 would be an industrial oil and gas mon Read more…

By Tiffany Trader

MLPerf – Will New Machine Learning Benchmark Help Propel AI Forward?

May 2, 2018

Let the AI benchmarking wars begin. Today, a diverse group from academia and industry – Google, Baidu, Intel, AMD, Harvard, and Stanford among them – releas Read more…

By John Russell

Google Releases Machine Learning “What-If” Analysis Tool

September 12, 2018

Training machine learning models has long been time-consuming process. Yesterday, Google released a “What-If Tool” for probing how data point changes affect a model’s prediction. The new tool is being launched as a new feature of the open source TensorBoard web application... Read more…

By John Russell

  • arrow
  • Click Here for More Headlines
  • arrow
Do NOT follow this link or you will be banned from the site!
Share This