Etnus LLC, makers of the TotalView debugger, have apparently decided to worship at the altar of name recognition. Last week, they relaunched the company under a new name, TotalView Technologies, using the moniker of their flagship debugger offering. The new company also announced TotalView Multi-Core Debugging Framework, a new workbench model that revolves around their core debugging technology. The framework is designed to integrate a set of debugger and analysis tools that will help developers deal with parallel programming applications targeted for multi-core architectures.
“For years our customers have recognized us as the TotalView company,” explains Rich Collier, CEO of TotalView Technologies. “So we just went with the flow. It also allows us to capitalize on the success of our flagship product and it ties in with our other new product that we're going to be coming out with underneath the TotalView umbrella.”
The TotalView debugger was originally developed at BBN (Bolt Beranek and Newman) for their Butterfly computer, a massively parallel system built in the 1980s. In its day, the Butterfly was a state-of-the-art machine, and was probably as challenging to program as IBM Cell-based systems are today. The Butterfly is no more, but the BBN ToolWorks group took the TotalView technology and spun off a new company, Etnus, in 1999. Over the next eight years, TotalView went on to become the most widely known commercial parallel debugger for high performance machines.
In the early part of the decade, the TotalView source code debugger started out in government labs and supercomputer centers, where most of the HPC action was. But starting in 2001, Etnus began to tap into the growing number of commercial customers as they adopted parallel programming technologies.
“Today TotalView is used not just for high-end applications, but also for distributed, cluster, or any heavily multi-threaded codes,” says Collier.
It's being used to debug everything from nuclear weapons simulation programs to financial analysis codes to animation rendering workloads. With regards to animation, Collier says that their technology has become the de facto standard for all animation and digital content studios. Recently, Blue Sky Studios used TotalView during the production of the animated film “Ice Age.” Disney and Pixar are also using the debugger for their computer-generated animation projects.
TotalView is well-known for supporting a wide variety of hardware and operating systems found on HPC platforms. The breadth of coverage includes Blue Gene/L (PowerPC) on SuSE Linux, Itanium 2 on HP-UX, SPARC on Solaris, Intel x86 on Mac OS X, to name a few. Collier says they're also considering support for Microsoft's Windows Compute Cluster Server (CCS) platform. According to him, they've received a few requests from some existing customers and have also talked to partners about it, but haven't made a commitment yet. No doubt they'll hop on board if and when CCS gains enough traction in the HPC market to make it worthwhile.
The new framework announced last week encompasses a set of five debugging and analysis components that can be incorporated into a workbench structure. The components include:
- TotalView Source Debugger — the core technology for source code debugging (just updated)
- MemoryScape Memory Debugger — for finding memory leaks or data corruption in dynamic (heap) memory
- Performance Analysis Tool Suite — for performance tuning
- Data-Centric Debugging Tool Suite — for detecting data errors in a multi-threaded environment
- Active Web Tool Debugging Suite — allows developers to debug in distributed Java and AJAX environments
The three tool suites won't be available until the second half of 2007. The other two components are available now and amount to a repackaging of existing offerings. Up until now, TotalView included source code and memory debugging together. With the memory debugging broken out in the MemoryScape product, this can be purchased and licensed separately.
Although TotalView has few parallel debugger competitors (Allinea's Distributed Debugging Tool being one), a lot of users are still using GDB, or variations of it, to debug their HPC applications. In general, GDB does only serial debugging, so does not offer the sophistication of an MPI- or OpenMP-aware parallel debugger. Some users prefer to go without debuggers altogether, inserting low-level printf calls into their code to output data or other application status information during test runs. Less sophisticated methods like printf can work for smaller codes, but once the application scales up to hundreds or thousands of processors, a more high level view of the software is usually required.
“What we're doing with this framework is offering a comprehensive, integrated set of software development tools that are designed to work together to provide multi-core debugging. Ultimately this will improve the programmer's productivity and keep the programmer sane.”
Selling productivity and sanity is a good strategy when you're talking about parallel debugging. Memory leaks, data corruption and logic errors all become much harder to find and fix when you start using hundreds or thousands of threads. TotalView Technologies' plans to follow parallel architectures into the marketplace and provide the debugging and analysis tools that are bound to be needed by a new generation of applications.