The Leading Source for Global News and Information Covering the Ecosystem of High Productivity Computing
April 11, 2008
In a few months, Bill Johnston of Lawrence Berkeley National Laboratory will step down as head of ESnet, the Department of Energy's international network that provides high bandwidth networking to tens of thousands of researchers around the world. In a career that began in the 1970s and has included seminal work in networking, distributed computing, the Grid, and even crossing paths with Al Gore, Johnston has had a hand in the development of many of the high performance computing and networking resources that are today taken for granted. And as he tells it, it all began with the brain of Berkeley Lab scientist Tom Budinger.
Berkeley Lab is now recruiting for a new head of the ESnet Department at the Lab. [see the posting at: http://jobs.lbl.gov/LBNLCareers/details.asp?jid=21495&p=1]. Although he plans to officially retire from LBNL by June 1, Johnston is already planning how he'll spend his time -- doing pretty much what he does now: working, reading for both professional and personal interest, and traveling, but adjusting the ratio somewhat. Berkeley Lab's Jon Bashor managed to get an hour with Johnston to talk about his career, his accomplishments and his future plans.
Question: You've announced your plan to retire this year after 35 years at Department of Energy labs. How did you get started in your career?
Bill Johnston: When I was a graduate student at San Francisco State University, one of my professors spent her summers working on math libraries for the Star 100, which was CDC's supercomputer successor to the CDC 7600. Through this connection, I started taking graduate classes at the Department of Applied Sciences at Lawrence Livermore National Laboratory , then went to work full time in the atmospheric sciences division. There I worked on LIARQ, an air quality model that is still used by the San Francisco Bay Area Air Quality Management District (BAAQMD). Although the code was developed at Livermore, the BAAQMD couldn't run it there. So, I would bring it to LBNL to run on the Lab's CDC 7600 computer.
I began spending more and more time at the Berkeley Lab, and developed data visualization techniques that added a graphical interpretation interface to the code, so that they had dozens of different ways of looking at the data. I went on to turn this work into a data visualization package and made it available to other users of the 7600 that was the main LBNL machine at the time. Through this work I met Harvard Holmes, then head of the graphics group. I also knew the head of the systems group and was offered jobs in each group. Something Harvard said led me to join the graphics group, which was a good decision because five years later the systems group had tanked because there was no new funding to replace the 7600 when it was retired.
Over the years, I took over the graphics group, and was also getting more involved in visualization of science data. As a result, we were often focused on large data sets. These data sets were often stored at remote sites, and accessing them led me into networking. In fact as a result of some of this work, we set up the first remote, high performance network-based visualization demonstration at the Supercomputing conference in 1991. Working with the Pittsburgh Supercomputer Center (PSC), we combined the Cray Y-MP at PSC with a Thinking Machines CM2 in order to do the rendering -- the conversion of the data into a graphical representation -- fast enough for interactive manipulation. We -- mostly David Robertson -- split the code up to run part on the massively parallel CM2 and do the vector processing part on the Cray. The idea was to have the graphics workstation at SC91 in Albuquerque getting data from the supercomputers at PSC. Because high performance TCP/IP implementations weren't available, we partnered with Van Jacobson of LBNL and Dave Borman from Cray to provide a high-speed, wide area version of TCP for a Sun workstation at SC91 and for the Cray at PSC. I remember Van working on the Sun for 48 hours in order to get the two TCP stacks to work together. NSF ran a connection from the conference to the 45 Mb/s NSF network backbone (which effectively was the Internet at the time) into the conference for the first time.
The demo was a volume visualization of Tom Budinger's brain, with the data from some of Tom's high-resolution MRI work. This was real-time visualization -- you could take it, grab it, rotate it. It all started with Tom's brain. [Note: Budinger is a physician and physicist who helped develop MRI and previously headed LBNL's Center for Functional Imaging.]
For myself and Brian Tierney and David Robertson from LBNL, this was our introduction to high performance wide area networking. We got involved more and more with networking and graphics, and were even involved with ESnet on several projects.
One of our projects was with the DARPA MAGIC gigabit network testbed, that included LBNL, SRI, University of Kansas, the Minnesota Supercomputer Center, the USGS EROS Data Center, and Sprint. We worked with Sprint to build the country's first 2.5 gigabit ATM (a technology that is not used much any more) network linking Minneapolis with sites in Sioux Falls and Overland Park and Lawrence, Kansas. Together with Brian Tierney and Jason Lee (both students of mine), we developed the Distributed Parallel Storage System to drive an SRI-developed visualization application over the network with high-speed parallel data streams. This experiment made it clear that in order to get end-to-end high performance you had to address every component in the distributed system from end to end -- the applications, the operating system and network software, and the network devices -- all at the same time in order to make things run fast. This led directly to my interest in Grids. Interestingly, the ideas behind our work in DPSS [Distributed-Parallel Storage System] also fed into the development of GridFTP, which is one of the most enduring Grid applications, and heavily used by the LHC [Large Hadron Collider] community to move the massive data of the CMS and ATLAS detectors around the world for analysis by the collaborating physics community.
Question: Can you elaborate more on your work with Grids?
Page: 1 of 5(Digg, Technorati, more)
PGI Accelerator™ Fortran 95/03 and C99 compilers for x64+NVIDIA
Accelerate applications on x64+GPU platforms by adding OpenMP-like compiler directives to existing Fortran and C programs. Available now for Linux, MacOS and Windows. Download a free 15 day trial.
Platform HPC Workgroup Manager
Platform HPC Workgroup Manager integrates all the cluster productivity tools you need to deploy, run and manage your HPC environment.
Mar 17 | The Register | But what about the tier ones? Read more...
Mar 17 | Cadalyst Magazine | A new generation of workstations is changing the nature of technical computing. Read more...
Mar 17 | Linux Magazine | Latest iteration of Sun Grid Engine able to tap into Cloud. Read more...
Mar 16 | Bio-IT World | Biotech firm builds genetic models from patient data. Read more...
Mar 15 | The Register | EMC's grand vision for unified global storage. Read more...
Jan 12 | | In-depth look at vSMP Foundation server virtualization technology, technical implementation, use cases and capabilities. The technical whitepaper provides an architectural overview and details on the three vSMP Foundation products: vSMP Foundation for SMP, vSMP Foundation for Cluster and vSMP Foundation for Cloud.
Jan 18 | | This white paper discusses Gore’s copper cable assemblies, and how they continue to exceed the standards for providing reliable, cost-effective solutions for high-performance computer applications.
Join this online panel discussion for live Q&A with leading industry experts, analysts, and end-users to discuss the latest innovations, best practices, barriers to implementation, and measurable benefits of server virtualization with a particular focus on today's real world solutions.
Learn about scalable fault-tolerant architectures and examples of energy efficient and scalable supercomputing clusters using dual QDR InfiniBand to combine capacity computing with network failover capabilities with the help of programming languages such as MPI and a robust Linux cluster management package.
LIVE@SCO9: The IBM team discusses new innovations in hardware, software and services that help clients better understand their workloads and get insight from their R&D efforts. Technology demonstrations include the soon-to-be-released Power7 HPC processor, the DCS990 system with 2.4 petabytes of storage, the xCAT management tool, secure HPC cloud computing and more. Winners of two HPCwire Readers' and Editors’ Choice Awards! Take the IBM virtual tour at SC09 or more information go online to: http://www-03.ibm.com/systems/deepcomputing/sc09.html