Hundreds of scientists from the DZero collaboration at the Department of Energy's Fermi National Accelerator Laboratory are using the technology of the future to process particle physics data today. Using Grid computing, facilities in six countries around the globe have begun to provide computing power equivalent to 3,000 one-gigahertz Pentium III processors to crunch more experimental data than ever before. In six months, the computers will churn through 250TB of data — enough to fill a stack of CDs as high as the Eiffel Tower.
“We're using the Grid to process three years' worth of data — 1 billion particle collisions — in six months,” said Fermilab guest scientist Daniel Wicke, on leave from the University of Wuppertal, Germany, who heads the reprocessing effort. “DZero has a long history of using computing resources from outside Fermilab, including a project in 2003 to send a much smaller amount of data off-site for reprocessing. We knew that this much bigger effort, remotely processing ten times more collisions than before using five times the number of computers, would be possible.”
As new data is recorded with the DZero detector at the Tevatron, the world's highest-energy particle accelerator located in Batavia, Ill., it is processed into a form useable by physicists. The cluster of one thousand computer processors dedicated to DZero computing at Fermilab is kept busy processing the newly acquired data.
“The DZero computer farm can process about four million events per day,” said Mike Diesburg, who manages the farm. “At Fermilab, we process data in real time, so even with no new data coming in it would take three years to reprocess three years' worth of data. To do it in six months we need to look for computing resources all over the world.”
A reprocessing of stored data is necessary when physicists and computer scientists have made significant advances. Researchers are constantly trying to optimize the software to process each collision event faster, and physicists' understanding of the complex DZero detector is also steadily improving.
“Our scientists are always thinking up new ideas; better ways to calibrate detectors or track particles,” said DZero spokesperson Jerry Blazey. “We wait until many of those ideas have been incorporated into the software and then do a reprocessing. The reprocessed data will improve the full physics program, including detection of top quarks and other elementary particles, and searches for the Higgs boson and new phenomena like supersymmetry.”
As each collision event is processed, the software pulls additional information from large databases, requiring several complex auxiliary systems to work well together at all times. This system then has to be adapted to run on computer systems in many different environments, with many different configurations. Researchers at Fermilab and the participating institutions have been working for almost a year to ensure that the current reprocessing runs smoothly.
“The reprocessing effort pushes the limits of our software and infrastructure so that we can get the most physics out of the data collected by the DZero detector,” said Dugan O'Neil of Simon Fraser University, a participant in the WestGrid collaboration. “The Grid allows DZero to make better use of remote human resources as well as computing power. Participating in the reprocessing is an important technical contribution for our group, and it also gives us the experience needed to figure out how to efficiently analyze data remotely.”
Canada's WestGrid, the University of Texas at Arlington, CCIN2P3 in Lyon, France and FZU in the Czech Republic are the first collaborating sites remotely reprocessing DZero data. Computing centers and Grid projects at the University of Oklahoma, GridKa in Germany, and GridPP and PPARC in the United Kingdom will soon follow. Fermilab scientists hope to eventually add collaborating sites in Brazil, India, Korea and China.
Institutions that have not traditionally collaborated on the DZero experiment also contribute to the reprocessing. The University of Wisconsin is currently contributing computing power, and Fermilab resources primarily dedicated to the CMS experiment at the Large Hadron Collider (LHC) at CERN will soon begin reprocessing. Ultimately, researchers hope to use resources distributed over several international Grids, including the Open Science Grid and LHC Computing Grid.
The DZero experiment is a collaboration of about 650 scientists from over 80 institutions in the United States and 19 foreign countries. For a list of collaborating institutions, please visit the DZero Web site.
Fermilab is operated by Universities Research Association Inc., a consortium of 90 research universities, for the United States Department of Energy's Office of Science.