Visit additional Tabor Communication Publications
September 13, 2011
Health care analytics is an emerging application area that promises to help cut costs and provide better patient outcomes. To reach that goal though requires sophisticated software that can mimic some of the intelligence of real live physicians. At Lund University and Skåne University in Sweden, researchers are attempting to do just that by building a model of heart-transplant recipients and donors to improve survival times.
The so-called "survival model" is designed to discover the optimal matches between recipients and donor for heart transplants. It takes into account such factors as age, blood type (both donor and recipient), weight, gender, age, and time during a transplant when there is no blood flow to the heart. Just analyzing those six variables leads to about 30,000 distinct combinations to track. When you want to match tens of thousands of recipients and donors across that spread of combinations, you need a rather sophisticated software model and some serious computing horsepower.
To build the application, the Lund researchers used MATLAB and a set of related MathWorks libraries, namely the Neural Network Toolbox, the Parallel Computing Toolbox, and the MATLAB Distributed Computing Server. With that, they built their predictive artificial neural network (ANN) models, in this case, a simulation that predicts survival rates for heart transplant patients based on the suitability of the donor match. The ANN models are "trained" using donor and recipient data encapsulated in two databases: the International Society for Heart and Lung Transplantation (ISHLT) registry and the Nordic Thoracic Transplantation Database (NTTD).
The key software technology for the ANN application is MathWorks' Neural Network Toolbox. The package contains tools for designing and simulating neural networks, which can be used for artificial intelligence-type applications such as pattern recognition, quantum chemistry, speech recognition, game-playing and process control. These types of application don't lend themselves easily to the type of formal analysis done in traditional computing.
For the ANN models, training involves correlating donor and recipient data, such that the risk factors are weighted accurately. If done correctly, the simulations can become adept at associating these factors with the heart transplant survival rates. In this case, the results from the simulations were used to pick out the best and worst donors for any particular recipient.
The ultimate goal is to determine the mean survival times after transplantation for waiting recipients, so that doctors can make the best possible decision with regard to matches. In the research study, they analyzed about 10,000 patients that had already received transplants in order to verify the accuracy of the algorithms.
What they found was that the ANN models could increase the five-year survival rate raised by 5 to 10 percent compared to the traditional selection criteria performed by practicing physicians. Perhaps more importantly, using a randomized trial based on preliminary results, approximately 20 percent more patients would be considered for transplantation under these models, says Dr. Johan Nilsson, Associate Professor in the Division of Cardiothoracic Surgery at Lund University.
Because of the combinatorial load of the recipient-donor variables, the models are very compute-intensive. On a relative small cluster, the MATLAB-derived ANN simulation took about five days. That was significantly better the open source software packages (R and Python) they started out with. Under that environment, runs took about three to four weeks and were beset with crashes and inaccurate results.
To run the simulation, the researchers used a nine-node Apple Xserve cluster (which includes a head node and a filesharing node), along with 16 TB of disk, all lashed to together with a vanilla GigE network. Memory size on the nodes ranged form 24 to 48 GB. According to Nilsson, with the latest MATLAB configuration, they use 64 CPUs to run the ANN simulation.
Nilsson, who is a physician, programmed the application himself, noting that the MATLAB environment was easy to set up and use, adding there was no need for deep knowledge of parallel computing. The biggest roadblock he encountered was the need to customize an error function (MATLAB Neural Network does not have any cross-entropy error routine.) There were also some problems encountered in setting up the Xserve cluster, but once they replaced Apple's Xgrid protocol with the MATLAB Distributed Computing Server, many of those problems disappeared.
The Apple Xserve cluster is not exactly state of the art for high performance computing these day. Presumably with a late model HPC setup, they could cut the five-day turnaround time for the simulation even more, which would speed up the research even further.
In the short term, the Lund and Skåne team intend to continue to optimize the software and explore other solutions like regression tree and logistic regression algorithms, as well as add support for vector machines. In parallel, they want to start transitioning the technology into a clinical setting.
According to Nilsson, once they've fully cooked the models, they can do away with the high performance computing environment. "In a future clinical setting," he says, "the application could be used on any desktop computer, and the matching process will take only seconds to a couple of minutes."
May 16, 2013 |
When it comes to cloud, long distances mean unacceptably high latencies. Researchers from the University of Bonn in Germany examined those latency issues of doing CFD modeling in the cloud by utilizing a common CFD and its utilization in HPC instance types including both CPU and GPU cores of Amazon EC2.
May 15, 2013 |
Supercomputers at the Department of Energy’s National Energy Research Scientific Computing Center (NERSC) have worked on important computational problems such as collapse of the atomic state, the optimization of chemical catalysts, and now modeling popping bubbles.
May 10, 2013 |
Program provides cash awards up to $10,000 for the best open-source end-user applications deployed on 100G network.
May 09, 2013 |
The Japanese government has revealed its plans to best its previous K Computer efforts with what they hope will be the first exascale system...
May 08, 2013 |
For engineers looking to leverage high-performance computing, the accessibility of a cloud-based approach is a powerful draw, but there are costs that may not be readily apparent.
05/10/2013 | Cleversafe, Cray, DDN, NetApp, & Panasas | From Wall Street to Hollywood, drug discovery to homeland security, companies and organizations of all sizes and stripes are coming face to face with the challenges – and opportunities – afforded by Big Data. Before anyone can utilize these extraordinary data repositories, however, they must first harness and manage their data stores, and do so utilizing technologies that underscore affordability, security, and scalability.
04/15/2013 | Bull | “50% of HPC users say their largest jobs scale to 120 cores or less.” How about yours? Are your codes ready to take advantage of today’s and tomorrow’s ultra-parallel HPC systems? Download this White Paper by Analysts Intersect360 Research to see what Bull and Intel’s Center for Excellence in Parallel Programming can do for your codes.
In this demonstration of SGI DMF ZeroWatt disk solution, Dr. Eng Lim Goh, SGI CTO, discusses a function of SGI DMF software to reduce costs and power consumption in an exascale (Big Data) storage datacenter.
The Cray CS300-AC cluster supercomputer offers energy efficient, air-cooled design based on modular, industry-standard platforms featuring the latest processor and network technologies and a wide range of datacenter cooling requirements.