During a special address at ISC today, general manager and vice president of Accelerated Computing at Nvidia, Ian Buck, shared promising news for the future of neuroscience: Researchers at King’s College in London have curated the largest database of synthetic brain images in the world using Nvidia’s Cambridge-1 supercomputer and artificial intelligence.
The database contains 100,000 images of brains and is being made freely available to healthcare researchers to advance cognitive disease research. The images were curated and donated by Jorge Cardoso, a senior lecturer and researcher at King’s College, and a founding member of MONAI, an open-source AI framework for healthcare born from a collaboration with Nvidia.
“In the past, many researchers didn’t want to work in healthcare because they couldn’t get good data, but now they can,” said Cardoso.
Synthetic data, or data that is generated by computer simulations as an alternative to real-world data, is quickly becoming ubiquitous in AI modeling due to the difficult nature of obtaining real-world data for some use cases. Medical images are especially tough to curate, as patient privacy is a concern when using actual images, as well as the potential that the patient demographics of a particular hospital are not necessarily reflective of the broad population.
Cardoso’s realistic 3D brain images, which can be those of male or female, young or old, can be made to order depending on research needs. Though the images are simulated, the researchers assert they look and behave just like actual brain scans as a result of highly trained algorithms.
This research began as a project meant to identify odd structures in brain images that could suggest brain disease. Cardoso’s team trained the AI models by first showing them real-world images of healthy brains and then following up with images of unhealthy brains. They also taught the models the differences between older and younger brains which involved generating synthetic images. By using larger and larger datasets, image fidelity was enhanced, and the models were optimized to be just as good at predicting outcomes as the real images, according to Nvidia.
“We realized the models had learned the distribution of brain types, so we didn’t need the dataset anymore, it was part of the model,” Cardoso said.
The engine that generates the synthetic brain data is Nvidia’s Cambridge-1 supercomputer which uses serious computing power to process each image’s 16 million 3D pixels. Cambridge-1 is built on 80 DGX A100 systems, 640 Nvidia A100 Tensor Core GPUs, Bluefield-2 DPUs, and Nvidia HDR InfiniBand networking.
MONAI is an open source, deep learning framework based on Ignite and PyTorch. The MONAI framework consists of domain-specific data loaders, metrics, GPU-accelerated transforms and an optimized workflow engine. The work was also supported by other Nvidia software, including its CUDA Deep Neural Network library (cuDNN), and the Nvidia Omniverse simulation platform. Thanks to the accelerated process the Nvidia software stack affords, hundreds of AI models were trained in weeks, not months, and model accuracy was greatly improved using hyperparameter tuning, as reported by Nvidia.
A national repository, Health Data Research UK, will be hosting the 100,000 images, and Cardoso hopes he can also share his AI models for future use. His team is also investigating how these models can be applied and optimized for other parts of the human anatomy for all types of medical imagery.