Intel and the University of Pennsylvania today announced a collaboration involving 29 international medical centers to train models to recognize brain tumors. The project is part of the Informatics Technology for Cancer Research (ITCR) program of the National Cancer Institute (NCI) and will use ‘federated learning architecture’ to mine relevant data while maintaining privacy and security. NIH is funding the effort with a three-year $1.2 million grant.
“It is widely accepted by our scientific community that machine learning training requires ample and diverse data that no single institution can hold. This year, the federation will begin developing algorithms that identify brain tumors from a greatly expanded version of the International Brain Tumor Segmentation (BraTS) challenge dataset,” said Spyridon Bakas of the Center for Biomedical Image Computing and Analytics (CBICA), UPenn, and principal investigator for the initiative. “This federation will allow medical researchers access to vastly greater amounts of healthcare data while protecting the security of that data.”
According to the Intel the announcement made online, Penn Medicine and 29 healthcare and research institutions from the United States, Canada, United Kingdom, Germany, the Netherlands, Switzerland and India will use federated learning, which is a distributed machine learning approach that enables organizations to collaborate on deep learning projects without sharing patient data. The collaboration’s approach was outlined in a paper roughly a year ago (Multi-institutional Deep Learning Modeling Without Sharing Patient Data: A Feasibility Study on Brain Tumor Segmentation) with Bakas as an author. Here’s the abstract:
“Deep learning models for semantic segmentation of images require large amounts of data. In the medical imaging domain, acquiring sufficient data is a significant challenge. Labeling medical image data requires expert knowledge. Collaboration between institutions could address this challenge, but sharing medical data to a centralized location faces various legal, privacy, technical, and data-ownership challenges, especially among international institutions.
“In this study, we introduce the first use of federated learning for multi-institutional collaboration, enabling deep learning modeling without sharing patient data. Our quantitative results demonstrate that the performance of federated semantic segmentation models (Dice = 0.852) on multimodal brain scans is similar to that of models trained by sharing data (Dice = 0.862). We compare federated learning with two alternative collaborative learning methods and find that they fail to match the performance of federated learning.”
Federated learning was first introduced by Google in 2017. Here’s description from Google’s blog introducing the work:
“Standard machine learning approaches require centralizing the training data on one machine or in a datacenter. And Google has built one of the most secure and robust cloud infrastructures for processing this data to make our services better. Now for models trained from user interaction with mobile devices, we’re introducing an additional approach: Federated Learning.
“Federated Learning enables mobile phones to collaboratively learn a shared prediction model while keeping all the training data on device, decoupling the ability to do machine learning from the need to store the data in the cloud. This goes beyond the use of local models that make predictions on mobile devices (like the Mobile Vision API and On-Device Smart Reply) by bringing model training to the device as well.
“It works like this: your device downloads the current model, improves it by learning from data on your phone, and then summarizes the changes as a small focused update. Only this update to the model is sent to the cloud, using encrypted communication, where it is immediately averaged with other user updates to improve the shared model. All the training data remains on your device, and no individual updates are stored in the cloud.”
Intel reports the new effort will leverage Intel hardware and software “to produce a new state-of-the-art AI model that is trained on the largest brain tumor dataset to date… The subset of collaborating institutions expected to participate “in initiating the first phase of this federation includes the Hospital of the University of Pennsylvania, Washington University in St. Louis, the University of Pittsburgh Medical Center, Vanderbilt University, Queen’s University, Technical University of Munich, University of Bern, King’s College London and Tata Memorial Hospital.”
According to the American Brain Tumor Association (ABTA), nearly 80,000 people will be diagnosed with a brain tumor this year, with more than 4,600 of them being children. In order to train and build a model to detect a brain tumor that could aid in early detection and better outcomes, researchers need access to large amounts of relevant medical data. However, it is essential that the data remain private and protected.