The ability to share and analyze data while protecting patient privacy is giving medical researchers a new tool in their efforts to use what one vendor calls “federated learning” to train models based on diverse data sets.
To that end, researchers at GPU leader Nvidia working with a team at King’s College London came up with a way to use its “privacy-preserving” federated learning approach to train machine learning algorithms, specifically a neural network used for cancer research known as brain tumor segmentation.
Nvidia’s is among a growing list of frameworks aimed at balancing patient privacy with the ability to share and analyze large data sets needed to train models.
Among the challenges faced by medical researchers is the inability to collect and share clinical data in centralized repositories like data lakes. That makes it harder to train models used in, for example, Nvidia’s deep convolutional networks.
“Federated learning sidesteps this diﬃculty by bringing code to the patient data owners and only sharing intermediate model training updates among them,” the researchers noted in a research paper presented this week at a medical technology conference in China.
The researchers used Nvidia’s V100 Tensor Core GPUs for model training and inference. The goal is deployment of a secured federated learning platform that would enable “data-driven precision medicine at large scale,” the company said in a blog post.
While accurate models were possible by aggregating updates, the researchers found that a shared model could indirectly “leak” local training examples, thereby violating privacy restrictions on patient data. Hence, the researchers tested the feasibility of applying “differential-privacy” techniques in a federated learning framework.
Federated learning was applied to the problem of brain tumor segmentation, a technique whereby clinicians use an MRI to pinpoint tumors to come up with a diagnosis and treatment. The researchers used a dataset called BraTS to demonstrate their federated approach.
“The experimental results show that there is a trade-oﬀ between model performance and privacy protection costs,” the researchers noted.
“Federated learning allows collaborative and decentralized training of neural networks without sharing the patient data,” the researchers conclude in their paper. “Each node trains its own local model and, periodically, submits it to a parameter server. The server accumulates and aggregates the individual contributions to yield a global model, which is then shared with all nodes.”
While federated learning has proven promising as a way to balance privacy and access to training data, the researchers note their approach remains vulnerable to misuse, including the reconstruction of training models by a technique known as model inversion.
One countermeasure is injecting noise into the model training process, thereby distorting the updates seen by attackers while limiting the amount of detailed patient information shared among model nodes.
Once secured, the researcher assert their federated learning approach can help achieve “comparable” brain tumor segmentation performance without sharing patient data.