Privacy-Preserving Machine Learning
Distributed learning (sometimes known as federated learning) allows a group of independent data owners to collaboratively learn a model over their data sets without exposing their private data.
Our approach combines differential privacy with secure multi-party computation to both protect the data during training and produce a model that provides privacy against inference attacks.
We explore two popular methods of differential privacy, output perturbation and gradient perturbation, and advance the state-of-the-art for both methods in the distributed learning setting. In our output perturbation method, the parties combine local models within a secure computation and then add therequired differential privacy noise before revealing the model. In our gradient perturbation method, the data owners collaboratively train a global model via aniterative learning algorithm. At each iteration, the parties aggregate their local gradients within a secure computation, adding sufficient noise to ensure privacy before the gradient updates are revealed. For both methods, we show that the noise can be reduced in the multi-party setting by adding the noise inside the securecomputation after aggregation, asymptotically improving upon the best previous results. Experiments on real world data sets demonstrate that our methods providesubstantial utility gains for typical privacy requirements.
Code
https://github.com/bargavj/distributedMachineLearning
Papers
Bargav Jayaraman, Lingxiao Wang, David Evans and Quanquan Gu. Distributed Learning without Distress: Privacy-Preserving Empirical Risk Minimization. 32nd Conference on Neural Information Processing Systems (NeurIPS). Montreal, Canada. December 2018. (PDF, 12 pages) (NeurIPS Paper Page)
Lu Tian, Bargav Jayaraman, Quanquan Gu, and David Evans. Aggregating Private Sparse Learning Models Using Multi-Party Computation. In Private Multi‑Party Machine Learning (NIPS 2016 Workshop), Barcelona, 9 December 2016. (PDF, 6 pages)