DP-FEDAW: FEDERATED LEARNING WITH DIFFERENTIAL PRIVACY IN NON-IID DATA

Qingjie Tan; Bin Wang; Hongfeng Yu; Shuhui Wu; Yaguan Qian; Yuanhong Tao

doi:10.29121/ijetmr.v10.i5.2023.1328

Authors

Tan Qingjie School of Science, Zhejiang University of Science and Technology, Hangzhou 310023, Zhejiang, China
Wang Bin Network and Information Security Laboratory of Hangzhou Hikvision Digital Technology Co. , Ltd. , Hangzhou 310051, Zhejiang, China
Yu Hongfeng School of Science, Zhejiang University of Science and Technology, Hangzhou 310023, Zhejiang, China
Shuhui Wu School of Science, Zhejiang University of Science and Technology, 318 Liuhe Road, Hangzhou, Zhejiang 31002, P. R. CHINA
Qian Yaguan School of Science, Zhejiang University of Science and Technology, Hangzhou 310023, Zhejiang, China
Tao Yuanhong School of Science, Zhejiang University of Science and Technology, Hangzhou 310023, Zhejiang, China

DOI:

https://doi.org/10.29121/ijetmr.v10.i5.2023.1328

Keywords:

Federated Learning, Non-IID Data, Differential Privacy, Convergence

Abstract

Federated learning can effectively utilize data from various users to coordinately train machine learning models while ensuring that data does not leave the user's device. However, it also faces the challenge of slow global model convergence and even the leakage of model parameters under heterogeneous data. To address this issue, this paper proposes a federated weighted average with differential privacy (DP-FedAW) algorithm, which studies the security and convergence issues of federated learning for Non-independent identically distributed (Non-IID) data. Firstly, the DP-FedAW algorithm quantifies the degree of Non-IID for different user datasets and further adjusts the aggregation weights of each user, effectively alleviating the model convergence problem caused by differences in Non-IID data during the training process. Secondly, a federated weighted average algorithm for privacy protection is designed to ensure that the model parameters meet differential privacy requirements. In theory, this algorithm effectively provides privacy and security during the training process while accelerating the convergence of the model. Experiments have shown that compared to the federated average algorithm, this algorithm can converge faster. In addition, with the increase of the privacy budget, the model's accuracy gradually tends to be without noise while ensuring model security. This study provides an important reference for ensuring model parameter security and improving the algorithm convergence rate of federated learning towards the Non-IID data.

Downloads

Download data is not yet available.

References

Bassily, R., Smith, A., & Thakurta, A. (2014). Private empirical risk minimization: Efﬁcient algorithms and tight error bounds 55th Annual Symposium on Foundations of Computer Science p. 464. IEEE Publications. https://doi.org/10.1109/FOCS.2014.56 DOI: https://doi.org/10.1109/FOCS.2014.56

Byrd, D., & Polychroniadou, A. (2020). Differentially private secure multi-party computation for federated learning in financial applications. Proceedings of the First ACM International Conference on AI in Finance, 1. https://doi.org/10.1145/3383455.3422562 DOI: https://doi.org/10.1145/3383455.3422562

Chen, B., Cheng, X., Zhang, J. L. et al. (2020). A survey of federal learning security and privacy protection. Journal of Nanjing University of Aeronautics and Astronautics, 52(5), 10.

Dinur, I., & Nissim, K. (2003). Revealing information while preserving privacy. Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 207. https://doi.org/10.1145/773153.773173 DOI: https://doi.org/10.1145/773153.773173

Dwork, C., & Roth, A. (2013). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3-4), 211-407. https://doi.org/10.1561/0400000042 DOI: https://doi.org/10.1561/0400000042

Geyer, R., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. https://arxiv.org/abs/1712.07557

Huang, X., Ding, Y., Jiang, Z. L., Qi, S., Wang, X., & Liao, Q. (2020). DP-FL: A novel differentially private federated learning framework for the unbalanced data. World Wide Web, 23(4), 2529-2545. https://doi.org/10.1007/s11280-020-00780-4 DOI: https://doi.org/10.1007/s11280-020-00780-4

Kaissis, G. A., Makowski, M. R., Rückert, D., & Braren, R. F. (2020). Secure, privacy preserving and federated machine learning in medical imaging. Nature Machine Intelligence, 2(6), 305-311. https://doi.org/10.1038/s42256-020-0186-1 DOI: https://doi.org/10.1038/s42256-020-0186-1

Karimireddy, S. P., Kale, S., Mohri, M. et al. (2019). SCAFFOLD: Stochastic controlled averaging for on-device federated learning. https://ui.adsabs.harvard.edu/abs/2019arXiv191006378P/abstract

Konen, J., Mcmahan, H. B., Yu, F. X. et al. (2016). Federated learning: Strategies for improving communication efficiency. https://arxiv.org/abs/1610.05492

Letaief, K. B., Shi, Y., Lu, J., & Lu, J. (2021). Edge artificial intelligence for 6G: Vision, enabling technologies, and applications. IEEE Journal on Selected Areas in Communications, 40(1), 5-36. https://doi.org/10.1109/JSAC.2021.3126076 DOI: https://doi.org/10.1109/JSAC.2021.3126076

Li, Q., Diao, Y., Chen, Q. et al. (2022). Federated learning on non-iind data silos: An experimental study 38th International Conference on Data Engineering (ICDE), 2022 p. 965. IEEE Publications. IEEE Publications. https://doi.org/10.1109/ICDE53745.2022.00077 DOI: https://doi.org/10.1109/ICDE53745.2022.00077

Li, X., Huang, K., Yang, W. et al. (2019). On the convergence of FedAvg on non-iind data. https://arxiv.org/abs/1907.02189

Liu, Y., Yu, J. J. Q., Kang, J., Niyato, D., & Zhang, S. (2020). Privacy-preserving traffic flow prediction: A federated learning approach. IEEE Internet of Things Journal, 7(8), 7751-7763. https://doi.org/10.1109/JIOT.2020.2991401 DOI: https://doi.org/10.1109/JIOT.2020.2991401

Ma, J., Naas, S. A., Sigg, S., & Lyu, X. (2022). Privacy-preserving federated learning based on multi-key homomorphic encryption. International Journal of Intelligent Systems, 37(9), 5880-5901. https://doi.org/10.1002/int.22818 DOI: https://doi.org/10.1002/int.22818

Mcmahan, H., Moore, E., Ramage, D. et al. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial intelligence and statistics. PMLR, 1273. http://proceedings.mlr.press/v54/mcmahan17a?ref=https://githubhelp.com

Ping, L., Li, J., Huang, Z. et al. (2017). Multi-key privacy preserving deep learning in cloud computing. Future Generation Computer Systems, 74(7), 76. https://doi.org/10.1016/j.future.2017.02.006 DOI: https://doi.org/10.1016/j.future.2017.02.006

Pokhrel, S. R., & Choi, J. (2020). Federated learning with blockchain for autonomous vehicles: Analysis and design Challenges. IEEE Transactions on Communications, 68(8), 4734-4746. https://doi.org/10.1109/TCOMM.2020.2990686 DOI: https://doi.org/10.1109/TCOMM.2020.2990686

Tian, P., Liao, W., Yu, W., & Blasch, E. (2022). WSCC: A weight-similarity-based client clustering approach for non-IID federated learning. IEEE Internet of Things Journal, 9(20), 20243-20256. https://doi.org/10.1109/JIOT.2022.3175149 DOI: https://doi.org/10.1109/JIOT.2022.3175149

Tikkinen-Piri, C., Rohunen, A., & Markkula, J. (2018). EU general data protection regulation: Changes and implications for personal data collecting companies. Computer Law and Security Review, 34(1), 134-153. https://doi.org/10.1016/j.clsr.2017.05.015 DOI: https://doi.org/10.1016/j.clsr.2017.05.015

Wu, X., Zhang, Y., Shi, M., Li, P., Li, R., & Xiong, N. N. (2022). An adaptive federated learning scheme with differential privacy preserving. Future Generation Computer Systems, 127(6), 362-372. https://doi.org/10.1016/j.future.2021.09.015 DOI: https://doi.org/10.1016/j.future.2021.09.015

Xie, Y., Wang, H., Yu, B., & Zhang, C. (2020). Secure collaborative few-shot learning. Knowledge-Based Systems, 203(7553), 106157. https://doi.org/10.1016/j.knosys.2020.106157 DOI: https://doi.org/10.1016/j.knosys.2020.106157

You, X., Liu, X., Jiang, N., Cai, J., & Ying, Z. (2023). Reschedule gradients: Temporal non-IID resilient federated learning. IEEE Internet of Things Journal, 10(1), 747-762. https://doi.org/10.1109/JIOT.2022.3203233 DOI: https://doi.org/10.1109/JIOT.2022.3203233

Yu, M., Zheng, Z., Li, Q., Wu, F., & Zheng, J. (2022). A Comprehensive Study on Personalized Federated Learning with Non-IID Data. IEEE intl. Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 40. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00013 DOI: https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00013

Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated learning. Knowledge-Based Systems, 216(1), 106775. https://doi.org/10.1016/j.knosys.2021.106775 DOI: https://doi.org/10.1016/j.knosys.2021.106775

Zhang, L., Shen, L., Ding, L., Tao, D., & Duan, L. (2022). Fine-tuning global model via data-free knowledge distillation for non-IID federated learning p. 10164-10173. IEEE Publications. https://doi.org/10.1109/CVPR52688.2022.00993 DOI: https://doi.org/10.1109/CVPR52688.2022.00993

Zhang, Q., Yang, L. T., Chen, Z., & Li, P. (2018). A survey on deep learning for big data. Information Fusion, 42(5), 146-157. https://doi.org/10.1016/j.inffus.2017.10.006 DOI: https://doi.org/10.1016/j.inffus.2017.10.006

Zhou, C. X., Sun, Y., Wang, D. G. et al. (2021). A survey of federated learning research. Chinese Journal of Network and Information Security, 7(5), 77. http://www.infocomm-journal.com/cjnis/EN/10.11959/j.issn.2096-109x.2021056