DP-FEDAW: FEDERATED LEARNING WITH DIFFERENTIAL PRIVACY IN NON-IID DATA
DOI:
https://doi.org/10.29121/ijetmr.v10.i5.2023.1328Keywords:
Federated Learning, Non-IID Data, Differential Privacy, ConvergenceAbstract
Federated learning can effectively utilize data from various users to coordinately train machine learning models while ensuring that data does not leave the user's device. However, it also faces the challenge of slow global model convergence and even the leakage of model parameters under heterogeneous data. To address this issue, this paper proposes a federated weighted average with differential privacy (DP-FedAW) algorithm, which studies the security and convergence issues of federated learning for Non-independent identically distributed (Non-IID) data. Firstly, the DP-FedAW algorithm quantifies the degree of Non-IID for different user datasets and further adjusts the aggregation weights of each user, effectively alleviating the model convergence problem caused by differences in Non-IID data during the training process. Secondly, a federated weighted average algorithm for privacy protection is designed to ensure that the model parameters meet differential privacy requirements. In theory, this algorithm effectively provides privacy and security during the training process while accelerating the convergence of the model. Experiments have shown that compared to the federated average algorithm, this algorithm can converge faster. In addition, with the increase of the privacy budget, the model's accuracy gradually tends to be without noise while ensuring model security. This study provides an important reference for ensuring model parameter security and improving the algorithm convergence rate of federated learning towards the Non-IID data.
Downloads
References
Bassily, R., Smith, A., & Thakurta, A. (2014). Private empirical risk minimization: Efficient algorithms and tight error bounds 55th Annual Symposium on Foundations of Computer Science p. 464. IEEE Publications. https://doi.org/10.1109/FOCS.2014.56 DOI: https://doi.org/10.1109/FOCS.2014.56
Byrd, D., & Polychroniadou, A. (2020). Differentially private secure multi-party computation for federated learning in financial applications. Proceedings of the First ACM International Conference on AI in Finance, 1. https://doi.org/10.1145/3383455.3422562 DOI: https://doi.org/10.1145/3383455.3422562
Chen, B., Cheng, X., Zhang, J. L. et al. (2020). A survey of federal learning security and privacy protection. Journal of Nanjing University of Aeronautics and Astronautics, 52(5), 10.
Dinur, I., & Nissim, K. (2003). Revealing information while preserving privacy. Proceedings of the Twenty-Second ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, 207. https://doi.org/10.1145/773153.773173 DOI: https://doi.org/10.1145/773153.773173
Dwork, C., & Roth, A. (2013). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9(3-4), 211-407. https://doi.org/10.1561/0400000042 DOI: https://doi.org/10.1561/0400000042
Geyer, R., Klein, T., & Nabi, M. (2017). Differentially private federated learning: A client level perspective. https://arxiv.org/abs/1712.07557
Huang, X., Ding, Y., Jiang, Z. L., Qi, S., Wang, X., & Liao, Q. (2020). DP-FL: A novel differentially private federated learning framework for the unbalanced data. World Wide Web, 23(4), 2529-2545. https://doi.org/10.1007/s11280-020-00780-4 DOI: https://doi.org/10.1007/s11280-020-00780-4
Kaissis, G. A., Makowski, M. R., Rückert, D., & Braren, R. F. (2020). Secure, privacy preserving and federated machine learning in medical imaging. Nature Machine Intelligence, 2(6), 305-311. https://doi.org/10.1038/s42256-020-0186-1 DOI: https://doi.org/10.1038/s42256-020-0186-1
Karimireddy, S. P., Kale, S., Mohri, M. et al. (2019). SCAFFOLD: Stochastic controlled averaging for on-device federated learning. https://ui.adsabs.harvard.edu/abs/2019arXiv191006378P/abstract
Konen, J., Mcmahan, H. B., Yu, F. X. et al. (2016). Federated learning: Strategies for improving communication efficiency. https://arxiv.org/abs/1610.05492
Letaief, K. B., Shi, Y., Lu, J., & Lu, J. (2021). Edge artificial intelligence for 6G: Vision, enabling technologies, and applications. IEEE Journal on Selected Areas in Communications, 40(1), 5-36. https://doi.org/10.1109/JSAC.2021.3126076 DOI: https://doi.org/10.1109/JSAC.2021.3126076
Li, Q., Diao, Y., Chen, Q. et al. (2022). Federated learning on non-iind data silos: An experimental study 38th International Conference on Data Engineering (ICDE), 2022 p. 965. IEEE Publications. IEEE Publications. https://doi.org/10.1109/ICDE53745.2022.00077 DOI: https://doi.org/10.1109/ICDE53745.2022.00077
Li, X., Huang, K., Yang, W. et al. (2019). On the convergence of FedAvg on non-iind data. https://arxiv.org/abs/1907.02189
Liu, Y., Yu, J. J. Q., Kang, J., Niyato, D., & Zhang, S. (2020). Privacy-preserving traffic flow prediction: A federated learning approach. IEEE Internet of Things Journal, 7(8), 7751-7763. https://doi.org/10.1109/JIOT.2020.2991401 DOI: https://doi.org/10.1109/JIOT.2020.2991401
Ma, J., Naas, S. A., Sigg, S., & Lyu, X. (2022). Privacy-preserving federated learning based on multi-key homomorphic encryption. International Journal of Intelligent Systems, 37(9), 5880-5901. https://doi.org/10.1002/int.22818 DOI: https://doi.org/10.1002/int.22818
Mcmahan, H., Moore, E., Ramage, D. et al. (2017). Communication-efficient learning of deep networks from decentralized data. Artificial intelligence and statistics. PMLR, 1273. http://proceedings.mlr.press/v54/mcmahan17a?ref=https://githubhelp.com
Ping, L., Li, J., Huang, Z. et al. (2017). Multi-key privacy preserving deep learning in cloud computing. Future Generation Computer Systems, 74(7), 76. https://doi.org/10.1016/j.future.2017.02.006 DOI: https://doi.org/10.1016/j.future.2017.02.006
Pokhrel, S. R., & Choi, J. (2020). Federated learning with blockchain for autonomous vehicles: Analysis and design Challenges. IEEE Transactions on Communications, 68(8), 4734-4746. https://doi.org/10.1109/TCOMM.2020.2990686 DOI: https://doi.org/10.1109/TCOMM.2020.2990686
Tian, P., Liao, W., Yu, W., & Blasch, E. (2022). WSCC: A weight-similarity-based client clustering approach for non-IID federated learning. IEEE Internet of Things Journal, 9(20), 20243-20256. https://doi.org/10.1109/JIOT.2022.3175149 DOI: https://doi.org/10.1109/JIOT.2022.3175149
Tikkinen-Piri, C., Rohunen, A., & Markkula, J. (2018). EU general data protection regulation: Changes and implications for personal data collecting companies. Computer Law and Security Review, 34(1), 134-153. https://doi.org/10.1016/j.clsr.2017.05.015 DOI: https://doi.org/10.1016/j.clsr.2017.05.015
Wu, X., Zhang, Y., Shi, M., Li, P., Li, R., & Xiong, N. N. (2022). An adaptive federated learning scheme with differential privacy preserving. Future Generation Computer Systems, 127(6), 362-372. https://doi.org/10.1016/j.future.2021.09.015 DOI: https://doi.org/10.1016/j.future.2021.09.015
Xie, Y., Wang, H., Yu, B., & Zhang, C. (2020). Secure collaborative few-shot learning. Knowledge-Based Systems, 203(7553), 106157. https://doi.org/10.1016/j.knosys.2020.106157 DOI: https://doi.org/10.1016/j.knosys.2020.106157
You, X., Liu, X., Jiang, N., Cai, J., & Ying, Z. (2023). Reschedule gradients: Temporal non-IID resilient federated learning. IEEE Internet of Things Journal, 10(1), 747-762. https://doi.org/10.1109/JIOT.2022.3203233 DOI: https://doi.org/10.1109/JIOT.2022.3203233
Yu, M., Zheng, Z., Li, Q., Wu, F., & Zheng, J. (2022). A Comprehensive Study on Personalized Federated Learning with Non-IID Data. IEEE intl. Conference on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom), 40. https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00013 DOI: https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00013
Zhang, C., Xie, Y., Bai, H., Yu, B., Li, W., & Gao, Y. (2021). A survey on federated learning. Knowledge-Based Systems, 216(1), 106775. https://doi.org/10.1016/j.knosys.2021.106775 DOI: https://doi.org/10.1016/j.knosys.2021.106775
Zhang, L., Shen, L., Ding, L., Tao, D., & Duan, L. (2022). Fine-tuning global model via data-free knowledge distillation for non-IID federated learning p. 10164-10173. IEEE Publications. https://doi.org/10.1109/CVPR52688.2022.00993 DOI: https://doi.org/10.1109/CVPR52688.2022.00993
Zhang, Q., Yang, L. T., Chen, Z., & Li, P. (2018). A survey on deep learning for big data. Information Fusion, 42(5), 146-157. https://doi.org/10.1016/j.inffus.2017.10.006 DOI: https://doi.org/10.1016/j.inffus.2017.10.006
Zhou, C. X., Sun, Y., Wang, D. G. et al. (2021). A survey of federated learning research. Chinese Journal of Network and Information Security, 7(5), 77. http://www.infocomm-journal.com/cjnis/EN/10.11959/j.issn.2096-109x.2021056
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Shuhui Wu, Tan Qingjie, Wang Bin, Yu Hongfeng, Qian Yaguan, Tao Yuanhong
This work is licensed under a Creative Commons Attribution 4.0 International License.
License and Copyright Agreement
In submitting the manuscript to the journal, the authors certify that:
- They are authorized by their co-authors to enter into these arrangements.
- The work described has not been formally published before, except in the form of an abstract or as part of a published lecture, review, thesis, or overlay journal.
- That it is not under consideration for publication elsewhere.
- That its release has been approved by all the author(s) and by the responsible authorities – tacitly or explicitly – of the institutes where the work has been carried out.
- They secure the right to reproduce any material that has already been published or copyrighted elsewhere.
- They agree to the following license and copyright agreement.
Copyright
Authors who publish with International Journal of Engineering Technologies and Management Research agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License (CC BY-SA 4.0) that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors can enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or edit it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) before and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.
For More info, please visit CopyRight Section