DESIGN AND OPTIMIZATION OF NONLINEAR ACTIVATION FUNCTIONS FOR ENHANCED NEURAL NETWORK PERFORMANCE

Archana Tomar; Harish Patidar

doi:10.29121/shodhkosh.v5.i1.2024.3291

Authors

Archana Tomar Department of Computer Science and Engineering, Mandsaur university, Mandsaur, Madhya Pradesh, India
Harish Patidar Department of Computer Science and Engineering, Mandsaur university, Mandsaur, Madhya Pradesh, India

DOI:

https://doi.org/10.29121/shodhkosh.v5.i1.2024.3291

Keywords:

Activation Function, Relu, Softmax, Tanh. Sigmoid, MNIST, CIFAR-10

Abstract [English]

This paper presents T_Saf, an innovative hybrid activation function aimed at enhancing neural network training. T_Saf combines the benefits of Softplus and Tanh, providing improved gradient stability and convergence across various tasks. Through experimental assessments on MNIST and CIFAR-10 datasets, T_Saf outperforms traditional activation functions such as ReLU, Tanh, and Leaky ReLU in terms of accuracy, convergence stability, and training robustness. The comparative analysis highlights T_Saf’s adaptability, especially in scenarios susceptible to vanishing or exploding gradients, making it a promising candidate for deep neural network applications. These results indicate that T_Saf can be a preferred activation function in challenging training environments, contributing to the overall efficiency and reliability of neural network models.

References

Dubey, S. R., Singh, S. K., & Chaudhuri, B. B. (2022). Activation Functions In Deep Learning: A Comprehensive Survey And Benchmark. Neurocomputing, 503, 92-108. DOI: https://doi.org/10.1016/j.neucom.2022.06.111

Hammad, M. M. (2024). Deep Learning Activation Functions: Fixed-Shape, Parametric, Adaptive, Stochastic, Miscellaneous, Non-Standard, Ensemble. Arxiv Preprint Arxiv:2407.11090.

Nwankpa, C., Ijomah, W., Gachagan, A., & Marshall, S. (2018). Activation Functions: Comparison Of Trends In Practice And Research For Deep Learning. Arxiv Preprint Arxiv:1811.03378.

Szandała, T. (2021). Review And Comparison Of Commonly Used Activation Functions For Deep Neural Networks. Bio-Inspired Neurocomputing, 203-224. DOI: https://doi.org/10.1007/978-981-15-5495-7_11

Meduri, S. (2024). Activation Functions In Neural Networks: A Comprehensive Overview. International Journal Of Research In Computer Applications And Information Technology (Ijrcait), 7(2), 214-227.

Abdel-Nabi, H., Al-Naymat, G., Ali, M. Z., & Awajan, A. (2023). Hclsh: A Novel Non-Linear Monotonic Activation Function For Deep Learning Methods. IEEE Access, 11, 47794-47815. DOI: https://doi.org/10.1109/ACCESS.2023.3276298

Banerjee, C., Mukherjee, T., & Pasiliao, E. (2020). Feature Representations Using The Reflected Rectified Linear Unit (Rrelu) Activation. Big Data Mining And Analytics, 3(2), 102-120. DOI: https://doi.org/10.26599/BDMA.2019.9020024

Pusztaházi, L. S., Eigner, G., & Csiszár, O. (2024). Parametric Activation Functions For Neural Networks: A Tutorial Survey. IEEE Access. DOI: https://doi.org/10.1109/ACCESS.2024.3474574

Tan, H. H., & Lim, K. H. (2019, June). Vanishing gradient mitigation with deep learning neural network optimization. In 2019 7th international conference on smart computing & communications (ICSCC) (pp. 1-4). IEEE. DOI: https://doi.org/10.1109/ICSCC.2019.8843652

Shamsuddin, M. R., Abdul-Rahman, S., & Mohamed, A. (2019). Exploratory analysis of MNIST handwritten digit for machine learning modelling. In Soft Computing in Data Science: 4th International Conference, SCDS 2018, Bangkok, Thailand, August 15-16, 2018, Proceedings 4 (pp. 134-145). Springer Singapore. DOI: https://doi.org/10.1007/978-981-13-3441-2_11

Javid, I., Ghazali, R., Syed, I., Husaini, N. A., & Zulqarnain, M. (2022, October). Developing Novel T-Swish Activation Function in Deep Learning. In 2022 International Conference on IT and Industrial Technologies (ICIT) (pp. 1-7). IEEE. DOI: https://doi.org/10.1109/ICIT56493.2022.9989151

Chapman, A. D. (2023). Neural networks. The Autodidact’s Toolkit.