Research Article
BibTex RIS Cite

Hyper Parameter Analysis in Recognition of Handwritten Digits Using Convolutional Neural Network

Year 2023, Volume: 9 Issue: 4, 268 - 277, 31.12.2023

Abstract

Recognition of handwritten digits has recently gained importance and attracted the attention of many scientists, as it is used in many machine learning, deep learning and computer vision applications. Hyperparameter optimization involves determining a set of values aimed at increasing accuracy in both classification and prediction. It is also aimed to optimize the performance in feature selection by regulating the parameters selected by the algorithms more accurately. In this study, a convolutional neural network was used to recognize handwritten digits using the MNIST dataset. There are many open source hyperparameter libraries that deep learning developers can use to determine hyperparameters. In the developed model, hyperparameter optimization techniques were applied using Optuna, HyperOpt and Scikit-optimize libraries. Optimization times for hyperparameter libraries and the change in the success rate in recognizing handwritten digits were analyzed. The model trained with randomly given parameters achieved 78.45%, 97.13%, 75.62%, 76.95%, 97.46% and 97.27% accuracy, while the model trained with optimized hyperparameters achieved 99.26% accuracy.

References

  • [1] W. Ertel, Introduction to Artificial Intelligence, Springer Cham., 2017
  • [2] A. Baldominos, Y. Saez, and P. Isasi, “A survey of handwritten character recognition with MNIST and EMNIST,” Applied Sciences (Switzerland), vol. 9, no. 15. Aug. 01, 2019. doi: 10.3390/app9153169.
  • [3] E. Kussul, T. Baidyk, “Improved method of handwritten digit recognition tested on MNIST database”, Image and Vision Computing, vol.22, no.12, pp. 971-981, 2004, doi: 10.1016/j.imavis.2004.03.008.
  • [4] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization,” Journal of Machine Learning Research, vol. 18, no. 185, pp. 1–52, 2018.
  • [5] R. S. Humera and R. Zaheer, “Impact of Hyperparameters on Model Development in Deep Learning,” in Proceedings of International Conference on Computational Intelligence and Data Engineering., Springer, 2021, pp. 57–67. [6] R. Elshawi, M. Maher, and S. Sakr, “Automated Machine Learning: State-of-The-Art and Open Challenges,” Jun. 2019, [Online]. Available: http://arxiv.org/abs/1906.02287. [Accessed: Dec. 28, 2023].
  • [7] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyper-parameter optimization” in Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011. 2011, pp. 2546–2554.
  • [8] M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, D. Deng, C. Benjamins, T. Ruhkopf, R. Sass, and F. Hutter, “SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization”, The Journal of Machine Learning Research., vol. 23, no. 1, Jan. 2022.
  • [9] L. Li, K. Jamieson, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization,” Journal of Machine Learning Research, vol.18, no.1, pp.6765–6816, Jan 2017, doi: 10.5555/3122009.3242042.
  • [10] S. L. Chooi and A. S. Ghafar, “Handwritten Character Recognition Using Convolutional Neural Network”, Progress in Engineering Application and Technology, vol. 2, no. 1, pp. 593–611, Jun. 2021.
  • [11] O. M. Khanday, S. Dadvandipour, and M. A. Lone, “Effect of filter sizes on image classification in CNN: A case study on CFIR10 and fashion-MNIST datasets,” IAES International Journal of Artificial Intelligence, vol. 10, no. 4, pp. 872–878, Dec. 2021, doi: 10.11591/ijai.v10.i4.pp872-878.
  • [12] L. Ming Seng, B. Bang Chen Chiang, Z. Arabee Abdul Salam, G. Yih Tan, and H. Tong Chai, “MNIST handwritten digit recognition with different CNN architectures,” Journal of Applied Technology and Innovation. vol. 5, no.1,pp. 7-10, 2021.
  • [13] H. Shao, E. Ma, M. Zhu, X. Deng, and S. Zhai, “MNIST Handwritten Digit Classification Based on Convolutional Neural Network with Hyperparameter Optimization,” Intelligent Automation and Soft Computing, vol. 36, no. 3, pp. 3595–3606, 2023, doi: 10.32604/iasc.2023.036323.
  • [14] B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, J. Thomas, T. Ullmann, M. Becker, A. Boulesteix, D. Deng, M. Lindauer, “Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges,” Wiley Interdiscip Rev Data Min Knowl Discov, vol. 13, Dec. 2023, doi: 10.1002/widm.1484.
  • [15] A. Baldominos, Y. Saez, and P. Isasi, “Evolutionary convolutional neural networks: An application to handwriting recognition,” Neurocomputing, vol. 283, pp. 38–52, 2018, doi: 10.1016/j.neucom.2017.12.049.
  • [16] Y. LeCun, “The MNIST database of handwritten digits.”, 1998. [Online]. Available: http://yann. lecun. com/exdb/mnist/, [Accessed: Dec. 28, 2023].
  • [17] M. Wu and Z. Zhang, “Handwritten Digit Classification using the MNIST Data Set Handwritten Digit Classification using the MNIST Data Set 1,” 2014. [Online]. Available: https://www.researchgate.net/publication/228685853 [Accessed: Dec. 28, 2023].
  • [18] L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamaría, M. A. Fadhel, M. Al-Amidie and L. Farhan, “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions”, J Big Data vol.8, no.53, pp. 1-74, Mar 2021. https://doi.org/10.1186/s40537-021-00444-8
  • [19] D. H. Hubel and T. N. Wiesel, “Receptive fields and functional architecture of monkey striate cortex,” J Physiol, vol. 195, no. 1, pp. 215–243, 1968, doi: 10.1113/jphysiol.1968.sp008455.
  • [20] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology”, Insights into Imaging, vol. 9, no. 4, pp. 611–629, Aug. 01, 2018. doi: 10.1007/s13244-018-0639-9.
  • [21] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, “Backpropagation Applied to Handwritten Zip Code Recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989, doi: 10.1162/neco.1989.1.4.541.
  • [22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Association for Computing Machinery, vol.60, pp. 84-90, June 2017, doi: 10.1145/3065386
  • [23] M. Krichen, “Convolutional Neural Networks: A Survey,” Computers, vol. 12, no. 8, 2023, doi: 10.3390/computers12080151.
  • [24] Ö. İnik Ve E. Ülker, “Derin Öğrenme Ve Görüntü Analizinde Kullanılan Derin Öğrenme Modelleri”, Gbad, Vol. 6, no. 3, pp. 85–104, 2017.
  • [25] S. Ali, Z. Shaukat, M. Azeem, Z. Sakhawat, T. Mahmood, and K. ur Rehman, “An efficient and improved scheme for handwritten digit recognition based on convolutional neural network,” SN Appl Sci, vol. 1, no. 9, Sep. 2019, doi: 10.1007/s42452-019-1161-5.
  • [26] M. Sahu and R. Dash, “A Survey on Deep Learning: Convolution Neural Network (CNN),” 2021, Smart Innovation, Systems and Technologies, vol.153, pp. 317–325, 2021. doi: 10.1007/978-981-15-6202-0_32.
  • [27] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai and T. Chen, “Recent advances in convolutional neural networks,” Pattern Recognit, vol. 77, pp. 354–377, 2018, doi: 10.1016/j.patcog.2017.10.013.
  • [28] W. Ouyang, B. Xu, J. Hou, and X. Yuan, “Fabric Defect Detection Using Activation Layer Embedded Convolutional Neural Network,” IEEE Access, vol. 7, pp.70130-70140 , Apr. 2019, doi: 10.1109/ACCESS.2019.2913620.
  • [29] M. Tanaka, “Weighted sigmoid gate unit for an activation function of deep neural network,” Pattern Recognit Lett, vol. 135, pp. 354–359, 2020, doi: 10.1016/j.patrec.2020.05.017.
  • [30] M. Coşkun, A. Uçar, Ö. Yildirim, and Y. Demir, “Face recognition based on convolutional neural network,” in 2017 International Conference on Modern Electrical and Energy Systems (MEES), 2017, pp. 376–379. doi: 10.1109/MEES.2017.8248937.
  • [31] S. L. Hijazi, R. Kumar, and C. Rowen, “Using Convolutional Neural Networks for Image Recognition,” Cadence Design Systems Inc, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:6212567
  • [32] F. Yılmaz and M. C. Kasapbaşı, “Eeg Sinyalleri İle Epilepsi Krizinin Tahminlenmesinde Rassal Orman Algoritması İle Hiper Parametre Optimizasyonun Uygulanması,” İstanbul Ticaret Üniversitesi Teknoloji Ve Uygulamalı Bilimler Dergisi, vol. 3, no. 2, pp. 189–203, 2021.
  • [33] T. Bartz-Beielstein, “Hyperparameter Tuning and Optimization Applications,” in Hyperparameter Tuning for Machine and Deep Learning with R, Springer, 2023, pp. 165–175.
  • [34] J. Bergstra, J. B. Ca, and Y. B. Ca, “Random Search for Hyper-Parameter Optimization Yoshua Bengio,” Journal of Machine Learning Research, vol.13, no.10, pp.281-305. Feb 2012.
  • [35] F. Hutter, L. Kotthoff, and J. Vanschoren, “Automated Machine Learning: Methods, Systems, Challenges,” Automated Machine Learning, pp. 113–134. Springer. 2019.
  • [36] L. Yang And A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol.415, pp.295-316,November 2020. https://doi.org/10.1016/j.neucom.2020.07.061
  • [37] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-Generation Hyperparameter Optimization Framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, in KDD ’19. New York, NY, 2019, pp. 2623–2631. doi: 10.1145/3292500.3330701.
  • [38] “scikit-optimize: Sequential model-based optimization in Python — scikit-optimize 0.7.3 documentation. (n.d.).” [Online]. Available: https://scikit-optimize.github.io/. [Accessed: Dec. 28, 2023].

Evrişimli Sinir Ağı Kullanarak El yazısı Rakamların Tanımasında Hiper Parametre Analizi

Year 2023, Volume: 9 Issue: 4, 268 - 277, 31.12.2023

Abstract

El yazısı rakamların tanıması birçok makine öğrenimi, derin öğrenme ve bilgisayarla görme uygulamalarında kullanıldığından, son zamanlarda önem kazanmış ve birçok bilim insanının ilgisini çekmiştir. Hiper parametre optimizasyonu, hem sınıflandırma hem de tahmin etmede doğruluğu artırmayı amaçlayan bir dizi değerlerin belirlenmesini içermektedir. Ayrıca algoritmalar tarafından seçilen parametreler daha doğru şekilde düzenlenerek özellik seçiminde performansın optimize edilmesi de amaçlanmaktadır. Bu çalışmada MNIST veri seti kullanılarak el yazısı ile yazılmış rakamların tanınmasında evrişimli sinir ağı kullanılmıştır. Hiper parametre belirlemede derin öğrenme geliştiricilerinin kullanabileceği birçok açık kaynaklı hiper parametre kütüphanesi mevcuttur. Geliştirilen modelde Optuna, HyperOpt ve Scikit-optimize kütüphaneleri kullanılarak hiper parametre optimizasyon tekniklerinin uygulaması yapılmıştır. Hiper parametre kütüphaneleri için optimizasyon süreleri, el yazısı rakamların tanınmasında başarı oranındaki değişim analiz edilmiştir. Rastgele verilmiş parametrelerle eğitilen model %78,45, %97,13, %75.62, %76.95, %97.46 ve %97.27 doğruluk elde ederken, optimize edilmiş hiper parametrelerle eğitilen model ile %99,26 doğruluk elde etmiştir.

References

  • [1] W. Ertel, Introduction to Artificial Intelligence, Springer Cham., 2017
  • [2] A. Baldominos, Y. Saez, and P. Isasi, “A survey of handwritten character recognition with MNIST and EMNIST,” Applied Sciences (Switzerland), vol. 9, no. 15. Aug. 01, 2019. doi: 10.3390/app9153169.
  • [3] E. Kussul, T. Baidyk, “Improved method of handwritten digit recognition tested on MNIST database”, Image and Vision Computing, vol.22, no.12, pp. 971-981, 2004, doi: 10.1016/j.imavis.2004.03.008.
  • [4] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization,” Journal of Machine Learning Research, vol. 18, no. 185, pp. 1–52, 2018.
  • [5] R. S. Humera and R. Zaheer, “Impact of Hyperparameters on Model Development in Deep Learning,” in Proceedings of International Conference on Computational Intelligence and Data Engineering., Springer, 2021, pp. 57–67. [6] R. Elshawi, M. Maher, and S. Sakr, “Automated Machine Learning: State-of-The-Art and Open Challenges,” Jun. 2019, [Online]. Available: http://arxiv.org/abs/1906.02287. [Accessed: Dec. 28, 2023].
  • [7] J. Bergstra, R. Bardenet, Y. Bengio, and B. Kégl, “Algorithms for hyper-parameter optimization” in Proceedings of the 24th International Conference on Neural Information Processing Systems NIPS 2011. 2011, pp. 2546–2554.
  • [8] M. Lindauer, K. Eggensperger, M. Feurer, A. Biedenkapp, D. Deng, C. Benjamins, T. Ruhkopf, R. Sass, and F. Hutter, “SMAC3: A Versatile Bayesian Optimization Package for Hyperparameter Optimization”, The Journal of Machine Learning Research., vol. 23, no. 1, Jan. 2022.
  • [9] L. Li, K. Jamieson, A. Rostamizadeh, and A. Talwalkar, “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization,” Journal of Machine Learning Research, vol.18, no.1, pp.6765–6816, Jan 2017, doi: 10.5555/3122009.3242042.
  • [10] S. L. Chooi and A. S. Ghafar, “Handwritten Character Recognition Using Convolutional Neural Network”, Progress in Engineering Application and Technology, vol. 2, no. 1, pp. 593–611, Jun. 2021.
  • [11] O. M. Khanday, S. Dadvandipour, and M. A. Lone, “Effect of filter sizes on image classification in CNN: A case study on CFIR10 and fashion-MNIST datasets,” IAES International Journal of Artificial Intelligence, vol. 10, no. 4, pp. 872–878, Dec. 2021, doi: 10.11591/ijai.v10.i4.pp872-878.
  • [12] L. Ming Seng, B. Bang Chen Chiang, Z. Arabee Abdul Salam, G. Yih Tan, and H. Tong Chai, “MNIST handwritten digit recognition with different CNN architectures,” Journal of Applied Technology and Innovation. vol. 5, no.1,pp. 7-10, 2021.
  • [13] H. Shao, E. Ma, M. Zhu, X. Deng, and S. Zhai, “MNIST Handwritten Digit Classification Based on Convolutional Neural Network with Hyperparameter Optimization,” Intelligent Automation and Soft Computing, vol. 36, no. 3, pp. 3595–3606, 2023, doi: 10.32604/iasc.2023.036323.
  • [14] B. Bischl, M. Binder, M. Lang, T. Pielok, J. Richter, S. Coors, J. Thomas, T. Ullmann, M. Becker, A. Boulesteix, D. Deng, M. Lindauer, “Hyperparameter optimization: Foundations, algorithms, best practices, and open challenges,” Wiley Interdiscip Rev Data Min Knowl Discov, vol. 13, Dec. 2023, doi: 10.1002/widm.1484.
  • [15] A. Baldominos, Y. Saez, and P. Isasi, “Evolutionary convolutional neural networks: An application to handwriting recognition,” Neurocomputing, vol. 283, pp. 38–52, 2018, doi: 10.1016/j.neucom.2017.12.049.
  • [16] Y. LeCun, “The MNIST database of handwritten digits.”, 1998. [Online]. Available: http://yann. lecun. com/exdb/mnist/, [Accessed: Dec. 28, 2023].
  • [17] M. Wu and Z. Zhang, “Handwritten Digit Classification using the MNIST Data Set Handwritten Digit Classification using the MNIST Data Set 1,” 2014. [Online]. Available: https://www.researchgate.net/publication/228685853 [Accessed: Dec. 28, 2023].
  • [18] L. Alzubaidi, J. Zhang, A. J. Humaidi, A. Al-Dujaili, Y. Duan, O. Al-Shamma, J. Santamaría, M. A. Fadhel, M. Al-Amidie and L. Farhan, “Review of deep learning: concepts, CNN architectures, challenges, applications, future directions”, J Big Data vol.8, no.53, pp. 1-74, Mar 2021. https://doi.org/10.1186/s40537-021-00444-8
  • [19] D. H. Hubel and T. N. Wiesel, “Receptive fields and functional architecture of monkey striate cortex,” J Physiol, vol. 195, no. 1, pp. 215–243, 1968, doi: 10.1113/jphysiol.1968.sp008455.
  • [20] R. Yamashita, M. Nishio, R. K. G. Do, and K. Togashi, “Convolutional neural networks: an overview and application in radiology”, Insights into Imaging, vol. 9, no. 4, pp. 611–629, Aug. 01, 2018. doi: 10.1007/s13244-018-0639-9.
  • [21] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, “Backpropagation Applied to Handwritten Zip Code Recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989, doi: 10.1162/neco.1989.1.4.541.
  • [22] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” Association for Computing Machinery, vol.60, pp. 84-90, June 2017, doi: 10.1145/3065386
  • [23] M. Krichen, “Convolutional Neural Networks: A Survey,” Computers, vol. 12, no. 8, 2023, doi: 10.3390/computers12080151.
  • [24] Ö. İnik Ve E. Ülker, “Derin Öğrenme Ve Görüntü Analizinde Kullanılan Derin Öğrenme Modelleri”, Gbad, Vol. 6, no. 3, pp. 85–104, 2017.
  • [25] S. Ali, Z. Shaukat, M. Azeem, Z. Sakhawat, T. Mahmood, and K. ur Rehman, “An efficient and improved scheme for handwritten digit recognition based on convolutional neural network,” SN Appl Sci, vol. 1, no. 9, Sep. 2019, doi: 10.1007/s42452-019-1161-5.
  • [26] M. Sahu and R. Dash, “A Survey on Deep Learning: Convolution Neural Network (CNN),” 2021, Smart Innovation, Systems and Technologies, vol.153, pp. 317–325, 2021. doi: 10.1007/978-981-15-6202-0_32.
  • [27] J. Gu, Z. Wang, J. Kuen, L. Ma, A. Shahroudy, B. Shuai, T. Liu, X. Wang, G. Wang, J. Cai and T. Chen, “Recent advances in convolutional neural networks,” Pattern Recognit, vol. 77, pp. 354–377, 2018, doi: 10.1016/j.patcog.2017.10.013.
  • [28] W. Ouyang, B. Xu, J. Hou, and X. Yuan, “Fabric Defect Detection Using Activation Layer Embedded Convolutional Neural Network,” IEEE Access, vol. 7, pp.70130-70140 , Apr. 2019, doi: 10.1109/ACCESS.2019.2913620.
  • [29] M. Tanaka, “Weighted sigmoid gate unit for an activation function of deep neural network,” Pattern Recognit Lett, vol. 135, pp. 354–359, 2020, doi: 10.1016/j.patrec.2020.05.017.
  • [30] M. Coşkun, A. Uçar, Ö. Yildirim, and Y. Demir, “Face recognition based on convolutional neural network,” in 2017 International Conference on Modern Electrical and Energy Systems (MEES), 2017, pp. 376–379. doi: 10.1109/MEES.2017.8248937.
  • [31] S. L. Hijazi, R. Kumar, and C. Rowen, “Using Convolutional Neural Networks for Image Recognition,” Cadence Design Systems Inc, 2015. [Online]. Available: https://api.semanticscholar.org/CorpusID:6212567
  • [32] F. Yılmaz and M. C. Kasapbaşı, “Eeg Sinyalleri İle Epilepsi Krizinin Tahminlenmesinde Rassal Orman Algoritması İle Hiper Parametre Optimizasyonun Uygulanması,” İstanbul Ticaret Üniversitesi Teknoloji Ve Uygulamalı Bilimler Dergisi, vol. 3, no. 2, pp. 189–203, 2021.
  • [33] T. Bartz-Beielstein, “Hyperparameter Tuning and Optimization Applications,” in Hyperparameter Tuning for Machine and Deep Learning with R, Springer, 2023, pp. 165–175.
  • [34] J. Bergstra, J. B. Ca, and Y. B. Ca, “Random Search for Hyper-Parameter Optimization Yoshua Bengio,” Journal of Machine Learning Research, vol.13, no.10, pp.281-305. Feb 2012.
  • [35] F. Hutter, L. Kotthoff, and J. Vanschoren, “Automated Machine Learning: Methods, Systems, Challenges,” Automated Machine Learning, pp. 113–134. Springer. 2019.
  • [36] L. Yang And A. Shami, “On hyperparameter optimization of machine learning algorithms: Theory and practice,” Neurocomputing, vol.415, pp.295-316,November 2020. https://doi.org/10.1016/j.neucom.2020.07.061
  • [37] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A Next-Generation Hyperparameter Optimization Framework,” in Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, in KDD ’19. New York, NY, 2019, pp. 2623–2631. doi: 10.1145/3292500.3330701.
  • [38] “scikit-optimize: Sequential model-based optimization in Python — scikit-optimize 0.7.3 documentation. (n.d.).” [Online]. Available: https://scikit-optimize.github.io/. [Accessed: Dec. 28, 2023].
There are 37 citations in total.

Details

Primary Language Turkish
Subjects Computer Software
Journal Section Research Articles
Authors

Tuncay Yiğit 0000-0001-7397-7224

Şerafettin Atmaca 0000-0003-2407-1113

Remzi Gürfidan 0000-0002-4899-2219

Recep Çolak 0000-0002-7119-6202

Publication Date December 31, 2023
Submission Date November 28, 2023
Acceptance Date December 19, 2023
Published in Issue Year 2023 Volume: 9 Issue: 4

Cite

IEEE T. Yiğit, Ş. Atmaca, R. Gürfidan, and R. Çolak, “Evrişimli Sinir Ağı Kullanarak El yazısı Rakamların Tanımasında Hiper Parametre Analizi”, GJES, vol. 9, no. 4, pp. 268–277, 2023.

Gazi Journal of Engineering Sciences (GJES) publishes open access articles under a Creative Commons Attribution 4.0 International License (CC BY). 1366_2000-copia-2.jpg