Experimental Study of Hyperparameter Tuning a Deep Learning Convolutional Recurrent Network for Text Classification
Authors: Bharatendra Rai
Abstract:
Sequences of words in text data have long-term dependencies and are known to suffer from vanishing gradient problem when developing deep learning models. Although recurrent networks such as long short-term memory networks help overcome this problem, achieving high text classification performance is a challenging problem. Convolutional recurrent networks that combine advantages of long short-term memory networks and convolutional neural networks, can be useful for text classification performance improvements. However, arriving at suitable hyperparameter values for convolutional recurrent networks is still a challenging task where fitting of a model requires significant computing resources. This paper illustrates the advantages of using convolutional recurrent networks for text classification with the help of statistically planned computer experiments for hyperparameter tuning.
Keywords: Convolutional recurrent networks, hyperparameter tuning, long short-term memory networks, Tukey honest significant differences
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 115References:
[1] B. K. Rai, B. Advanced Deep Learning with R: Become an expert at designing, building, and improving advanced neural network models using R. Packt Publishing, Birmingham, 2019, pp. 253-276.
[2] A. Chavhan, S. Chavan, S. Dahe, and S. Chibhade, “A neural network approach for real time emotion recognition,” International Journal of Advanced Research in Computer and Communication Engineering, 4(3), 259-263, 2015.
[3] J. Chen, J. Chen, D. Zhang, Y. Sun, and Y.A. Nanehkaran, “Using deep transfer learning for image-based plant disease identification,” Computers and Electronics in Agriculture, 173, 105393, 2020.
[4] S. Khan, N. Islam, Z. Jan, I. Din, and J. Rodrigues, “A novel deep learning-based framework for the detection and classification of breast cancer using transfer learning,” Pattern Recognition Letters, 125, 1–6. Doi: 10.1016/j.patrec.2019.03.022, 2019.
[5] B. K., Rai, and A. Meshram, “Application of neural network to detect freezing of gait in patients with Parkinson’s disease,” chapter in book titled Soft Computing, edited by Mangey Ram and S. B. Singh, De Gruyter, 2020.
[6] M. Tounsi, I. Moalla, F. Lebourgeois, and A. Alimi, “CNN based transfer learning for scene script identification,” in International Conference on Neural Information Processing, pp. 702–711, Springer, Cham. Guangzhou, China, 2017.
[7] K. Simonyan, and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” ArXiv Preprint ArXiv:1409.1556, 2014.
[8] P. Tiwari, and A. Darji, “A Novel S-LDA Features for Automatic Emotion Recognition from Speech using 1-D CNN,” International Journal of Mathematical, Engineering and Management Sciences, 7(1), 49-67, 2022.
[9] A. Jacovi, O. Shalom, and, Y. Goldberg, “Understanding convolutional neural networks for text classification,” arXiv preprint arXiv:1809.08037, 2018.
[10] J. Zhao, X. Mao, and L. Chen, “Speech emotion recognition using deep 1D & 2D CNN LSTM networks,” Biomedical Signal Processing and Control, 47, 312-323, 2019.
[11] A. Bhunia, A. Konwer, A. Bhunia, A. Bhowmick, P. Roy, and U. Pal, “Script identification in natural scene image and video frames using an attention based convolutional-LSTM network,” Pattern Recognition, 85, 172–184, 2019.
[12] S. Lyu, and J. Liu, “Convolutional Recurrent Neural Networks for Text Classification,” Journal of Database Management (JDM), 32(4), 65-82, 2021.
[13] C. Zhou, C. Sun, Z. Liu, and F. Lau, “A C-LSTM neural network for text classification,” arXiv preprint arXiv:1511.08630, 2015.
[14] D. Yogatama, C. Dyer, W. Ling, and P. Blunsom, “Generative and discriminative text classification with recurrent neural networks,” arXiv preprint arXiv:1703.01898, 2017.
[15] L. Guo, D. Zhang, L. Wang, H. Wang, and B. Cui, “CRAN: A Hybrid CNN-RNN Attention-Based Model for Text Classification,” in: Trujillo J. et al. (eds) Conceptual Modeling, Lecture Notes in Computer Science, vol 11157. Springer, Cham, 2018.
[16] J. Allaire, and F. Chollet, Deep Learning with R. United States: Manning, 2018.
[17] R. G. Miller, Simultaneous Statistical Inference. Springer, 1981.
[18] B. S. Yandell, Practical Data Analysis for Designed Experiments. Chapman & Hall, 1997.