Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32759
Object Recognition Approach Based on Generalized Hough Transform and Color Distribution Serving in Generating Arabic Sentences

Authors: Nada Farhani, Naim Terbeh, Mounir Zrigui

Abstract:

The recognition of the objects contained in images has always presented a challenge in the field of research because of several difficulties that the researcher can envisage because of the variability of shape, position, contrast of objects, etc. In this paper, we will be interested in the recognition of objects. The classical Hough Transform (HT) presented a tool for detecting straight line segments in images. The technique of HT has been generalized (GHT) for the detection of arbitrary forms. With GHT, the forms sought are not necessarily defined analytically but rather by a particular silhouette. For more precision, we proposed to combine the results from the GHT with the results from a calculation of similarity between the histograms and the spatiograms of the images. The main purpose of our work is to use the concepts from recognition to generate sentences in Arabic that summarize the content of the image.

Keywords: Recognition of shape, generalized hough transformation, histogram, Spatiogram, learning.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.3299705

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 542

References:


[1] Ulric Neisser. Cognitive Psychology. Appleton-Century-Crofts, New York (1967).
[2] Eric Nowak. Recognize object categories and object instances using local representations. Informatique (cs). Institut National Polytechnique de Grenoble – INPG. (2008).
[3] N Terbeh, M Labidi, M Zrigui, Automatic speech correction: A step to speech recognition for people with disabilities.Information and Communication Technology and Accessibility (ICTA). (2013).
[4] A Zouaghi, L Merhbene, M Zrigui. A hybrid approach for arabic word sense disambiguation. International Journal of Computer Processing Of Languages 24 (02), 133-151 (2012).
[5] S Mansouri, M Charhad, M Zrigui. A Heuristic Approach to Detect and Localize Text on Arabic News Video. Computación y Sistemas 22 (1). (2018).
[6] He, K., Zhang, X., Ren, S., & Sun, J. Deep residual learning for image recognition. pp. 770-778. In Proceedings of the IEEE conference on computer vision and pattern recognition (2016).
[7] Karpathy, A., & Fei-Fei, L. Deep visual-semantic alignments for generating image descriptions. pp. 3128-3137. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2015).
[8] Chabot, F., Chaouch, M., Rabarisoa, J., Teulière, C., & Chateau, T. Deep MANTA: A Coarse-to-fine Many-Task Network for joint 2D and 3D vehicle analysis from monocular image (2017).
[9] Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. Focal loss for dense object detection (2017).
[10] R. Vaillant, C. Monrocq, and Y. LeCun. Original approach for the localisation of objects in images. IEE Proc. on Vision, Image, and Signal Processing (1994).
[11] P. Viola and M. Jones. Rapid object detection using a boosted cascade of simple features. In CVPR (2001).
[12] P. Dollar, Z. Tu, P. Perona, and S. Belongie. Integral channel ´ features (2009).
[13] N. Dalal and B. Triggs. Histograms of oriented gradients for human detection. In CVPR (2005).
[14] Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS (2012).
[15] R. Uijlings, K. E. van de Sande, T. Gevers, and A. W. Smeulders. Selective search for object recognition. IJCV (2013).
[16] He, G. Gkioxari, P. Dollar, and R. Girshick. Mask R- ´ CNN. In ICCV (2017).
[17] He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV (2014).
[18] S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS (2015).
[19] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and ´ S. Belongie. Feature pyramid networks for object detection. In CVPR (2017).
[20] Shrivastava, A. Gupta, and R. Girshick. Training regionbased object detectors with online hard example mining. In CVPR (2016).
[21] Shrivastava, R. Sukthankar, J. Malik, and A. Gupta. Beyond skip connections: Top-down modulation for object detection (2016).
[22] R. Girshick, J. Donahue, T. Darrell, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR (2014).
[23] P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. In ICLR (2014).
[24] Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. Reed. SSD: Single shot multibox detector. In ECCV (2016).
[25] J. Redmon and A. Farhadi. YOLO9000: Better, faster, stronger. In CVPR (2017).
[26] Awad, Dounia. Vers un système perceptuel de reconnaissance d'objets. Diss. Université de La Rochelle (2014).
[27] Yogesh N. Shinde1, Mrunmayee Patil.“Translating Images into Text Descriptions and Speech Synthesis for Learning Purpose”, International Journal for Research in Applied Science & Engineering Technology (IJRASET), Volume 4 Issue VI. (2016).
[28] ACHARJYA, Pinaki Pratim, DAS, Ritaban, et GHOSHAL, Dibyendu. Study and comparison of different edge detectors for image segmentation. Global Journal of Computer Science and Technology (2012).
[29] Wu, J., & Xiao, Z. Video surveillance object recognition based on shape and color features. (Vol. 1, pp. 451-454). In Image and Signal Processing (CISP), 2010 3rd International Congress on IEEE. (2010).
[30] Auguste, R., Aissaoui, A., Martinet, J., & Djeraba, C. Spatio-temporal histograms for the re-identification of people in television news. Compression et Représentation des Signaux Audiovisuels (CORESA), article30. (2012).
[31] Hkiri, E., Mallat, S., & Zrigui, M. Constructing a Lexicon of Arabic-English Named Entity using SMT and Semantic Linked Data. (2017).
[32] Lhioui, C., Zouaghi, A., & Zrigui, M. A Rule-based Semantic Frame Annotation of Arabic Speech Turns for Automatic (2017).
[33] Farhani, N., Terbeh, N., & Zrigui, M. Image to Text Conversion: State of the Art and Extended Work. 2017 IEEE/ACS 14th International Conference on Computer Systems and Applications (AICCSA), 937-943. (2017).