Recognition of Grocery Products in Images Captured by Cellular Phones
Authors: Farshideh Einsele, Hassan Foroosh
Abstract:
In this paper, we present a robust algorithm to recognize extracted text from grocery product images captured by mobile phone cameras. Recognition of such text is challenging since text in grocery product images varies in its size, orientation, style, illumination, and can suffer from perspective distortion. Pre-processing is performed to make the characters scale and rotation invariant. Since text degradations can not be appropriately defined using well-known geometric transformations such as translation, rotation, affine transformation and shearing, we use the whole character black pixels as our feature vector. Classification is performed with minimum distance classifier using the maximum likelihood criterion, which delivers very promising Character Recognition Rate (CRR) of 89%. We achieve considerably higher Word Recognition Rate (WRR) of 99% when using lower level linguistic knowledge about product words during the recognition process.
Keywords: Camera-based OCR, Feature extraction, Document and image processing.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1337988
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2475References:
[1] M. Mirmehdi and P. Clarck, “Recognising text in real scenes,” in International Journal on Document Analysis and Recognition (IJDAR), 2001, pp. 243–257.
[2] O. D. Trier, A. K. Jain, and T. Taxt, “Feature extraction methods for character recognition – a survey,” Journal of Pattern Recognition, Elsevier ScienceICPR06), vol. 29, pp. 641–662, 1996.
[3] S. Lu and C. L. Tan, “Camera text recognition based on perspective invariants,” in Proc. of the 18th International Conference on Pattern Recognition (ICPR06), 2006.
[4] S. Omachi, M. Iwamura, S. Uchida, and K. Kise, “Affine invariant information embedment for accurate camera-based character recognition,” in Proc. of the 18th International Conference on Pattern Recognition (ICPR06), 2006, pp. 1098–1101.
[5] S. Uchida and M. Iwamura, “Data embedding for camera-based character recognition,” in Proc. of the 18th International Conference on Pattern Recognition (ICPR06), 2006, pp. 1098–1101.
[6] S. Uchida, M. Iwamura, S. Omachi, and K. Kise, “Ocr fonts revisited for camera-based character recognition,” in Proc. of the 18th International Conference on Pattern Recognition (ICPR06), 2006.
[7] J. Flusser and T. Suk, “Pattern recognition by affine moment invariants,” Pattern Recognition, vol. 26, pp. 192–195, 1993.
[8] J. Flusser and T. Suk, “Graph method for generating affine moment invariants,” in Proceedings of the 17th International Conference on Pattern Recognition (ICPR04), 2004, pp. 167–174.
[9] J. Sun and S. N. Y. Hotta, Y. Katsuyama, “Camera based degraded text recognition using grayscale feature,” in Proc. of the 8th International Conference on Document Analysis and Recognition (ICDAR05), 2005.
[10] D. S. Zhang and G. Lu, “A comparative study on shape retrieval using fourier descriptors with different shape signatures,” in In Proc. of International Conference on Intelligent Multimedia and Distance Education (ICIMADE01), 2001, pp. 1–9.
[11] C. Dionisio and H. Kim, “A supervised shape classification technique invariant under rotation and scaling,” in Intl Telecommunications Symposium, 2002.
[12] K. Jung, K. I. Kim, and A. Jain, “Text information extraction in images and video: A survey,” in Pattern Recognition, vol. 37, 2004.
[13] F. Einsele and H. Foroosh, “Towards text extraction from low resolution cell phone images,” in submitted paper to IEEE International Conference on Image Processing (ICIP09), 2009.
[14] R. Duda and P. Hart, Pattern classification and scene analysis. Reading, MA: John Wisley & Sons, 1972.