{"title":"Deep Learning Application for Object Image Recognition and Robot Automatic Grasping","authors":"Shiuh-Jer Huang, Chen-Zon Yan, C. K. Huang, Chun-Chien Ting","volume":164,"journal":"International Journal of Mechanical and Mechatronics Engineering","pagesStart":359,"pagesEnd":367,"ISSN":"1307-6892","URL":"https:\/\/publications.waset.org\/pdf\/10011384","abstract":"<p>Since the vision system application in industrial environment for autonomous purposes is required intensely, the image recognition technique becomes an important research topic. Here, deep learning algorithm is employed in image system to recognize the industrial object and integrate with a 7A6 Series Manipulator for object automatic gripping task. PC and Graphic Processing Unit (GPU) are chosen to construct the 3D Vision Recognition System. Depth Camera (Intel RealSense SR300) is employed to extract the image for object recognition and coordinate derivation. The YOLOv2 scheme is adopted in Convolution neural network (CNN) structure for object classification and center point prediction. Additionally, image processing strategy is used to find the object contour for calculating the object orientation angle. Then, the specified object location and orientation information are sent to robotic controller. Finally, a six-axis manipulator can grasp the specific object in a random environment based on the user command and the extracted image information. The experimental results show that YOLOv2 has been successfully employed to detect the object location and category with confidence near 0.9 and 3D position error less than 0.4 mm. It is useful for future intelligent robotic application in industrial 4.0 environment.<\/p>\r\n","references":"[1]\tDeen Cockbum, Jean-Philippe Roberge, Thuy-Hong-Loan Le, Alexis Maslyczyk and Vincent Duchaine, \u201cGrasp stability assessment through unsupervised feature learning of tactile images,\u201d IEEE International Conference on Robotics and Automation (ICRA), May 29 ~ June 3, pp. 2238-2244, Singapore, 2017.\r\n[2]\tJaehyun Yoo and Karl H. Johansson, \u201cSemi-supervised learning for mobile robot localization using wireless signal strengths,\u201d International 12Conference on Indoor Positioning and Indoor Navigation (IPIN), Sapporo, Japan, Sept 18-21, 2017.\r\n[3]\tS. K. Lenka and A. G. Mohapatra, \u201cGradient Descent with momentum based neural network pattern classification for the prediction of soil moisture content in precision agriculture,\u201d IEEE International Symposium on Nanoelectronic and Information Systems, Indore, India, October 21-23, 2015, pp. 63-66.\r\n[4]\tD. Soudry, D. Di Castro, A. Gal, A. Kolodny and S. Kvatinsky, \u201cMemristor-based multilayer neureal network with online gradient descent training,\u201d IEEE Transactions on Neural Networks and Learning Systems, 26 (10), pp. 2408-2421, 2015.\r\n[5]\tAndy Zeng, Kuan-Ting Yu, Shuran Song, Daniel Suo, Ed Walker, Alberto Rodriguez and Jianxiong Xiao, \u201cMulti-view self-supervised deep learning for 6D pose estimation in the Amazon Picking Challenge,\u201d IEEE International Conference on Robotics and Automation (ICRA), May 29 ~ June 3, Singapore, pp. 1386-1393, 2017.\r\n[6]\tG. E. Pazienza, P. Giangrossi, S. Tortella, M. Balsi and X. Vilasis-Cardona, \u201cTracking for a CNN guided robot,\u201d Proceedings of the 2005 European Conference on Circuit Theory and Design, 2005.\r\n[7]\tE. Martinson and V. Yalla, \u201cReal-time human detection for robots using CNN with a feature-based layered pre-filter,\u201d 25th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), New York, USA,pp. 1120-1125, 2016.\r\n[8]\tX. Peng, B. Sun, K. Ali and K. Saenko, \u201cLearning deep object detectors from 3D models,\u201d IEEE International Conference on Computer Vision (ICCV), Santiago, pp. 1278-1286, 2015.\r\n[9]\tS. Ren, K. He, R. Girshick and J. Sun, \u201cFaster R-CNN: Towards real-time object detection with region proposal mnetworks,\u201d IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (6), pp.1137-1149, 2017.\r\n[10]\tE. Shelhamer, J. Long and T. Darrell, \u201cFully convolutional networks for semantic segmentation,\u201d IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (4), pp. 640-651, 2017.\r\n[11]\tJ. Redmon, S. Divvala, R. Girshick and A. Farhadi, \u201cYou only look once: Unified, real-time object detection,\u201d IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779-788, 2016.\r\n[12]\tIntel\u00ae RealSense\u2122 Technology, \u201cIntel\u00ae RealSense\u2122 SDK\u201d, Revised Jun 2016.\r\n[13]\tNing Qian, \u201cOn the momentum term in gradient descent learning algorithms,\u201d Neural networks, 12(1), pp. 145\u2013151, 1999.\r\n[14]\tSergey Ioffe and Christian Szegedy, \u201cBatch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift,\u201d arXiv preprint arXiv:1502.03167v3, 2015.\r\n[15]\tJ. Redmon and A. Farhadi, \u201cYOLO9000: Better, Faster, Stronger,\u201d IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, pp. 6517-6525, 2017.\r\n[16]\tLi, Zhizhong, and Derek Hoiem, \u201cLearning without forgetting,\u201d IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(12), pp. 2935-2947, 2017.\r\n[17]\tS. Suzuki and K. Abe, \u201cTopological structural analysis of Digitized binaryimages by border following,\u201d Computer vision, graphics, and image processing 30, pp. 32-46, 1985.\r\n[18]\tJ. Redmon. Darknet: Open source neural networks in c. http:\/\/pjreddie.com\/darknet\/, 2013\u20132016\r\n[19]\tMark Everingham, Luc Gool, Christopher K. Williams, John Winn and Andrew Zisserman, \u201cThe Pascal visual object classes (VOC) Challenge,\u201d Int. J. Comput. Vision 88(2), pp. 303-308, 2010.","publisher":"World Academy of Science, Engineering and Technology","index":"Open Science Index 164, 2020"}