Recognition of Gene Names from Gene Pathway Figures Using Siamese Network
Authors: Muhammad Azam, Micheal Olaolu Arowolo, Fei He, Mihail Popescu, Dong Xu
Abstract:
The number of biological papers is growing quickly, which means that the number of biological pathway figures in those papers is also increasing quickly. Each pathway figure shows extensive biological information, like the names of genes and how the genes are related. However, manually annotating pathway figures takes a lot of time and work. Even though using advanced image understanding models could speed up the process of curation, these models still need to be made more accurate. To improve gene name recognition from pathway figures, we applied a Siamese network to map image segments to a library of pictures containing known genes in a similar way to person recognition from photos in many photo applications. We used a triple loss function and a triplet spatial pyramid pooling network by combining the triplet convolution neural network and the spatial pyramid pooling (TSPP-Net). We compared VGG19 and VGG16 as the Siamese network model. VGG16 achieved better performance with an accuracy of 93%, which is much higher than Optical Character Recognition (OCR) results.
Keywords: Biological pathway, image understanding, gene name recognition, object detection, Siamese network, Visual Geometry Group.
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 675References:
[1] D. Kim et al., “A Neural Named Entity Recognition and Multi-Type Normalization Tool for Biomedical Text Mining,” IEEE Access, vol. 7, pp. 73729–73740, 2019, doi: 10.1109/ACCESS.2019.2920708.
[2] C. Essien, F. He, M. Hannink, M. Popescu, and D. Xu, “Extraction of Gene Regulatory Relation Using Bio BERT,” in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, Dec. 2022, pp. 3351–3355. Doi: 10.1109/BIBM55620.2022.9995458.
[3] J. Hu, J. Lu, and Y.-P. Tan, “Discriminative Deep Metric Learning for Face Verification in the Wild,” in 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, Jun. 2014, pp. 1875–1882. Doi: 10.1109/CVPR.2014.242.
[4] T. Bui, L. Ribeiro, M. Ponti, and J. Cellulose, “Compact descriptors for sketch-based image retrieval using a triplet loss convolutional neural network,” Compute. Vis. Image Underset., vol. 164, pp. 27–37, Nov. 2017, Doi: 10.1016/j.cviu.2017.06.007.
[5] S. Aldrich, M. O. Arowolo, F. He, M. Popescu, and D. Xu, “Comprehensive Assessment of OCR Tools for Gene Name Recognition in Biological Pathway Figures,” in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA, Dec. 2022, pp. 3574–3579. Doi: 10.1109/BIBM55620.2022.9995448.
[6] C. Szeged et al., “Going deeper with convolutions,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, Jun. 2015, pp. 1–9. Doi: 10.1109/CVPR.2015.7298594.
[7] N. M. Asemia and P. D. D. Dominic, “Hyperparameter Optimization in Convolutional Neural Network using Genetic Algorithms,” Int. J. Adv. Compute. Sci. Appl., vol. 10, no. 6, 2019, Doi: 10.14569/IJACSA.2019.0100638.
[8] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition.” arXiv, Apr. 10, 2015. Accessed: Dec. 18, 2022. (Online). Available: http://arxiv.org/abs/1409.1556
[9] E. Hoffer and N. Alon, “Deep metric learning using Triplet network.” arXiv, Dec. 04, 2018. Accessed: Nov. 21, 2022.
[Online]. Available: http://arxiv.org/abs/1412.6622
[10] S. Apparatus, “Image similarity using Deep CNN and Curriculum Learning,” p. 9.
[11] F. Shroff, D. Maleficence, and J. Philbin, “Face Net: A unified embedding for face recognition and clustering,” in 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, Jun. 2015, pp. 815–823. Doi: 10.1109/CVPR.2015.7298682.
[12] Y. Cao, C. Lu, X. Lu, and X. Xia, “A Spatial Pyramid Pooling Convolutional Neural Network for Smoky Vehicle Detection,” in 2018 37th Chinese Control Conference (CCC), Wuhan, Jul. 2018, pp. 9170–9175. Doi: 10.23919/ChiCC.2018.8483521.
[13] C.-Q. Huang, S.-M. Yang, Y. Pan, and H.-J. Lai, “Object-Location-Aware Hashing for Multi-Label Image Retrieval via Automatic Mask Learning,” IEEE Trans. Image Process., vol. 27, no. 9, pp. 4490–4502, Sep. 2018, Doi: 10.1109/TIP.2018.2839522.
[14] E. Jahan Heave, H. Habibi Agama, and D. Puig, “An optimized convolutional neural network with bottleneck and spatial pyramid pooling layers for classification of foods,” Pattern Recognin. Lett., vol. 105, pp. 50–58, Apr. 2018, doi: 10.1016/j.patrec.2017.12.007.
[15] F. He, D. Wang, Y. Innocentia, O. Kholod, D. Shin, and D. Xu, “Extracting Molecular Entities and Their Interactions from Pathway Figures Based on Deep Learning,” in Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Niagara Falls NY USA, Sep. 2019, pp. 397–404. Doi: 10.1145/3307339.3342187.
[16] M. Kunihisa, M. Furuuchi, M. Tanabe, Y. Sato, and K. Mishima, “KEGG: new perspectives on genomes, pathways, diseases and drugs,” Nucleic Acids Res., vol. 45, no. D1, pp. D353–D361, Jan. 2017, Doi: 10.1093/Nar/gkw1092.
[17] K. Hanapers, A. Riitta, M. Summer-Kitman, and A. R. Pico, “Pathway information extracted from 25 years of pathway figures,” Genome Biol., vol. 21, no. 1, p. 273, Dec. 2020, doi: 10.1186/s13059-020-02181-2.
[18] X. Liu, M. Liu, L. Dong, M. Hui, Y. Zhao, and L. Peng, “SAR target recognition and posture estimation using spatial pyramid pooling within CNN,” in 2017 International Conference on Optical Instruments and Technology: Optoelectronic Imaging/Spectroscopy and Signal Processing Technology, Beijing, China, Jan. 2018, p. 27. doi: 10.1117/12.2285913.
[19] K. He, X. Zhang, S. Ren, and J. Sun, “Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition,” IEEE Trans. Pattern Anal. Mach. Intel., vol. 37, no. 9, pp. 1904–1916, Sep. 2015, Doi: 10.1109/TPAMI.2015.2389824.
[20] M. Kim, S. H. Beak, and M. Song, “Relation extraction for biological pathway construction using node2vec,” BMC Bioinformatics, vol. 19, no. S8, p. 206, Jun. 2018, doi: 10.1186/s12859-018-2200-8.
[21] I. Malakhov, J. Kanaloa, and E. Rathi, “Siamese network features for image matching,” in 2016 23rd International Conference on Pattern Recognition (ICPR), Cancun, Dec. 2016, pp. 378–383. Doi: 10.1109/ICPR.2016.7899663.
[22] H. Key, D. Chen, X. Li, Y. Tang, T. Shah, and R. Ranjan, “Towards Brain Big Data Classification: Epileptic EEG Identification with a Lightweight VGGNet on Global MIC,” IEEE Access, vol. 6, pp. 14722–14733, 2018, Doi: 10.1109/ACCESS.2018.2810882.
[23] Q. Yu, Y. Xu, J. Liu, Z. Xi, Z. Li, and Y. Liu, “Fibronectin Promotes the Malignancy of Glioma Stem-Like Cells Via Modulation of Cell Adhesion, Differentiation, Proliferation and Chemoresistance,” Front. Mol. Neurosis., vol. 11, p. 130, Apr. 2018, Doi: 10.3389/fnmol.2018.00130.
[24] F. N. Bandola, S. Han, M. W. Moszkowicz, K. Ashraf, W. J. Dally, and K. Kreutzer, “Squeeze Net: Alex Net-level accuracy with 50x fewer parameters and <0.5MB model size.” arXiv, Nov. 04, 2016. Accessed: Dec. 18, 2022. (Online)s. Available: http://arxiv.org/abs/1602.07360
[25] J. Zhou and B. Fu, “The research on gene-disease association based on text-mining of PubMed,” BMC Bioinformatics, vol. 19, no. 1, p. 37, Dec. 2018, Doi: 10.1186/s12859-018-2048-y.
[26] T. Kymani, T. Karas, S. Laine, J. Lehtonen, and T. Aila, “Improved Precision and Recall Metric for Assessing Generative Models”.
[27] P. A. Marin-Reyes, L. Ber Gamini, J. Lorenzo-Navarro, A. Palazzi, S. Caldara, and R. Cochair, “Unsupervised Vehicle Re-identification Using Triplet Networks,” in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Salt Lake City, UT, USA, Jun. 2018, pp. 166–1665. doi: 10.1109/CVPRW.2018.00030.
[28] S. Aldrich, M. Arowolo, F. He, M. Popescu, and D. Xu, "Comprehensive Assessment of OCR Tools for Gene Name Recognition in Biological Pathway Figures," in 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, NV, USA