Methods for Distinction of Cattle Using Supervised Learning
Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl
Abstract:
Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.
Keywords: Genetic data, Pinzgau cattle, supervised learning.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1093044
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2322References:
[1] F. C. Canavez, D. D. Luche, P. Stothard, K. R. M. Leite, J. M. Sousa-Canavez, G. Plastow, J. Meidanis, M. A. Souza, P. Feijao,S. S. Moore,L. H. Camara-Lopes, "Genome Sequence and Assembly of Bosindicus,” J. Hered., vol. 103, no. 3, pp. 342-348, Feb. 2012.
[2] M. Simčič, M. Čepon, S. Horvat, S. Jovnovac, V. Gantner, P. Dovč, D. Kompan, "Genetic characterization of autochtonous cattle breeds, Cika and Busha, using microsatellites,” ActaAgr.Slovenica, supl. 2, pp. 71-77. Sept. 2008.
[3] O. Kadlečík, H. H. Swalve, J. A. Lederer, H. Grosu, "Development of dual –purpose Pinzgau cattle,” SPU, Nitra, Slovak Republic, pp. 128, ISBN 80-8069-439-7. 2004.
[4] O. Kadlečík, R. Kasarda, L. Hetényi, „Geneticgain, increase in inbreeding rate and generation interval in alternativesofPinzgaubreeding program," Czech J. Anim. Sci., vol. 49, no. 12, pp. 524-531. Dec. 2004.
[5] P. Taberlet, A. Valentini, H. R. Rezaei, S. Naderi, F. Pompanon, R. Negrini, P. Ajmone-Marsan, "Are cattle, sheep, and goats endangered species?”Mol. Ecol.,vol. 17, no. 1, pp. 275-284.Jan. 2008.
[6] K. Liu, S. V. Muse, "Integrated analysis environment for genetic marker data,”Bioinformatics, vol. 21, no. 9, pp. 2128-2129.Feb. 2005.
[7] R. Židek, R. Kasarda, "Distribution of genetic distance within groups with different relationship coefficient,” ActaFytotech.Zootech.,vol. 13, pp. 73-76.Oct. 2010.
[8] FAO. The state of the world’s animal genetic resources for food and agriculture. FAO, Rome. 2007.
[9] W. B. Sun, H. Chen, C. Z. Lei, X. Q. Lei, Y. H. Zhang,"Study on population genetic characteristics of Qinchuan cows using microsatellite markers,” J. Genet. Genomics, vol. 34, pp. 17-25.Jan. 2007.
[10] J. A. Lenstra,L. F. Groeneveld, H. Eding, J. Kantanen, J. L. Williams, P. Taberlet, E. L.Nicolazzi, J. Sölkner, H. Simianer, E. Ciani, J. F. Garcia, M. W. Bruford, P. Ajmone-Marshan, S. Weigend, "Molecular tools and analytical approaches for the characterization of farm animal genetic diversity,”Anim. Genet., vol. 43, no. 5, pp. 483-502.Oct. 2012.
[11] J. Goudet,l. Keller,"The correlation between inbreeding and fitness: does allele size matter?”Trends Ecol. Evol., vol. 17, no. 5, pp. 201-202.May 2002.
[12] D. L. Samson, T. J. Parker, Z. Upton, C. P. Hurst,"A Comparison of Methods for Classifying Clinical Samples Based on Proteomics Data: A Case Study for Statistical and Machine Learning Approaches,”PloS one, vol. 6, no. 9, e24973.Sept. 2011.
[13] H. Brink, J. W. Richards, "Real-Word Machine Learning,” Manning Publications Co. Pp. 400, ISBN 9781617291920. 2013.
[14] A. L. Swan, A. Mobasheri, D. Allaway, S. Liddell, J. Bacardit, "Application of Machine Learning to Proteomics Data: Classification and Biomarker Identification in Postgenomics Biology,” OMICS, vol. 17, no. 12, pp. 595-610. Dec 2013.
[15] A. L.Tarca, V. J. Carey, X. Chen, R. Romero, S. Drăghici,"Machine Learning and Its Applications to Biology,” PlosComput. Biol., vol. 3, no. 6, pp. 116.Jun 2007.
[16] S. B. Kotsiantis, "Supervised Machine Learning: A Review of Classification Techniques,” Informatica, vol. 31, pp. 249-268.2007.
[17] P. Guerts, A. Irrthum,L. Wehenkel,"Supervised learning with decision tree-based methodsin computational and systems biology,”Mol. BioSyst., vol. 5, no. 12, pp. 1593-1605. Dec.2009.
[18] R. Rakotomalala, "TANAGRA: a free software for research and academic purposes”, Proc. of EGC'2005, RNTI-E-3, vol. 2, pp.697-702. 2005. (in French)
[19] R. Kohavi, "A study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection,” Proc. of the 14th IJCAI, vol. 2, pp. 1137-1143, ISBN 1-55860-363-8, 1995.
[20] R. C. Barros, M. P. Basgalupp, A. C. P. L. F. de Carvalho, A. A. Freitas, "A Survey of Evolutionary Algorithms for Decision Tree Induction,” IEEE Transactions on Systems, Man, and Cybernetics – Part C:Appl. Rev., vol. 42, no. 3, pp. 291-312, May 2012.
[21] R. Quinlan, "C4.5: programs for machine learning,” San Francisco, CA: Morgan Kaufmann Publishers Inc., ISBN 1-55860-238-0, 1993.
[22] J-H. Chauchat, R. Rakotomalala, M. Carloz, C. Pelletier, "Targeting customer groups using gain and cost matrix; a marketing application,” Mining for Marketing, 2001.