Modified Naïve Bayes Based Prediction Modeling for Crop Yield Prediction
Authors: Kefaya Qaddoum
Abstract:
Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.
Keywords: Tomato yields prediction, naive Bayes, redundancy
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1326820
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5108References:
[1] N. Friedman, D. Geiger, and M. Goldszmidt, Bayesian network classifiers Machine Learning 29 (1997) 131–163.
[2] P. Domingos and M. Pazzani, Beyond independence: Conditions for the optimality of the simple Bayesian classifier, Machine Learning 29 (1997) 103–130.
[3] D. J. Hand and K. Yu, Idiot’s Bayes – Not so stupid after all? International Statistical Review 69 (2001) 385–398.
[4] J. T. A. S. Ferreira, D. G. T. Denison, and D. J. Hand, Data mining with products of trees, in Advances in Intelligent Data Analysis, Volume 2189 of Lecture Notes in Computer Science, (2001), pp. 167–176.
[5] P. Langley and S. Sage, Induction of Bayesian classifiers, in Proc. of the 10th Conf. on Uncertainty in Artificial Intelligence (1994), pp. 399–406.
[6] M. J. Pazzani, Searching for dependencies in Bayesian classifiers, in Learning from Data: Artificial Intelligence and Statistics V, (1996), pp. 239–248.
[7] M. Boull´e, Compression-based averaging of selective naive Bayes classifiers, Journal of Machine Learning Research 8 (2007) 1659–1685.
[8] R. Tibshirani, Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society: Series B 58 (1996) 267–288.
[9] P. Zhao and B. Yu, On model selection consistency of Lasso, Machine Learning Re- search 7 (2006) 2541–2567.
[10] D. Vidaurre, C. Bielza, and P. Larra˜naga, Forward stagewise naive Bayes, Progress in Artificial Intelligence 1 (2011) 57–69.
[11] Heuvelink, E. Growth, development, and yield of a tomato crop: Periodic destructive measurements in a greenhouse. Scientia Hort. 61(1-2): 77-99. 1995.
[12] Heuvelink, E. Tomato growth and yield: quantitative analysis and synthesis. PhD diss. Wageningen, the Netherlands: Wageningen Agricultural University. 1996.
[13] Heuvelink, E. Evaluation of a dynamic simulation model for tomato crop growth and development. Ann. Botany 83(4): 413-422. 1999.
[14] Heuvelink, E. Developmental process. In Tomatoes, 53-83. Crop Production Science in Horticulture Series. E. Heuvelink, ed. Wallingford, U.K.: CABI Publishing. 2005.