A Software Framework for Predicting Oil-Palm Yield from Climate Data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 32799
A Software Framework for Predicting Oil-Palm Yield from Climate Data

Authors: Mohd. Noor Md. Sap, A. Majid Awan

Abstract:

Intelligent systems based on machine learning techniques, such as classification, clustering, are gaining wide spread popularity in real world applications. This paper presents work on developing a software system for predicting crop yield, for example oil-palm yield, from climate and plantation data. At the core of our system is a method for unsupervised partitioning of data for finding spatio-temporal patterns in climate data using kernel methods which offer strength to deal with complex data. This work gets inspiration from the notion that a non-linear data transformation into some high dimensional feature space increases the possibility of linear separability of the patterns in the transformed space. Therefore, it simplifies exploration of the associated structure in the data. Kernel methods implicitly perform a non-linear mapping of the input data into a high dimensional feature space by replacing the inner products with an appropriate positive definite function. In this paper we present a robust weighted kernel k-means algorithm incorporating spatial constraints for clustering the data. The proposed algorithm can effectively handle noise, outliers and auto-correlation in the spatial data, for effective and efficient data analysis by exploring patterns and structures in the data, and thus can be used for predicting oil-palm yield by analyzing various factors affecting the yield.

Keywords: Pattern analysis, clustering, kernel methods, spatial data, crop yield

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1057879

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920

References:


[1] US Department of Agriculture, Production Estimates and Crop Assessment Division, 12 November 2004. htttp://www.fas.usda.gov/pecad2/highlights//2004/08/maypalm/
[2] F. Camastra, A. Verri. A Novel Kernel Method for Clustering. IEEE Trans. on Pattern Analysis and Machine Intelligence, vol 27, pp 801-805, May 2005.
[3] I.S. Dhillon, Y. Guan, B. Kulis. Kernel kmeans, Spectral Clustering and Normalized Cuts. KDD 2004.
[4] C. Ding and X. He. K-means Clustering via Principal Component Analysis. Proc. of Int. Conf. Machine Learning (ICML 2004), pp 225- 232, July 2004.
[5] M. Girolami. Mercer Kernel Based Clustering in Feature Space. IEEE Trans. on Neural Networks. Vol 13, 2002.
[6] M.N. Md. Sap, A. Majid Awan. Developing an Intelligent System Using Kernel-Based Learning Methods for Predicting Palm-Oil Yield. In proc. Int. Symposium on Bio-Inspired Computing (BIC-05), 5-7 September 2005, Johor Bahru, Malaysia.
[7] D.S. Satish and C.C. Sekhar. Kernel based clustering for multiclass data. Proc. Int. Conf. on Neural Information Processing , Kolkata, Nov. 2004.
[8] B. Scholkopf and A. Smola. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. MIT Press, 2002.
[9] F. Camastra. Kernel Methods for Unsupervised Learning. PhD thesis, University of Genova, 2004.
[10] L. Xu, J. Neufeld, B. Larson, D. Schuurmans. Maximum Margin Clustering. NIPS 2004.
[11] B. Scholkopf, A. Smola, and K. R. M├╝ller. Nonlinear component analysis as a kernel eigenvalue problem. Neural Comput., vol. 10, no. 5, pp. 1299- 1319, 1998.
[12] J. Han, M. Kamber and K. H. Tung. Spatial Clustering Methods in Data Mining: A Survey. Harvey J. Miller and Jiawei Han (eds.), Geographic Data Mining and Knowledge Discovery, Taylor and Francis, 2001.
[13] S. Shekhar, P. Zhang, Y. Huang, R. Vatsavai. Trends in Spatial Data Mining. As a chapter in Data Mining: Next Generation Challenges and Future Directions, H. Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha (eds.), MIT Press, 2003.
[14] P. Zhang, M. Steinbach, V. Kumar, S. Shekhar, P-N Tan, S. Klooster, and C. Potter. Discovery of Patterns of Earth Science Data Using Data Mining. As a Chapter in Next Generation of Data Mining Applications, J. Zurada and M. Kantardzic (eds), IEEE Press, 2003.
[15] M. Steinbach, P-N. Tan, V. Kumar, S. Klooster and C. Potter. Data Mining for the Discovery of Ocean Climate Indices. Proc of the Fifth Workshop on Scientific Data Mining at 2nd SIAM Int. Conf. on Data Mining, 2002.
[16] A. Ben-Hur, D. Horn, H. Siegelman, and V. Vapnik. Support vector clustering. J. of Machine Learning Research 2, 2001.
[17] N. Cristianini and J.S.Taylor. An Introduction to Support Vector Machines. Cambridge Academic Press, 2000.
[18] V.N. Vapnik. Statistical Learning Theory. John Wiley & Sons, 1998 .
[19] M.N. Md. Sap, A. Majid Awan. Finding Patterns in Spatial Data Using Kernel-Based Clustering Method. In proc. Int. Conf. on Intelligent Knowledge Systems (IKS-2005), Istanbul, Turkey, 06-08 July 2005.
[20] J. H. Chen and C. S. Chen. Fuzzy kernel perceptron. IEEE Trans. Neural Networks, vol. 13, pp. 1364-1373, Nov. 2002.
[21] V.N. Vapnik. The Nature of Statistical Learning Theory. Springer- Verlag, New York, 1995.
[22] M.N. Md. Sap, A. Majid Awan. Weighted Kernel K-Means Algorithm for Clustering Spatial Data. Journal of Information Technology, University Technology Malaysia, Vol 16 (2), pp. 137-156, Dec 2004.
[23] M.N. Md. Sap, A. Majid Awan. Developing an Intelligent Agro- Hydrological System using Machine Learning Techniques for Predicting Palm-oil Yield. Journal of Information Technology, University Technology Malaysia, June 2005.
[24] M.N. Md. Sap, A. Majid Awan. Finding Spatio-Temporal Patterns in Climate Data using Clustering. Proc 2005 Int. Conf. on Cyberworlds (CW'05), 23-25 November 2005, Singapore.
[25] V. Roth and V. Steinhage. Nonlinear discriminant analysis using kernel functions. In Advances in Neural Information Processing Systems 12, S. A Solla, T. K. Leen, and K.-R. Muller, Eds. MIT Press, 2000, pp. 568- 574.
[26] R.M. Gray. Vector Quantization and Signal Compression. Kluwer Academic Press, Dordrecht, 1992.
[27] S.P. Lloyd. An algorithm for vector quantizer design. IEEE Trans. on Communications, vol. 28, no. 1, pp. 84-95, 1982.
[28] M.N. Ahmed, S.M. Yamany, N. Mohamed, A.A. Farag and T. Moriarty. A modified fuzzy C-means algorithm for bias field estimation and segmentation of MRI data. IEEE Trans. on Medical Imaging, vol. 21, pp.193-199, 2002.
[29] B. Scholkopf. The kernel trick for distances. In Advances in Neural Information Processing Systems, volume 12, pages 301--307. MIT Press, 2000.