Summarizing Data Sets for Data Mining by Using Statistical Methods in Coastal Engineering
Authors: Yunus Doğan, Ahmet Durap
Abstract:
Coastal regions are the one of the most commonly used places by the natural balance and the growing population. In coastal engineering, the most valuable data is wave behaviors. The amount of this data becomes very big because of observations that take place for periods of hours, days and months. In this study, some statistical methods such as the wave spectrum analysis methods and the standard statistical methods have been used. The goal of this study is the discovery profiles of the different coast areas by using these statistical methods, and thus, obtaining an instance based data set from the big data to analysis by using data mining algorithms. In the experimental studies, the six sample data sets about the wave behaviors obtained by 20 minutes of observations from Mersin Bay in Turkey and converted to an instance based form, while different clustering techniques in data mining algorithms were used to discover similar coastal places. Moreover, this study discusses that this summarization approach can be used in other branches collecting big data such as medicine.
Keywords: Clustering algorithms, coastal engineering, data mining, data summarization, statistical methods.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1130613
Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1251References:
[1] C. P. Wei, Y. H. Lee and C. M. Hsu, “Empirical comparison of fast partitioning-based clustering algorithms for large data sets”, Expert Systems with Applications, vol. 24, no. 4, pp. 351-361, 2003.
[2] M. J. Shaw, C. Subramaniam, G. W. Tan and M. E. Welge, “Knowledge management and data mining for marketing”, Decision Support Systems, vol. 1, no. 31, pp. 128-138, 2001.
[3] S. H. Liao, “Knowledge management technologies and applications -literature review from 1995 to 2002”, Expert Systems with Applications, vol. 25, no. 2, pp. 157-167, 2003.
[4] E. W. M. Ma and T. W. S. Chow, “A new shifting grid clustering algorithm”, Pattern Recognition, vol. 37, no. 3, pp. 503-513, 2004.
[5] J. A. McCarty and M. Hastak, “Segmentation approaches in data mining: A comparison of RFM, CHAID and Logistic Regression”, Journal of Business Research, vol. 60, no. 6, pp. 656-666, 2009.
[6] G. G. Emel, C. Taskin and S. Kilicarslan, “An analysis at the process of steel production by using artificial neural network”, Journal of Dokuz Eylül University, vol. 5, no. 1, pp. 206-207, 2004.
[7] C. Rygielski, J. C. Wang and D. C. Yen, “Data mining techniques for customer relationship management”, Technology in Society, vol. 24, no. 4, pp. 488-498, 2002.
[8] H. M. Moshkovich, A. I. Mechitov and D. L. Olson, “Rule induction in data mining: Effect of ordinal scales”, Expert Systems with Applications, vol. 22, no. 4, pp. 303-304, 2002.
[9] C. Budayan, I. Dikmen ve M. T. Birgonul, “Comparing the performance of traditional cluster analysis, self-organizing maps and fuzzy c-means method for strategic grouping”, Expert Systems with Applications, vol. 36, no. 9, pp. 117-127, 2009.
[10] R. J. Kuo, L. M. Ho and C. M. Hu, “Cluster analysis in industrial market segmentation through artificial neural network”, Computers & Industrial Engineering, vol. 42, no. 4, pp. 393-403, 2002.
[11] A. Likas, N. Vlassis and J. J. Verbeek, “The global k-means clustering algorithm”, Pattern Recognition, vol. 36, no. 2, pp. 451-461, 2003.
[12] C. H. Hsu, “Data mining to improve industrial standards and enhance production and marketing: An empirical study in apparel industry”, Expert Systems with Applications, vol. 36, no. 3, pp. 504-514, 2009.
[13] B. Hammer, A. Micheli, A. Sperduti and M. Strickert, “Recursive self-organizing network models”, Neural Networks, vol. 17, no. 10, pp. 1061-1071, 2004.
[14] D. G. Roussinov and H. Chen, “Document clustering for electronic meetings: An experimental comparison of two techniques”, Decision Support Systems, vol. 27, no. 2, pp. 70-80, 1999.
[15] B. Aydogan, B. Ayat, M. N. Ozturk, Y. Yuksel and E. O. Cevik, “Modeling of water level changes in the Bosphorus”, The 6th National Coastal Engineering Symposium, Izmir, Turkey, 2007, pp. 271-278.
[16] M. L. Koc, C. E. Balas and A. Arsla, “Preliminary design of artificial neural networks of stone filled breakwaters”, IMO Technical Journal, vol. 225, no. 11, pp. 3351-3375, 2004.
[17] D. F. Milliea, G. R. Weckmanc, W. A. Y. IId, J. E. Iveye, D. P. Friesf, E. Ardjmandc and G. L. Fahnenstielb, “Coastal ‘big data’ and nature-inspired computation: Prediction potentials, uncertainties, and knowledge derivation of neural networks for an algal metric”, Coastal and Shelf Science, vol. 125, pp. 57–67, 2013.
[18] H. C. Seyffert and A. W. Troesch, “Data mining Pt. Reyes Buoy for rare wave groups”, Journal of Offshore Mechanics and Arctic Engineering, vol. 138, no. 1, pp. 1-8, 2015.
[19] P. A. Conrads and E. A. Roehl, “The use of data-mining techniques for developing effective decision support systems: a case study of simulating the effects of climate change on coastal salinity intrusion”, The Geological Society of London, vol. 408, 2015.
[20] C. H. Chang, C. C. Liu, H. W. Chung, L. J. Lee and W. C. Yang, “Development and evaluation of a genetic algorithm-based ocean color inversion model for simultaneously retrieving optical properties and bottom types in coral reef regions”, Applied Optics, vol. 53, no. 4, pp. 605-617, 2014.
[21] S. Gao, “Shallow water depth inversion based on data mining models”, B.S., China University of Petroleum (East China), 2013.
[22] W. Huang, C. Murray, N. Kraus and J. Rosati, “Development of a regional neural network for coastal water level predictions”, Ocean Engineering, vol. 30, pp. 2275–2295, 2003.
[23] O. Makarynskyya, A. A. Pires-Silvab, D. Makarynskaa and C. Ventura-Soaresc, “Artificial neural networks in wave predictions at the west coast of Portugal”, Computers & Geosciences, vol. 31, pp. 415–424, 2005.
[24] L. H. Holthuijsen, “Waves in Oceanic and Coastal Waters”, Cambridge University Press, pp. 27-28.