**Commenced**in January 2007

**Frequency:**Monthly

**Edition:**International

**Paper Count:**31100

##### Entropy Based Spatial Design: A Genetic Algorithm Approach (Case Study)

**Authors:**
Abbas Siefi,
Mohammad Javad Karimifar

**Abstract:**

We study the spatial design of experiment and we want to select a most informative subset, having prespecified size, from a set of correlated random variables. The problem arises in many applied domains, such as meteorology, environmental statistics, and statistical geology. In these applications, observations can be collected at different locations and possibly at different times. In spatial design, when the design region and the set of interest are discrete then the covariance matrix completely describe any objective function and our goal is to choose a feasible design that minimizes the resulting uncertainty. The problem is recast as that of maximizing the determinant of the covariance matrix of the chosen subset. This problem is NP-hard. For using these designs in computer experiments, in many cases, the design space is very large and it's not possible to calculate the exact optimal solution. Heuristic optimization methods can discover efficient experiment designs in situations where traditional designs cannot be applied, exchange methods are ineffective and exact solution not possible. We developed a GA algorithm to take advantage of the exploratory power of this algorithm. The successful application of this method is demonstrated in large design space. We consider a real case of design of experiment. In our problem, design space is very large and for solving the problem, we used proposed GA algorithm.

**Keywords:**
Genetic Algorithm,
Spatial design of experiments,
maximum entropy sampling,
computer experiments

**Digital Object Identifier (DOI):**
doi.org/10.5281/zenodo.1331197

**References:**

[1] Muller, W.G. (2007). Collecting Spatial Data: Optimum Design of Experiments for Random Fields. 3nd ed., Physica Verlag, Heidelberg .

[2] Matheron, G. (1963). Principles of geostatistics. Economic Geology, 58:1246-1266.

[3] Cox, D.D., Cox, L.H. and Ensore, K.B. (1997). Spatial sampling and the environment: some issues and directions. Environmental and Ecological Statistics, 4:219-233.

[4] Cochran, W.G., 1977. Sampling Techniques, 3rd Edition. Wiley, New York.

[5] Stuart, A., 1984. The Ideas of Sampling. Griffin, London.

[6] Wollum, A.G., 1994. Soil sampling for microbiological analysis. In: Weaver, R.W., et al. (Eds), Methods of Soil Analysis. Part 2. Microbiological and Biochemical Properties. Soil Science Society of America, Madison, pp. 1-14.

[7] Wendroth, O., Reynolds, W.D., Vieira, S.R., Reichardt, K., Wirth, S., 1997. Statistical approaches to the analysis of soil quality data. In: Gregorich, E.G., Carter, M.R. (Eds.), Soil Quality for Crop Production and Ecosystem Health. Elsevier, Amsterdam, pp. 247-276.

[8] Stein, A., Ettema, C., 2003. An overview of spatial sampling procedures and experimental design of spatial studies for ecosystem comparisons, Agriculture, Ecosystems and Environment 94, PP. 31-47.

[9] Melissa J. Dobbie, Brent L. Henderson, and Don L. Stevens, Jr, 2008. Sparse sampling: Spatial design for monitoring stream networks, Statistics Surveys Vol. 2, PP. 113-153.

[10] Muller, W.G. (2000). Collecting Spatial Data: Optimum Design of Experiments for Random Fields. 2nd ed., Physica Verlag, Heidelberg.

[11] Hansen, M.H., Madow, W.G., and Tepping, B.J. (1983). An evaluation of model dependent and probability sampling inferences in sample surveys. Journal of the American Statistical Association 78, 776-760.

[12] Sarndal, C. (1978). Design-based and model-based inference for survey sampling. Scandinavian Journal of Statistics 5, 27-52.

[13] Brus, D.J. and de Gruijter, J.J. (1993). Design-based versus modelbased estimates of spatialmeans: Theory and application in environmental soil science. Environmetrics 4, 123-152.

[14] Brus, D.J. and de Gruijter, J.J. (1997). Random sampling or geostatistical modeling? Choosing between design-based and modelbased sampling strategies for soil (with Discussion). Geoderma 60, 1- 44.

[15] de Gruijter, J.J., and Ter Braak, C.J.F. (1990). Model free estimation from survey samples: A reappraisal of classical sampling theory. Mathematical Geology 22, 407-415.

[16] Cressie, N., Calder, C.A., Clark, J.S., Ver Hoef, J.M. and Wikle, C.K. (2007). Accounting for uncertainty in ecological analysis: the strengths and limitations of hierarchical statistical modeling. Department of Statistics Preprint No. 798, The Ohio State University.

[17] Royle, J.A. and Nychka, D. (1998). An Algorithm for the Construction of Spatial Coverage Designs with Implementation in Splus. Computers and Geosciences 24, 479-488.

[18] Nychka, D. and Saltzman, N. (1998). Design of air-qualitymonitoring networks. In Case Studies in Environmental Statistics, D. Nychka, W. Piegorsch, and L. Cox, Eds. Springer, New York, 51-76.

[19] De Gruijter, J.J. and Ter Braak, C.J.F. (1990) Model-free estimation from spatial samples: A reappraisal of classical sampling theory. Mathematical Geology, 22(4), 407-15.

[20] Cressie, N.A.C. (1991) Statistics for Spatial Data. Wiley, New York.

[21] Christakos, G. (1992) Random Field Models in Earth Sciences. Academic Press, San Diego.

[22] Caselton, W.F. and Hussian, T. (1980) Hydrologic networks: Information transmission. Journal of the Water Resources Planning and Management Division, A.S.C.E., 106 (WR2), 503-20.

[23] Caselton, W.F. and Zidek, J.V. (1984) Optimal monitoring network designs. Statistics and Probability Letters, 2, 223-7.

[24] Caselton, W.F., Kan, L., and Zidek, J.V. (1991) Quality data network designs based on entropy. In Statistics in the Environmental and Earth Sciences, P. Guttorp and A. Walden (eds), Griffin, London.

[25] Ko, C.-W., Lee, J., and Queyranne, M. (1995) An exact algorithm for maximum entropy sampling. Operations Research, 43, 684-91.

[26] Sacks, J. and Ylvisakerd. (1984). Some model robust designs in regression. Ann. Statist. 12, 1324-1348.

[27] Sacks, J. and Ylvisakerd. (1985). Model robust design in regression: Bayes theory. In Proc. of the Berkeley Conference in Honor of Jerzy Neyman and Jack Kiefer (L. M. Le Cam and R. A. Olshen, eds.) 2, 667- 679. Wadsworth, Monterey, Calif.

[28] Welch, W. J. (1983). A mean squared error criterion for the design of experiments. Biometrika 70, 205-213.

[29] Mitchell, T. J. An algorithm for the construction of "D-optimal" experimental designs. Technometrics 16, 1974, 203-210.

[30] Dodge, Y., Fedorov, V, V. and Wynn, H. P. Optimal Design and Analysis of Experiments. Elsevier, 1988.

[31] Shewry, M.C. and Wynn, H.P. (1987). Maximum entropy sampling. Journal of Applied Statistics, 14, 165-170.

[32] Guttorpn, P., Le, N. D, Sampson, P. D. and Zidek, J. V. (1992). Using Entropy in the Redesign of an Environmental Monitoring Network. Technical Report #116, Department of Statistics, The University of British Columbia.

[33] Mckaym, D., Conoverw, J. and Beckmanr, J. (1979). A comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21, 239-245.

[34] Imanr, L. and Heltonj, C. (1988). An investigation of uncertainty and sensitivity analysis techniques for computer models. Risk Analysis 8, 71- 90.

[35] Kurt M. Anstreicher, Marcia Fampa, Jon Lee, and JoyWilliams. (1996). Continuous relaxations for constrained maximum-entropy sampling. In Integer Programming and Combinatorial Optimization (Vancouver, BC, 1996), volume 1084 of Lecture Notes in Computer Science, pages 234- 248. Springer, Berlin.

[36] Kurt M. Anstreicher, Marcia Fampa, Jon Lee, and Joy Williams. (1999). Using continuous nonlinear relaxations to solve constrained maximumentropy sampling problems. Mathematical Programming, Series A, 85(2):221-240.

[37] Jon Lee. (1998) Constrained maximum-entropy sampling. Operations Research, 46(5):655-664.

[38] Jon Lee. (2000). Semidefinite-programming in experimental design. In Henry Wolkowicz, Romesh Saigal and LievenVandenberghe, editors, Handbook of Semidefinite Programming, volume 27 of International Series in Operations Research and Management Science, pages 528- 532. Kluwer.

[39] Jon Lee. Maximum-entropy sampling. (2001). In Abdel H. El-Shaarawi andWalterW. Piegorsch, editors, Encyclopedia of Environmetrics, volume 3, pages 1229-1234. JohnWiley & Sons Inc.

[40] Alan Hoffman, Jon Lee, and Joy Williams. (2001). New upper bounds for maximum-entropy sampling. In A.C. Atkinson, P. Hackl, andW.G.M┬¿uller, editors,MODA 6ÔÇöAdvances in model-oriented design and analysis, pages 143-153. Springer-Verlag.

[41] Jon Lee, Joy Williams. (2003). A linear integer programming bound for maximum-entropy sampling, Math. Program., Ser. B 94: 247-256.

[42] Michalewicz, Zbigniew. (1992). Genetic Algorithms + Data Structures = Evolution Programs, Springer-Verlag, New York.

[43] Haupt, Randy L. and Haupt, Sue Ellen. (1998). Practical Genetic Algorithms, Wiley, New York.

[44] Davis, Lawrence (Ed.). (1991). Handbook of Genetic Algorithms, Van Nostrand Reinhold, New York.

[45] Goldberg, David E. (1989). Genetic Algorithms in Search, Optimization, and Machine Learning, Addison-Wesley, New York.