Computational Aspects of Regression Analysis of Interval Data
Authors: Michal Cerny
We consider linear regression models where both input data (the values of independent variables) and output data (the observations of the dependent variable) are interval-censored. We introduce a possibilistic generalization of the least squares estimator, so called OLS-set for the interval model. This set captures the impact of the loss of information on the OLS estimator caused by interval censoring and provides a tool for quantification of this effect. We study complexity-theoretic properties of the OLS-set. We also deal with restricted versions of the general interval linear regression model, in particular the crisp input – interval output model. We give an argument that natural descriptions of the OLS-set in the crisp input – interval output cannot be computed in polynomial time. Then we derive easily computable approximations for the OLS-set which can be used instead of the exact description. We illustrate the approach by an example.
Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1062994Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1300
 G. Alefeld and J. Herzberger, Introduction to interval computations, Computer Science and Applied Mathematics, New York, USA: Academic Press, 1983.
 S. Arora and B. Barak, Computational complexity: A modern approach, Cambridge, Great Britain: Cambridge University Press, 2009.
 D. Avis and K. Fukuda, Reverse search for enumeration, Discrete Applied Mathematics 65, 1996, 21-46.
 A. H. Bentbib, Solving the full rank interval least squares problem, Applied Numerical Mathematics 41 (2), 2002, 283-294.
 M. C╦ç erny' and M. Hlad'─▒k, The regression tolerance quotient in data analysis, in: M. Houda and J. Friebelov'a (eds.), Procceding of Mathematical Methods in Economics 2010, Czech Republic: University of South Bohemia, 2010, 98-104.
 M. C╦ç erny' and M. Rada, A note on linear regression with interval data and linear programming, in: Quantitative methods in economics: Multiple Criteria Decision Making XV, Slovakia: Kluwer, Iura Edition, 2010, 276- 282.
 P.-T. Chang, E. S. Lee and S. A. Konz, Applying fuzzy linear regression to VDT legibility, Fuzzy Sets and Systems 80 (2), 1996, 197-204.
 C. Chuang, Extended support vector interval regression networks for interval input-output data, Information Science 178 (3), 2008, 871-891.
 J. P. Dunyak and D. Wunsch, Fuzzy regression by fuzzy number neural networks, Fuzzy Sets and Systems 112 (3), 2000, 371-380.
 T. Entani and M. Inuiguchi, Group decisions in interval AHP based on interval regression analysis, in: V.-N. Huynh et al. (eds.), Integrated uncertainty management and applications, Advances in Soft Computing, vol. 68, Germany: Springer, 2010, 269-280.
 J.-A. Ferrez, K. Fukuda and T. Liebling, Solving the fixed rank convex quadratic maximization in binary variables by a parallel zonotope construction algorithm, European Journal of Operational Research 166, 2005, 35-50.
 D. M. Gay, Interval least squaresÔÇöa diagnostic tool, in R. E. Moore (ed.), Reliability in computing, the role of interval methods in scientific computing, Perspectives in Computing, vol. 19, Boston, USA: Academic Press, 1988, 183-205.
 M. Gr┬¿otschel, L. Lov'asz and A. Schrijver, Geometric algorithms and combinatorial optimization, Germany: Springer, 1993.
 P. Guo and H. Tanaka, Dual models for possibilistic regression analysis, Computational Statistics & Data Analysis 51 (1), 2006, 253-266.
 B. Hesmaty and A. Kandel, Fuzzy linear regression and its applications to forecasting in uncertain environment, Fuzzy Sets and Systems 15, 1985, 159-191.
 M. Hlad'─▒k, Description of symmetric and skew-symmetric solution set, SIAM Journal on Matrix Analysis and Applications 30 (2), 2008, 509- 521.
 M. Hlad'─▒k, Solution set characterization of linear interval systems with a specific dependence structure, Reliable Computing 13 (4), 2007, 361- 374.
 M. Hlad'─▒k, Solution sets of complex linear interval systems of equations, Reliable Computing 14, 2010, 78-87.
 M. Hlad'─▒k and M. C╦ç erny', Interval regression by tolerance analysis approach, Submitted in Fuzzy Sets and Systems, Preprint: KAM-DIMATIA Series 963, 2010.
 M. Hlad'─▒k and M. C╦ç erny', New approach to interval linear regression, in: R. Kas─▒mbeyli et al. (eds.), 24th Mini-EURO conference on continuous optimization and information-based technologies in the financial sector MEC EurOPT 2010, Selected papers, Vilnius, Lithuania: Technika, 2010, 167-171.
 C.-H. Huang and H.-Y. Kao, Interval regression analysis with softmargin reduced support vector machine, Lecture Notes in Computer Science 5579, Germany: Springer, 2009, 826-835.
 M. Inuiguchi, H. Fujita and T. Tanino, Robust interval regression analysis based on Minkowski difference, in: SICE 2002, proceedings of the 41st SICE Annual Conference, vol. 4, Osaka, Japan, 2002, 2346-2351.
 H. Ishibuchi and H. Tanaka, Several formulations of interval regression analysis, in: Proceedings of Sino-Japan joint meeting on fuzzy sets and systems, Beijing, China, 1990, B2-2, 1-4.
 H. Ishibuchi, H. Tanaka and H. Okada, An architecture of neural networks with interval weights and its application to fuzzy regression analysis, Fuzzy Sets and Systems 57 (1), 1993, 27-39.
 C. Jansson, Calculation of exact bounds for the solution set of linear interval systems, Linear Algebra and its Applications 251, 1997, 321-340.
 G. Jun-peng and L. Wen-hua, Regression analysis of interval data based on error theory, in: Proceedings of 2008 IEEE International Conference on Networking, Sensing and Control, ICNSC, Sanya, China, 2008, 552- 555.
 M. Kaneyoshi, H. Tanaka, M. Kamei and H. Furuta, New system identification technique using fuzzy regression analysis, in: Proceedings of the First International Symposium on Uncertainty Modeling and Analysis, Baltimore, USA, 1990, 528-533.
 H. Kashima, K. Yamasaki, A. Inokuchi and H. Saigo, Regression with interval output values, in: 19th International Conference on Pattern Recognition ICPR 2008, Tampa, USA, 2008, 1-4.
 H. Lee and H. Tanaka, Fuzzy regression analysis by quadratic programming reflecting central tendency, Behaviormetrika 25 (1), 1998, 65-80.
 H. Lee and H. Tanaka, Upper and lower approximation models in interval regression using regression quantile techniques, Europeran Journal of Operational Research 116 (3), 1999, 653-666.
 B. Li, C. Li, J. Si and G. Abousleman, Interval least-squares filtering with applications to robust video target tracking, in: 2008 IEEE International Conference on Acoustics, Speech and Signal Processing ÔÇö Proceedings, Las Vegas, USA: IEEE Signal Processing Society, 2008, 3397-3400.
 E. de A. Lima Neto, F. de A. T. de Carvalho, Constrained linear regression models for symbolic interval-valued variables, Computational Statistics & Data Analysis 54 (2), 2010, 333-347.
 P. Liu, Study on a speech learning approach based on interval support vector regression, in: Proceedings of 4th International Conference on Computer Science & Education, Nanning, China, 2009, 1009-1012.
 I. Moral-Arce, J. M. Rodr'─▒guez-P'oo and S. Sperlich, Low dimensional semiparametric estimation in a censored regression model, Journal of Multivariate Analysis 102 (1), 118-129.
 E. Nasrabadi and S. Hashemi, Robust fuzzy regression analysis using neural networks, International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems 16 (4), 2008, 579-598.
 A. Neumaier, Interval methods for systems of equations, Cambridge, Great Britain: Cambridge University Press, 1990.
 S. Ning and R. B. Kearfott, A comparison of some methods for solving linear interval equations, SIAM Journal on Numerical Analysis 34 (4), 1997, 1289-1305.
 W. Pan and R. Chappell, Computation of the NPMLE of distribution functions for interval censored and truncated data with applications to the Cox model, Computational Statistics & Data Analysis 28 (1), 1998, 33-50.
 C. Papadimitriou, Computational complexity, Addison-Wesley Longman, 1995.
 J. Rohn, A handbook of results on interval linear problems, Prague, Czech Republic: Czech Academy of Sciences, 2005; available at: http://uivtx.cs.cas.cz/Ôê╝rohn/handbook/handbook.zip.
 A. Schrijver, Theory of linear and integer programming, USA: Wiley, 2000.
 K. Sugihara, H. Ishii and H. Tanaka, Interval priorities in AHP by interval regression analysis, Europeran Journal of Operational Research 158 (3), 2004, 745-754.
 H. Tanaka and H. Lee, Fuzzy linear regression combining central tendency and possibilistic properties, in: Proceedings of the Sixth IEEE International Conference on Fuzzy Systems, vol. 1, Barcelona, Spain, 1997, 63-68.
 H. Tanaka and H. Lee, Interval regression analysis by quadratic programming approach, IEEE Transactions on Fuzzy Systems 6 (4), 1998, 473-481.
 H. Tanaka and J. Watada, Possibilistic linear systems and their application to the linear regression model, Fuzzy Sets and Systems 27 (3), 1988, 275-289.
 X. Zhang and J. Sun, Regression analysis of clustered interval-censored failure time data with informative cluster size, Computational Statistics & Data Analysis 54 (7), 2010, 1817-1823.
 G. Ziegler, Lectures on polytopes, Germany: Springer, 2004.