Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30184
Multidimensional Visualization Tools for Analysis of Expression Data

Authors: Urska Cvek, Marjan Trutschl, Randolph Stone II, Zanobia Syed, John L. Clifford, Anita L. Sabichi

Abstract:

Expression data analysis is based mostly on the statistical approaches that are indispensable for the study of biological systems. Large amounts of multidimensional data resulting from the high-throughput technologies are not completely served by biostatistical techniques and are usually complemented with visual, knowledge discovery and other computational tools. In many cases, in biological systems we only speculate on the processes that are causing the changes, and it is the visual explorative analysis of data during which a hypothesis is formed. We would like to show the usability of multidimensional visualization tools and promote their use in life sciences. We survey and show some of the multidimensional visualization tools in the process of data exploration, such as parallel coordinates and radviz and we extend them by combining them with the self-organizing map algorithm. We use a time course data set of transitional cell carcinoma of the bladder in our examples. Analysis of data with these tools has the potential to uncover additional relationships and non-trivial structures.

Keywords: microarrays, visualization, parallel coordinates, radviz, self-organizing maps.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1074635

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2116

References:


[1] T.R. Golub, D.K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J.P. Mesirov, H. Coller, M.L. Loh, J.R. Downing, et al. Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), pp. 531-537, 1999.
[2] P.T. Spelman, G. Sherlock, M.Q. Zhang, V.R. Iyer, K. Anders, M.B. Eisen, P.O. Brown, D. Botstein, B. Fucher. Comprehensive identification of cell-cycle regulated genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization. Molecular Biology of the Cell, 9(12), pp. 3273-3297, 1998.
[3] T. Zhang, R. Ramakrishnan, M. Livny. Birch: an efficient data clustering method for very large databases. Proc.Int. Conf. Management of Data, pp. 103-114, 1996.
[4] P. Tamayo, D. Slonim, J. Mesirov, Q. Zhu, S. Kitareewan, E. Dmitrovsky, E.S. Lander, T.R. Golub. Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc. Atl. Acad. Sci., 96(6), pp. 2907- 2912, 1999.
[5] P. Saraiya, C. North, K. Duca. An evaluation of microarray visualization tools for biological insight. Proc. Information Visualization 2004, pp. 1- 8, 2004.
[6] G. Grinstein, M. Trutschl, U. Cvek, High-dimensional visualizations. 7th ACM/SIGKDD Data mining Conference (KDD), 2001.
[7] T. Kohonen, Self-organized formation of topologically correct feature maps. Biological Cybernetics, vol. 43, pp. 59-69, 1982.
[8] R. Stone II, A.L. Sabichi, J. Gill, I.Lee, R. Loganatharaj, M. Trutschl, U. Cvek, J.L. Clifford. Identification of genes involved in early stage bladder cancer progression. Unpublished.
[9] Z.T. Zhang, J. Pak, E. Shapiro, T.T. Sun, X.R. Wu. Urothelium-specific expression of an oncogene in transgenic mice induced the formation of carcinoma in situ and invasive transitional cell carcinoma. Cancer Res., 59(14), pp. 3512-7, 1999.
[10] R. Gentleman, V. Carey, et al. (editors) Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, 2005.
[11] R. Gentleman, W. Huber. Working with Affymetrix data: estrogen, a 2x2 factorial design example. Practical Microarray Course, Heidelberg, 2003.
[12] R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna Austria, 2008.
[13] R.C. Gentleman, V.J. Carey, D.M. Bates, B. Bolstad, M. Dettling, S. Dudoit S, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biology, 5(10), R80, 2004.
[14] G.K. Smyth. Limma: Linear models for microarray data. Bioinformatics and Computational Biology Solutions using R and Bioconductor. R. Genleman, V. Carey, S. Dudoit, R. Irizarry, W. Huber (editors) Springer pp. 397-420, 2005.
[15] L. Gautier, L. Cope, B.M. Bolstad, R.A. Irizarry. affy--analysis of Affymetrix GeneChip data at the probe level. Bioinformatics, 12(3), pp. 307-315, 2004.
[16] A. Torrente, M. Kapushesky, A. Brazma. A new algorithm for comparing and visualizing relationships between hierarchical and flat gene expression data clusterings. Bioinformatics 21(21), pp. 3993-3999, 2005.
[17] D. Keim, H. Kriegel, M. Ankerst. Recursive pattern: a technique for visualizing very large amounts of data. Proc. Visualization 1995, pp. 279-286, 1995.
[18] D.F. Andrews. Plots of high-dimensional data. Biometrics, 29, pp. 125- 136, 1972.
[19] J.M. Chambers, W.S. Cleveland, B. Kleiner, P.A. Tukey. Graphical Methods for Data Analysis, Chapman and Hall, 1976.
[20] J. Bertin, Semiology of Graphics: Diagrams, Networks, Maps. University of Wisconsin, Madison, WI, 1983.
[21] A. Inselberg, The plane with parallel coordinates. The Visual Computer, pp. 69-92, 1985.
[22] A. Inselberg, B. Dimsdale, Parallel coordinates: A tool for visualizing multidimensional geometry. Proc. IEEE Visualization, pp. 361-378, 1990.
[23] P. Hoffman, G. Grinstein. Dimensional anchors: a graphic primitive for multidimensional multivariate information visualizations. Presented at NPIV 99 (Workshop on New Paradigms in Information Visualization and Manipulation), 1999.
[24] W. Peng, M.O. Ward, E.A. Rundensteiner, Clutter reduction in multidimensional data visualization using dimension reordering. Proc. IEEE Symposium on Information Visualization, pp. 89-96, 2004.
[25] M.O. Ward, XmdvTool: Integrating multiple methods for visualizing multivariate data. Proc. IEEE Visualization 1994, pp. 326-333, 1994. URL: http://davis.wpi.edu/~xmdv/.
[26] J. Yang, W. Peng, M.O. Ward, E.A. Rudensteiner, Interactive hierarchical dimension ordering, spacing and filtering for exploration of high dimensional datasets. Proc. IEEE Symposium on Information Visualization, pp. 14-21, 2003.
[27] Y.-H. Fua, M.O. Ward, E.A. Rundensteiner, Hierarchical parallel coordinates for exploration of large datasets. Proc. IEEE 5th International Conference on Information Visualization, pp. 425-432, 2001.
[28] Y.-H. Fua, M.O. Ward, E.A. Rundensteiner, Navigating hierarchies with structure-based brushes. Proc. IEEE 5th International Conference on Information Visualization, pp. 58-64, 1999.
[29] J. Johansson, P. Ljung, M. Jern, M. Cooper, Revealing structure within clustered parallel coordinates displays. Proc. IEEE Symposium on Information Visualization, pp. 125-132, 2005.
[30] H. Siirtola, Direct manipulation of parallel coordinates, Proc. IEEE 4th International Conference on Information Visualization, pp. 373-378, 2000.
[31] N. Lesh, M. Mitzenmacher, Interactive data summarization: an example application. Proc. Working Conference on Advanced Visual Interfaces, pp. 183-187, 2004.
[32] J.F. Rodrigues, Jr., A.J. Traina, C. Traina, Jr., Frequency plot and relevance plot to enhance visual data exploration. Proc. XVI Brazilian Symposium on Computer Graphics and Image Processing, pp. 117-134, 2003.
[33] M. Berthold, L.O. Hall, Visualizing fuzzy points in parallel coordinates. IEEE Transactions on Fuzzy Systems, pp. 369-374, 2003.
[34] G. Andrienko, N. Andrienko, Parallel coordinates for exploring properties of subsets. Proc. 2nd IEEE Conference on Coordinated and Multiple Views in Exploratory Visualization, pp. 93-104, 2004.
[35] M. Novotny, Visually effective information visualization of large data. Proc. 8th Central European Seminar on Computer Graphics, pp. 41-48, 2004.
[36] J.J. Miller, E.J. Wegman, Construction of line densities for parallel coordinate plots. Computational Statistics and Graphics, eds. A. Buja, P. Tukey, Springer-Verlag, pp. 107-123, 1990.
[37] E.J. Wegman, Hyperdimensional data analysis using parallel coordinates. Journal of American Statistical Association, 85 (411), pp. 664-675, 1990.
[38] E.J. Wegman, Q. Luo, High dimensional clustering using parallel coordinates and the grand tour. Proc. Conf. German Classification Society, Freiburg, Germany, 1996.
[39] A.O. Artero, M.C. Ferreira de Oliveira, H. Levkowitz, Uncovering Clusters in Crowded Parallel Coordinates Visualizations. Proc. IEEE Symposium on Information Visualization, pp. 81-88, 2004.
[40] D. Ericson, J. Johansson, M. Cooper, Visual data analysis using tracked statistical measures within parallel coordinate representations. Proc. 3rd IEEE Conference on Coordinated and Multiple Views in Exploratory Visualization, pp. 42-53, 2005.
[41] E. Bertini, L. Dell- Aquila, G. Santucci, Springview: cooperation of radviz and parallel coordinates or view optimization and clutter reduction. Proc. 3rd IEEE International Conference on Coordinated & Multiple Views in Exploratory Visualization, pp. 22-29, 2005.
[42] P.C. Wong, R.D. Bergeron, Multivariate visualization using metric scaling. Proc. IEEE Visualization 1997, pp. 111-118, 1997.
[43] Y.-H. Fua, M.O. Ward, E.A. Rundensteiner, Hierarchical parallel coordinates for exploration of large datasets. Proc. IEEE 5th International Conference on Information Visualization, pp. 425-432, 2001.
[44] M.O. Ward, XmdvTool: Integrating multiple methods for visualizing multivariate data. Proc. IEEE Visualization 1994, pp. 326-333, 1994.
[45] J. Yang, A. Patro, S. Huang, N. Mehta, M.O. Ward, E.A. Rundensteiner, Value and relation display for interactive exploration of high dimensional datasets. Proc. IEEE Symposium on Information Visualization 2004, pp. 73-80, 2004
[46] G. Leban, I. Bratko, U. Petrovic, T. Curk, B. Zupan. VizRank: finding informative data projections in functional genomics by machine learning. Bioinformatics, 21, 2005.
[47] P. Au, M. Carey, S. Sewraz, Y. Guo, S. Ruger. New paradigms in information visualization. Proc. 23rd International ACM SIGIR Conference, Athens, Greece, 2000.
[48] J. Seo, B. Shneiderman. A Rank-by-Feature framework for unsupervised multidimensional data exploration using low dimensional projections. Proc. IEEE InfoVis2004, pp. 65-72, 2004.
[49] URL: http://www.cs.umd.edu/hcil/hce/
[50] J. Demsar, B. Zupan, G. Leban. Orange: From Experimental Machine Learning to Interactive Data Mining, White Paper. Faculty of Computer and Information Science, University of Ljubljana.
[51] URL:www.ailab.si/orange
[52] M.A. Nour, G.R. Madey. Heuristic and optimization approaches to extending the Kohonen self-organizing algorithm. European Journal of Operational Research, 93(2), pp. 428-448, 1996.
[53] B. Fritzke. Growing cell structures - a self-organizing network for unsupervised and supervised learning. Neural Networks 7, 9, pp. 1441- 1460, 1994.
[54] P. Koikkalainen, E. Oja. Self-organizing hierarchical feature maps, International Joint Conference on Neural Networks IJCNN'90, pp. 279- 284, 1990.
[55] E. Oja. A simplified neuron model as a principle component analyzer. Journal of Mathematical Biology ,15, pp. 267-273, 1982.
[56] M. A. Kraaijveld, J. Mao, A.K. Jain. A nonlinear projection method based on Kohonen's topology preserving maps. IEEE Transactions on Neural Networks, 6(3), pp. 548-559, 1995.
[57] D. Merkl, A. Rauber. Alternative ways for cluster visualization in selforganizing maps, Proc. Workshop on Self-Organizing Maps, pp. 106- 111, 1997.
[58] M.-C. Su, H.-T. Chang. Fast self-organizing feature map algorithm, IEEE Transaction on Neural Networks, 11(3), pp.721-727, 2000.