Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 30578
Predictive Clustering Hybrid Regression(pCHR) Approach and Its Application to Sucrose-Based Biohydrogen Production

Authors: Nikhil, Ari Visa, Chiu-Yue Lin, Jaakko A. Puhakka, Olli Yli-Harja, Chin-Chao Chen


A predictive clustering hybrid regression (pCHR) approach was developed and evaluated using dataset from H2- producing sucrose-based bioreactor operated for 15 months. The aim was to model and predict the H2-production rate using information available about envirome and metabolome of the bioprocess. Selforganizing maps (SOM) and Sammon map were used to visualize the dataset and to identify main metabolic patterns and clusters in bioprocess data. Three metabolic clusters: acetate coupled with other metabolites, butyrate only, and transition phases were detected. The developed pCHR model combines principles of k-means clustering, kNN classification and regression techniques. The model performed well in modeling and predicting the H2-production rate with mean square error values of 0.0014 and 0.0032, respectively.

Keywords: Biohydrogen, Bioprocess Modeling, clusteringhybrid regression

Digital Object Identifier (DOI):

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1446


[1] H. Kitano, Foundations of Systems Biology. The MIT Press, 2001.
[2] M. T. Facciotti, R. Bonneau, L. Hood, and N. S. Baliga, "Systems biology experimental design - considerations for building predictive gene regulatory network models for prokaryotic systems," Current Genomics, vol. 5, no. 7, pp. 527-544, Nov. 2004.
[3] H. Kitano, "Systems biology: a brief overview," Science, vol. 295, no. 5560, pp. 1662-1664, March 2002.
[4] A. Kremling, and J. Saez-Rodriguez, "Systems biology - an engineering perspective," J. Biotechnol., vol. 129, pp. 329-351, 2007.
[5] R. Takors, B. Bathe, M. Rieping, S. Hans, R. Kelle, and K. Hutchmacher, "Systems biology for industrial strains and fermentation processes - example: amino acids," J. Biotechnol., vol. 129, pp. 181- 190, 2007.
[6] P.C. Hallenbeck, "Fundamentals of fermentative production of hydrogen," Water Sci. Technol., vol. 52, no. 1-2, pp. 21-29, 2005.
[7] J-O. M.Bockris, "The origin of ideas on a hydrogen economy and its solution to the decay of the environment," Int. J. Hydrogen Energy, vol. 27, pp. 731-740, 2002.
[8] D. Das, and T.N. Veziro─ƒlu, "Hydrogen production by biological processes: a survey of literature," Int. J. Hydrogen Energy, vol. 26, pp. 13-28, 2001.
[9] J. Benemann, "Hydrogen biotechnology: progress and prospects," Nat. Biotechnol., vol. 14, pp. 1101-1103, 1996.
[10] I. K. Kapdan, and F. Kargi, "Bio-hydrogen production from waste materials," Enzyme Microb. Tech., vol. 38, pp. 569-582, 2006.
[11] C. Li, and H. H. P. Fang, "Fermentative hydrogen production and wastewater and solid wastes by mixed cultures," Crit. Rew. Env. Sci. Technol., vol. 37, pp. 1-39, 2007.
[12] C.-Y. Lin, and R.-C. Chang, "Fermentative hydrogen production at ambient temperature," Int. J. Hydrogen Energy, vol. 29, pp. 715-720, 2004.
[13] J. Rodriguez, R. Kleerebezem, J. M. Lema, and M. C. van Loosdrecht, "Modeling product formation in anaerobic mixed culture fermentations," Biotechnol. Bioeng., vol. 93, pp. 592-606, 2006.
[14] R. Nandi, and S. Sengupta, "Microbial production of hydrogen: an overview," Crit. Rev. Microbiol., vol. 24, pp. 61-84, 1998.
[15] G. Liden, "Understanding the bioreactor," Bioprocess Biosyst. Eng., vol. 24, pp. 273- 279, 2002.
[16] Nikhil, "Formulation of mathematical models for control and optimization of bioreactors," M.Sc. thesis, Dept. Environmental Technology, Tampere Univ. Technology, Tampere, Finland, 2005.
[17] Nikhil, "Application of systems bioengineering for fermentative hydrogen production," presented at 3rd TICSP Workshop on Computational Systems Biology, WCSB 2005, Tampere, Finland, June 13 - 14, 2005, pp. 33-34.
[18] K. Y. Rani, and V. S. R. Rao, "Control of fermenters - a review," Bioprocess Eng., vol. 21, pp. 77-78, 1999.
[19] Schugerl, K.; Bellgardt, K.H. Bioreaction engineering. Modeling and control. Berlin, Heidelberg, New York: Springer-Verlag. 2000.
[20] Bailey, E.J. Mathematical modeling and analysis in biochemical engineering: Past accomplishments and future opportunities. Biotechnol. Prog. 1998, 14, 8-20.
[21] Bernard, O.; Bastin, G. On the estimation of the pseudo-stoichiometric matrix for macroscopic mass balance modelling of biotechnological processes. Math. Biosci. 2005, 193, 51-77.
[22] Husain, A. Mathematical models of the kinetics of anaerobic digestion - a selected review. Biomass. Bioenerg. 1998, 14, 561-571.
[23] McCarty, P.L.; Mosey, F.E. Modelling of anaerobic digestion processes (a discussion of concepts). Wat. Sci. Technol. 1991, 24:8, 123-129.
[24] Batstone, D.J.; Keller, J.; Angelidaki, I.; Kalyuzhnyi, S.V.; Pavlostathis, S.G.; Rozzi, A.; Sanders, W.T.M.; Siegrist, H.; Vavilin, V.A. Anaerobic digestion model no. 1 (ADM1), IWA Task Group for mathematical modelling of anaerobic digestion processes. London, UK: IWA Publishing 2002.
[25] Blumensaat, F.; Keller J. Modelling of two-stage anaerobic digestion using the IWA Anaerobic Digestion Model No. 1 (ADM1). Water Res 2005, 39, 171-183.
[26] Kalyuzhnyi, S.V. Batch anaerobic digestion of glucose and its mathematical modeling. II. Description, verification and application of model. Bioresour. Technol. 1997, 59, 249-258.
[27] Parker, W.J. Application of the ADM1 model to advanced anaerobic digestion. Bioresour. Technol. 2005, 96, 832-1842.
[28] Nikhil, A. Visa, O. Yli-Harja, C.-Y. Lin, and J. A. Puhakka, "Application of the Clustering Hybrid Regression Approach to Model Xylose-Based Fermentative Hydrogen Production," Energy Fuels, 2008, 22 (1), 128-133.
[29] Nikhil, P. E. P. Koskinen, A. Visa, A. H. Kaksonen, J. A. Puhakka, and O. Yli-Harja, "Clustering hybrid regression (CHR): a novel computational approach to study and model biohydrogen production through dark fermentation," Bioprocess and Biosystems Engineering, 2008, doi: 10.1007/s00449-008-0213-9.
[30] P. J. Huber, "Projection pursuit," Ann. Statist., vol. 13, no. 2, pp. 435- 475, 1985.
[31] J. H. Friedman, "Exploratory projection pursuit," J. Amer. Statist. Assoc., vol. 82, no. 397, pp. 249-266, 1987.
[32] B. D. Ripley, "Neural networks: a review from statistical perspective," Statistical Sci., vol. 9, no. 1, pp. 45-48, Feb. 1994.
[33] J. A. Lee, A. Lendasse, and M. Verleysen, "Nonlinear projection with curvilinear distances: isomap versus curvilinear distance analysis," Neurocomputing, vol. 57, pp. 49-76, 2004.
[34] T. Kohonen, Self-organizing maps. Springer, Berlin, Heidelberg, New York: Springer Series in Information Sciences, vol. 30, 1995.
[35] S. Kaski, "Data exploration using self-organizing maps," D.Tech. (Ph.D.) dissertation, Helsinki University of Technology, Finland, 1997.
[36] M. Kasslin, J. Kangas, and O. Simula, "Process state monitoring using self organizing maps," in Artificial Neural Networks, vol. 2, I. Aleksander, and J. Taylor, Eds. Amsterdam, The Netherlands, North Holland, 1992, pp. 1531-1534.
[37] O. Simula, and J. Kangas, Process monitoring and visualization using self-organizing maps. Neural networks for chemical engineers. Computer-aided chemical engineering. Amsterdam: Elsevier, 1995, pp. 377-390.
[38] H. Yin, "ViSOM - a novel method for multivariate data projection and structure visualization," IEEE Trans. Neural Networks, vol. 13, no. 1, pp. 237-243, Jan. 2002.
[39] Tamayo, P.; Slonim, D.; Mesirov, J.; Zhu, Q.; Kitareewan, S.; Dmitrovsky, E.; Lander, E.; Golub, T. Interpreting patterns of gene expression with self-organizing maps; methods and application to hematopoietic differentiation. Proceedings of the National Academy of Sciences, USA 1999, 96, 2907-2912.
[40] Törönen, P.; Kolehmainen, M.; Wong, G.; Castren, E. Analysis of gene expression data using self-organizing maps. FEBS Letters 1999, 451:2, 142-146.
[41] Hill, A.; Hunter, C.; Tsung, B.; Tucker-Kellogg, G.; Brown, E. Genomic analysis of gene expression in C.elegans. Science 2000, 290, 809-812.
[42] Chen, D.-R.; Chang, R.-F.; Huang, Y.-L. Breast cancer diagnosis using self-organizing maps for sonography. Ultrasound in Medicine and Biology 2000, 26:3, 405-411.
[43] J. C. Principe, L. Wang, and M. A. Motter, "Local dynamic modeling with self-organizing maps and applications to nonlinear system identification and control," Proc. IEEE, vol. 86, no. 11, pp. 2240-2258, Nov. 1998.
[44] T. Kohonen, J. Hynninen, J. Kangas, and J. Laaksonen, "SOM_PAK: The self-organizing map program package," Laboratory of Computer and Information Science, Helsinki University of Technology, Finland, Technical Report A31, 1996.
[45] J. Vesanto, J. Himberg, E. Alhoniemi, and J. Parhankangas, (2000) "SOM Toolbox for MATLAB 5," SOM Toolbox Team, Helsinki University of Technology, Finland. Available:
[46] J. W. Sammon, Jr, "A nonlinear mapping for data structure analysis," IEEE Trans. Computers, vol. c-18, no. 5, pp. 401-409. May 1969.
[47] D. K. Agrafiotis, "A new method for analyzing protein sequence relationships based on Sammon maps," Protein Sci., vol. 6, pp. 287- 293, 1997.
[48] B. Lerner, H. Guterman, M. Aladjem, and I. Dinstein, "On the initialization of Sammon-s nonlinear mapping," Pattern analysis and applications, vol. 3, pp. 61-68, 2000.
[49] J. B. MacQueen, "Some methods for classification and analysis of multivariate observations," Proc. 5th Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, University of California Press, vol. 1, pp. 281-297, 1967.
[50] A. K. Jain, and R. C. Dubes, Algorithms for clustering data. Englewood Cliffs, New Jersey: Prentice Hall, 1988.
[51] A. K. Jain, M. N. Murty, and P. J. Flynn, "Data clustering: a review," ACM Comput. Surv., vol. 31, pp. 264-323, 1999.
[52] T. M. Cover, and P. E. Hart, "Nearest neighbor pattern classification," IEEE Trans. Information Theory, vol. IT-13, no. 1, pp. 21-27, 1967.
[53] C. M. van der Walt, and E. Barnard, "Data characteristics that determine classifier performance", in Proc. Sixteenth Annual Symposium of the Pattern Recognition, Association of South Africa, pp.160-165, 2006.
[54] R. O. Duda, P. E. Hart, and D. G. Stork, Pattern classification. Wiley Interscience, 2nd ed., 2000, ch. 4.
[55] P. J. Rousseeuw, "Silhouettes: a graphical aid to the interpretation and validation of cluster analysis," J. Comput. Appl. Math., vol. 20, pp. 53- 65, 1987.
[56] L. Kaufman, and P. J. Rousseeuw, Finding Groups in Data: An Introduction to Cluster Analysis. Wiley Interscience, 1990.
[57] V. E. McGee, and W. T. Carleton, "Piecewise regression," J. Am. Stat. Assoc., vol. 65, pp. 1109-1124, 1970.
[58] M. N. Karim, D. Hodge, and L. Simon, "Data-based modeling and analysis of bioprocesses. Some real experiences," Biotechnol. Prog., vol. 19, pp. 1591-1605, 2003.
[59] W. S. Cleveland, E. H. Grosse, and W. M. Shyu, Local regression models. London: Chapman and Hall, J. M. Chambers, and T. J. Hastie, Eds., 1992, pp. 309-376.
[60] Y. Chen, G. Dong, J. Han, B. W. Wah, and J. Wang, "Multidimensional regression analysis of time-series data streams," Proc. 28th Int. Conf. Very Large Data Bases, Hongkong, China, pp. 323-334, 2002.
[61] Akhbardeh, A., Nikhil, Koskinen, P.E., Yli-Harja, O., Towards the Experimental Evaluation of Novel Supervised Fuzzy Adaptive Resonance Theory for Pattern Classification, Pattern Recognition Letters (2007), doi: 10.1016/j.patrec.2007.10.017
[62] Ramkrishna D, Amundson NR (2004) Mathematics in chemical engineering: a 50 year introspection. AIChE J 50:7-23
[63] G. Endo, T. Noike and J. Matsumoto, "Characteristics of cellulose and glucose decomposition in acidogenic phase of anaerobic digestion," Proc. Soc. Civ. Engrs., vol. 325, pp. 61-68, 1982. (In Japanese).
[64] H. Q. Yu, Z. H. Hu, T. Q. Hong and G. W. Gu, "Performance of an anaerobic filter treating soybean processing wastewater with and without effluent recycle," Process Biochem., vol. 38, pp. 507-513, 2002.
[65] N. Kataoka, A. Miya, and K. Kiriyama, "Studies on hydrogen production by continuous culture system of hydrogen-producing anaerobic bacteria," Water Sci. Technol., vol. 36, no. 6-7, pp. 41-47, 1997.
[66] C. C. Chen, and C.-Y. Lin, "Using sucrose as a substrate in an anaerobic hydrogen producing reactor," Adv. Environ. Res., vol. 7, pp. 695-699, 2003.
[67] C.-Y. Lin, and C. H. Lay, "Carbon/nitrogen-ratio effect on fermentative hydrogen production by mixed microflora," Int. J. Hydrogen Energy, vol. 29, no. 1, pp. 41-45, 2004.
[68] C.-Y. Lin, and C. H. Lay, "Effects of carbonate and phosphate concentrations on hydrogen production using anaerobic sewage microflora," Int. J. Hydrogen Energy, vol. 29, no. 3, pp. 275-81, 2004.
[69] M. Dubois, K. A. Giles, J. K. Hamilton, P. A. Rebers, and F. Smith, "Colorimetric method for determination of sugars and related substances," Anal. Chem., vol. 28, pp. 350-356, 1956.
[70] APHA. 1995. Standard methods. 19th Edition. American Public Health Association, Washington, DC.
[71] Koskinen PEP, Kaksonen AH and Puhakka JA (2007) The relationship between instability of H2 production and compositions of bacterial communities within a dark fermentation fluidized-bed bioreactor. Biotechnol Bioeng 97(4):742-758
[72] Hawkes, F.R.; Hussy, I.; Kyazze, G.; Dinsdale, R.; Hawkes, D. L. Continuous dark fermentative hydrogen production by mesophilic microflora: Principles and progress. International Journal of Hydrogen Energy 2007, 32, 172 - 184.