Comparison of Machine Learning Techniques for Single Imputation on Audiograms

Sarah Beaver; Renee Bryce

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33156

Comparison of Machine Learning Techniques for Single Imputation on Audiograms

Authors: Sarah Beaver, Renee Bryce

Abstract:

Audiograms detect hearing impairment, but missing values pose problems. This work explores imputations in an attempt to improve accuracy. This work implements Linear Regression, Lasso, Linear Support Vector Regression, Bayesian Ridge, K Nearest Neighbors (KNN), and Random Forest machine learning techniques to impute audiogram frequencies ranging from 125 Hz to 8000 Hz. The data contain patients who had or were candidates for cochlear implants. Accuracy is compared across two different Nested Cross-Validation k values. Over 4000 audiograms were used from 800 unique patients. Additionally, training on data combines and compares left and right ear audiograms versus single ear side audiograms. The accuracy achieved using Root Mean Square Error (RMSE) values for the best models for Random Forest ranges from 4.74 to 6.37. The R2 values for the best models for Random Forest ranges from .91 to .96. The accuracy achieved using RMSE values for the best models for KNN ranges from 5.00 to 7.72. The R2 values for the best models for KNN ranges from .89 to .95. The best imputation models received R2 between .89 to .96 and RMSE values less than 8dB. We also show that the accuracy of classification predictive models performed better with our imputation models versus constant imputations by a two percent increase.

Keywords: Machine Learning, audiograms, data imputations, single imputations.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 190

References:

[1] H. Mahboubi, H. W. Lin, and N. Bhattacharyya, “Prevalence, characteristics, and treatment patterns of hearing difficulty in the united states,” JAMA Otolaryngology–Head & Neck Surgery, vol. 144, no. 1, pp. 65–70, 2018.
[2] F. Charih, A. Steeves, M. Bromwich, A. E. Mark, R. Lefranc¸ois, and J. R. Green, “Applications of machine learning methods in retrospective studies on hearing,” in 2018 IEEE Life Sciences Conference (LSC). IEEE, 2018, pp. 126–129.
[3] E. Rose, “Audiology,” Australian Family Physician, vol. 40, no. 5, pp. 290–292, 2011.
[4] C. Pavelchek, A. P. Michelson, A. Walia, A. Ortmann, J. Herzog, C. A. Buchman, and M. A. Shew, “Imputation of missing values for cochlear implant candidate audiometric data and potential applications,” Plos one, vol. 18, no. 2, p. e0281337, 2023.
[5] M. K. Hasan, M. A. Alam, S. Roy, A. Dutta, M. T. Jawad, and S. Das, “Missing value imputation affects the performance of machine learning: A review and analysis of the literature (2010–2021),” Informatics in Medicine Unlocked, vol. 27, p. 100799, 2021.
[6] J. G. Ibrahim, H. Chu, and M.-H. Chen, “Missing data in clinical studies: issues and methods,” Journal of clinical oncology, vol. 30, no. 26, p. 3297, 2012.
[7] J. M. Jerez, I. Molina, P. J. Garc´ıa-Laencina, E. Alba, N. Ribelles, M. Mart´ın, and L. Franco, “Missing data imputation using statistical and machine learning methods in a real breast cancer problem,” Artificial intelligence in medicine, vol. 50, no. 2, pp. 105–115, 2010.
[8] A. Dubey and A. Rasool, “Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour,” Scientific Reports, vol. 11, no. 1, pp. 1–12, 2021.
[9] R. Bruni, C. Daraio, and D. Aureli, “Imputation techniques for the reconstruction of missing interconnected data from higher educational institutions,” Knowledge-Based Systems, vol. 212, p. 106512, 2021.
[10] J. Wendl, D. Gerstner, J. Huß, V. Weilnhammer, C. Jenkac, C. Pe´rez-A` lvarez, T. Steffens, C. Herr, and S. Heinze, “Compensating for missing data in the ohrkan cohort study examining total leisure noise exposure among adolescents,” International Journal of Audiology, vol. 61, no. 7, pp. 574–582, 2022.
[11] R. K. Sharma, S. Y. Chen, J. Grisel, and J. S. Golub, “Assessing cochlear implant performance in older adults using a single, universal outcome measure created with imputation in hermes,” Otology & Neurotology, vol. 39, no. 8, pp. 987–994, 2018.
[12] D. L. Barbour, R. T. Howard, X. D. Song, N. Metzger, K. A. Sukesan, J. C. DiLorenzo, B. R. Snyder, J. Y. Chen, E. A. Degen, J. M. Buchbinder et al., “Online machine learning audiometry,” Ear and hearing, vol. 40, no. 4, p. 918, 2019.
[13] X. D. Song, B. M. Wallace, J. R. Gardner, N. M. Ledbetter, K. Q. Weinberger, and D. L. Barbour, “Fast, continuous audiogram estimation using machine learning,” Ear and hearing, vol. 36, no. 6, p. e326, 2015.
[14] R. Feirn, “Guidelines for fitting hearing aids to young infants version 2.0 february 2014,” 2014.
[15] A. L. Pittman and P. G. Stelmachowicz, “Hearing loss in children and adults: audiometric configuration, asymmetry, and progression,” Ear and hearing, vol. 24, no. 3, p. 198, 2003.
[16] P. Pitathawatchai, S. Chaichulee, and V. Kirtsreesakul, “Robust machine learning method for imputing missing values in audiograms collected in children,” International Journal of Audiology, vol. 61, no. 1, pp. 66–77, 2022.
[17] E. C. Schafer, J. J. Grisel, A. de Jong, K. Ravelo, A. Lam, M. Burke, T. Griffin, M. Winter, and D. Schrader, “Creating a framework for data sharing in cochlear implant research,” Cochlear Implants International, vol. 17, no. 6, pp. 283–292, 2016.
[18] S. Y. Chen, J. J. Grisel, A. Lam, and J. S. Golub, “Assessing cochlear implant outcomes in older adults using hermes: A national web-based database,” Otology & Neurotology, vol. 38, no. 10, pp. e405–e412, 2017.
[19] U. Garciarena and R. Santana, “An extensive analysis of the interaction between missing data types, imputation methods, and supervised classifiers,” Expert Systems with Applications, vol. 89, pp. 52–65, 2017.
[20] S. Saleem, M. Aslam, and M. R. Shaukat, “A review and empirical comparison of univariate outlier detection methods.” Pakistan Journal of Statistics, vol. 37, no. 4, 2021.
[21] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg et al., “Scikit-learn: Machine learning in python,” the Journal of machine Learning research, vol. 12, pp. 2825–2830, 2011.
[22] F. GUSTAFSSON, “Comparing random forest, xgboost and neural networks with hyperparameter optimization by nested cross-validation,” 2019.
[23] D. Chicco, M. J. Warrens, and G. Jurman, “The coefficient of determination r-squared is more informative than smape, mae, mape, mse and rmse in regression analysis evaluation,” PeerJ Computer Science, vol. 7, p. e623, 2021.
[24] T. Chai and R. R. Draxler, “Root mean square error (rmse) or mean absolute error (mae)?–arguments against avoiding rmse in the literature,” Geoscientific model development, vol. 7, no. 3, pp. 1247–1250, 2014.
[25] T. O. Hodson, “Root-mean-square error (rmse) or mean absolute error (mae): when to use them or not,” Geoscientific Model Development, vol. 15, no. 14, pp. 5481–5487, 2022.