Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

A. Appe; B. Poluparthi; L. Kasivajjula; U. Mv; S. Bagadi; P. Modi; A. Singh; H. Gunupudi; S. Troiano; J. Paul; J. Stovall; J. Yamamoto

Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 33132

Machine Learning Framework: Competitive Intelligence and Key Drivers Identification of Market Share Trends among Healthcare Facilities

Authors: A. Appe, B. Poluparthi, L. Kasivajjula, U. Mv, S. Bagadi, P. Modi, A. Singh, H. Gunupudi, S. Troiano, J. Paul, J. Stovall, J. Yamamoto

Abstract:

The necessity of data-driven decisions in healthcare strategy formulation is rapidly increasing. A reliable framework which helps identify factors impacting a healthcare provider facility or a hospital (from here on termed as facility) market share is of key importance. This pilot study aims at developing a data-driven machine learning-regression framework which aids strategists in formulating key decisions to improve the facility’s market share which in turn impacts in improving the quality of healthcare services. The US (United States) healthcare business is chosen for the study, and the data spanning 60 key facilities in Washington State and about 3 years of historical data are considered. In the current analysis, market share is termed as the ratio of the facility’s encounters to the total encounters among the group of potential competitor facilities. The current study proposes a two-pronged approach of competitor identification and regression approach to evaluate and predict market share, respectively. Leveraged model agnostic technique, SHAP (SHapley Additive exPlanations), to quantify the relative importance of features impacting the market share. Typical techniques in literature to quantify the degree of competitiveness among facilities use an empirical method to calculate a competitive factor to interpret the severity of competition. The proposed method identifies a pool of competitors, develops Directed Acyclic Graphs (DAGs) and feature level word vectors, and evaluates the key connected components at the facility level. This technique is robust since it is data-driven, which minimizes the bias from empirical techniques. The DAGs factor in partial correlations at various segregations and key demographics of facilities along with a placeholder to factor in various business rules (for e.g., quantifying the patient exchanges, provider references, and sister facilities). Identified are the multiple groups of competitors among facilities. Leveraging the competitors' identified developed and fine-tuned Random Forest Regression model to predict the market share. To identify key drivers of market share at an overall level, permutation feature importance of the attributes was calculated. For relative quantification of features at a facility level, incorporated SHAP, a model agnostic explainer. This helped to identify and rank the attributes at each facility which impacts the market share. This approach proposes an amalgamation of the two popular and efficient modeling practices, viz., machine learning with graphs and tree-based regression techniques to reduce the bias. With these, we helped to drive strategic business decisions.

Keywords: Competition, DAGs, hospital, healthcare, machine learning, market share, random forest, SHAP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 301

References:

[1] Ecevit, E., Ciftci, F. and Ag, Y., 2010. Competition among Hospitals and Its Measurement: Theory and a Case Study. Romanian Journal of Regional Science, 4(1).
[2] Noether, M., 1988. Competition among hospitals. Journal of Health Economics, 7(3), pp.259-284.
[3] Pasipanodya, T. and Knott, A.M., 2022. The Herfindahl-Hirschmann Index (HHI) Revisited. Available at SSRN 3762836.
[4] Rivers, P.A. and Glover, S.H., 2008. Health care competition, strategic mission, and patient satisfaction: research model and propositions. Journal of health organization and management, 22(6), pp.627-641.
[5] Lundberg, S.M., Erion, G., Chen, H., DeGrave, A., Prutkin, J.M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N. and Lee, S.I., 2020. From local explanations to global understanding with explainable AI for trees. Nature machine intelligence, 2(1), pp.56-67.
[6] Ribeiro, M.T., Singh, S. and Guestrin, C., 2016, August. " Why should i trust you?" Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1135-1144).
[7] Lundberg, S.M. and Lee, S.I., 2017. A unified approach to interpreting model predictions. Advances in neural information processing systems, 30.
[8] Heaton, J., 2016, March. An empirical analysis of feature engineering for predictive modeling. In SoutheastCon 2016 (pp. 1-6). IEEE.
[9] Bahadori, M., Teymourzadeh, E., Ravangard, R., Nasiri, A., Raadabadi, M. and Alimohammadzadeh, K., 2016. Factors contributing towards patient’s choice of a hospital clinic from the patients’ and managers’ perspective. Electronic physician, 8(5), p.2378.
[10] Drapeaux, A., Jenson, J.A. and Fustino, N., 2021. The impact of COVID-19 on patient experience within a Midwest hospital system: A case study. Journal of patient experience, 8, p.23743735211065298.
[11] Cassell, K., Zipfel, C., Bansal, S. and Weinberger, D., 2022. Trends in non-COVID-19 hospitalizations prior to and during the COVID-19 pandemic period, United States, 2017–2021 (preprint).
[12] Ravaghi, H., Alidoost, S., Mannion, R. and Bélorgeot, V.D., 2020. Models and methods for determining the optimal number of beds in hospitals and regions: a systematic scoping review. BMC health services research, 20, pp.1-13.
[13] Spetz, J., Donaldson, N., Aydin, C. and Brown, D.S., 2008. How many nurses per patient? Measurements of nurse staffing in health services research. Health Services Research, 43(5p1), pp.1674-1692.
[14] Luecke, R.W., Rosselli, V.R. and Moss, J.M., 1991. The economic ramifications of “client” dissatisfaction. Group Pract J, 40, pp.8-18.
[15] Lu, W. and Wu, H., 2019. How Online Reviews and Services Affect Physician’s Outpatient Care Demands: Evidence from Two Online Healthcare Communities (Preprint).
[16] Giordano, L.A., Elliott, M.N., Goldstein, E., Lehrman, W.G. and Spencer, P.A., 2010. Development, implementation, and public reporting of the HCAHPS survey. Medical Care Research and Review, 67(1), pp.27-37.
[17] Kim, S., 2015. ppcor: an R package for a fast calculation to semi-partial correlation coefficients. Communications for statistical applications and methods, 22(6), p.665.
[18] DAGshttps://networkx.org/documentation/stable/reference/algorithms/dag.html
[19] Breiman, L., 2001. Random forests. Machine learning, 45, pp.5-32.
[20] System Chen, T. and Guestrin, C., 2016, August. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[21] Bergstra, J., Yamins, D. and Cox, D.D., 2013, June. Hyperopt: A python library for optimizing the hyperparameters of machine learning algorithms. In Proceedings of the 12th Python in science conference (Vol. 13, p. 20).