Search results for: healthcare data
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25935

Search results for: healthcare data

25035 Comprehensive Study of Data Science

Authors: Asifa Amara, Prachi Singh, Kanishka, Debargho Pathak, Akshat Kumar, Jayakumar Eravelly

Abstract:

Today's generation is totally dependent on technology that uses data as its fuel. The present study is all about innovations and developments in data science and gives an idea about how efficiently to use the data provided. This study will help to understand the core concepts of data science. The concept of artificial intelligence was introduced by Alan Turing in which the main principle was to create an artificial system that can run independently of human-given programs and can function with the help of analyzing data to understand the requirements of the users. Data science comprises business understanding, analyzing data, ethical concerns, understanding programming languages, various fields and sources of data, skills, etc. The usage of data science has evolved over the years. In this review article, we have covered a part of data science, i.e., machine learning. Machine learning uses data science for its work. Machines learn through their experience, which helps them to do any work more efficiently. This article includes a comparative study image between human understanding and machine understanding, advantages, applications, and real-time examples of machine learning. Data science is an important game changer in the life of human beings. Since the advent of data science, we have found its benefits and how it leads to a better understanding of people, and how it cherishes individual needs. It has improved business strategies, services provided by them, forecasting, the ability to attend sustainable developments, etc. This study also focuses on a better understanding of data science which will help us to create a better world.

Keywords: data science, machine learning, data analytics, artificial intelligence

Procedia PDF Downloads 82
25034 Online Dietary Management System

Authors: Kyle Yatich Terik, Collins Oduor

Abstract:

The current healthcare system has made healthcare more accessible and efficient by the use of information technology through the implementation of computer algorithms that generate menus based on the diagnosis. While many systems just like these have been created over the years, their main objective is to help healthy individuals calculate their calorie intake and assist them by providing food selections based on a pre-specified calorie. That application has been proven to be useful in some ways, and they are not suitable for monitoring, planning, and managing hospital patients, especially that critical condition their dietary needs. The system also addresses a number of objectives, such as; the main objective is to be able to design, develop and implement an efficient, user-friendly as well as and interactive dietary management system. The specific design development objectives include developing a system that will facilitate a monitoring feature for users using graphs, developing a system that will provide system-generated reports to the users, dietitians, and system admins, design a system that allows users to measure their BMI (Body Mass Index), the system will also provide food template feature that will guide the user on a balanced diet plan. In order to develop the system, further research was carried out in Kenya, Nairobi County, using online questionnaires being the preferred research design approach. From the 44 respondents, one could create discussions such as the major challenges encountered from the manual dietary system, which include no easily accessible information of the calorie intake for food products, expensive to physically visit a dietitian to create a tailored diet plan. Conclusively, the system has the potential of improving the quality of life of people as a whole by providing a standard for healthy living and allowing individuals to have readily available knowledge through food templates that will guide people and allow users to create their own diet plans that consist of a balanced diet.

Keywords: DMS, dietitian, patient, administrator

Procedia PDF Downloads 161
25033 Application of Artificial Neural Network Technique for Diagnosing Asthma

Authors: Azadeh Bashiri

Abstract:

Introduction: Lack of proper diagnosis and inadequate treatment of asthma leads to physical and financial complications. This study aimed to use data mining techniques and creating a neural network intelligent system for diagnosis of asthma. Methods: The study population is the patients who had visited one of the Lung Clinics in Tehran. Data were analyzed using the SPSS statistical tool and the chi-square Pearson's coefficient was the basis of decision making for data ranking. The considered neural network is trained using back propagation learning technique. Results: According to the analysis performed by means of SPSS to select the top factors, 13 effective factors were selected, in different performances, data was mixed in various forms, so the different models were made for training the data and testing networks and in all different modes, the network was able to predict correctly 100% of all cases. Conclusion: Using data mining methods before the design structure of system, aimed to reduce the data dimension and the optimum choice of the data, will lead to a more accurate system. Therefore, considering the data mining approaches due to the nature of medical data is necessary.

Keywords: asthma, data mining, Artificial Neural Network, intelligent system

Procedia PDF Downloads 273
25032 Interpreting Privacy Harms from a Non-Economic Perspective

Authors: Christopher Muhawe, Masooda Bashir

Abstract:

With increased Internet Communication Technology(ICT), the virtual world has become the new normal. At the same time, there is an unprecedented collection of massive amounts of data by both private and public entities. Unfortunately, this increase in data collection has been in tandem with an increase in data misuse and data breach. Regrettably, the majority of data breach and data misuse claims have been unsuccessful in the United States courts for the failure of proof of direct injury to physical or economic interests. The requirement to express data privacy harms from an economic or physical stance negates the fact that not all data harms are physical or economic in nature. The challenge is compounded by the fact that data breach harms and risks do not attach immediately. This research will use a descriptive and normative approach to show that not all data harms can be expressed in economic or physical terms. Expressing privacy harms purely from an economic or physical harm perspective negates the fact that data insecurity may result into harms which run counter the functions of privacy in our lives. The promotion of liberty, selfhood, autonomy, promotion of human social relations and the furtherance of the existence of a free society. There is no economic value that can be placed on these functions of privacy. The proposed approach addresses data harms from a psychological and social perspective.

Keywords: data breach and misuse, economic harms, privacy harms, psychological harms

Procedia PDF Downloads 195
25031 The Importance and Feasibility of Hospital Interventions for Patient Aggression and Violence Against Physicians in China: A Delphi Study

Authors: Yuhan Wu, CTB (Kees) Ahaus, Martina Buljac-Samardzic

Abstract:

Patient aggression and violence is a complex occupational hazards for physicians working in hospitals, and it can have multiple severe negative effects for physicians and hospitals. Although there is a range of interventions in the healthcare sector applied in various countries, China lacks a comprehensive set of interventions at the hospital level in this area. Therefore, due to cultural differences, this study investigates whether international interventions are important and feasible in the Chinese cultural context by conducting a Delphi study. Based on a literature search, a list of 47 hospital interventions to prevent and manage patient aggression and violence was constructed, including 8 categories: hospital environment design, access and entrance, staffing and work practice, training and education, leadership and culture, support, during/after-the-event actions, and hospital policy. The list of interventions will be refined, extended and brought back during a three-round Delphi study. The panel consists of 17 Chinese experts, including physicians experiencing patient aggression and violence, hospital management team members, scientists working in this research area, and policymakers in the healthcare sector. In each round, experts will receive the possible interventions with the instruction to indicate the importance and feasibility of each intervention for preventing and managing patient violence and aggression in Chinese hospitals. Experts will be asked about the importance and feasibility of interventions for patient violence and aggression at the same time. This study will exclude or include interventions based on the score of importance. More specifically, an intervention will be included after each round if >80% of the experts judged it as important or very important and excluded if >50% judged an intervention as not or moderately important. The three-round Delphi study will provide a list of included interventions and assess which of the 8 categories of interventions are considered as important. It is expected that this study can bring new ideas and inspiration to Chinese hospitals in the prevention and management of patient aggression and violence.

Keywords: patient aggression and violence, hospital interventions, feasibility, importance

Procedia PDF Downloads 96
25030 Machine Learning Analysis of Student Success in Introductory Calculus Based Physics I Course

Authors: Chandra Prayaga, Aaron Wade, Lakshmi Prayaga, Gopi Shankar Mallu

Abstract:

This paper presents the use of machine learning algorithms to predict the success of students in an introductory physics course. Data having 140 rows pertaining to the performance of two batches of students was used. The lack of sufficient data to train robust machine learning models was compensated for by generating synthetic data similar to the real data. CTGAN and CTGAN with Gaussian Copula (Gaussian) were used to generate synthetic data, with the real data as input. To check the similarity between the real data and each synthetic dataset, pair plots were made. The synthetic data was used to train machine learning models using the PyCaret package. For the CTGAN data, the Ada Boost Classifier (ADA) was found to be the ML model with the best fit, whereas the CTGAN with Gaussian Copula yielded Logistic Regression (LR) as the best model. Both models were then tested for accuracy with the real data. ROC-AUC analysis was performed for all the ten classes of the target variable (Grades A, A-, B+, B, B-, C+, C, C-, D, F). The ADA model with CTGAN data showed a mean AUC score of 0.4377, but the LR model with the Gaussian data showed a mean AUC score of 0.6149. ROC-AUC plots were obtained for each Grade value separately. The LR model with Gaussian data showed consistently better AUC scores compared to the ADA model with CTGAN data, except in two cases of the Grade value, C- and A-.

Keywords: machine learning, student success, physics course, grades, synthetic data, CTGAN, gaussian copula CTGAN

Procedia PDF Downloads 44
25029 Leveraging on Application of Customer Relationship Management Strategy as Business Driving Force: A Case Study of Major Industries

Authors: Odunayo S. Faluse, Roger Telfer

Abstract:

Customer relationship management is a business strategy that is centred on the idea that ‘Customer is the driving force of any business’ i.e. Customer is placed in a central position in any business. However, this belief coupled with the advancement in information technology in the past twenty years has experienced a change. In any form of business today it can be concluded that customers are the modern dictators to whom the industry always adjusts its business operations due to the increase in availability of information, intense market competition and ever growing negotiating ideas of customers in the process of buying and selling. The most vital role of any organization is to satisfy or meet customer’s needs and demands, which eventually determines customer’s long-term value to the industry. Therefore, this paper analyses and describes the application of customer relationship management operational strategies in some of the major industries in business. Both developed and up-coming companies nowadays value the quality of customer services and client’s loyalty, they also recognize the customers that are not very sensitive when it comes to changes in price and thereby realize that attracting new customers is more tasking and expensive than retaining the existing customers. However, research shows that several factors have recently amounts to the sudden rise in the execution of CRM strategies in the marketplace, such as a diverted attention of some organization towards integrating ideas in retaining existing customers rather than attracting new one, gathering data about customers through the use of internal database system and acquiring of external syndicate data, also exponential increase in technological intelligence. Apparently, with this development in business operations, CRM research in Academia remain nascent; hence this paper gives detailed critical analysis of the recent advancement in the use of CRM and key research opportunities for future development in using the implementation of CRM as a determinant factor for successful business optimization.

Keywords: agriculture, banking, business strategies, CRM, education, healthcare

Procedia PDF Downloads 223
25028 Prediction of Alzheimer's Disease Based on Blood Biomarkers and Machine Learning Algorithms

Authors: Man-Yun Liu, Emily Chia-Yu Su

Abstract:

Alzheimer's disease (AD) is the public health crisis of the 21st century. AD is a degenerative brain disease and the most common cause of dementia, a costly disease on the healthcare system. Unfortunately, the cause of AD is poorly understood, furthermore; the treatments of AD so far can only alleviate symptoms rather cure or stop the progress of the disease. Currently, there are several ways to diagnose AD; medical imaging can be used to distinguish between AD, other dementias, and early onset AD, and cerebrospinal fluid (CSF). Compared with other diagnostic tools, blood (plasma) test has advantages as an approach to population-based disease screening because it is simpler, less invasive also cost effective. In our study, we used blood biomarkers dataset of The Alzheimer’s disease Neuroimaging Initiative (ADNI) which was funded by National Institutes of Health (NIH) to do data analysis and develop a prediction model. We used independent analysis of datasets to identify plasma protein biomarkers predicting early onset AD. Firstly, to compare the basic demographic statistics between the cohorts, we used SAS Enterprise Guide to do data preprocessing and statistical analysis. Secondly, we used logistic regression, neural network, decision tree to validate biomarkers by SAS Enterprise Miner. This study generated data from ADNI, contained 146 blood biomarkers from 566 participants. Participants include cognitive normal (healthy), mild cognitive impairment (MCI), and patient suffered Alzheimer’s disease (AD). Participants’ samples were separated into two groups, healthy and MCI, healthy and AD, respectively. We used the two groups to compare important biomarkers of AD and MCI. In preprocessing, we used a t-test to filter 41/47 features between the two groups (healthy and AD, healthy and MCI) before using machine learning algorithms. Then we have built model with 4 machine learning methods, the best AUC of two groups separately are 0.991/0.709. We want to stress the importance that the simple, less invasive, common blood (plasma) test may also early diagnose AD. As our opinion, the result will provide evidence that blood-based biomarkers might be an alternative diagnostics tool before further examination with CSF and medical imaging. A comprehensive study on the differences in blood-based biomarkers between AD patients and healthy subjects is warranted. Early detection of AD progression will allow physicians the opportunity for early intervention and treatment.

Keywords: Alzheimer's disease, blood-based biomarkers, diagnostics, early detection, machine learning

Procedia PDF Downloads 322
25027 Data Access, AI Intensity, and Scale Advantages

Authors: Chuping Lo

Abstract:

This paper presents a simple model demonstrating that ceteris paribus countries with lower barriers to accessing global data tend to earn higher incomes than other countries. Therefore, large countries that inherently have greater data resources tend to have higher incomes than smaller countries, such that the former may be more hesitant than the latter to liberalize cross-border data flows to maintain this advantage. Furthermore, countries with higher artificial intelligence (AI) intensity in production technologies tend to benefit more from economies of scale in data aggregation, leading to higher income and more trade as they are better able to utilize global data.

Keywords: digital intensity, digital divide, international trade, scale of economics

Procedia PDF Downloads 68
25026 Secured Transmission and Reserving Space in Images Before Encryption to Embed Data

Authors: G. R. Navaneesh, E. Nagarajan, C. H. Rajam Raju

Abstract:

Nowadays the multimedia data are used to store some secure information. All previous methods allocate a space in image for data embedding purpose after encryption. In this paper, we propose a novel method by reserving space in image with a boundary surrounded before encryption with a traditional RDH algorithm, which makes it easy for the data hider to reversibly embed data in the encrypted images. The proposed method can achieve real time performance, that is, data extraction and image recovery are free of any error. A secure transmission process is also discussed in this paper, which improves the efficiency by ten times compared to other processes as discussed.

Keywords: secure communication, reserving room before encryption, least significant bits, image encryption, reversible data hiding

Procedia PDF Downloads 412
25025 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 258
25024 A Review on Intelligent Systems for Geoscience

Authors: R Palson Kennedy, P.Kiran Sai

Abstract:

This article introduces machine learning (ML) researchers to the hurdles that geoscience problems present, as well as the opportunities for improvement in both ML and geosciences. This article presents a review from the data life cycle perspective to meet that need. Numerous facets of geosciences present unique difficulties for the study of intelligent systems. Geosciences data is notoriously difficult to analyze since it is frequently unpredictable, intermittent, sparse, multi-resolution, and multi-scale. The first half addresses data science’s essential concepts and theoretical underpinnings, while the second section contains key themes and sharing experiences from current publications focused on each stage of the data life cycle. Finally, themes such as open science, smart data, and team science are considered.

Keywords: Data science, intelligent system, machine learning, big data, data life cycle, recent development, geo science

Procedia PDF Downloads 135
25023 Covid-19 Frontliners Survey: Assessing Complications and Quality of Life in Health Care Workers in District Swat, Khyber Pakhtunkhwa, Pakistan

Authors: Mohsin Shahab, Shagufta Rehmat, Faisal F. Khan

Abstract:

Background: The global COVID-19 pandemic has generated health problems worldwide. Health care workers are the front-line warriors against the pandemic. The aim of this study was to find out the prevalence of COVID-19 (7th May 2021 to 3rd August 2021) amongst Health Care Workers (HCWs) and to assess the complications associated with it and its effects on their quality of life. Material and Method: The study was conducted in healthcare facilities which serve as pandemic hospitals in district Swat. A total of 140 healthcare workers, who were employed in the COVID-19 health care facilities, including the department of Pulmonology, Intensive Care Unit (ICU), and COVID-19 wards. Participants were tested for COVIID-19 using RT PCR test. A Case Report Form (CRF) for conditions during and post COVID-19 was filled to assess the complications and quality of life of health care workers. Results: A total of 140 Health Care Workers were studied, out of which 40% were doctors, 22% nursing staff, 17% paramedic staff, 9% cleaning staff, lab technologist 6%, 2% operation theater staff, administration staff, and pharmacist. The respondents were also investigated for pre-existing illness prior to SARS-CoV-2 infection, hypertension was the most prevalent, followed by chronic heart diseases and neurological disorders. Fever was the most common symptom, recorded 76.42% in the participants, while 55.71% of participants had dry cough, 55% had a sore throat, following by chest pain 43.56%. Reinfection rate was 10%, with chest pain being recorded in 85.71%. Post disease complication analysis showed that 47.14% of the participants were diagnosed with a new diagnosis after the COVID-19 recovery. Pulmonological diseases were recorded the most as a new diagnosis in, followed by gastrointestinal and psychological problems. Conclusions: The results of the study illustrates how COVID-19 has affected the overall health and quality of life of HCWs in District Swat of Khyber Pakhtunkhwa, Pakistan.

Keywords: SARS-CoV-2, COVID-19, HCW's, symptoms, questionnaire, post COVID-19

Procedia PDF Downloads 275
25022 Evaluating the Effectiveness of Plantar Sensory Insoles and Remote Patient Monitoring for Early Intervention in Diabetic Foot Ulcer Prevention in Patients with Peripheral Neuropathy

Authors: Brock Liden, Eric Janowitz

Abstract:

Introduction: Diabetic peripheral neuropathy (DPN) affects 70% of individuals with diabetes1. DPN causes a loss of protective sensation, which can lead to tissue damage and diabetic foot ulcer (DFU) formation2. These ulcers can result in infections and lower-extremity amputations of toes, the entire foot, and the lower leg. Even after a DFU is healed, recurrence is common, with 49% of DFU patients developing another ulcer within a year and 68% within 5 years3. This case series examines the use of sensory insoles and newly available plantar data (pressure, temperature, step count, adherence) and remote patient monitoring in patients at risk of DFU. Methods: Participants were provided with custom-made sensory insoles to monitor plantar pressure, temperature, step count, and daily use and were provided with real-time cues for pressure offloading as they went about their daily activities. The sensory insoles were used to track subject compliance, ulceration, and response to feedback from real-time alerts. Patients were remotely monitored by a qualified healthcare professional and were contacted when areas of concern were seen and provided coaching on reducing risk factors and overall support to improve foot health. Results: Of the 40 participants provided with the sensory insole system, 4 presented with a DFU. Based on flags generated from the available plantar data, patients were contacted by the remote monitor to address potential concerns. A standard clinical escalation protocol detailed when and how concerns should be escalated to the provider by the remote monitor. Upon escalation to the provider, patients were brought into the clinic as needed, allowing for any issues to be addressed before more serious complications might arise. Conclusion: This case series explores the use of innovative sensory technology to collect plantar data (pressure, temperature, step count, and adherence) for DFU detection and early intervention. The results from this case series suggest the importance of sensory technology and remote patient monitoring in providing proactive, preventative care for patients at risk of DFU. This robust plantar data, with the addition of remote patient monitoring, allow for patients to be seen in the clinic when concerns arise, giving providers the opportunity to intervene early and prevent more serious complications, such as wounds, from occurring.

Keywords: diabetic foot ulcer, DFU prevention, digital therapeutics, remote patient monitoring

Procedia PDF Downloads 77
25021 Advances in Medication Reconciliation Tools

Authors: Zixuan Liu, Xin Zhang, Kexin He

Abstract:

In the context of widespread prevalence of multiple diseases, medication safety has become a highly concerned issue affecting patient safety. Medication reconciliation plays a vital role in preventing potential medication risks. However, in medical practice, medication reconciliation faces various challenges, and there is a wide variety of medication reconciliation tools, making the selection of appropriate tools somewhat difficult. The article introduces and analyzes the currently available medication reconciliation tools, providing a reference for healthcare professionals to choose and apply the appropriate medication reconciliation tools.

Keywords: patient safety, medication reconciliation, tools, review

Procedia PDF Downloads 80
25020 A Multicenter Assessment on Psychological Well-Being Status among Medical Residents in the United Arab Emirates

Authors: Mahera Abdulrahman

Abstract:

Objective: Healthcare transformation from traditional to modern in the country recently prompted the need to address career choices, accreditation perception and satisfaction among medical residents. However, a concerted nationwide study to understand and address burnout in the medical residency program has not been conducted in the UAE and the region. Methods: A nationwide, multicenter, cross-sectional study was designed to evaluate professional burnout and depression among medical residents in order to address the gap. Results: Our results indicate that 75.5% (216/286) of UAE medical residents had moderate to high emotional exhaustion, 84% (249/298) had high depersonalization, and 74% (216/291) had a low sense of personal accomplishment. In aggregate, 70% (212/302) of medical residents were considered to be experiencing at least one symptom of burnout based on a high emotional exhaustion score or a high depersonalization score. Depression ranging from 6-22%, depending on the specialty was also striking given the fact the Arab culture lays high emphasis on family bonding. Interestingly 83% (40/48) of medical residents who had high scores for depression also reported burnout. Conclusion: Our data indicate that burnout and depression among medical residents is epidemic. There is an immediate need to address burnout through effective interventions at both the individual and institutional levels. It is imperative to reconfigure the approach to medical training for the well-being of the next generation of physicians in the Arab world.

Keywords: mental health, Gulf, Arab, residency training, burnout, depression

Procedia PDF Downloads 294
25019 The Long – Term Effects of a Prevention Program on the Number of Critical Incidents and Sick Leave Days: A Decade Perspective

Authors: Valerie Isaak

Abstract:

Background: This study explores the effectiveness of refresher training sessions of an intervention program at reducing the employees’ risk of injury due to patient violence in a forensic psychiatric hospital. Methods: The original safety intervention program that consisted of a 3 days’ workshop was conducted in the maximum-security ward of a psychiatric hospital in Israel. Ever since the original intervention, annual refreshers were conducted, highlighting one of the safety elements covered in the original intervention. The study examines the effect of the intervention program along with the refreshers over a period of 10 years in four wards. Results: Analysis of the data demonstrates that beyond the initial reduction following the original intervention, refreshers seem to have an additional positive long-term effect, reducing both the number of violent incidents and the number of actual employee injuries in a forensic psychiatric hospital. Conclusions: We conclude that such an intervention program followed by refresher training would promote employees’ wellbeing. A healthy work environment is part of management’s commitment to improving employee wellbeing at the workplace.

Keywords: wellbeing, violence at work, intervention program refreshers, public sector mental healthcare

Procedia PDF Downloads 137
25018 A Mixed Methods Study: Evaluation of Experiential Learning Techniques throughout a Nursing Curriculum to Promote Empathy

Authors: Joan Esper Kuhnly, Jess Holden, Lynn Shelley, Nicole Kuhnly

Abstract:

Empathy serves as a foundational nursing principle inherent in the nurse’s ability to form those relationships from which to care for patients. Evidence supports, including empathy in nursing and healthcare education, but there is limited data on what methods are effective to do so. Building evidence supports experiential and interactive learning methods to be effective for students to gain insight and perspective from a personalized experience. The purpose of this project is to evaluate learning activities designed to promote the attainment of empathic behaviors across 5 levels of the nursing curriculum. Quantitative analysis will be conducted on data from pre and post-learning activities using the Toronto Empathy Questionnaire. The main hypothesis, that simulation learning activities will increase empathy, will be examined using a repeated measures Analysis of Variance (ANOVA) on Pre and Post Toronto Empathy Questionnaire scores for three simulation activities (Stroke, Poverty, Dementia). Pearson product-moment correlations will be conducted to examine the relationships between continuous demographic variables, such as age, credits earned, and years practicing, with the dependent variable of interest, Post Test Toronto Empathy Scores. Krippendorff’s method of content analysis will be conducted to identify the quantitative incidence of empathic responses. The researchers will use Colaizzi’s descriptive phenomenological method to describe the students’ simulation experience and understand its impact on caring and empathy behaviors employing bracketing to maintain objectivity. The results will be presented, answering multiple research questions. The discussion will be relevant to results and educational pedagogy in the nursing curriculum as they relate to the attainment of empathic behaviors.

Keywords: curriculum, empathy, nursing, simulation

Procedia PDF Downloads 111
25017 Data Quality as a Pillar of Data-Driven Organizations: Exploring the Benefits of Data Mesh

Authors: Marc Bachelet, Abhijit Kumar Chatterjee, José Manuel Avila

Abstract:

Data quality is a key component of any data-driven organization. Without data quality, organizations cannot effectively make data-driven decisions, which often leads to poor business performance. Therefore, it is important for an organization to ensure that the data they use is of high quality. This is where the concept of data mesh comes in. Data mesh is an organizational and architectural decentralized approach to data management that can help organizations improve the quality of data. The concept of data mesh was first introduced in 2020. Its purpose is to decentralize data ownership, making it easier for domain experts to manage the data. This can help organizations improve data quality by reducing the reliance on centralized data teams and allowing domain experts to take charge of their data. This paper intends to discuss how a set of elements, including data mesh, are tools capable of increasing data quality. One of the key benefits of data mesh is improved metadata management. In a traditional data architecture, metadata management is typically centralized, which can lead to data silos and poor data quality. With data mesh, metadata is managed in a decentralized manner, ensuring accurate and up-to-date metadata, thereby improving data quality. Another benefit of data mesh is the clarification of roles and responsibilities. In a traditional data architecture, data teams are responsible for managing all aspects of data, which can lead to confusion and ambiguity in responsibilities. With data mesh, domain experts are responsible for managing their own data, which can help provide clarity in roles and responsibilities and improve data quality. Additionally, data mesh can also contribute to a new form of organization that is more agile and adaptable. By decentralizing data ownership, organizations can respond more quickly to changes in their business environment, which in turn can help improve overall performance by allowing better insights into business as an effect of better reports and visualization tools. Monitoring and analytics are also important aspects of data quality. With data mesh, monitoring, and analytics are decentralized, allowing domain experts to monitor and analyze their own data. This will help in identifying and addressing data quality problems in quick time, leading to improved data quality. Data culture is another major aspect of data quality. With data mesh, domain experts are encouraged to take ownership of their data, which can help create a data-driven culture within the organization. This can lead to improved data quality and better business outcomes. Finally, the paper explores the contribution of AI in the coming years. AI can help enhance data quality by automating many data-related tasks, like data cleaning and data validation. By integrating AI into data mesh, organizations can further enhance the quality of their data. The concepts mentioned above are illustrated by AEKIDEN experience feedback. AEKIDEN is an international data-driven consultancy that has successfully implemented a data mesh approach. By sharing their experience, AEKIDEN can help other organizations understand the benefits and challenges of implementing data mesh and improving data quality.

Keywords: data culture, data-driven organization, data mesh, data quality for business success

Procedia PDF Downloads 135
25016 Big Data Analysis with RHadoop

Authors: Ji Eun Shin, Byung Ho Jung, Dong Hoon Lim

Abstract:

It is almost impossible to store or analyze big data increasing exponentially with traditional technologies. Hadoop is a new technology to make that possible. R programming language is by far the most popular statistical tool for big data analysis based on distributed processing with Hadoop technology. With RHadoop that integrates R and Hadoop environment, we implemented parallel multiple regression analysis with different sizes of actual data. Experimental results showed our RHadoop system was much faster as the number of data nodes increases. We also compared the performance of our RHadoop with lm function and big lm packages available on big memory. The results showed that our RHadoop was faster than other packages owing to paralleling processing with increasing the number of map tasks as the size of data increases.

Keywords: big data, Hadoop, parallel regression analysis, R, RHadoop

Procedia PDF Downloads 437
25015 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 94
25014 Efficient Positioning of Data Aggregation Point for Wireless Sensor Network

Authors: Sifat Rahman Ahona, Rifat Tasnim, Naima Hassan

Abstract:

Data aggregation is a helpful technique for reducing the data communication overhead in wireless sensor network. One of the important tasks of data aggregation is positioning of the aggregator points. There are a lot of works done on data aggregation. But, efficient positioning of the aggregators points is not focused so much. In this paper, authors are focusing on the positioning or the placement of the aggregation points in wireless sensor network. Authors proposed an algorithm to select the aggregators positions for a scenario where aggregator nodes are more powerful than sensor nodes.

Keywords: aggregation point, data communication, data aggregation, wireless sensor network

Procedia PDF Downloads 158
25013 Spatial Econometric Approaches for Count Data: An Overview and New Directions

Authors: Paula Simões, Isabel Natário

Abstract:

This paper reviews a number of theoretical aspects for implementing an explicit spatial perspective in econometrics for modelling non-continuous data, in general, and count data, in particular. It provides an overview of the several spatial econometric approaches that are available to model data that are collected with reference to location in space, from the classical spatial econometrics approaches to the recent developments on spatial econometrics to model count data, in a Bayesian hierarchical setting. Considerable attention is paid to the inferential framework, necessary for structural consistent spatial econometric count models, incorporating spatial lag autocorrelation, to the corresponding estimation and testing procedures for different assumptions, to the constrains and implications embedded in the various specifications in the literature. This review combines insights from the classical spatial econometrics literature as well as from hierarchical modeling and analysis of spatial data, in order to look for new possible directions on the processing of count data, in a spatial hierarchical Bayesian econometric context.

Keywords: spatial data analysis, spatial econometrics, Bayesian hierarchical models, count data

Procedia PDF Downloads 594
25012 A NoSQL Based Approach for Real-Time Managing of Robotics's Data

Authors: Gueidi Afef, Gharsellaoui Hamza, Ben Ahmed Samir

Abstract:

This paper deals with the secret of the continual progression data that new data management solutions have been emerged: The NoSQL databases. They crossed several areas like personalization, profile management, big data in real-time, content management, catalog, view of customers, mobile applications, internet of things, digital communication and fraud detection. Nowadays, these database management systems are increasing. These systems store data very well and with the trend of big data, a new challenge’s store demands new structures and methods for managing enterprise data. The new intelligent machine in the e-learning sector, thrives on more data, so smart machines can learn more and faster. The robotics are our use case to focus on our test. The implementation of NoSQL for Robotics wrestle all the data they acquire into usable form because with the ordinary type of robotics; we are facing very big limits to manage and find the exact information in real-time. Our original proposed approach was demonstrated by experimental studies and running example used as a use case.

Keywords: NoSQL databases, database management systems, robotics, big data

Procedia PDF Downloads 355
25011 Fuzzy Optimization Multi-Objective Clustering Ensemble Model for Multi-Source Data Analysis

Authors: C. B. Le, V. N. Pham

Abstract:

In modern data analysis, multi-source data appears more and more in real applications. Multi-source data clustering has emerged as a important issue in the data mining and machine learning community. Different data sources provide information about different data. Therefore, multi-source data linking is essential to improve clustering performance. However, in practice multi-source data is often heterogeneous, uncertain, and large. This issue is considered a major challenge from multi-source data. Ensemble is a versatile machine learning model in which learning techniques can work in parallel, with big data. Clustering ensemble has been shown to outperform any standard clustering algorithm in terms of accuracy and robustness. However, most of the traditional clustering ensemble approaches are based on single-objective function and single-source data. This paper proposes a new clustering ensemble method for multi-source data analysis. The fuzzy optimized multi-objective clustering ensemble method is called FOMOCE. Firstly, a clustering ensemble mathematical model based on the structure of multi-objective clustering function, multi-source data, and dark knowledge is introduced. Then, rules for extracting dark knowledge from the input data, clustering algorithms, and base clusterings are designed and applied. Finally, a clustering ensemble algorithm is proposed for multi-source data analysis. The experiments were performed on the standard sample data set. The experimental results demonstrate the superior performance of the FOMOCE method compared to the existing clustering ensemble methods and multi-source clustering methods.

Keywords: clustering ensemble, multi-source, multi-objective, fuzzy clustering

Procedia PDF Downloads 189
25010 Application of Deep Neural Networks to Assess Corporate Credit Rating

Authors: Parisa Golbayani, Dan Wang, Ionut¸ Florescu

Abstract:

In this work we implement machine learning techniques to financial statement reports in order to asses company’s credit rating. Specifically, the work analyzes the performance of four neural network architectures (MLP, CNN, CNN2D, LSTM) in predicting corporate credit rating as issued by Standard and Poor’s. The paper focuses on companies from the energy, financial, and healthcare sectors in the US. The goal of this analysis is to improve application of machine learning algorithms to credit assessment. To accomplish this, the study investigates three questions. First, we investigate if the algorithms perform better when using a selected subset of important features or whether better performance is obtained by allowing the algorithms to select features themselves. Second, we address the temporal aspect inherent in financial data and study whether it is important for the results obtained by a machine learning algorithm. Third, we aim to answer if one of the four particular neural network architectures considered consistently outperforms the others, and if so under which conditions. This work frames the problem as several case studies to answer these questions and analyze the results using ANOVA and multiple comparison testing procedures.

Keywords: convolutional neural network, long short term memory, multilayer perceptron, credit rating

Procedia PDF Downloads 235
25009 Modeling Activity Pattern Using XGBoost for Mining Smart Card Data

Authors: Eui-Jin Kim, Hasik Lee, Su-Jin Park, Dong-Kyu Kim

Abstract:

Smart-card data are expected to provide information on activity pattern as an alternative to conventional person trip surveys. The focus of this study is to propose a method for training the person trip surveys to supplement the smart-card data that does not contain the purpose of each trip. We selected only available features from smart card data such as spatiotemporal information on the trip and geographic information system (GIS) data near the stations to train the survey data. XGboost, which is state-of-the-art tree-based ensemble classifier, was used to train data from multiple sources. This classifier uses a more regularized model formalization to control the over-fitting and show very fast execution time with well-performance. The validation results showed that proposed method efficiently estimated the trip purpose. GIS data of station and duration of stay at the destination were significant features in modeling trip purpose.

Keywords: activity pattern, data fusion, smart-card, XGboost

Procedia PDF Downloads 246
25008 Hepatitis B, Hepatitis C and HIV Infections and Associated Risk Factors among Substance Abusers in Mekelle Substance Users Treatment and Rehabilitation Centers, Tigrai, Northern Ethiopia

Authors: Tadele Araya, Tsehaye Asmelash, Girmatsion Fiseha

Abstract:

Background: Hepatitis B virus (HBV), Hepatitis C virus (HCV) and Human Immunodeficiency Virus (HIV) constitute serious healthcare problems worldwide. Blood-borne pathogens HBV, HCV and HIV are commonly associated with infections among substance or Injection Drug Users (IDUs). The objective of this study was to determine the prevalence of HBV, HCV, and HIV infections among substance users in Mekelle Substance users Treatment and Rehabilitation Centers. Methods: A cross-sectional study design was used from Dec 2020 to Sep / 2021 to conduct the study. A total of 600 substance users were included. Data regarding the socio-demographic, clinical and sexual behaviors of the substance users were collected using a structured questionnaire. For laboratory analysis, 5-10 ml of venous blood was taken from the substance users. The laboratory analysis was performed by Enzyme-Linked Immunosorbent Assay (ELISA) at Mekelle University, Department of Medical Microbiology and Immunology Research Laboratory. The Data was analyzed using SPSS and Epi-data. The association of variables with HBV, HCV and HIV infections was determined using multivariate analysis and a P value < 0.05 was considered statistically significant. Result: The overall prevalence rate of HBV, HCV and HIV infections were 10%, 6.6%, and 7.5%, respectively. The mean age of the study participants was 28.12 ± 6.9. A higher prevalence of HBV infection was seen in participants who were users of drug injections and in those who were infected with HIV. HCV was comparatively higher in those who had a previous history of unsafe surgical procedures than their counterparts. Homeless participants were highly exposed to HCV and HIV infections than their counterparts. The HBV/HIV Co-infection prevalence was 3.5%. Those doing unprotected sexual practices [P= 0.03], Injection Drug users [P= 0.03], those who had an HBV-infected person in their family [P=0.02], infected with HIV [P= 0.025] were statistically associated with HBV infection. HCV was significantly associated with Substance users and previous history of unsafe surgical procedures [p=0.03, p=0.04), respectively. HIV was significantly associated with unprotected sexual practices and being homeless [p=0.045, p=0.05) respectively. Conclusion-The highly prevalent viral infection was HBV compared to others. There was a High prevalence of HBV/HIV co-infection. The presence of HBV-infected persons in a family, unprotected sexual practices and sharing of needles for drug injection were the risk factors associated with HBV, HIV, and HCV. Continuous health education and screening of the viral infection coupled with medical and psychological treatment is mandatory for the prevention and control of the infections.

Keywords: hepatitis b virus, hepatitis c virus, HIV, substance users

Procedia PDF Downloads 85
25007 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 143
25006 Clique and Clan Analysis of Patient-Sharing Physician Collaborations

Authors: Shahadat Uddin, Md Ekramul Hossain, Arif Khan

Abstract:

The collaboration among physicians during episodes of care for a hospitalised patient has a significant contribution towards effective health outcome. This research aims at improving this health outcome by analysing the attributes of patient-sharing physician collaboration network (PCN) on hospital data. To accomplish this goal, we present a research framework that explores the impact of several types of attributes (such as clique and clan) of PCN on hospitalisation cost and hospital length of stay. We use electronic health insurance claim dataset to construct and explore PCNs. Each PCN is categorised as ‘low’ and ‘high’ in terms of hospitalisation cost and length of stay. The results from the proposed model show that the clique and clan of PCNs affect the hospitalisation cost and length of stay. The clique and clan of PCNs show the difference between ‘low’ and ‘high’ PCNs in terms of hospitalisation cost and length of stay. The findings and insights from this research can potentially help the healthcare stakeholders to better formulate the policy in order to improve quality of care while reducing cost.

Keywords: clique, clan, electronic health records, physician collaboration

Procedia PDF Downloads 140