Search results for: microarray data mining
24737 Assessing the Impacts of Riparian Land Use on Gully Development and Sediment Load: A Case Study of Nzhelele River Valley, Limpopo Province, South Africa
Authors: B. Mavhuru, N. S. Nethengwe
Abstract:
Human activities on land degradation have triggered several environmental problems especially in rural areas that are underdeveloped. The main aim of this study is to analyze the contribution of different land uses to gully development and sediment load on the Nzhelele River Valley in the Limpopo Province. Data was collected using different methods such as observation, field data techniques and experiments. Satellite digital images, topographic maps, aerial photographs and the sediment load static model also assisted in determining how land use affects gully development and sediment load. For data analysis, the researcher used the following methods: Analysis of Variance (ANOVA), descriptive statistics, Pearson correlation coefficient and statistical correlation methods. The results of the research illustrate that high land use activities create negative changes especially in areas that are highly fragile and vulnerable. Distinct impact on land use change was observed within settlement area (9.6 %) within a period of 5 years. High correlation between soil organic matter and soil moisture (R=0.96) was observed. Furthermore, a significant variation (p ≤ 0.6) between the soil organic matter and soil moisture was also observed. A very significant variation (p ≤ 0.003) was observed in bulk density and extreme significant variations (p ≤ 0.0001) were observed in organic matter and soil particle size. The sand mining and agricultural activities has contributed significantly to the amount of sediment load in the Nzhelele River. A high significant amount of total suspended sediment (55.3 %) and bed load (53.8 %) was observed within the agricultural area. The connection which associates the development of gullies to various land use activities determines the amount of sediment load. These results are consistent with other previous research and suggest that land use activities are likely to exacerbate the development of gullies and sediment load in the Nzhelele River Valley.Keywords: drainage basin, geomorphological processes, gully development, land degradation, riparian land use and sediment load
Procedia PDF Downloads 30524736 Generative AI: A Comparison of Conditional Tabular Generative Adversarial Networks and Conditional Tabular Generative Adversarial Networks with Gaussian Copula in Generating Synthetic Data with Synthetic Data Vault
Authors: Lakshmi Prayaga, Chandra Prayaga. Aaron Wade, Gopi Shankar Mallu, Harsha Satya Pola
Abstract:
Synthetic data generated by Generative Adversarial Networks and Autoencoders is becoming more common to combat the problem of insufficient data for research purposes. However, generating synthetic data is a tedious task requiring extensive mathematical and programming background. Open-source platforms such as the Synthetic Data Vault (SDV) and Mostly AI have offered a platform that is user-friendly and accessible to non-technical professionals to generate synthetic data to augment existing data for further analysis. The SDV also provides for additions to the generic GAN, such as the Gaussian copula. We present the results from two synthetic data sets (CTGAN data and CTGAN with Gaussian Copula) generated by the SDV and report the findings. The results indicate that the ROC and AUC curves for the data generated by adding the layer of Gaussian copula are much higher than the data generated by the CTGAN.Keywords: synthetic data generation, generative adversarial networks, conditional tabular GAN, Gaussian copula
Procedia PDF Downloads 7924735 A 0-1 Goal Programming Approach to Optimize the Layout of Hospital Units: A Case Study in an Emergency Department in Seoul
Authors: Farhood Rismanchian, Seong Hyeon Park, Young Hoon Lee
Abstract:
This paper proposes a method to optimize the layout of an emergency department (ED) based on real executions of care processes by considering several planning objectives simultaneously. Recently, demand for healthcare services has been dramatically increased. As the demand for healthcare services increases, so do the need for new healthcare buildings as well as the need for redesign and renovating existing ones. The importance of implementation of a standard set of engineering facilities planning and design techniques has been already proved in both manufacturing and service industry with many significant functional efficiencies. However, high complexity of care processes remains a major challenge to apply these methods in healthcare environments. Process mining techniques applied in this study to tackle the problem of complexity and to enhance care process analysis. Process related information such as clinical pathways extracted from the information system of an ED. A 0-1 goal programming approach is then proposed to find a single layout that simultaneously satisfies several goals. The proposed model solved by optimization software CPLEX 12. The solution reached using the proposed method has 42.2% improvement in terms of walking distance of normal patients and 47.6% improvement in walking distance of critical patients at minimum cost of relocation. It has been observed that lots of patients must unnecessarily walk long distances during their visit to the emergency department because of an inefficient design. A carefully designed layout can significantly decrease patient walking distance and related complications.Keywords: healthcare operation management, goal programming, facility layout problem, process mining, clinical processes
Procedia PDF Downloads 29224734 Performance Analysis of Search Medical Imaging Service on Cloud Storage Using Decision Trees
Authors: González A. Julio, Ramírez L. Leonardo, Puerta A. Gabriel
Abstract:
Telemedicine services use a large amount of data, most of which are diagnostic images in Digital Imaging and Communications in Medicine (DICOM) and Health Level Seven (HL7) formats. Metadata is generated from each related image to support their identification. This study presents the use of decision trees for the optimization of information search processes for diagnostic images, hosted on the cloud server. To analyze the performance in the server, the following quality of service (QoS) metrics are evaluated: delay, bandwidth, jitter, latency and throughput in five test scenarios for a total of 26 experiments during the loading and downloading of DICOM images, hosted by the telemedicine group server of the Universidad Militar Nueva Granada, Bogotá, Colombia. By applying decision trees as a data mining technique and comparing it with the sequential search, it was possible to evaluate the search times of diagnostic images in the server. The results show that by using the metadata in decision trees, the search times are substantially improved, the computational resources are optimized and the request management of the telemedicine image service is improved. Based on the experiments carried out, search efficiency increased by 45% in relation to the sequential search, given that, when downloading a diagnostic image, false positives are avoided in management and acquisition processes of said information. It is concluded that, for the diagnostic images services in telemedicine, the technique of decision trees guarantees the accessibility and robustness in the acquisition and manipulation of medical images, in improvement of the diagnoses and medical procedures in patients.Keywords: cloud storage, decision trees, diagnostic image, search, telemedicine
Procedia PDF Downloads 20124733 Testing and Validation Stochastic Models in Epidemiology
Authors: Snigdha Sahai, Devaki Chikkavenkatappa Yellappa
Abstract:
This study outlines approaches for testing and validating stochastic models used in epidemiology, focusing on the integration and functional testing of simulation code. It details methods for combining simple functions into comprehensive simulations, distinguishing between deterministic and stochastic components, and applying tests to ensure robustness. Techniques include isolating stochastic elements, utilizing large sample sizes for validation, and handling special cases. Practical examples are provided using R code to demonstrate integration testing, handling of incorrect inputs, and special cases. The study emphasizes the importance of both functional and defensive programming to enhance code reliability and user-friendliness.Keywords: computational epidemiology, epidemiology, public health, infectious disease modeling, statistical analysis, health data analysis, disease transmission dynamics, predictive modeling in health, population health modeling, quantitative public health, random sampling simulations, randomized numerical analysis, simulation-based analysis, variance-based simulations, algorithmic disease simulation, computational public health strategies, epidemiological surveillance, disease pattern analysis, epidemic risk assessment, population-based health strategies, preventive healthcare models, infection dynamics in populations, contagion spread prediction models, survival analysis techniques, epidemiological data mining, host-pathogen interaction models, risk assessment algorithms for disease spread, decision-support systems in epidemiology, macro-level health impact simulations, socioeconomic determinants in disease spread, data-driven decision making in public health, quantitative impact assessment of health policies, biostatistical methods in population health, probability-driven health outcome predictions
Procedia PDF Downloads 324732 A Privacy Protection Scheme Supporting Fuzzy Search for NDN Routing Cache Data Name
Authors: Feng Tao, Ma Jing, Guo Xian, Wang Jing
Abstract:
Named Data Networking (NDN) replaces IP address of traditional network with data name, and adopts dynamic cache mechanism. In the existing mechanism, however, only one-to-one search can be achieved because every data has a unique name corresponding to it. There is a certain mapping relationship between data content and data name, so if the data name is intercepted by an adversary, the privacy of the data content and user’s interest can hardly be guaranteed. In order to solve this problem, this paper proposes a one-to-many fuzzy search scheme based on order-preserving encryption to reduce the query overhead by optimizing the caching strategy. In this scheme, we use hash value to ensure the user’s query safe from each node in the process of search, so does the privacy of the requiring data content.Keywords: NDN, order-preserving encryption, fuzzy search, privacy
Procedia PDF Downloads 48324731 Generation of Knowlege with Self-Learning Methods for Ophthalmic Data
Authors: Klaus Peter Scherer, Daniel Knöll, Constantin Rieder
Abstract:
Problem and Purpose: Intelligent systems are available and helpful to support the human being decision process, especially when complex surgical eye interventions are necessary and must be performed. Normally, such a decision support system consists of a knowledge-based module, which is responsible for the real assistance power, given by an explanation and logical reasoning processes. The interview based acquisition and generation of the complex knowledge itself is very crucial, because there are different correlations between the complex parameters. So, in this project (semi)automated self-learning methods are researched and developed for an enhancement of the quality of such a decision support system. Methods: For ophthalmic data sets of real patients in a hospital, advanced data mining procedures seem to be very helpful. Especially subgroup analysis methods are developed, extended and used to analyze and find out the correlations and conditional dependencies between the structured patient data. After finding causal dependencies, a ranking must be performed for the generation of rule-based representations. For this, anonymous patient data are transformed into a special machine language format. The imported data are used as input for algorithms of conditioned probability methods to calculate the parameter distributions concerning a special given goal parameter. Results: In the field of knowledge discovery advanced methods and applications could be performed to produce operation and patient related correlations. So, new knowledge was generated by finding causal relations between the operational equipment, the medical instances and patient specific history by a dependency ranking process. After transformation in association rules logically based representations were available for the clinical experts to evaluate the new knowledge. The structured data sets take account of about 80 parameters as special characteristic features per patient. For different extended patient groups (100, 300, 500), as well one target value as well multi-target values were set for the subgroup analysis. So the newly generated hypotheses could be interpreted regarding the dependency or independency of patient number. Conclusions: The aim and the advantage of such a semi-automatically self-learning process are the extensions of the knowledge base by finding new parameter correlations. The discovered knowledge is transformed into association rules and serves as rule-based representation of the knowledge in the knowledge base. Even more, than one goal parameter of interest can be considered by the semi-automated learning process. With ranking procedures, the most strong premises and also conjunctive associated conditions can be found to conclude the interested goal parameter. So the knowledge, hidden in structured tables or lists can be extracted as rule-based representation. This is a real assistance power for the communication with the clinical experts.Keywords: an expert system, knowledge-based support, ophthalmic decision support, self-learning methods
Procedia PDF Downloads 25224730 Healthcare Big Data Analytics Using Hadoop
Authors: Chellammal Surianarayanan
Abstract:
Healthcare industry is generating large amounts of data driven by various needs such as record keeping, physician’s prescription, medical imaging, sensor data, Electronic Patient Record(EPR), laboratory, pharmacy, etc. Healthcare data is so big and complex that they cannot be managed by conventional hardware and software. The complexity of healthcare big data arises from large volume of data, the velocity with which the data is accumulated and different varieties such as structured, semi-structured and unstructured nature of data. Despite the complexity of big data, if the trends and patterns that exist within the big data are uncovered and analyzed, higher quality healthcare at lower cost can be provided. Hadoop is an open source software framework for distributed processing of large data sets across clusters of commodity hardware using a simple programming model. The core components of Hadoop include Hadoop Distributed File System which offers way to store large amount of data across multiple machines and MapReduce which offers way to process large data sets with a parallel, distributed algorithm on a cluster. Hadoop ecosystem also includes various other tools such as Hive (a SQL-like query language), Pig (a higher level query language for MapReduce), Hbase(a columnar data store), etc. In this paper an analysis has been done as how healthcare big data can be processed and analyzed using Hadoop ecosystem.Keywords: big data analytics, Hadoop, healthcare data, towards quality healthcare
Procedia PDF Downloads 41324729 Aire-Dependent Transcripts have Shortened 3’UTRs and Show Greater Stability by Evading Microrna-Mediated Repression
Authors: Clotilde Guyon, Nada Jmari, Yen-Chin Li, Jean Denoyel, Noriyuki Fujikado, Christophe Blanchet, David Root, Matthieu Giraud
Abstract:
Aire induces ectopic expression of a large repertoire of tissue-specific antigen (TSA) genes in thymic medullary epithelial cells (MECs), driving immunological self-tolerance in maturing T cells. Although important mechanisms of Aire-induced transcription have recently been disclosed through the identification and the study of Aire’s partners, the fine transcriptional functions underlied by a number of them and conferred to Aire are still unknown. Alternative cleavage and polyadenylation (APA) is an essential mRNA processing step regulated by the termination complex consisting of 85 proteins, 10 of them have been related to Aire. We evaluated APA in MECs in vivo by microarray analysis with mRNA-spanning probes and RNA deep sequencing. We uncovered the preference of Aire-dependent transcripts for short-3’UTR isoforms and for proximal poly(A) site selection marked by the increased binding of the cleavage factor Cstf-64. RNA interference of the 10 Aire-related proteins revealed that Clp1, a member of the core termination complex, exerts a profound effect on short 3’UTR isoform preference. Clp1 is also significantly upregulated in the MECs compared to 25 mouse tissues in which we found that TSA expression is associated with longer 3’UTR isoforms. Aire-dependent transcripts escape a global 3’UTR lengthening associated with MEC differentiation, thereby potentiating the repressive effect of microRNAs that are globally upregulated in mature MECs. Consistent with these findings, RNA deep sequencing of actinomycinD-treated MECs revealed the increased stability of short 3’UTR Aire-induced transcripts, resulting in TSA transcripts accumulation and contributing for their enrichment in the MECs.Keywords: Aire, central tolerance, miRNAs, transcription termination
Procedia PDF Downloads 38124728 Data Disorders in Healthcare Organizations: Symptoms, Diagnoses, and Treatments
Authors: Zakieh Piri, Shahla Damanabi, Peyman Rezaii Hachesoo
Abstract:
Introduction: Healthcare organizations like other organizations suffer from a number of disorders such as Business Sponsor Disorder, Business Acceptance Disorder, Cultural/Political Disorder, Data Disorder, etc. As quality in healthcare care mostly depends on the quality of data, we aimed to identify data disorders and its symptoms in two teaching hospitals. Methods: Using a self-constructed questionnaire, we asked 20 questions in related to quality and usability of patient data stored in patient records. Research population consisted of 150 managers, physicians, nurses, medical record staff who were working at the time of study. We also asked their views about the symptoms and treatments for any data disorders they mentioned in the questionnaire. Using qualitative methods we analyzed the answers. Results: After classifying the answers, we found six main data disorders: incomplete data, missed data, late data, blurred data, manipulated data, illegible data. The majority of participants believed in their important roles in treatment of data disorders while others believed in health system problems. Discussion: As clinicians have important roles in producing of data, they can easily identify symptoms and disorders of patient data. Health information managers can also play important roles in early detection of data disorders by proactively monitoring and periodic check-ups of data.Keywords: data disorders, quality, healthcare, treatment
Procedia PDF Downloads 43124727 Big Data and Analytics in Higher Education: An Assessment of Its Status, Relevance and Future in the Republic of the Philippines
Authors: Byron Joseph A. Hallar, Annjeannette Alain D. Galang, Maria Visitacion N. Gumabay
Abstract:
One of the unique challenges provided by the twenty-first century to Philippine higher education is the utilization of Big Data. The higher education system in the Philippines is generating burgeoning amounts of data that contains relevant data that can be used to generate the information and knowledge needed for accurate data-driven decision making. This study examines the status, relevance and future of Big Data and Analytics in Philippine higher education. The insights gained from the study may be relevant to other developing nations similarly situated as the Philippines.Keywords: big data, data analytics, higher education, republic of the philippines, assessment
Procedia PDF Downloads 34724726 Data Management and Analytics for Intelligent Grid
Authors: G. Julius P. Roy, Prateek Saxena, Sanjeev Singh
Abstract:
Power distribution utilities two decades ago would collect data from its customers not later than a period of at least one month. The origin of SmartGrid and AMI has subsequently increased the sampling frequency leading to 1000 to 10000 fold increase in data quantity. This increase is notable and this steered to coin the tern Big Data in utilities. Power distribution industry is one of the largest to handle huge and complex data for keeping history and also to turn the data in to significance. Majority of the utilities around the globe are adopting SmartGrid technologies as a mass implementation and are primarily focusing on strategic interdependence and synergies of the big data coming from new information sources like AMI and intelligent SCADA, there is a rising need for new models of data management and resurrected focus on analytics to dissect data into descriptive, predictive and dictatorial subsets. The goal of this paper is to is to bring load disaggregation into smart energy toolkit for commercial usage.Keywords: data management, analytics, energy data analytics, smart grid, smart utilities
Procedia PDF Downloads 77824725 Privacy Preserving Data Publishing Based on Sensitivity in Context of Big Data Using Hive
Authors: P. Srinivasa Rao, K. Venkatesh Sharma, G. Sadhya Devi, V. Nagesh
Abstract:
Privacy Preserving Data Publication is the main concern in present days because the data being published through the internet has been increasing day by day. This huge amount of data was named as Big Data by its size. This project deals the privacy preservation in the context of Big Data using a data warehousing solution called hive. We implemented Nearest Similarity Based Clustering (NSB) with Bottom-up generalization to achieve (v,l)-anonymity. (v,l)-Anonymity deals with the sensitivity vulnerabilities and ensures the individual privacy. We also calculate the sensitivity levels by simple comparison method using the index values, by classifying the different levels of sensitivity. The experiments were carried out on the hive environment to verify the efficiency of algorithms with Big Data. This framework also supports the execution of existing algorithms without any changes. The model in the paper outperforms than existing models.Keywords: sensitivity, sensitive level, clustering, Privacy Preserving Data Publication (PPDP), bottom-up generalization, Big Data
Procedia PDF Downloads 29324724 Career Guidance System Using Machine Learning
Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan
Abstract:
Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should properly evaluate their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, Neural Networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable to offer an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills
Procedia PDF Downloads 8024723 Career Guidance System Using Machine Learning
Authors: Mane Darbinyan, Lusine Hayrapetyan, Elen Matevosyan
Abstract:
Artificial Intelligence in Education (AIED) has been created to help students get ready for the workforce, and over the past 25 years, it has grown significantly, offering a variety of technologies to support academic, institutional, and administrative services. However, this is still challenging, especially considering the labor market's rapid change. While choosing a career, people face various obstacles because they do not take into consideration their own preferences, which might lead to many other problems like shifting jobs, work stress, occupational infirmity, reduced productivity, and manual error. Besides preferences, people should evaluate properly their technical and non-technical skills, as well as their personalities. Professional counseling has become a difficult undertaking for counselors due to the wide range of career choices brought on by changing technological trends. It is necessary to close this gap by utilizing technology that makes sophisticated predictions about a person's career goals based on their personality. Hence, there is a need to create an automated model that would help in decision-making based on user inputs. Improving career guidance can be achieved by embedding machine learning into the career consulting ecosystem. There are various systems of career guidance that work based on the same logic, such as the classification of applicants, matching applications with appropriate departments or jobs, making predictions, and providing suitable recommendations. Methodologies like KNN, neural networks, K-means clustering, D-Tree, and many other advanced algorithms are applied in the fields of data and compute some data, which is helpful to predict the right careers. Besides helping users with their career choice, these systems provide numerous opportunities which are very useful while making this hard decision. They help the candidate to recognize where he/she specifically lacks sufficient skills so that the candidate can improve those skills. They are also capable of offering an e-learning platform, taking into account the user's lack of knowledge. Furthermore, users can be provided with details on a particular job, such as the abilities required to excel in that industry.Keywords: career guidance system, machine learning, career prediction, predictive decision, data mining, technical and non-technical skills
Procedia PDF Downloads 6924722 Democracy Bytes: Interrogating the Exploitation of Data Democracy by Radical Terrorist Organizations
Authors: Nirmala Gopal, Sheetal Bhoola, Audecious Mugwagwa
Abstract:
This paper discusses the continued infringement and exploitation of data by non-state actors for destructive purposes, emphasizing radical terrorist organizations. It will discuss how terrorist organizations access and use data to foster their nefarious agendas. It further examines how cybersecurity, designed as a tool to curb data exploitation, is ineffective in raising global citizens' concerns about how their data can be kept safe and used for its acquired purpose. The study interrogates several policies and data protection instruments, such as the Data Protection Act, Cyber Security Policies, Protection of Personal Information(PPI) and General Data Protection Regulations (GDPR), to understand data use and storage in democratic states. The study outcomes point to the fact that international cybersecurity and cybercrime legislation, policies, and conventions have not curbed violations of data access and use by radical terrorist groups. The study recommends ways to enhance cybersecurity and reduce cyber risks using democratic principles.Keywords: cybersecurity, data exploitation, terrorist organizations, data democracy
Procedia PDF Downloads 20224721 Effects of Sulphide Mining on AISI 304 Stainless Steel
Authors: Aguasanta Miguel Sarmiento, José Miguel Dávila, María Luisa de la Torre
Abstract:
Acid mine drainage (AMD) is an acidic leachate with high levels of metals and sulphates in solution, which seriously affects the durability and strength of metallic materials used in the construction of structural and mechanical components. This paper presents the results of the evolution over time of the reduction in tensile strength and defects in AISI 304 stainless steel in contact with acid mine drainage. For this purpose, a total of 30 bars with a diameter of 8 mm and a length of 14 cm were placed transversely in the course of a stream contaminated by AMD from the sulphide mines of the Iberian Pyritic Belt (SW Spain). This stream has average pH values of 2.6, a potential of 660 mV, and average concentrations of 12 g/L of sulphates, 1.2 g/L of Fe, 191 mg/L of Zn, etc. Every two months of exposure, 6 stainless steel bars were extracted from the acid stream. They were subjected to surface roughness analysis carried out with the help of Mitutoyo Surftest SJ-210 surface roughness tester. The analysis was carried out at three different points on 5 specimens from each series. The average reading of each parameter is calculated in order to ensure the accuracy of the measurements and the surface coverage. Arithmetic mean roughness value (Ra), mean roughness depth (Rz), and root mean square roughness (Rq) were measured. Five specimens from each series were statically tensile tested using universal equipment (Servosis ME 403 of 200kN). The specimens were clamped at their ends with two grips for cylindrical sections, and the tensile force was applied at a constant speed of 0.5 kN/s, according to the requirements of standard UNE-EN ISO 6892-1: 2020. To determine the modulus of elasticity, limits close to 15% and 55% of the maximum load were used, depending on the course of each test. Field Emission Scanning Electron Microscopy (FESEM) was used to observe corrosion products and defects generated by exposure to AMD. Energy dispersive X-ray spectrometry (EDS) was used to analyse the chemical composition of the corrosion products formed. For this purpose, small pieces were cut from the resulting specimens, cleaned, and embedded in epoxy resin. The results show that after only 5 months of exposure of AISI 304 stainless steel to the mining environment, the surface roughness increases significantly, with average depths almost 6 times greater than the initial one. Cracks are observed on the surface of the material, which increases in size with the time of exposure. A large number of grains with a composition of more than 57% Pb and 16% Sn can be observed inside these cracks. Tensile tests show a reduction in the resistance of this material after only two months of exposure. The results show the serious problems that would result from the use of this material for the use of mechanical components in a sulphide mining environment, not only because of the significant reduction in the lifetime of such components, but also because of the implications for human safety.Keywords: acid mine drainage, corrosion, mechanical properties, stainless steel
Procedia PDF Downloads 1424720 Early Gastric Cancer Prediction from Diet and Epidemiological Data Using Machine Learning in Mizoram Population
Authors: Brindha Senthil Kumar, Payel Chakraborty, Senthil Kumar Nachimuthu, Arindam Maitra, Prem Nath
Abstract:
Gastric cancer is predominantly caused by demographic and diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (ECG) from diet and lifestyle factors using supervised machine learning algorithms. For this study, 160 healthy individual and 80 cases were selected who had been followed for 3 years (2016-2019), at Civil Hospital, Aizawl, Mizoram. A dataset containing 11 features that are core risk factors for the gastric cancer were extracted. Supervised machine algorithms: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Multilayer perceptron, and Random Forest were used to analyze the dataset using Python Jupyter Notebook Version 3. The obtained classified results had been evaluated using metrics parameters: minimum_false_positives, brier_score, accuracy, precision, recall, F1_score, and Receiver Operating Characteristics (ROC) curve. Data analysis results showed Naive Bayes - 88, 0.11; Random Forest - 83, 0.16; SVM - 77, 0.22; Logistic Regression - 75, 0.25 and Multilayer perceptron - 72, 0.27 with respect to accuracy and brier_score in percent. Naive Bayes algorithm out performs with very low false positive rates as well as brier_score and good accuracy. Naive Bayes algorithm classification results in predicting ECG showed very satisfactory results using only diet cum lifestyle factors which will be very helpful for the physicians to educate the patients and public, thereby mortality of gastric cancer can be reduced/avoided with this knowledge mining work.Keywords: Early Gastric cancer, Machine Learning, Diet, Lifestyle Characteristics
Procedia PDF Downloads 16024719 Access to Health Data in Medical Records in Indonesia in Terms of Personal Data Protection Principles: The Limitation and Its Implication
Authors: Anny Retnowati, Elisabeth Sundari
Abstract:
This research aims to elaborate the meaning of personal data protection principles on patient access to health data in medical records in Indonesia and its implications. The method uses normative legal research by examining health law in Indonesia regarding the patient's right to access their health data in medical records. The data will be analysed qualitatively using the interpretation method to elaborate on the limitation of the meaning of personal data protection principles on patients' access to their data in medical records. The results show that patients only have the right to obtain copies of their health data in medical records. There is no right to inspect directly at any time. Indonesian health law limits the principle of patients' right to broad access to their health data in medical records. This restriction has implications for the reduction of personal data protection as part of human rights. This research contribute to show that a limitaion of personal data protection may abuse the human rights.Keywords: access, health data, medical records, personal data, protection
Procedia PDF Downloads 9124718 Conceptualizing the Knowledge to Manage and Utilize Data Assets in the Context of Digitization: Case Studies of Multinational Industrial Enterprises
Authors: Martin Böhmer, Agatha Dabrowski, Boris Otto
Abstract:
The trend of digitization significantly changes the role of data for enterprises. Data turn from an enabler to an intangible organizational asset that requires management and qualifies as a tradeable good. The idea of a networked economy has gained momentum in the data domain as collaborative approaches for data management emerge. Traditional organizational knowledge consequently needs to be extended by comprehensive knowledge about data. The knowledge about data is vital for organizations to ensure that data quality requirements are met and data can be effectively utilized and sovereignly governed. As this specific knowledge has been paid little attention to so far by academics, the aim of the research presented in this paper is to conceptualize it by proposing a “data knowledge model”. Relevant model entities have been identified based on a design science research (DSR) approach that iteratively integrates insights of various industry case studies and literature research.Keywords: data management, digitization, industry 4.0, knowledge engineering, metamodel
Procedia PDF Downloads 35524717 Stainless Steel Degradation by Sulphide Mining
Authors: Aguasanta M. Sarmiento, Jose Miguel Davila, Juan Carlos Fortes, Maria Luisa de la Torre
Abstract:
Acid mine drainage (AMD) is an acidic leachate with high levels of metals and sulphates in solution, which seriously affects the durability and strength of metallic materials used in the construction of structural and mechanical components. This paper presents the results of the evolution over time of the reduction in tensile strength and defects in AISI 304 stainless steel in contact with acid mine drainage. For this purpose, a total of 30 bars with a diameter of 8 mm and a length of 14 cm were placed transversely in the course of a stream contaminated by AMD from the sulphide mines of the Iberian Pyritic Belt (SW Spain). This stream has average pH values of 2.6, a potential of 660 mV and average concentrations of 12 g/L of sulphates, 1.2 g/L of Fe, 191 mg/L of Zn, etc. Every two months of exposure, 6 stainless steel bars were extracted from the acid stream. They were subjected to surface roughness analysis carried out with the help of Mitutoyo Surftest SJ-210 surface roughness tester. The analysis was carried out at three different points on 5 specimens from each series. The average reading of each parameter is calculated in order to ensure the accuracy of the measurements and the surface coverage. Arithmetic mean roughness value (Ra), mean roughness depth (Rz) and root mean square roughness (Rq) were measured. Five specimens from each series were statically tensile tested using universal equipment (Servosis ME 403 of 200kN). The specimens were clamped at their ends with two grips for cylindrical sections, and the tensile force was applied at a constant speed of 0.5 kN/s, according to the requirements of standard UNE-EN ISO 6892-1: 2020. To determine the modulus of elasticity, limits close to 15% and 55% of the maximum load were used, depending on the course of each test. Field Emission Scanning Electron Microscopy (FESEM) was used to observe corrosion products and defects generated by exposure to AMD. Energy dispersive X-ray spectrometry (EDS) was used to analyze the chemical composition of the corrosion products formed. For this purpose, small pieces were cut from the resulting specimens, cleaned and embedded in epoxy resin. The results show that after only 5 months of exposure of AISI 304 stainless steel to the mining environment, the surface roughness increases significantly, with average depths almost 6 times greater than the initial one. Cracks are observed on the surface of the material, which increases in size with the time of exposure. A large number of grains with a composition of more than 57% Pb and 16% Sn can be observed inside these cracks. Tensile tests show a reduction in the resistance of this material after only two months of exposure. The results show the serious problems that would result from the use of this material for the use of mechanical components in a sulphide mining environment, not only because of the significant reduction in the lifetime of such components but also because of the implications for human safety.Keywords: Acid mine drainage, Corrosion, Mechanical properties, Stainless steel
Procedia PDF Downloads 524716 A Method for the Extraction of the Character's Tendency from Korean Novels
Authors: Min-Ha Hong, Kee-Won Kim, Seung-Hoon Kim
Abstract:
The character in the story-based content, such as novels and movies, is one of the core elements to understand the story. In particular, the character’s tendency is an important factor to analyze the story-based content, because it has a significant influence on the storyline. If readers have the knowledge of the tendency of characters before reading a novel, it will be helpful to understand the structure of conflict, episode and relationship between characters in the novel. It may therefore help readers to select novel that the reader wants to read. In this paper, we propose a method of extracting the tendency of the characters from a novel written in Korean. In advance, we build the dictionary with pairs of the emotional words in Korean and English since the emotion words in the novel’s sentences express character’s feelings. We rate the degree of polarity (positive or negative) of words in our emotional words dictionary based on SenticNet. Then we extract characters and emotion words from sentences in a novel. Since the polarity of a word grows strong or weak due to sentence features such as quotations and modifiers, our proposed method consider them to calculate the polarity of characters. The information of the extracted character’s polarity can be used in the book search service or book recommendation service.Keywords: character tendency, data mining, emotion word, Korean novel
Procedia PDF Downloads 33424715 On the Combination of Patient-Generated Data with Data from a Secure Clinical Network Environment: A Practical Example
Authors: Jeroen S. de Bruin, Karin Schindler, Christian Schuh
Abstract:
With increasingly more mobile health applications appearing due to the popularity of smartphones, the possibility arises that these data can be used to improve the medical diagnostic process, as well as the overall quality of healthcare, while at the same time lowering costs. However, as of yet there have been no reports of a successful combination of patient-generated data from smartphones with data from clinical routine. In this paper, we describe how these two types of data can be combined in a secure way without modification to hospital information systems, and how they can together be used in a medical expert system for automatic nutritional classification and triage.Keywords: mobile health, data integration, expert systems, disease-related malnutrition
Procedia PDF Downloads 47624714 Methotrexate Associated Skin Cancer: A Signal Review of Pharmacovigilance Center
Authors: Abdulaziz Alakeel, Abdulrahman Alomair, Mohammed Fouda
Abstract:
Introduction: Methotrexate (MTX) is an antimetabolite used to treat multiple conditions, including neoplastic diseases, severe psoriasis, and rheumatoid arthritis. Skin cancer is the out-of-control growth of abnormal cells in the epidermis, the outermost skin layer, caused by unrepaired DNA damage that triggers mutations. These mutations lead the skin cells to multiply rapidly and form malignant tumors. The aim of this review is to evaluate the risk of skin cancer associated with the use of methotrexate and to suggest regulatory recommendations if required. Methodology: Signal Detection team at Saudi Food and Drug Authority (SFDA) performed a safety review using National Pharmacovigilance Center (NPC) database as well as the World Health Organization (WHO) VigiBase, alongside with literature screening to retrieve related information for assessing the causality between skin cancer and methotrexate. The search conducted in July 2020. Results: Four published articles support the association seen while searching in literature, a recent randomized control trial published in 2020 revealed a statistically significant increase in skin cancer among MTX users. Another study mentioned methotrexate increases the risk of non-melanoma skin cancer when used in combination with immunosuppressant and biologic agents. In addition, the incidence of melanoma for methotrexate users was 3-fold more than the general population in a cohort study of rheumatoid arthritis patients. The last article estimated the risk of cutaneous malignant melanoma (CMM) in a cohort study shows a statistically significant risk increase for CMM was observed in MTX exposed patients. The WHO database (VigiBase) searched for individual case safety reports (ICSRs) reported for “Skin Cancer” and 'Methotrexate' use, which yielded 121 ICSRs. The initial review revealed that 106 cases are insufficiently documented for proper medical assessment. However, the remaining fifteen cases have extensively evaluated by applying the WHO criteria of causality assessment. As a result, 30 percent of the cases showed that MTX could possibly cause skin cancer; five cases provide unlikely association and five un-assessable cases due to lack of information. The Saudi NPC database searched to retrieve any reported cases for the combined terms methotrexate/skin cancer; however, no local cases reported up to date. The data mining of the observed and the expected reporting rate for drug/adverse drug reaction pair is estimated using information component (IC), a tool developed by the WHO Uppsala Monitoring Centre to measure the reporting ratio. Positive IC reflects higher statistical association, while negative values translated as a less statistical association, considering the null value equal to zero. Results showed that a combination of 'Methotrexate' and 'Skin cancer' observed more than expected when compared to other medications in the WHO database (IC value is 1.2). Conclusion: The weighted cumulative pieces of evidence identified from global cases, data mining, and published literature are sufficient to support a causal association between the risk of skin cancer and methotrexate. Therefore, health care professionals should be aware of this possible risk and may consider monitoring any signs or symptoms of skin cancer in patients treated with methotrexate.Keywords: methotrexate, skin cancer, signal detection, pharmacovigilance
Procedia PDF Downloads 11324713 The Prospects of Leveraging (Big) Data for Accelerating a Just Sustainable Transition around Different Contexts
Authors: Sombol Mokhles
Abstract:
This paper tries to show the prospects of utilising (big)data for enabling just the transition of diverse cities. Our key purpose is to offer a framework of applications and implications of utlising (big) data in comparing sustainability transitions across different cities. Relying on the cosmopolitan comparison, this paper explains the potential application of (big) data but also its limitations. The paper calls for adopting a data-driven and just perspective in including different cities around the world. Having a just and inclusive approach at the front and centre ensures a just transition with synergistic effects that leave nobody behind.Keywords: big data, just sustainable transition, cosmopolitan city comparison, cities
Procedia PDF Downloads 9824712 Destination Management Organization in the Digital Era: A Data Framework to Leverage Collective Intelligence
Authors: Alfredo Fortunato, Carmelofrancesco Origlia, Sara Laurita, Rossella Nicoletti
Abstract:
In the post-pandemic recovery phase of tourism, the role of a Destination Management Organization (DMO) as a coordinated management system of all the elements that make up a destination (attractions, access, marketing, human resources, brand, pricing, etc.) is also becoming relevant for local territories. The objective of a DMO is to maximize the visitor's perception of value and quality while ensuring the competitiveness and sustainability of the destination, as well as the long-term preservation of its natural and cultural assets, and to catalyze benefits for the local economy and residents. In carrying out the multiple functions to which it is called, the DMO can leverage a collective intelligence that comes from the ability to pool information, explicit and tacit knowledge, and relationships of the various stakeholders: policymakers, public managers and officials, entrepreneurs in the tourism supply chain, researchers, data journalists, schools, associations and committees, citizens, etc. The DMO potentially has at its disposal large volumes of data and many of them at low cost, that need to be properly processed to produce value. Based on these assumptions, the paper presents a conceptual framework for building an information system to support the DMO in the intelligent management of a tourist destination tested in an area of southern Italy. The approach adopted is data-informed and consists of four phases: (1) formulation of the knowledge problem (analysis of policy documents and industry reports; focus groups and co-design with stakeholders; definition of information needs and key questions); (2) research and metadatation of relevant sources (reconnaissance of official sources, administrative archives and internal DMO sources); (3) gap analysis and identification of unconventional information sources (evaluation of traditional sources with respect to the level of consistency with information needs, the freshness of information and granularity of data; enrichment of the information base by identifying and studying web sources such as Wikipedia, Google Trends, Booking.com, Tripadvisor, websites of accommodation facilities and online newspapers); (4) definition of the set of indicators and construction of the information base (specific definition of indicators and procedures for data acquisition, transformation, and analysis). The framework derived consists of 6 thematic areas (accommodation supply, cultural heritage, flows, value, sustainability, and enabling factors), each of which is divided into three domains that gather a specific information need to be represented by a scheme of questions to be answered through the analysis of available indicators. The framework is characterized by a high degree of flexibility in the European context, given that it can be customized for each destination by adapting the part related to internal sources. Application to the case study led to the creation of a decision support system that allows: •integration of data from heterogeneous sources, including through the execution of automated web crawling procedures for data ingestion of social and web information; •reading and interpretation of data and metadata through guided navigation paths in the key of digital story-telling; •implementation of complex analysis capabilities through the use of data mining algorithms such as for the prediction of tourist flows.Keywords: collective intelligence, data framework, destination management, smart tourism
Procedia PDF Downloads 12124711 A Plan of Smart Management for Groundwater Resources
Authors: Jennifer Chen, Pei Y. Hsu, Yu W. Chen
Abstract:
Groundwater resources play a vital role in regional water supply because over 1/3 of total demand is satisfied by groundwater resources. Because over-pumpage might cause environmental impact such as land subsidence, a sustainable management of groundwater resource is required. In this study, a blueprint of smart management for groundwater resource is proposed and planned. The framework of the smart management can be divided into two major parts, hardware and software parts. First, an internet of groundwater (IoG) which is inspired by the internet of thing (IoT) is proposed to observe the migration of groundwater usage and the associated response, groundwater levels. Second, algorithms based on data mining and signal analysis are proposed to achieve the goal of providing highly efficient management of groundwater. The entire blueprint is a 4-year plan and this year is the first year. We have finished the installation of 50 flow meters and 17 observation wells. An underground hydrological model is proposed to determine the associated drawdown caused by the measured pumpages. Besides, an alternative to the flow meter is also proposed to decrease the installation cost of IoG. An accelerometer and 3G remote transmission are proposed to detect the on and off of groundwater pumpage.Keywords: groundwater management, internet of groundwater, underground hydrological model, alternative of flow meter
Procedia PDF Downloads 37524710 Strategic Workplace Security: The Role of Malware and the Threat of Internal Vulnerability
Authors: Modesta E. Ezema, Christopher C. Ezema, Christian C. Ugwu, Udoka F. Eze, Florence M. Babalola
Abstract:
Some employees knowingly or unknowingly contribute to loss of data and also expose data to threat in the process of getting their jobs done. Many organizations today are faced with the challenges of how to secure their data as cyber criminals constantly devise new ways of attacking the organization’s secret data. However, this paper enlists the latest strategies that must be put in place in order to protect these important data from being attacked in a collaborative work place. It also introduces us to Advanced Persistent Threats (APTs) and how it works. The empirical study was conducted to collect data from the employee in data centers on how data could be protected from malicious codes and cyber criminals and their responses are highly considered to help checkmate the activities of malicious code and cyber criminals in our work places.Keywords: data, employee, malware, work place
Procedia PDF Downloads 38224709 Application of Groundwater Level Data Mining in Aquifer Identification
Authors: Liang Cheng Chang, Wei Ju Huang, You Cheng Chen
Abstract:
Investigation and research are keys for conjunctive use of surface and groundwater resources. The hydrogeological structure is an important base for groundwater analysis and simulation. Traditionally, the hydrogeological structure is artificially determined based on geological drill logs, the structure of wells, groundwater levels, and so on. In Taiwan, groundwater observation network has been built and a large amount of groundwater-level observation data are available. The groundwater level is the state variable of the groundwater system, which reflects the system response combining hydrogeological structure, groundwater injection, and extraction. This study applies analytical tools to the observation database to develop a methodology for the identification of confined and unconfined aquifers. These tools include frequency analysis, cross-correlation analysis between rainfall and groundwater level, groundwater regression curve analysis, and decision tree. The developed methodology is then applied to groundwater layer identification of two groundwater systems: Zhuoshui River alluvial fan and Pingtung Plain. The abovementioned frequency analysis uses Fourier Transform processing time-series groundwater level observation data and analyzing daily frequency amplitude of groundwater level caused by artificial groundwater extraction. The cross-correlation analysis between rainfall and groundwater level is used to obtain the groundwater replenishment time between infiltration and the peak groundwater level during wet seasons. The groundwater regression curve, the average rate of groundwater regression, is used to analyze the internal flux in the groundwater system and the flux caused by artificial behaviors. The decision tree uses the information obtained from the above mentioned analytical tools and optimizes the best estimation of the hydrogeological structure. The developed method reaches training accuracy of 92.31% and verification accuracy 93.75% on Zhuoshui River alluvial fan and training accuracy 95.55%, and verification accuracy 100% on Pingtung Plain. This extraordinary accuracy indicates that the developed methodology is a great tool for identifying hydrogeological structures.Keywords: aquifer identification, decision tree, groundwater, Fourier transform
Procedia PDF Downloads 15524708 Acceptance of Big Data Technologies and Its Influence towards Employee’s Perception on Job Performance
Authors: Jia Yi Yap, Angela S. H. Lee
Abstract:
With the use of big data technologies, organization can get result that they are interested in. Big data technologies simply load all the data that is useful for the organizations and provide organizations a better way of analysing data. The purpose of this research is to get employees’ opinion from films in Malaysia to explore the use of big data technologies in their organization in order to provide how it may affect the perception of the employees on job performance. Therefore, in order to identify will accepting big data technologies in the organization affect the perception of the employee, questionnaire will be distributed to different employee from different Small and medium-sized enterprises (SME) organization listed in Malaysia. The conceptual model proposed will test with other variables in order to see the relationship between variables.Keywords: big data technologies, employee, job performance, questionnaire
Procedia PDF Downloads 296