Search results for: data mining challenges
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29188

Search results for: data mining challenges

29068 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 447
29067 A Concept of Data Mining with XML Document

Authors: Akshay Agrawal, Anand K. Srivastava

Abstract:

The increasing amount of XML datasets available to casual users increases the necessity of investigating techniques to extract knowledge from these data. Data mining is widely applied in the database research area in order to extract frequent correlations of values from both structured and semi-structured datasets. The increasing availability of heterogeneous XML sources has raised a number of issues concerning how to represent and manage these semi structured data. In recent years due to the importance of managing these resources and extracting knowledge from them, lots of methods have been proposed in order to represent and cluster them in different ways.

Keywords: XML, similarity measure, clustering, cluster quality, semantic clustering

Procedia PDF Downloads 379
29066 Data Mining Approach: Classification Model Evaluation

Authors: Lubabatu Sada Sodangi

Abstract:

The rapid growth in exchange and accessibility of information via the internet makes many organisations acquire data on their own operation. The aim of data mining is to analyse the different behaviour of a dataset using observation. Although, the subset of the dataset being analysed may not display all the behaviours and relationships of the entire data and, therefore, may not represent other parts that exist in the dataset. There is a range of techniques used in data mining to determine the hidden or unknown information in datasets. In this paper, the performance of two algorithms Chi-Square Automatic Interaction Detection (CHAID) and multilayer perceptron (MLP) would be matched using an Adult dataset to find out the percentage of an/the adults that earn > 50k and those that earn <= 50k per year. The two algorithms were studied and compared using IBM SPSS statistics software. The result for CHAID shows that the most important predictors are relationship and education. The algorithm shows that those are married (husband) and have qualification: Bachelor, Masters, Doctorate or Prof-school whose their age is > 41<57 earn > 50k. Also, multilayer perceptron displays marital status and capital gain as the most important predictors of the income. It also shows that individuals that their capital gain is less than 6,849 and are single, separated or widow, earn <= 50K, whereas individuals with their capital gain is > 6,849, work > 35 hrs/wk, and > 27yrs their income will be > 50k. By comparing the two algorithms, it is observed that both algorithms are reliable but there is strong reliability in CHAID which clearly shows that relation and education contribute to the prediction as displayed in the data visualisation.

Keywords: data mining, CHAID, multi-layer perceptron, SPSS, Adult dataset

Procedia PDF Downloads 378
29065 Association of Social Data as a Tool to Support Government Decision Making

Authors: Diego Rodrigues, Marcelo Lisboa, Elismar Batista, Marcos Dias

Abstract:

Based on data on child labor, this work arises questions about how to understand and locate the factors that make up the child labor rates, and which properties are important to analyze these cases. Using data mining techniques to discover valid patterns on Brazilian social databases were evaluated data of child labor in the State of Tocantins (located north of Brazil with a territory of 277000 km2 and comprises 139 counties). This work aims to detect factors that are deterministic for the practice of child labor and their relationships with financial indicators, educational, regional and social, generating information that is not explicit in the government database, thus enabling better monitoring and updating policies for this purpose.

Keywords: social data, government decision making, association of social data, data mining

Procedia PDF Downloads 369
29064 Strategies to Enhance Compliance of Health and Safety Standards at the Selected Mining Industries in Limpopo Province, South Africa: Occupational Health Nurse’s Perspective

Authors: Livhuwani Muthelo

Abstract:

The health and safety of the miners in the South African mining industry are guided by the regulations and standards which are anticipated to promote a healthy work environment and fatalities. It is of utmost importance for the miners to comply with these regulations/standards to protect themselves from potential occupational health and safety risks, accidents, and fatalities. The purpose of this study was to develop and validate strategies to enhance compliance with the Health and safety standards within the mining industries of Limpopo province in South Africa. A mixed-method exploratory sequential research design was adopted. The population consisted of 5350 miners. Purposive sampling was used to select the participants in the qualitative strand and stratified random sampling in the quantitative strand. Semi-structured interviews were conducted among the occupational health nurse practitioners and the health and safety team. Thematic analysis was used to generate an understanding of the interviews. In the quantitative strand, a survey was conducted using a self-administered questionnaire. Data were analysed using SPSS version 26.0. A descriptive statistical test was used in the analysis of data including frequencies, means, and standard deviation. Cronbach's alpha test was used to measure internal consistency. The integrated results revealed that there are diverse experiences related to health and safety standards compliance among the mineworkers. The main findings were challenges related to leadership compliance and also related to the cost of maintaining safety, Miner's behavior-related challenges; the impact of non-compliance on the overall health of the miners was also described, the conflict between production and safety. Health and safety compliance is not just mere compliance with regulations and standards but a culture that warrants the miners and organization to take responsibility for their behavior and actions towards health and safety. Thus taking responsibility for your well-being and other miners.

Keywords: perceptions, compliance, health and safety, legislation, standards, miners

Procedia PDF Downloads 104
29063 Optimal Classifying and Extracting Fuzzy Relationship from Query Using Text Mining Techniques

Authors: Faisal Alshuwaier, Ali Areshey

Abstract:

Text mining techniques are generally applied for classifying the text, finding fuzzy relations and structures in data sets. This research provides plenty text mining capabilities. One common application is text classification and event extraction, which encompass deducing specific knowledge concerning incidents referred to in texts. The main contribution of this paper is the clarification of a concept graph generation mechanism, which is based on a text classification and optimal fuzzy relationship extraction. Furthermore, the work presented in this paper explains the application of fuzzy relationship extraction and branch and bound method to simplify the texts.

Keywords: extraction, max-prod, fuzzy relations, text mining, memberships, classification, memberships, classification

Procedia PDF Downloads 582
29062 Student Performance and Confidence Analysis on Education Virtual Environments through Different Assessment Strategies

Authors: Rubén Manrique, Delio Balcázar, José Parrado, Sebastián Rodríguez

Abstract:

Hand in hand with the evolution of technology, education systems have moved to virtual environments to provide increased coverage and facilitate the access to education. However, measuring student performance in virtual environments presents significant challenges to ensure students are acquiring the expected skills. In this study, the confidence and performance of engineering students in virtual environments is analyzed through different evaluation strategies. The effect of the assessment strategy in student confidence is identified using educational data mining techniques. Four assessment strategies were used. First, a conventional multiple choice test; second, a multiple choice test with feedback; third, a multiple choice test with a second chance; and fourth; a multiple choice test with feedback and second chance. Our results show that applying testing with online feedback strategies can influence positively student confidence.

Keywords: assessment strategies, educational data mining, student performance, student confidence

Procedia PDF Downloads 354
29061 Recognizing Customer Preferences Using Review Documents: A Hybrid Text and Data Mining Approach

Authors: Oshin Anand, Atanu Rakshit

Abstract:

The vast increment in the e-commerce ventures makes this area a prominent research stream. Besides several quantified parameters, the textual content of reviews is a storehouse of many information that can educate companies and help them earn profit. This study is an attempt in this direction. The article attempts to categorize data based on a computed metric that quantifies the influencing capacity of reviews rendering two categories of high and low influential reviews. Further, each of these document is studied to conclude several product feature categories. Each of these categories along with the computed metric is converted to linguistic identifiers and are used in an association mining model. The article makes a novel attempt to combine feature attraction with quantified metric to categorize review text and finally provide frequent patterns that depict customer preferences. Frequent mentions in a highly influential score depict customer likes or preferred features in the product whereas prominent pattern in low influencing reviews highlights what is not important for customers. This is achieved using a hybrid approach of text mining for feature and term extraction, sentiment analysis, multicriteria decision-making technique and association mining model.

Keywords: association mining, customer preference, frequent pattern, online reviews, text mining

Procedia PDF Downloads 388
29060 Using Data Mining Technique for Scholarship Disbursement

Authors: J. K. Alhassan, S. A. Lawal

Abstract:

This work is on decision tree-based classification for the disbursement of scholarship. Tree-based data mining classification technique is used in other to determine the generic rule to be used to disburse the scholarship. The system based on the defined rules from the tree is able to determine the class (status) to which an applicant shall belong whether Granted or Not Granted. The applicants that fall to the class of granted denote a successful acquirement of scholarship while those in not granted class are unsuccessful in the scheme. An algorithm that can be used to classify the applicants based on the rules from tree-based classification was also developed. The tree-based classification is adopted because of its efficiency, effectiveness, and easy to comprehend features. The system was tested with the data of National Information Technology Development Agency (NITDA) Abuja, a Parastatal of Federal Ministry of Communication Technology that is mandated to develop and regulate information technology in Nigeria. The system was found working according to the specification. It is therefore recommended for all scholarship disbursement organizations.

Keywords: classification, data mining, decision tree, scholarship

Procedia PDF Downloads 375
29059 Visual Text Analytics Technologies for Real-Time Big Data: Chronological Evolution and Issues

Authors: Siti Azrina B. A. Aziz, Siti Hafizah A. Hamid

Abstract:

New approaches to analyze and visualize data stream in real-time basis is important in making a prompt decision by the decision maker. Financial market trading and surveillance, large-scale emergency response and crowd control are some example scenarios that require real-time analytic and data visualization. This situation has led to the development of techniques and tools that support humans in analyzing the source data. With the emergence of Big Data and social media, new techniques and tools are required in order to process the streaming data. Today, ranges of tools which implement some of these functionalities are available. In this paper, we present chronological evolution evaluation of technologies for supporting of real-time analytic and visualization of the data stream. Based on the past research papers published from 2002 to 2014, we gathered the general information, main techniques, challenges and open issues. The techniques for streaming text visualization are identified based on Text Visualization Browser in chronological order. This paper aims to review the evolution of streaming text visualization techniques and tools, as well as to discuss the problems and challenges for each of identified tools.

Keywords: information visualization, visual analytics, text mining, visual text analytics tools, big data visualization

Procedia PDF Downloads 399
29058 Assessing Supply Chain Performance through Data Mining Techniques: A Case of Automotive Industry

Authors: Emin Gundogar, Burak Erkayman, Nusret Sazak

Abstract:

Providing effective management performance through the whole supply chain is critical issue and hard to applicate. The proper evaluation of integrated data may conclude with accurate information. Analysing the supply chain data through OLAP (On-Line Analytical Processing) technologies may provide multi-angle view of the work and consolidation. In this study, association rules and classification techniques are applied to measure the supply chain performance metrics of an automotive manufacturer in Turkey. Main criteria and important rules are determined. The comparison of the results of the algorithms is presented.

Keywords: supply chain performance, performance measurement, data mining, automotive

Procedia PDF Downloads 513
29057 Text Mining of Twitter Data Using a Latent Dirichlet Allocation Topic Model and Sentiment Analysis

Authors: Sidi Yang, Haiyi Zhang

Abstract:

Twitter is a microblogging platform, where millions of users daily share their attitudes, views, and opinions. Using a probabilistic Latent Dirichlet Allocation (LDA) topic model to discern the most popular topics in the Twitter data is an effective way to analyze a large set of tweets to find a set of topics in a computationally efficient manner. Sentiment analysis provides an effective method to show the emotions and sentiments found in each tweet and an efficient way to summarize the results in a manner that is clearly understood. The primary goal of this paper is to explore text mining, extract and analyze useful information from unstructured text using two approaches: LDA topic modelling and sentiment analysis by examining Twitter plain text data in English. These two methods allow people to dig data more effectively and efficiently. LDA topic model and sentiment analysis can also be applied to provide insight views in business and scientific fields.

Keywords: text mining, Twitter, topic model, sentiment analysis

Procedia PDF Downloads 179
29056 On an Approach for Rule Generation in Association Rule Mining

Authors: B. Chandra

Abstract:

In Association Rule Mining, much attention has been paid for developing algorithms for large (frequent/closed/maximal) itemsets but very little attention has been paid to improve the performance of rule generation algorithms. Rule generation is an important part of Association Rule Mining. In this paper, a novel approach named NARG (Association Rule using Antecedent Support) has been proposed for rule generation that uses memory resident data structure named FCET (Frequent Closed Enumeration Tree) to find frequent/closed itemsets. In addition, the computational speed of NARG is enhanced by giving importance to the rules that have lower antecedent support. Comparative performance evaluation of NARG with fast association rule mining algorithm for rule generation has been done on synthetic datasets and real life datasets (taken from UCI Machine Learning Repository). Performance analysis shows that NARG is computationally faster in comparison to the existing algorithms for rule generation.

Keywords: knowledge discovery, association rule mining, antecedent support, rule generation

Procedia PDF Downloads 324
29055 What the Future Holds for Social Media Data Analysis

Authors: P. Wlodarczak, J. Soar, M. Ally

Abstract:

The dramatic rise in the use of Social Media (SM) platforms such as Facebook and Twitter provide access to an unprecedented amount of user data. Users may post reviews on products and services they bought, write about their interests, share ideas or give their opinions and views on political issues. There is a growing interest in the analysis of SM data from organisations for detecting new trends, obtaining user opinions on their products and services or finding out about their online reputations. A recent research trend in SM analysis is making predictions based on sentiment analysis of SM. Often indicators of historic SM data are represented as time series and correlated with a variety of real world phenomena like the outcome of elections, the development of financial indicators, box office revenue and disease outbreaks. This paper examines the current state of research in the area of SM mining and predictive analysis and gives an overview of the analysis methods using opinion mining and machine learning techniques.

Keywords: social media, text mining, knowledge discovery, predictive analysis, machine learning

Procedia PDF Downloads 423
29054 Design of a Small and Medium Enterprise Growth Prediction Model Based on Web Mining

Authors: Yiea Funk Te, Daniel Mueller, Irena Pletikosa Cvijikj

Abstract:

Small and medium enterprises (SMEs) play an important role in the economy of many countries. When the overall world economy is considered, SMEs represent 95% of all businesses in the world, accounting for 66% of the total employment. Existing studies show that the current business environment is characterized as highly turbulent and strongly influenced by modern information and communication technologies, thus forcing SMEs to experience more severe challenges in maintaining their existence and expanding their business. To support SMEs at improving their competitiveness, researchers recently turned their focus on applying data mining techniques to build risk and growth prediction models. However, data used to assess risk and growth indicators is primarily obtained via questionnaires, which is very laborious and time-consuming, or is provided by financial institutes, thus highly sensitive to privacy issues. Recently, web mining (WM) has emerged as a new approach towards obtaining valuable insights in the business world. WM enables automatic and large scale collection and analysis of potentially valuable data from various online platforms, including companies’ websites. While WM methods have been frequently studied to anticipate growth of sales volume for e-commerce platforms, their application for assessment of SME risk and growth indicators is still scarce. Considering that a vast proportion of SMEs own a website, WM bears a great potential in revealing valuable information hidden in SME websites, which can further be used to understand SME risk and growth indicators, as well as to enhance current SME risk and growth prediction models. This study aims at developing an automated system to collect business-relevant data from the Web and predict future growth trends of SMEs by means of WM and data mining techniques. The envisioned system should serve as an 'early recognition system' for future growth opportunities. In an initial step, we examine how structured and semi-structured Web data in governmental or SME websites can be used to explain the success of SMEs. WM methods are applied to extract Web data in a form of additional input features for the growth prediction model. The data on SMEs provided by a large Swiss insurance company is used as ground truth data (i.e. growth-labeled data) to train the growth prediction model. Different machine learning classification algorithms such as the Support Vector Machine, Random Forest and Artificial Neural Network are applied and compared, with the goal to optimize the prediction performance. The results are compared to those from previous studies, in order to assess the contribution of growth indicators retrieved from the Web for increasing the predictive power of the model.

Keywords: data mining, SME growth, success factors, web mining

Procedia PDF Downloads 266
29053 Data Mining of Students' Performance Using Artificial Neural Network: Turkish Students as a Case Study

Authors: Samuel Nii Tackie, Oyebade K. Oyedotun, Ebenezer O. Olaniyi, Adnan Khashman

Abstract:

Artificial neural networks have been used in different fields of artificial intelligence, and more specifically in machine learning. Although, other machine learning options are feasible in most situations, but the ease with which neural networks lend themselves to different problems which include pattern recognition, image compression, classification, computer vision, regression etc. has earned it a remarkable place in the machine learning field. This research exploits neural networks as a data mining tool in predicting the number of times a student repeats a course, considering some attributes relating to the course itself, the teacher, and the particular student. Neural networks were used in this work to map the relationship between some attributes related to students’ course assessment and the number of times a student will possibly repeat a course before he passes. It is the hope that the possibility to predict students’ performance from such complex relationships can help facilitate the fine-tuning of academic systems and policies implemented in learning environments. To validate the power of neural networks in data mining, Turkish students’ performance database has been used; feedforward and radial basis function networks were trained for this task; and the performances obtained from these networks evaluated in consideration of achieved recognition rates and training time.

Keywords: artificial neural network, data mining, classification, students’ evaluation

Procedia PDF Downloads 613
29052 Developing Structured Sizing Systems for Manufacturing Ready-Made Garments of Indian Females Using Decision Tree-Based Data Mining

Authors: Hina Kausher, Sangita Srivastava

Abstract:

In India, there is a lack of standard, systematic sizing approach for producing readymade garments. Garments manufacturing companies use their own created size tables by modifying international sizing charts of ready-made garments. The purpose of this study is to tabulate the anthropometric data which covers the variety of figure proportions in both height and girth. 3,000 data has been collected by an anthropometric survey undertaken over females between the ages of 16 to 80 years from some states of India to produce the sizing system suitable for clothing manufacture and retailing. This data is used for the statistical analysis of body measurements, the formulation of sizing systems and body measurements tables. Factor analysis technique is used to filter the control body dimensions from a large number of variables. Decision tree-based data mining is used to cluster the data. The standard and structured sizing system can facilitate pattern grading and garment production. Moreover, it can exceed buying ratios and upgrade size allocations to retail segments.

Keywords: anthropometric data, data mining, decision tree, garments manufacturing, sizing systems, ready-made garments

Procedia PDF Downloads 133
29051 Mining in Nigeria and Development Effort of Metallurgical Technologies at National Metallurgical Development Center Jos, Plateau State-Nigeria

Authors: Linus O. Asuquo

Abstract:

Mining in Nigeria and development effort of metallurgical technologies at National Metallurgical Development Centre Jos has been addressed in this paper. The paper has looked at the history of mining in Nigeria, the impact of mining on social and industrial development, and the contribution of the mining sector to Nigeria’s Gross Domestic Product (GDP). The paper clearly stated that Nigeria’s mining sector only contributes 0.5% to the nation’s GDP unlike Botswana that the mining sector contributes 38% to the nation’s GDP. Nigeria Bureau of Statistics has it on record that Nigeria has about 44 solid minerals awaiting to be exploited. Clearly highlighted by this paper is the abundant potentials that exist in the mining sector for investment. The paper made an exposition on the extensive efforts made at National Metallurgical Development Center (NMDC) to develop metallurgical technologies in various areas of the metals sector; like mineral processing, foundry development, nonferrous metals extraction, materials testing, lime calcination, ANO (Trade name for powder lubricant) wire drawing lubricant, refractories and many others. The paper went ahead to draw a conclusion that there is a need to develop the mining sector in Nigeria and to give a sustainable support to the efforts currently made at NMDC to develop metallurgical technologies which are capable of transforming the metals sector in Nigeria, which will lead to industrialization. Finally the paper made some recommendations which traverse the topic for the best expectation.

Keywords: mining, minerals, technologies, value addition

Procedia PDF Downloads 102
29050 Small-Scale Mining Policies in Ghana: Miners' Knowledge, Attitudes and Practices

Authors: Franklin Nantui Mabe, Robert Osei

Abstract:

Activities and operations of artisanal small scale mining (ASM) have recently appealed to the attention of policymakers, researchers, and the general public in Ghana. This stems from the negative impacts of ASM operations on the environment and livelihoods of local inhabitants, as well as the disregard for available ASM mining policies. This study, therefore, investigates whether or not artisanal small-scale miners have enough knowledge of the mining policies and their implementations. The study adopted the Knowledge, Attitudes, and Practices (KAP) framework approach to design the research, collect and analyze primary data. The most aware ASM policy provision is the one that mandates the government to reserve demarcated ASM areas for Ghanaians, whilst the least aware provision is the one that admonishes the government to promote co-operative saving among ASM. The awareness index is lower than the attitude index towards the policy provisions. In terms of practices, miners continued to use bad practices with the associated negative impacts on the environment and rural livelihoods. It is therefore important for the government through mineral commission, district, municipal and metropolitan assemblies to intensify the education on the ASM policies. These could be done with the help of ASM associations. The current systems where a cluster of districts have a single Mineral Commission Office should be restructured to make sure that each mining district has an office.

Keywords: mining policies, KAP, awareness, artisanal small-scale mining

Procedia PDF Downloads 185
29049 Application of Data Mining Techniques for Tourism Knowledge Discovery

Authors: Teklu Urgessa, Wookjae Maeng, Joong Seek Lee

Abstract:

Application of five implementations of three data mining classification techniques was experimented for extracting important insights from tourism data. The aim was to find out the best performing algorithm among the compared ones for tourism knowledge discovery. Knowledge discovery process from data was used as a process model. 10-fold cross validation method is used for testing purpose. Various data preprocessing activities were performed to get the final dataset for model building. Classification models of the selected algorithms were built with different scenarios on the preprocessed dataset. The outperformed algorithm tourism dataset was Random Forest (76%) before applying information gain based attribute selection and J48 (C4.5) (75%) after selection of top relevant attributes to the class (target) attribute. In terms of time for model building, attribute selection improves the efficiency of all algorithms. Artificial Neural Network (multilayer perceptron) showed the highest improvement (90%). The rules extracted from the decision tree model are presented, which showed intricate, non-trivial knowledge/insight that would otherwise not be discovered by simple statistical analysis with mediocre accuracy of the machine using classification algorithms.

Keywords: classification algorithms, data mining, knowledge discovery, tourism

Procedia PDF Downloads 295
29048 Designing an Enterprise Architecture for Mining Company by Using Togaf Framework

Authors: Rika Yuliana, Budi Rahardjo

Abstract:

The Role of ICT in the organization will continue to experience growth in line with business growth. However, in reality, there is a gap between ICT initiatives with the development (needs) of company business that is caused by yet inadequate of ICT strategic alignment. Therefore, this study was conducted with the aim to create an enterprise architectural model rule, particularly in mining companies, using the TOGAF framework. The results from the design development phase of the mining enterprise architecture meta model represents the domain of business, applications, data, and technology. The results of the design as a whole were analyzed from four perspectives, namely the perspective of contextual, conceptual, logical and physical. In the end, the quality assessment of the mining enterprise architecture is conducted to assess the suitability of the design standards and architectural principles.

Keywords: design and development the information technology architecture, enterprise architecture, enterprise architecture design result, TOGAF architecture development method (ADM)

Procedia PDF Downloads 445
29047 Emergence of Information Centric Networking and Web Content Mining: A Future Efficient Internet Architecture

Authors: Sajjad Akbar, Rabia Bashir

Abstract:

With the growth of the number of users, the Internet usage has evolved. Due to its key design principle, there is an incredible expansion in its size. This tremendous growth of the Internet has brought new applications (mobile video and cloud computing) as well as new user’s requirements i.e. content distribution environment, mobility, ubiquity, security and trust etc. The users are more interested in contents rather than their communicating peer nodes. The current Internet architecture is a host-centric networking approach, which is not suitable for the specific type of applications. With the growing use of multiple interactive applications, the host centric approach is considered to be less efficient as it depends on the physical location, for this, Information Centric Networking (ICN) is considered as the potential future Internet architecture. It is an approach that introduces uniquely named data as a core Internet principle. It uses the receiver oriented approach rather than sender oriented. It introduces the naming base information system at the network layer. Although ICN is considered as future Internet architecture but there are lot of criticism on it which mainly concerns that how ICN will manage the most relevant content. For this Web Content Mining(WCM) approaches can help in appropriate data management of ICN. To address this issue, this paper contributes by (i) discussing multiple ICN approaches (ii) analyzing different Web Content Mining approaches (iii) creating a new Internet architecture by merging ICN and WCM to solve the data management issues of ICN. From ICN, Content-Centric Networking (CCN) is selected for the new architecture, whereas, Agent-based approach from Web Content Mining is selected to find most appropriate data.

Keywords: agent based web content mining, content centric networking, information centric networking

Procedia PDF Downloads 475
29046 Data Mining in Healthcare for Predictive Analytics

Authors: Ruzanna Muradyan

Abstract:

Medical data mining is a crucial field in contemporary healthcare that offers cutting-edge tactics with enormous potential to transform patient care. This abstract examines how sophisticated data mining techniques could transform the healthcare industry, with a special focus on how they might improve patient outcomes. Healthcare data repositories have dynamically evolved, producing a rich tapestry of different, multi-dimensional information that includes genetic profiles, lifestyle markers, electronic health records, and more. By utilizing data mining techniques inside this vast library, a variety of prospects for precision medicine, predictive analytics, and insight production become visible. Predictive modeling for illness prediction, risk stratification, and therapy efficacy evaluations are important points of focus. Healthcare providers may use this abundance of data to tailor treatment plans, identify high-risk patient populations, and forecast disease trajectories by applying machine learning algorithms and predictive analytics. Better patient outcomes, more efficient use of resources, and early treatments are made possible by this proactive strategy. Furthermore, data mining techniques act as catalysts to reveal complex relationships between apparently unrelated data pieces, providing enhanced insights into the cause of disease, genetic susceptibilities, and environmental factors. Healthcare practitioners can get practical insights that guide disease prevention, customized patient counseling, and focused therapies by analyzing these associations. The abstract explores the problems and ethical issues that come with using data mining techniques in the healthcare industry. In order to properly use these approaches, it is essential to find a balance between data privacy, security issues, and the interpretability of complex models. Finally, this abstract demonstrates the revolutionary power of modern data mining methodologies in transforming the healthcare sector. Healthcare practitioners and researchers can uncover unique insights, enhance clinical decision-making, and ultimately elevate patient care to unprecedented levels of precision and efficacy by employing cutting-edge methodologies.

Keywords: data mining, healthcare, patient care, predictive analytics, precision medicine, electronic health records, machine learning, predictive modeling, disease prognosis, risk stratification, treatment efficacy, genetic profiles, precision health

Procedia PDF Downloads 62
29045 Main Cause of Children's Deaths in Indigenous Wayuu Community from Department of La Guajira: A Research Developed through Data Mining Use

Authors: Isaura Esther Solano Núñez, David Suarez

Abstract:

The main purpose of this research is to discover what causes death in children of the Wayuu community, and deeply analyze those results in order to take corrective measures to properly control infant mortality. We consider important to determine the reasons that are producing early death in this specific type of population, since they are the most vulnerable to high risk environmental conditions. In this way, the government, through competent authorities, may develop prevention policies and the right measures to avoid an increase of this tragic fact. The methodology used to develop this investigation is data mining, which consists in gaining and examining large amounts of data to produce new and valuable information. Through this technique it has been possible to determine that the child population is dying mostly from malnutrition. In short, this technique has been very useful to develop this study; it has allowed us to transform large amounts of information into a conclusive and important statement, which has made it easier to take appropriate steps to resolve a particular situation.

Keywords: malnutrition, data mining, analytical, descriptive, population, Wayuu, indigenous

Procedia PDF Downloads 159
29044 Building an Integrated Relational Database from Swiss Nutrition National Survey and Swiss Health Datasets for Data Mining Purposes

Authors: Ilona Mewes, Helena Jenzer, Farshideh Einsele

Abstract:

Objective: The objective of the study was to integrate two big databases from Swiss nutrition national survey (menuCH) and Swiss health national survey 2012 for data mining purposes. Each database has a demographic base data. An integrated Swiss database is built to later discover critical food consumption patterns linked with lifestyle diseases known to be strongly tied with food consumption. Design: Swiss nutrition national survey (menuCH) with approx. 2000 respondents from two different surveys, one by Phone and the other by questionnaire along with Swiss health national survey 2012 with 21500 respondents were pre-processed, cleaned and finally integrated to a unique relational database. Results: The result of this study is an integrated relational database from the Swiss nutritional and health databases.

Keywords: health informatics, data mining, nutritional and health databases, nutritional and chronical databases

Procedia PDF Downloads 112
29043 Data Integrity: Challenges in Health Information Systems in South Africa

Authors: T. Thulare, M. Herselman, A. Botha

Abstract:

Poor system use, including inappropriate design of health information systems, causes difficulties in communication with patients and increased time spent by healthcare professionals in recording the necessary health information for medical records. System features like pop-up reminders, complex menus, and poor user interfaces can make medical records far more time consuming than paper cards as well as affect decision-making processes. Although errors associated with health information and their real and likely effect on the quality of care and patient safety have been documented for many years, more research is needed to measure the occurrence of these errors and determine the causes to implement solutions. Therefore, the purpose of this paper is to identify data integrity challenges in hospital information systems through a scoping review and based on the results provide recommendations on how to manage these. Only 34 papers were found to be most suitable out of 297 publications initially identified in the field. The results indicated that human and computerized systems are the most common challenges associated with data integrity and factors such as policy, environment, health workforce, and lack of awareness attribute to these challenges but if measures are taken the data integrity challenges can be managed.

Keywords: data integrity, data integrity challenges, hospital information systems, South Africa

Procedia PDF Downloads 181
29042 Development of New Technology Evaluation Model by Using Patent Information and Customers' Review Data

Authors: Kisik Song, Kyuwoong Kim, Sungjoo Lee

Abstract:

Many global firms and corporations derive new technology and opportunity by identifying vacant technology from patent analysis. However, previous studies failed to focus on technologies that promised continuous growth in industrial fields. Most studies that derive new technology opportunities do not test practical effectiveness. Since previous studies depended on expert judgment, it became costly and time-consuming to evaluate new technologies based on patent analysis. Therefore, research suggests a quantitative and systematic approach to technology evaluation indicators by using patent data to and from customer communities. The first step involves collecting two types of data. The data is used to construct evaluation indicators and apply these indicators to the evaluation of new technologies. This type of data mining allows a new method of technology evaluation and better predictor of how new technologies are adopted.

Keywords: data mining, evaluating new technology, technology opportunity, patent analysis

Procedia PDF Downloads 377
29041 Study and Analysis of the Factors Affecting Road Safety Using Decision Tree Algorithms

Authors: Naina Mahajan, Bikram Pal Kaur

Abstract:

The purpose of traffic accident analysis is to find the possible causes of an accident. Road accidents cannot be totally prevented but by suitable traffic engineering and management the accident rate can be reduced to a certain extent. This paper discusses the classification techniques C4.5 and ID3 using the WEKA Data mining tool. These techniques use on the NH (National highway) dataset. With the C4.5 and ID3 technique it gives best results and high accuracy with less computation time and error rate.

Keywords: C4.5, ID3, NH(National highway), WEKA data mining tool

Procedia PDF Downloads 338
29040 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 142
29039 Merit Order of Indonesian Coal Mining Sources to Meet the Domestic Power Plants Demand

Authors: Victor Siahaan

Abstract:

Coal still become the most important energy source for electricity generation known for its contribution which take the biggest portion of energy mix that a country has, for example Indonesia. The low cost of electricity generation and quite a lot of resources make this energy still be the first choice to fill the portion of base load power. To realize its significance to produce electricity, it is necessary to know the amount of coal (volume) needed to ensure that all coal power plants (CPP) in a country can operate properly. To secure the volume of coal, in this study, discussion was carried out regarding the identification of coal mining sources in Indonesia, classification of coal typical from each coal mining sources, and determination of the port of loading. By using data above, the sources of coal mining are then selected to feed certain CPP based on the compatibility of the coal typical and the lowest transport cost.

Keywords: merit order, Indonesian coal mine, electricity, power plant

Procedia PDF Downloads 153