Search results for: classification tree.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1426

Search results for: classification tree.

676 Performance Analysis of Traffic Classification with Machine Learning

Authors: Htay Htay Yi, Zin May Aye

Abstract:

Network security is role of the ICT environment because malicious users are continually growing that realm of education, business, and then related with ICT. The network security contravention is typically described and examined centrally based on a security event management system. The firewalls, Intrusion Detection System (IDS), and Intrusion Prevention System are becoming essential to monitor or prevent of potential violations, incidents attack, and imminent threats. In this system, the firewall rules are set only for where the system policies are needed. Dataset deployed in this system are derived from the testbed environment. The traffic as in DoS and PortScan traffics are applied in the testbed with firewall and IDS implementation. The network traffics are classified as normal or attacks in the existing testbed environment based on six machine learning classification methods applied in the system. It is required to be tested to get datasets and applied for DoS and PortScan. The dataset is based on CICIDS2017 and some features have been added. This system tested 26 features from the applied dataset. The system is to reduce false positive rates and to improve accuracy in the implemented testbed design. The system also proves good performance by selecting important features and comparing existing a dataset by machine learning classifiers.

Keywords: False negative rate, intrusion detection system, machine learning methods, performance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1070
675 Mining Network Data for Intrusion Detection through Naïve Bayesian with Clustering

Authors: Dewan Md. Farid, Nouria Harbi, Suman Ahmmed, Md. Zahidur Rahman, Chowdhury Mofizur Rahman

Abstract:

Network security attacks are the violation of information security policy that received much attention to the computational intelligence society in the last decades. Data mining has become a very useful technique for detecting network intrusions by extracting useful knowledge from large number of network data or logs. Naïve Bayesian classifier is one of the most popular data mining algorithm for classification, which provides an optimal way to predict the class of an unknown example. It has been tested that one set of probability derived from data is not good enough to have good classification rate. In this paper, we proposed a new learning algorithm for mining network logs to detect network intrusions through naïve Bayesian classifier, which first clusters the network logs into several groups based on similarity of logs, and then calculates the prior and conditional probabilities for each group of logs. For classifying a new log, the algorithm checks in which cluster the log belongs and then use that cluster-s probability set to classify the new log. We tested the performance of our proposed algorithm by employing KDD99 benchmark network intrusion detection dataset, and the experimental results proved that it improves detection rates as well as reduces false positives for different types of network intrusions.

Keywords: Clustering, detection rate, false positive, naïveBayesian classifier, network intrusion detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5536
674 Clustering Categorical Data Using Hierarchies (CLUCDUH)

Authors: Gökhan Silahtaroğlu

Abstract:

Clustering large populations is an important problem when the data contain noise and different shapes. A good clustering algorithm or approach should be efficient enough to detect clusters sensitively. Besides space complexity, time complexity also gains importance as the size grows. Using hierarchies we developed a new algorithm to split attributes according to the values they have and choosing the dimension for splitting so as to divide the database roughly into equal parts as much as possible. At each node we calculate some certain descriptive statistical features of the data which reside and by pruning we generate the natural clusters with a complexity of O(n).

Keywords: Clustering, tree, split, pruning, entropy, gini.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
673 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments

Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard

Abstract:

With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.

Keywords: Activities of daily living, classification, internet of things, machine learning, smart home.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1770
672 Performance Optimization of Data Mining Application Using Radial Basis Function Classifier

Authors: M. Govindarajan, R. M.Chandrasekaran

Abstract:

Text data mining is a process of exploratory data analysis. Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. This paper describes proposed radial basis function Classifier that performs comparative crossvalidation for existing radial basis function Classifier. The feasibility and the benefits of the proposed approach are demonstrated by means of data mining problem: direct Marketing. Direct marketing has become an important application field of data mining. Comparative Cross-validation involves estimation of accuracy by either stratified k-fold cross-validation or equivalent repeated random subsampling. While the proposed method may have high bias; its performance (accuracy estimation in our case) may be poor due to high variance. Thus the accuracy with proposed radial basis function Classifier was less than with the existing radial basis function Classifier. However there is smaller the improvement in runtime and larger improvement in precision and recall. In the proposed method Classification accuracy and prediction accuracy are determined where the prediction accuracy is comparatively high.

Keywords: Text Data Mining, Comparative Cross-validation, Radial Basis Function, runtime, accuracy.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1554
671 Integration of Image and Patient Data, Software and International Coding Systems for Use in a Mammography Research Project

Authors: V. Balanica, W. I. D. Rae, M. Caramihai, S. Acho, C. P. Herbst

Abstract:

Mammographic images and data analysis to facilitate modelling or computer aided diagnostic (CAD) software development should best be done using a common database that can handle various mammographic image file formats and relate these to other patient information. This would optimize the use of the data as both primary reporting and enhanced information extraction of research data could be performed from the single dataset. One desired improvement is the integration of DICOM file header information into the database, as an efficient and reliable source of supplementary patient information intrinsically available in the images. The purpose of this paper was to design a suitable database to link and integrate different types of image files and gather common information that can be further used for research purposes. An interface was developed for accessing, adding, updating, modifying and extracting data from the common database, enhancing the future possible application of the data in CAD processing. Technically, future developments envisaged include the creation of an advanced search function to selects image files based on descriptor combinations. Results can be further used for specific CAD processing and other research. Design of a user friendly configuration utility for importing of the required fields from the DICOM files must be done.

Keywords: Database Integration, Mammogram Classification, Tumour Classification, Computer Aided Diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1945
670 Tool for Fast Detection of Java Code Snippets

Authors: Tomáš Bublík, Miroslav Virius

Abstract:

This paper presents general results on the Java source code snippet detection problem. We propose the tool which uses graph and subgraph isomorphism detection. A number of solutions for all of these tasks have been proposed in the literature. However, although that all these solutions are really fast, they compare just the constant static trees. Our solution offers to enter an input sample dynamically with the Scripthon language while preserving an acceptable speed. We used several optimizations to achieve very low number of comparisons during the matching algorithm.

Keywords: AST, Java, tree matching, Scripthon, source code recognition

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1958
669 An Energy Efficient Algorithm for Distributed Mutual Exclusion in Mobile Ad-hoc Networks

Authors: Sayani Sil, Sukanta Das

Abstract:

This paper reports a distributed mutual exclusion algorithm for mobile Ad-hoc networks. The network is clustered hierarchically. The proposed algorithm considers the clustered network as a logical tree and develops a token passing scheme to get the mutual exclusion. The performance analysis and simulation results show that its message requirement is optimal, and thus the algorithm is energy efficient.

Keywords: Critical section, Distributed mutual exclusion, MobileAd-hoc network, Token-based algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1751
668 Novel Hybrid Method for Gene Selection and Cancer Prediction

Authors: Liping Jing, Michael K. Ng, Tieyong Zeng

Abstract:

Microarray data profiles gene expression on a whole genome scale, therefore, it provides a good way to study associations between gene expression and occurrence or progression of cancer. More and more researchers realized that microarray data is helpful to predict cancer sample. However, the high dimension of gene expressions is much larger than the sample size, which makes this task very difficult. Therefore, how to identify the significant genes causing cancer becomes emergency and also a hot and hard research topic. Many feature selection algorithms have been proposed in the past focusing on improving cancer predictive accuracy at the expense of ignoring the correlations between the features. In this work, a novel framework (named by SGS) is presented for stable gene selection and efficient cancer prediction . The proposed framework first performs clustering algorithm to find the gene groups where genes in each group have higher correlation coefficient, and then selects the significant genes in each group with Bayesian Lasso and important gene groups with group Lasso, and finally builds prediction model based on the shrinkage gene space with efficient classification algorithm (such as, SVM, 1NN, Regression and etc.). Experiment results on real world data show that the proposed framework often outperforms the existing feature selection and prediction methods, say SAM, IG and Lasso-type prediction model.

Keywords: Gene Selection, Cancer Prediction, Lasso, Clustering, Classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2044
667 N-Sun Decomposition of Complete, Complete Bipartite and Some Harary Graphs

Authors: R. Anitha, R. S. Lekshmi

Abstract:

Graph decompositions are vital in the study of combinatorial design theory. A decomposition of a graph G is a partition of its edge set. An n-sun graph is a cycle Cn with an edge terminating in a vertex of degree one attached to each vertex. In this paper, we define n-sun decomposition of some even order graphs with a perfect matching. We have proved that the complete graph K2n, complete bipartite graph K2n, 2n and the Harary graph H4, 2n have n-sun decompositions. A labeling scheme is used to construct the n-suns.

Keywords: Decomposition, Hamilton cycle, n-sun graph, perfect matching, spanning tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2396
666 Myanmar Character Recognition Using Eight Direction Chain Code Frequency Features

Authors: Kyi Pyar Zaw, Zin Mar Kyu

Abstract:

Character recognition is the process of converting a text image file into editable and searchable text file. Feature Extraction is the heart of any character recognition system. The character recognition rate may be low or high depending on the extracted features. In the proposed paper, 25 features for one character are used in character recognition. Basically, there are three steps of character recognition such as character segmentation, feature extraction and classification. In segmentation step, horizontal cropping method is used for line segmentation and vertical cropping method is used for character segmentation. In the Feature extraction step, features are extracted in two ways. The first way is that the 8 features are extracted from the entire input character using eight direction chain code frequency extraction. The second way is that the input character is divided into 16 blocks. For each block, although 8 feature values are obtained through eight-direction chain code frequency extraction method, we define the sum of these 8 feature values as a feature for one block. Therefore, 16 features are extracted from that 16 blocks in the second way. We use the number of holes feature to cluster the similar characters. We can recognize the almost Myanmar common characters with various font sizes by using these features. All these 25 features are used in both training part and testing part. In the classification step, the characters are classified by matching the all features of input character with already trained features of characters.

Keywords: Chain code frequency, character recognition, feature extraction, features matching, segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 753
665 Efficient Dimensionality Reduction of Directional Overcurrent Relays Optimal Coordination Problem

Authors: Fouad Salha , X. Guillaud

Abstract:

Directional over current relays (DOCR) are commonly used in power system protection as a primary protection in distribution and sub-transmission electrical systems and as a secondary protection in transmission systems. Coordination of protective relays is necessary to obtain selective tripping. In this paper, an approach for efficiency reduction of DOCRs nonlinear optimum coordination (OC) is proposed. This was achieved by modifying the objective function and relaxing several constraints depending on the four constraints classification, non-valid, redundant, pre-obtained and valid constraints. According to this classification, the far end fault effect on the objective function and constraints, and in consequently on relay operating time, was studied. The study was carried out, firstly by taking into account the near-end and far-end faults in DOCRs coordination problem formulation; and then faults very close to the primary relays (nearend faults). The optimal coordination (OC) was achieved by simultaneously optimizing all variables (TDS and Ip) in nonlinear environment by using of Genetic algorithm nonlinear programming techniques. The results application of the above two approaches on 6-bus and 26-bus system verify that the far-end faults consideration on OC problem formulation don-t lose the optimality.

Keywords: Backup/Primary relay, Coordination time interval (CTI), directional over current relays, Genetic algorithm, time dial setting (TDS), pickup current setting (Ip), nonlinear programming.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1584
664 Preliminary Evaluation of Decommissioning Wastes for the First Commercial Nuclear Power Reactor in South Korea

Authors: Kyomin Lee, Joohee Kim, Sangho Kang

Abstract:

The commercial nuclear power reactor in South Korea, Kori Unit 1, which was a 587 MWe pressurized water reactor that started operation since 1978, was permanently shut down in June 2017 without an additional operating license extension. The Kori 1 Unit is scheduled to become the nuclear power unit to enter the decommissioning phase. In this study, the preliminary evaluation of the decommissioning wastes for the Kori Unit 1 was performed based on the following series of process: firstly, the plant inventory is investigated based on various documents (i.e., equipment/ component list, construction records, general arrangement drawings). Secondly, the radiological conditions of systems, structures and components (SSCs) are established to estimate the amount of radioactive waste by waste classification. Third, the waste management strategies for Kori Unit 1 including waste packaging are established. Forth, selection of the proper decontamination and dismantling (D&D) technologies is made considering the various factors. Finally, the amount of decommissioning waste by classification for Kori 1 is estimated using the DeCAT program, which was developed by KEPCO-E&C for a decommissioning cost estimation. The preliminary evaluation results have shown that the expected amounts of decommissioning wastes were less than about 2% and 8% of the total wastes generated (i.e., sum of clean wastes and radwastes) before/after waste processing, respectively, and it was found that the majority of contaminated material was carbon or alloy steel and stainless steel. In addition, within the range of availability of information, the results of the evaluation were compared with the results from the various decommissioning experiences data or international/national decommissioning study. The comparison results have shown that the radioactive waste amount from Kori Unit 1 decommissioning were much less than those from the plants decommissioned in U.S. and were comparable to those from the plants in Europe. This result comes from the difference of disposal cost and clearance criteria (i.e., free release level) between U.S. and non-U.S. The preliminary evaluation performed using the methodology established in this study will be useful as a important information in establishing the decommissioning planning for the decommissioning schedule and waste management strategy establishment including the transportation, packaging, handling, and disposal of radioactive wastes.

Keywords: Characterization, classification, decommissioning, decontamination and dismantling, Kori 1, radioactive waste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1481
663 Approximately Similarity Measurement of Web Sites Using Genetic Algorithms and Binary Trees

Authors: Doru Anastasiu Popescu, Dan Rădulescu

Abstract:

In this paper, we determine the similarity of two HTML web applications. We are going to use a genetic algorithm in order to determine the most significant web pages of each application (we are not going to use every web page of a site). Using these significant web pages, we will find the similarity value between the two applications. The algorithm is going to be efficient because we are going to use a reduced number of web pages for comparisons but it will return an approximate value of the similarity. The binary trees are used to keep the tags from the significant pages. The algorithm was implemented in Java language.

Keywords: Tag, HTML, web page, genetic algorithm, similarity value, binary tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1309
662 Hacking's 'Between Goffman and Foucault': A Theoretical Frame for Criminology

Authors: Tomás Speziale

Abstract:

This paper aims to analyse how Ian Hacking states the theoretical basis of his research on the classification of people. Although all his early philosophical education had been based in Foucault, it is also true that Erving Goffman’s perspective provided him with epistemological and methodological tools for understanding face-to-face relationships. Hence, all his works must be thought of as social science texts that combine the research on how the individuals are constituted ‘top-down’ (as in Foucault), with the inquiry into how people renegotiate ‘bottom-up’ the classifications about them. Thus, Hacking´s proposal constitutes a middle ground between the French Philosopher and the American Sociologist. Placing himself between both authors allows Hacking to build a frame that is expected to adjust to Social Sciences’ main particularity: the fact that they study interactive kinds. These are kinds of people, which imply that those who are classified can change in certain ways that prompt the need for changing previous classifications themselves. It is all about the interaction between the labelling of people and the people who are classified. Consequently, understanding the way in which Hacking uses Foucault’s and Goffman’s theories is essential to fully comprehend the social dynamic between individuals and concepts, what Bert Hansen had called dialectical realism. His theoretical proposal, therefore, is not only valuable because it combines diverse perspectives, but also because it constitutes an utterly original and relevant framework for Sociological theory and particularly for Criminology.

Keywords: Classification of people, Foucault`s archaeology, Goffman`s interpersonal sociology, interactive kinds.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
661 A Psychophysiological Evaluation of an Effective Recognition Technique Using Interactive Dynamic Virtual Environments

Authors: Mohammadhossein Moghimi, Robert Stone, Pia Rotshtein

Abstract:

Recording psychological and physiological correlates of human performance within virtual environments and interpreting their impacts on human engagement, ‘immersion’ and related emotional or ‘effective’ states is both academically and technologically challenging. By exposing participants to an effective, real-time (game-like) virtual environment, designed and evaluated in an earlier study, a psychophysiological database containing the EEG, GSR and Heart Rate of 30 male and female gamers, exposed to 10 games, was constructed. Some 174 features were subsequently identified and extracted from a number of windows, with 28 different timing lengths (e.g. 2, 3, 5, etc. seconds). After reducing the number of features to 30, using a feature selection technique, K-Nearest Neighbour (KNN) and Support Vector Machine (SVM) methods were subsequently employed for the classification process. The classifiers categorised the psychophysiological database into four effective clusters (defined based on a 3-dimensional space – valence, arousal and dominance) and eight emotion labels (relaxed, content, happy, excited, angry, afraid, sad, and bored). The KNN and SVM classifiers achieved average cross-validation accuracies of 97.01% (±1.3%) and 92.84% (±3.67%), respectively. However, no significant differences were found in the classification process based on effective clusters or emotion labels.

Keywords: Virtual Reality, effective computing, effective VR, emotion-based effective physiological database.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 994
660 Movie Genre Preference Prediction Using Machine Learning for Customer-Based Information

Authors: Haifeng Wang, Haili Zhang

Abstract:

Most movie recommendation systems have been developed for customers to find items of interest. This work introduces a predictive model usable by small and medium-sized enterprises (SMEs) who are in need of a data-based and analytical approach to stock proper movies for local audiences and retain more customers. We used classification models to extract features from thousands of customers’ demographic, behavioral and social information to predict their movie genre preference. In the implementation, a Gaussian kernel support vector machine (SVM) classification model and a logistic regression model were established to extract features from sample data and their test error-in-sample were compared. Comparison of error-out-sample was also made under different Vapnik–Chervonenkis (VC) dimensions in the machine learning algorithm to find and prevent overfitting. Gaussian kernel SVM prediction model can correctly predict movie genre preferences in 85% of positive cases. The accuracy of the algorithm increased to 93% with a smaller VC dimension and less overfitting. These findings advance our understanding of how to use machine learning approach to predict customers’ preferences with a small data set and design prediction tools for these enterprises.

Keywords: Computational social science, movie preference, machine learning, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1649
659 Technologic Information about Photovoltaic Applied in Urban Residences

Authors: Stephanie Fabris Russo, Daiane Costa Guimarães, Jonas Pedro Fabris, Maria Emilia Camargo, Suzana Leitão Russo, José Augusto Andrade Filho

Abstract:

Among renewable energy sources, solar energy is the one that has stood out. Solar radiation can be used as a thermal energy source and can also be converted into electricity by means of effects on certain materials, such as thermoelectric and photovoltaic panels. These panels are often used to generate energy in homes, buildings, arenas, etc., and have low pollution emissions. Thus, a technological prospecting was performed to find patents related to the use of photovoltaic plates in urban residences. The patent search was based on ESPACENET, associating the keywords photovoltaic and home, where we found 136 patent documents in the period of 1994-2015 in the fields title and abstract. Note that the years 2009, 2010, 2011, 2012, 2013 and 2014 had the highest number of applicants, with respectively, 11, 13, 23, 29, 15 and 21. Regarding the country that deposited about this technology, it is clear that China leads with 67 patent deposits, followed by Japan with 38 patents applications. It is important to note that most depositors, 50% are companies, 44% are individual inventors and only 6% are universities. On the International Patent classification (IPC) codes, we noted that the most present classification in results was H02J3/38, which represents provisions in parallel to feed a single network by two or more generators, converters or transformers. Among all categories, there is the H session, which means Electricity, with 70% of the patents.

Keywords: Prospecting, technology forecasting, photovoltaic, urban residences.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1142
658 Identifying Quality Islamic Content in Community Question Answering Sites

Authors: Rabia Bibi, Muhammad Shahzad Faisal, Khalid Iqbal, Atif Inayat

Abstract:

Internet is growing rapidly and new community-based content is added by people every second. With this fast-growing community-based content, if a user requires answers of particular questions, then reviews are required from experts or community. However, it is difficult to get quality answers. The Muslim community all over the world is seeking help to get their questions and issues discussed to get answers. Online web portals of religious schools and community-based question answering sites are two big platforms to solve the issues of users. In the case of religious schools, there are experts and qualified religious scholars (mufti) who can give the expert opinion. However, the quality of community-based content cannot be guaranteed as it may not be an answer that satisfies the question of a user. Users on CQA sites may include spammers or individual criticizing the questioner instead of providing useful answers. In this paper, we research strategies to naturally distinguish the right content. As an experiment, we concentrate on Yahoo! Answers, and Quora, popular online QA sites, where questions are asked, answered, edited, and organized by a large community of users. We present the classification of data to categorize both relevant and irrelevant answers. Specifically, we demonstrate that the proposed framework can isolate quality answers from the rest with an exactness near that of people.

Keywords: Community-based question and answering, evaluation and prediction of quality answer, answer classification, Islamic content, answer ranking.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 80
657 Customer Need Type Classification Model using Data Mining Techniques for Recommender Systems

Authors: Kyoung-jae Kim

Abstract:

Recommender systems are usually regarded as an important marketing tool in the e-commerce. They use important information about users to facilitate accurate recommendation. The information includes user context such as location, time and interest for personalization of mobile users. We can easily collect information about location and time because mobile devices communicate with the base station of the service provider. However, information about user interest can-t be easily collected because user interest can not be captured automatically without user-s approval process. User interest usually represented as a need. In this study, we classify needs into two types according to prior research. This study investigates the usefulness of data mining techniques for classifying user need type for recommendation systems. We employ several data mining techniques including artificial neural networks, decision trees, case-based reasoning, and multivariate discriminant analysis. Experimental results show that CHAID algorithm outperforms other models for classifying user need type. This study performs McNemar test to examine the statistical significance of the differences of classification results. The results of McNemar test also show that CHAID performs better than the other models with statistical significance.

Keywords: Customer need type, Data mining techniques, Recommender system, Personalization, Mobile user.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2146
656 Fetal and Infant Mortality in Botucatu City, São Paulo State, Brazil: Evaluation of Maternal - Infant Health Care

Authors: Noda L. M., Salvador I. C, C. M. L. G. Parada, Fonseca C. R. B.

Abstract:

In Brazil, neonatal mortality rate is considered incompatible with the country development conditions, and has been a Public Health concern. Reduction in infant mortality rates has also been part of the Millennium Development Goals, a commitment made by countries, members of the Organization of United Nations (OUN), including Brazil. Fetal mortality rate is considered a highly sensitive indicator of health care quality. Suitable actions, such as good quality and access to health services may contribute positively towards reduction in these fetal and neonatal rates. With appropriate antenatal follow-up and health care during gestation and delivery, some death causes could be reduced or even prevented by means of early diagnosis and intervention, as well as changes in risk factors and interventions. Objectives: To study the quality of maternal and infant health care based on fetal and neonatal mortality, as well as the possible actions to prevent those deaths in Botucatu (Brazil). Methods: Classification of prevention according to the International Classification of Diseases and the modified Wigglesworth´s classification. In order to evaluate adequacy, indicators of quality of antenatal and delivery care were established by the authors. Results: Considering fetal deaths, 56.7% of them occurred before delivery, which reveals possible shortcomings in antenatal care, and 38.2% of them were a result of intra- labor changes, which could be prevented or reduced by adequate obstetric management. These findings were different from those in the group of early neonatal deaths which were also studied. Adequacy of health services showed that antenatal and childbirth care was appropriate for 24% and 33.3% of pregnant women, respectively, which corroborates the results of prevention. These results revealed that shortcomings in obstetric and antenatal care could be the causes of deaths in the study. Early and late neonatal deaths have similar characteristics: 76% could be prevented or reduced mainly by adequate newborn care (52.9%) and adequate health care for gestational women (11.7%). When adequacy of care was evaluated, childbirth and newborn care was adequate in 25.8% and antenatal care was adequate in 16.1%. In conclusion, direct relationship was found between adequacy and quality of care rendered to pregnant women and newborns, and fetal and infant mortality. Moreover, our findings highlight that deaths could be prevented by an adequate obstetric and neonatal management.

Keywords: Fetal Mortality, Infant Mortality, Maternal-Child Health Services, Program Evaluation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 5069
655 A Methodology for Definition of Road Networks in Rural Areas of Nepal

Authors: J. K. Shrestha, A. Benta, R. B. Lopes, N. Lopes

Abstract:

This work provides a practical method for the development of rural road networks in rural areas of developing countries. The proposed methodology enables to determine obligatory points in the rural road network maximizing the number of settlements that have access to basic services within a given maximum distance. The proposed methodology is simple and practical, hence, highly applicable to real-world scenarios, as demonstrated in the definition of the road network for the rural areas of Nepal.

Keywords: Minimum spanning tree, nodal points, rural road network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2880
654 A Frame Work for the Development of a Suitable Method to Find Shoot Length at Maturity of Mustard Plant Using Soft Computing Model

Authors: Satyendra Nath Mandal, J. Pal Choudhury, Dilip De, S. R. Bhadra Chaudhuri

Abstract:

The production of a plant can be measured in terms of seeds. The generation of seeds plays a critical role in our social and daily life. The fruit production which generates seeds, depends on the various parameters of the plant, such as shoot length, leaf number, root length, root number, etc When the plant is growing, some leaves may be lost and some new leaves may appear. It is very difficult to use the number of leaves of the tree to calculate the growth of the plant.. It is also cumbersome to measure the number of roots and length of growth of root in several time instances continuously after certain initial period of time, because roots grow deeper and deeper under ground in course of time. On the contrary, the shoot length of the tree grows in course of time which can be measured in different time instances. So the growth of the plant can be measured using the data of shoot length which are measured at different time instances after plantation. The environmental parameters like temperature, rain fall, humidity and pollution are also play some role in production of yield. The soil, crop and distance management are taken care to produce maximum amount of yields of plant. The data of the growth of shoot length of some mustard plant at the initial stage (7,14,21 & 28 days after plantation) is available from the statistical survey by a group of scientists under the supervision of Prof. Dilip De. In this paper, initial shoot length of Ken( one type of mustard plant) has been used as an initial data. The statistical models, the methods of fuzzy logic and neural network have been tested on this mustard plant and based on error analysis (calculation of average error) that model with minimum error has been selected and can be used for the assessment of shoot length at maturity. Finally, all these methods have been tested with other type of mustard plants and the particular soft computing model with the minimum error of all types has been selected for calculating the predicted data of growth of shoot length. The shoot length at the stage of maturity of all types of mustard plants has been calculated using the statistical method on the predicted data of shoot length.

Keywords: Fuzzy time series, neural network, forecasting error, average error.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1591
653 Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers

Authors: Sule Yucelbas, Gulay Tezel, Cuneyt Yucelbas, Seral Ozsen

Abstract:

In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other.

As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.

Keywords: AIS, ANN, ECG, hybrid classifiers, PSO.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1916
652 Human Digital Twin for Personal Conversation Automation Using Supervised Machine Learning Approaches

Authors: Aya Salama

Abstract:

Digital Twin has emerged as a compelling research area, capturing the attention of scholars over the past decade. It finds applications across diverse fields, including smart manufacturing and healthcare, offering significant time and cost savings. Notably, it often intersects with other cutting-edge technologies such as Data Mining, Artificial Intelligence, and Machine Learning. However, the concept of a Human Digital Twin (HDT) is still in its infancy and requires further demonstration of its practicality. HDT takes the notion of Digital Twin a step further by extending it to living entities, notably humans, who are vastly different from inanimate physical objects. The primary objective of this research was to create an HDT capable of automating real-time human responses by simulating human behavior. To achieve this, the study delved into various areas, including clustering, supervised classification, topic extraction, and sentiment analysis. The paper successfully demonstrated the feasibility of HDT for generating personalized responses in social messaging applications. Notably, the proposed approach achieved an overall accuracy of 63%, a highly promising result that could pave the way for further exploration of the HDT concept. The methodology employed Random Forest for clustering the question database and matching new questions, while K-nearest neighbor was utilized for sentiment analysis.

Keywords: Human Digital twin, sentiment analysis, topic extraction, supervised machine learning, unsupervised machine learning, classification and clustering.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 188
651 Path Planning of a Robot Manipulator using Retrieval RRT Strategy

Authors: K. Oh, J. P. Hwang, E. Kim, H. Lee

Abstract:

This paper presents an algorithm which extends the rapidly-exploring random tree (RRT) framework to deal with change of the task environments. This algorithm called the Retrieval RRT Strategy (RRS) combines a support vector machine (SVM) and RRT and plans the robot motion in the presence of the change of the surrounding environment. This algorithm consists of two levels. At the first level, the SVM is built and selects a proper path from the bank of RRTs for a given environment. At the second level, a real path is planned by the RRT planners for the given environment. The suggested method is applied to the control of KUKA™,, a commercial 6 DOF robot manipulator, and its feasibility and efficiency are demonstrated via the cosimulatation of MatLab™, and RecurDyn™,.

Keywords: Path planning, RRT, 6 DOF manipulator, SVM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2531
650 AI-Based Techniques for Online Social Media Network Sentiment Analysis: A Methodical Review

Authors: A. M. John-Otumu, M. M. Rahman, O. C. Nwokonkwo, M. C. Onuoha

Abstract:

Online social media networks have long served as a primary arena for group conversations, gossip, text-based information sharing and distribution. The use of natural language processing techniques for text classification and unbiased decision making has not been far-fetched. Proper classification of these textual information in a given context has also been very difficult. As a result, a systematic review was conducted from previous literature on sentiment classification and AI-based techniques. The study was done in order to gain a better understanding of the process of designing and developing a robust and more accurate sentiment classifier that could correctly classify social media textual information of a given context between hate speech and inverted compliments with a high level of accuracy using the knowledge gain from the evaluation of different artificial intelligence techniques reviewed. The study evaluated over 250 articles from digital sources like ACM digital library, Google Scholar, and IEEE Xplore; and whittled down the number of research to 52 articles. Findings revealed that deep learning approaches such as Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), Bidirectional Encoder Representations from Transformer (BERT), and Long Short-Term Memory (LSTM) outperformed various machine learning techniques in terms of performance accuracy. A large dataset is also required to develop a robust sentiment classifier. Results also revealed that data can be obtained from places like Twitter, movie reviews, Kaggle, Stanford Sentiment Treebank (SST), and SemEval Task4 based on the required domain. The hybrid deep learning techniques like CNN+LSTM, CNN+ Gated Recurrent Unit (GRU), CNN+BERT outperformed single deep learning techniques and machine learning techniques. Python programming language outperformed Java programming language in terms of development simplicity and AI-based library functionalities. Finally, the study recommended the findings obtained for building robust sentiment classifier in the future.

Keywords: Artificial Intelligence, Natural Language Processing, Sentiment Analysis, Social Network, Text.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 594
649 Management of Meskit (Prosopis juliflora) Tree in Oman: The Case of Using Meskit (Prosopis juliflora) Pods for Feeding Omani Sheep

Authors: S. Al-Khalasi, O. Mahgoub, H. Yaakub

Abstract:

This study evaluated the use of raw or processed Prosopis juliflora (Meskit) pods as a major ingredient in a formulated ration to provide an alternative non-conventional concentrate for livestock feeding in Oman. Dry Meskit pods were reduced to lengths of 0.5- 1.0 cm to ensure thorough mixing into three diets. Meskit pods were subjected to two types of treatments; roasting and soaking. They were roasted at 150оC for 30 minutes using a locally-made roasting device (40 kg barrel container rotated by electric motor and heated by flame gas cooker). Chopped pods were soaked in tap water for 24 hours and dried for 2 days under the sun with frequent turning. The Meskit-pod-based diets (MPBD) were formulated and pelleted from 500 g/kg ground Meskit pods, 240 g/kg wheat bran, 200 g/kg barley grain, 50 g/kg local dried sardines and 10 g/kg of salt. Twenty four 10 months-old intact Omani male lambs with average body weight of 27.3 kg (± 0.5 kg) were used in a feeding trial for 84 days. They were divided (on body weight basis) and allocated to four diet combination groups. These were: Rhodes grass hay (RGH) plus a general ruminant concentrate (GRC); RGH plus raw Meskit pods (RMP) based concentrate; RGH plus roasted Meskit pods (ROMP) based concentrate; RGH plus soaked Meskit pods (SMP) based concentrate Daily feed intakes and bi-weekly body weights were recorded. MPBD had higher contents of crude protein (CP), acid detergent fibre (ADF) and neutral detergent fibre (NDF) than the GRC. Animals fed various types of MPBD did not show signs of ill health. There was a significant effect of feeding ROMP on the performance of Omani sheep compared to RMP and SMP. The ROMP fed animals had similar performance to those fed the GRC in terms of feed intake, body weight gain and feed conversion ratio (FCR).This study indicated that roasted Meskit pods based diet may be used instead of the commercial concentrate for feeding Omani sheep without adverse effects on performance. It offers a cheap alternative source of protein and energy for feeding Omani sheep. Also, it might help in solving the spread impact of Meskit trees, maintain the ecosystem and helping in preserving the local tree species.

Keywords: Growth, Meskit, Omani sheep, Prosopis juliflora.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2776
648 Cheiloscopy and Dactylography in Relation to ABO Blood Groups: Egyptian vs. Malay Populations

Authors: Manal Hassan Abdel Aziz, Fatma Mohamed Magdy Badr El Dine, Nourhan Mohamed Mohamed Saeed

Abstract:

Establishing association between lip print patterns and those of fingerprints as well as blood groups is of fundamental importance in the forensic identification domain. The first aim of the current study was to determine the prevalent types of ABO blood groups, lip prints and fingerprints patterns in both studied populations. Secondly, to analyze any relation found between the different print patterns and the blood groups, which would be valuable in identification purposes. The present study was conducted on 60 healthy volunteers, (30 males and 30 females) from each of the studied population. Lip prints and fingerprints were obtained and classified according to Tsuchihashi's classification and Michael Kuchen’s classification, respectively. The results show that the ulnar loop was the most frequent among both populations. Blood group A was the most frequent among Egyptians, while blood groups O and B were the predominant among Malaysians. Significant relations were observed between lip print patterns and fingerprint (in the second quadrant for Egyptian males and the first one for Malaysian). For Malaysian females, a statistically significant association was proved in the fourth quadrant. Regarding the blood groups, 89.5% of ulnar loops were significantly related to blood group A among Egyptian males. The results proved an association between the fingerprint pattern and the lip prints, as well as between the ABO blood group and the pattern of fingerprints. However, further researches with larger sample sizes need to be directed to approve the current results.

Keywords: ABO, cheiloscopy, dactylography, Egyptians, Malaysians.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 884
647 An Automatic Bayesian Classification System for File Format Selection

Authors: Roman Graf, Sergiu Gordea, Heather M. Ryan

Abstract:

This paper presents an approach for the classification of an unstructured format description for identification of file formats. The main contribution of this work is the employment of data mining techniques to support file format selection with just the unstructured text description that comprises the most important format features for a particular organisation. Subsequently, the file format indentification method employs file format classifier and associated configurations to support digital preservation experts with an estimation of required file format. Our goal is to make use of a format specification knowledge base aggregated from a different Web sources in order to select file format for a particular institution. Using the naive Bayes method, the decision support system recommends to an expert, the file format for his institution. The proposed methods facilitate the selection of file format and the quality of a digital preservation process. The presented approach is meant to facilitate decision making for the preservation of digital content in libraries and archives using domain expert knowledge and specifications of file formats. To facilitate decision-making, the aggregated information about the file formats is presented as a file format vocabulary that comprises most common terms that are characteristic for all researched formats. The goal is to suggest a particular file format based on this vocabulary for analysis by an expert. The sample file format calculation and the calculation results including probabilities are presented in the evaluation section.

Keywords: Data mining, digital libraries, digital preservation, file format.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1660