Search results for: random features
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5521

Search results for: random features

5341 Remaining Useful Life Estimation of Bearings Based on Nonlinear Dimensional Reduction Combined with Timing Signals

Authors: Zhongmin Wang, Wudong Fan, Hengshan Zhang, Yimin Zhou

Abstract:

In data-driven prognostic methods, the prediction accuracy of the estimation for remaining useful life of bearings mainly depends on the performance of health indicators, which are usually fused some statistical features extracted from vibrating signals. However, the existing health indicators have the following two drawbacks: (1) The differnet ranges of the statistical features have the different contributions to construct the health indicators, the expert knowledge is required to extract the features. (2) When convolutional neural networks are utilized to tackle time-frequency features of signals, the time-series of signals are not considered. To overcome these drawbacks, in this study, the method combining convolutional neural network with gated recurrent unit is proposed to extract the time-frequency image features. The extracted features are utilized to construct health indicator and predict remaining useful life of bearings. First, original signals are converted into time-frequency images by using continuous wavelet transform so as to form the original feature sets. Second, with convolutional and pooling layers of convolutional neural networks, the most sensitive features of time-frequency images are selected from the original feature sets. Finally, these selected features are fed into the gated recurrent unit to construct the health indicator. The results state that the proposed method shows the enhance performance than the related studies which have used the same bearing dataset provided by PRONOSTIA.

Keywords: continuous wavelet transform, convolution neural net-work, gated recurrent unit, health indicators, remaining useful life

Procedia PDF Downloads 92
5340 Intrusion Detection in Cloud Computing Using Machine Learning

Authors: Faiza Babur Khan, Sohail Asghar

Abstract:

With an emergence of distributed environment, cloud computing is proving to be the most stimulating computing paradigm shift in computer technology, resulting in spectacular expansion in IT industry. Many companies have augmented their technical infrastructure by adopting cloud resource sharing architecture. Cloud computing has opened doors to unlimited opportunities from application to platform availability, expandable storage and provision of computing environment. However, from a security viewpoint, an added risk level is introduced from clouds, weakening the protection mechanisms, and hardening the availability of privacy, data security and on demand service. Issues of trust, confidentiality, and integrity are elevated due to multitenant resource sharing architecture of cloud. Trust or reliability of cloud refers to its capability of providing the needed services precisely and unfailingly. Confidentiality is the ability of the architecture to ensure authorization of the relevant party to access its private data. It also guarantees integrity to protect the data from being fabricated by an unauthorized user. So in order to assure provision of secured cloud, a roadmap or model is obligatory to analyze a security problem, design mitigation strategies, and evaluate solutions. The aim of the paper is twofold; first to enlighten the factors which make cloud security critical along with alleviation strategies and secondly to propose an intrusion detection model that identifies the attackers in a preventive way using machine learning Random Forest classifier with an accuracy of 99.8%. This model uses less number of features. A comparison with other classifiers is also presented.

Keywords: cloud security, threats, machine learning, random forest, classification

Procedia PDF Downloads 285
5339 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health

Authors: Irfan Ahmad Afip

Abstract:

This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.

Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression

Procedia PDF Downloads 72
5338 Customer Churn Prediction by Using Four Machine Learning Algorithms Integrating Features Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial component of maintaining a customer-oriented business as in the telecom industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years. It has become more important to understand customers’ needs in this strong market of telecom industries, especially for those who are looking to turn over their service providers. So, predictive churn is now a mandatory requirement for retaining those customers. Machine learning can be utilized to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: machine learning, gradient boosting, logistic regression, churn, random forest, decision tree, ROC, AUC, F1-score

Procedia PDF Downloads 94
5337 Radar Fault Diagnosis Strategy Based on Deep Learning

Authors: Bin Feng, Zhulin Zong

Abstract:

Radar systems are critical in the modern military, aviation, and maritime operations, and their proper functioning is essential for the success of these operations. However, due to the complexity and sensitivity of radar systems, they are susceptible to various faults that can significantly affect their performance. Traditional radar fault diagnosis strategies rely on expert knowledge and rule-based approaches, which are often limited in effectiveness and require a lot of time and resources. Deep learning has recently emerged as a promising approach for fault diagnosis due to its ability to learn features and patterns from large amounts of data automatically. In this paper, we propose a radar fault diagnosis strategy based on deep learning that can accurately identify and classify faults in radar systems. Our approach uses convolutional neural networks (CNN) to extract features from radar signals and fault classify the features. The proposed strategy is trained and validated on a dataset of measured radar signals with various types of faults. The results show that it achieves high accuracy in fault diagnosis. To further evaluate the effectiveness of the proposed strategy, we compare it with traditional rule-based approaches and other machine learning-based methods, including decision trees, support vector machines (SVMs), and random forests. The results demonstrate that our deep learning-based approach outperforms the traditional approaches in terms of accuracy and efficiency. Finally, we discuss the potential applications and limitations of the proposed strategy, as well as future research directions. Our study highlights the importance and potential of deep learning for radar fault diagnosis. It suggests that it can be a valuable tool for improving the performance and reliability of radar systems. In summary, this paper presents a radar fault diagnosis strategy based on deep learning that achieves high accuracy and efficiency in identifying and classifying faults in radar systems. The proposed strategy has significant potential for practical applications and can pave the way for further research.

Keywords: radar system, fault diagnosis, deep learning, radar fault

Procedia PDF Downloads 39
5336 Study of Quantum Lasers of Random Trimer Barrier AlxGa1-xAs Superlattices

Authors: Bentata Samir, Bendahma Fatima

Abstract:

We have numerically studied the random trimer barrier AlxGa1-xAs superlattices (RTBSL). Such systems consist of two different structures randomly distributed along the growth direction, with the additional constraint that the barriers of one kind appear in triply. An explicit formula is given for evaluating the transmission coefficient of superlattices (SL's) in intentional correlated disorder. We have specially investigated the effect of aluminum concentration on the laser wavelength. We discuss the impact of the aluminum concentration associated with the structure profile on the laser wavelengths.

Keywords: superlattices, transfer matrix method, transmission coefficient, quantum laser

Procedia PDF Downloads 439
5335 Multi-Objective Random Drift Particle Swarm Optimization Algorithm Based on RDPSO and Crowding Distance Sorting

Authors: Yiqiong Yuan, Jun Sun, Dongmei Zhou, Jianan Sun

Abstract:

In this paper, we presented a Multi-Objective Random Drift Particle Swarm Optimization algorithm (MORDPSO-CD) based on RDPSO and crowding distance sorting to improve the convergence and distribution with less computation cost. MORDPSO-CD makes the most of RDPSO to approach the true Pareto optimal solutions fast. We adopt the crowding distance sorting technique to update and maintain the archived optimal solutions. Introducing the crowding distance technique into MORDPSO can make the leader particles find the true Pareto solution ultimately. The simulation results reveal that the proposed algorithm has better convergence and distribution

Keywords: multi-objective optimization, random drift particle swarm optimization, crowding distance sorting, pareto optimal solution

Procedia PDF Downloads 212
5334 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)

Procedia PDF Downloads 184
5333 Node Optimization in Wireless Sensor Network: An Energy Approach

Authors: Y. B. Kirankumar, J. D. Mallapur

Abstract:

Wireless Sensor Network (WSN) is an emerging technology, which has great invention for various low cost applications both for mass public as well as for defence. The wireless sensor communication technology allows random participation of sensor nodes with particular applications to take part in the network, which results in most of the uncovered simulation area, where fewer nodes are located at far distances. The drawback of such network would be that the additional energy is spent by the nodes located in a pattern of dense location, using more number of nodes for a smaller distance of communication adversely in a region with less number of nodes and additional energy is again spent by the source node in order to transmit a packet to neighbours, thereby transmitting the packet to reach the destination. The proposed work is intended to develop Energy Efficient Node Placement Algorithm (EENPA) in order to place the sensor node efficiently in simulated area, where all the nodes are equally located on a radial path to cover maximum area at equidistance. The total energy consumed by each node compared to random placement of nodes is less by having equal burden on fewer nodes of far location, having distributed the nodes in whole of the simulation area. Calculating the network lifetime also proves to be efficient as compared to random placement of nodes, hence increasing the network lifetime, too. Simulation is been carried out in a qualnet simulator, results are obtained on par with random placement of nodes with EENP algorithm.

Keywords: energy, WSN, wireless sensor network, energy approach

Procedia PDF Downloads 276
5332 Modeling Of The Random Impingement Erosion Due To The Impact Of The Solid Particles

Authors: Siamack A. Shirazi, Farzin Darihaki

Abstract:

Solid particles could be found in many multiphase flows, including transport pipelines and pipe fittings. Such particles interact with the pipe material and cause erosion which threats the integrity of the system. Therefore, predicting the erosion rate is an important factor in the design and the monitor of such systems. Mechanistic models can provide reliable predictions for many conditions while demanding only relatively low computational cost. Mechanistic models utilize a representative particle trajectory to predict the impact characteristics of the majority of the particle impacts that cause maximum erosion rate in the domain. The erosion caused by particle impacts is not only due to the direct impacts but also random impingements. In the present study, an alternative model has been introduced to describe the erosion due to random impingement of particles. The present model provides a realistic trend for erosion with changes in the particle size and particle Stokes number. The present model is examined against the experimental data and CFD simulation results and indicates better agreement with the data incomparison to the available models in the literature.

Keywords: erosion, mechanistic modeling, particles, multiphase flow, gas-liquid-solid

Procedia PDF Downloads 135
5331 A Development of a Simulation Tool for Production Planning with Capacity-Booking at Specialty Store Retailer of Private Label Apparel Firms

Authors: Erika Yamaguchi, Sirawadee Arunyanrt, Shunichi Ohmori, Kazuho Yoshimoto

Abstract:

In this paper, we suggest a simulation tool to make a decision of monthly production planning for maximizing a profit of Specialty store retailer of Private label Apparel (SPA) firms. Most of SPA firms are fabless and make outsourcing deals for productions with factories of their subcontractors. Every month, SPA firms make a booking for production lines and manpower in the factories. The booking is conducted a few months in advance based on a demand prediction and a monthly production planning at that time. However, the demand prediction is updated month by month, and the monthly production planning would change to meet the latest demand prediction. Then, SPA firms have to change the capacities initially booked within a certain range to suit to the monthly production planning. The booking system is called “capacity-booking”. These days, though it is an issue for SPA firms to make precise monthly production planning, many firms are still conducting the production planning by empirical rules. In addition, it is also a challenge for SPA firms to match their products and factories with considering their demand predictabilities and regulation abilities. In this paper, we suggest a model for considering these two issues. An objective is to maximize a total profit of certain periods, which is sales minus costs of production, inventory, and capacity-booking penalty. To make a better monthly production planning at SPA firms, these points should be considered: demand predictabilities by random trends, previous and next month’s production planning of the target month, and regulation abilities of the capacity-booking. To decide matching products and factories for outsourcing, it is important to consider seasonality, volume, and predictability of each product, production possibility, size, and regulation ability of each factory. SPA firms have to consider these constructions and decide orders with several factories per one product. We modeled these issues as a linear programming. To validate the model, an example of several computational experiments with a SPA firm is presented. We suppose four typical product groups: basic, seasonal (Spring / Summer), seasonal (Fall / Winter), and spot product. As a result of the experiments, a monthly production planning was provided. In the planning, demand predictabilities from random trend are reduced by producing products which are different product types. Moreover, priorities to produce are given to high-margin products. In conclusion, we developed a simulation tool to make a decision of monthly production planning which is useful when the production planning is set every month. We considered the features of capacity-booking, and matching of products and factories which have different features and conditions.

Keywords: capacity-booking, SPA, monthly production planning, linear programming

Procedia PDF Downloads 481
5330 Rural Development through Women Participation in Livestock Care and Management in District Faisalabad

Authors: Arfan Riasat, M. Iqbal Zafar, Gulfam Riasat

Abstract:

Pakistani women actively participate in livestock management activities, along with their normal domestic chores. The study was designed to measure the position and contribution of rural women, their constraints in livestock management activities and mainly how the rural women contribute for development in the district Faisalabad. It was envisioned that women participation in livestock activities have rarely been investigated. A multistage random sampling technique was used to collect the data from Tehsil Summandry of the district selected at random. Two union councils were taken by using simple random sampling technique. Four Chak (village) from each union council were selected at random and fifteen woman were further selected randomly from each selected chak. The results show that a vast majority of women were illiterate, having annual family income of one to two lac. They are living in joint family system. Their main occupation is agriculture and they spend long hours in whole livestock related activities to support their families. A large proportion of the respondents reported that they had to face problems and constraints in livestock activities in the context of decision making, medication, awareness, training along with social and economic issues. Analysis indicated that education level of women, income of household, age were significantly associated with level of participation. Women participation in livestock activities increased production and they were involved in income generating activities for better economic conditions of their families.

Keywords: women, participation, livestock, management, rural development

Procedia PDF Downloads 370
5329 Designing Inventory System with Constrained by Reducing Ordering Cost, Lead Time and Lost Sale Rate and Considering Random Disturbance in Ordering Quantity

Authors: Arezoo Heidary, Abolfazl Mirzazadeh, Aref Gholami-Qadikolaei

Abstract:

In the business environment it is very common that a lot received may not be equal to quantity ordered. in this work, a random disturbance in a received quantity is considered. It is assumed a maximum allowable limit for storage space and inventory investment.The impact of lead time and ordering cost reductions once they act dependently is also investigated. Further, considering a mixture of back order and lost sales for allowable shortage system, the effect of investment on reducing lost sale rate is analyzed. For the proposed control system, a Lagrangian method is applied in order to solve the problem and an algorithmic procedure is utilized to achieve optimal solution with the global minimum expected cost. Finally, proves on concavity and convexity of the model in the decision variables are shown.

Keywords: stochastic inventory system, lead time, ordering cost, lost sale rate, inventory constraints, random disturbance

Procedia PDF Downloads 377
5328 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 172
5327 Development of Sleep Quality Index Using Heart Rate

Authors: Dongjoo Kim, Chang-Sik Son, Won-Seok Kang

Abstract:

Adequate sleep affects various parts of one’s overall physical and mental life. As one of the methods in determining the appropriate amount of sleep, this research presents a heart rate based sleep quality index. In order to evaluate sleep quality using the heart rate, sleep data from 280 subjects taken over one month are used. Their sleep data are categorized by a three-part heart rate range. After categorizing, some features are extracted, and the statistical significances are verified for these features. The results show that some features of this sleep quality index model have statistical significance. Thus, this heart rate based sleep quality index may be a useful discriminator of sleep.

Keywords: sleep, sleep quality, heart rate, statistical analysis

Procedia PDF Downloads 300
5326 The Use of Random Set Method in Reliability Analysis of Deep Excavations

Authors: Arefeh Arabaninezhad, Ali Fakher

Abstract:

Since the deterministic analysis methods fail to take system uncertainties into account, probabilistic and non-probabilistic methods are suggested. Geotechnical analyses are used to determine the stress and deformation caused by construction; accordingly, many input variables which depend on ground behavior are required for geotechnical analyses. The Random Set approach is an applicable reliability analysis method when comprehensive sources of information are not available. Using Random Set method, with relatively small number of simulations compared to fully probabilistic methods, smooth extremes on system responses are obtained. Therefore random set approach has been proposed for reliability analysis in geotechnical problems. In the present study, the application of random set method in reliability analysis of deep excavations is investigated through three deep excavation projects which were monitored during the excavating process. A finite element code is utilized for numerical modeling. Two expected ranges, from different sources of information, are established for each input variable, and a specific probability assignment is defined for each range. To determine the most influential input variables and subsequently reducing the number of required finite element calculations, sensitivity analysis is carried out. Input data for finite element model are obtained by combining the upper and lower bounds of the input variables. The relevant probability share of each finite element calculation is determined considering the probability assigned to input variables present in these combinations. Horizontal displacement of the top point of excavation is considered as the main response of the system. The result of reliability analysis for each intended deep excavation is presented by constructing the Belief and Plausibility distribution function (i.e. lower and upper bounds) of system response obtained from deterministic finite element calculations. To evaluate the quality of input variables as well as applied reliability analysis method, the range of displacements extracted from models has been compared to the in situ measurements and good agreement is observed. The comparison also showed that Random Set Finite Element Method applies to estimate the horizontal displacement of the top point of deep excavation. Finally, the probability of failure or unsatisfactory performance of the system is evaluated by comparing the threshold displacement with reliability analysis results.

Keywords: deep excavation, random set finite element method, reliability analysis, uncertainty

Procedia PDF Downloads 235
5325 A Comprehensive Analysis of the Phylogenetic Signal in Ramp Sequences in 211 Vertebrates

Authors: Lauren M. McKinnon, Justin B. Miller, Michael F. Whiting, John S. K. Kauwe, Perry G. Ridge

Abstract:

Background: Ramp sequences increase translational speed and accuracy when rare, slowly-translated codons are found at the beginnings of genes. Here, the results of the first analysis of ramp sequences in a phylogenetic construct are presented. Methods: Ramp sequences were compared from 211 vertebrates (110 Mammalian and 101 non-mammalian). The presence and absence of ramp sequences were analyzed as a binary character in a parsimony and maximum likelihood framework. Additionally, ramp sequences were mapped to the Open Tree of Life taxonomy to determine the number of parallelisms and reversals that occurred, and these results were compared to what would be expected due to random chance. Lastly, aligned nucleotides in ramp sequences were compared to the rest of the sequence in order to examine possible differences in phylogenetic signal between these regions of the gene. Results: Parsimony and maximum likelihood analyses of the presence/absence of ramp sequences recovered phylogenies that are highly congruent with established phylogenies. Additionally, the retention index of ramp sequences is significantly higher than would be expected due to random chance (p-value = 0). A chi-square analysis of completely orthologous ramp sequences resulted in a p-value of approximately zero as compared to random chance. Discussion: Ramp sequences recover comparable phylogenies as other phylogenomic methods. Although not all ramp sequences appear to have a phylogenetic signal, more ramp sequences track speciation than expected by random chance. Therefore, ramp sequences may be used in conjunction with other phylogenomic approaches.

Keywords: codon usage bias, phylogenetics, phylogenomics, ramp sequence

Procedia PDF Downloads 118
5324 Geo-Additive Modeling of Family Size in Nigeria

Authors: Oluwayemisi O. Alaba, John O. Olaomi

Abstract:

The 2013 Nigerian Demographic Health Survey (NDHS) data was used to investigate the determinants of family size in Nigeria using the geo-additive model. The fixed effect of categorical covariates were modelled using the diffuse prior, P-spline with second-order random walk for the nonlinear effect of continuous variable, spatial effects followed Markov random field priors while the exchangeable normal priors were used for the random effects of the community and household. The Negative Binomial distribution was used to handle overdispersion of the dependent variable. Inference was fully Bayesian approach. Results showed a declining effect of secondary and higher education of mother, Yoruba tribe, Christianity, family planning, mother giving birth by caesarean section and having a partner who has secondary education on family size. Big family size is positively associated with age at first birth, number of daughters in a household, being gainfully employed, married and living with partner, community and household effects.

Keywords: Bayesian analysis, family size, geo-additive model, negative binomial

Procedia PDF Downloads 499
5323 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 285
5322 Test-Retest Agreement, Random Measurement Error and Practice Effect of the Continuous Performance Test-Identical Pairs for Patients with Schizophrenia

Authors: Kuan-Wei Chen, Chien-Wei Chen, Tai-Ling Chang, Nan-Cheng Chen, Ching-Lin Hsieh, Gong-Hong Lin

Abstract:

Background and Purposes: Deficits in sustained attention are common in patients with schizophrenia. Such impairment can limit patients to effectively execute daily activities and affect the efficacy of rehabilitation. The aims of this study were to examine the test-retest agreement, random measurement error, and practice effect of the Continuous Performance Test-Identical Pairs (CPT-IP) (a commonly used sustained attention test) in patients with schizophrenia. The results can provide empirical evidence for clinicians and researchers to apply a sustained attention test with sound psychometric properties in schizophrenia patients. Methods: We recruited patients with chronic schizophrenia to be assessed twice with 1 week interval using CPT-IP. The intra-class correlation coefficient (ICC) was used to examine the test-retest agreement. The percentage of minimal detectable change (MDC%) was used to examine the random measurement error. Moreover, the standardized response mean (SRM) was used to examine the practice effect. Results: A total of 56 patients participated in this study. Our results showed that the ICC was 0.82, MDC% was 47.4%, and SRMs were 0.36 for the CPT-IP. Conclusion: Our results indicate that CPT-IP has acceptable test-retests agreement, substantial random measurement error, and small practice effect in patients with schizophrenia. Therefore, to avoid overestimating patients’ changes in sustained attention, we suggest that clinicians interpret the change scores of CPT-IP conservatively in their routine repeated assessments.

Keywords: schizophrenia, sustained attention, CPT-IP, reliability

Procedia PDF Downloads 259
5321 Early Gastric Cancer Prediction from Diet and Epidemiological Data Using Machine Learning in Mizoram Population

Authors: Brindha Senthil Kumar, Payel Chakraborty, Senthil Kumar Nachimuthu, Arindam Maitra, Prem Nath

Abstract:

Gastric cancer is predominantly caused by demographic and diet factors as compared to other cancer types. The aim of the study is to predict Early Gastric Cancer (ECG) from diet and lifestyle factors using supervised machine learning algorithms. For this study, 160 healthy individual and 80 cases were selected who had been followed for 3 years (2016-2019), at Civil Hospital, Aizawl, Mizoram. A dataset containing 11 features that are core risk factors for the gastric cancer were extracted. Supervised machine algorithms: Logistic Regression, Naive Bayes, Support Vector Machine (SVM), Multilayer perceptron, and Random Forest were used to analyze the dataset using Python Jupyter Notebook Version 3. The obtained classified results had been evaluated using metrics parameters: minimum_false_positives, brier_score, accuracy, precision, recall, F1_score, and Receiver Operating Characteristics (ROC) curve. Data analysis results showed Naive Bayes - 88, 0.11; Random Forest - 83, 0.16; SVM - 77, 0.22; Logistic Regression - 75, 0.25 and Multilayer perceptron - 72, 0.27 with respect to accuracy and brier_score in percent. Naive Bayes algorithm out performs with very low false positive rates as well as brier_score and good accuracy. Naive Bayes algorithm classification results in predicting ECG showed very satisfactory results using only diet cum lifestyle factors which will be very helpful for the physicians to educate the patients and public, thereby mortality of gastric cancer can be reduced/avoided with this knowledge mining work.

Keywords: Early Gastric cancer, Machine Learning, Diet, Lifestyle Characteristics

Procedia PDF Downloads 106
5320 Deterministic Random Number Generator Algorithm for Cryptosystem Keys

Authors: Adi A. Maaita, Hamza A. A. Al Sewadi

Abstract:

One of the crucial parameters of digital cryptographic systems is the selection of the keys used and their distribution. The randomness of the keys has a strong impact on the system’s security strength being difficult to be predicted, guessed, reproduced or discovered by a cryptanalyst. Therefore, adequate key randomness generation is still sought for the benefit of stronger cryptosystems. This paper suggests an algorithm designed to generate and test pseudo random number sequences intended for cryptographic applications. This algorithm is based on mathematically manipulating a publically agreed upon information between sender and receiver over a public channel. This information is used as a seed for performing some mathematical functions in order to generate a sequence of pseudorandom numbers that will be used for encryption/decryption purposes. This manipulation involves permutations and substitutions that fulfills Shannon’s principle of “confusion and diffusion”. ASCII code characters wereutilized in the generation process instead of using bit strings initially, which adds more flexibility in testing different seed values. Finally, the obtained results would indicate sound difficulty of guessing keys by attackers.

Keywords: cryptosystems, information security agreement, key distribution, random numbers

Procedia PDF Downloads 228
5319 Enhanced Thai Character Recognition with Histogram Projection Feature Extraction

Authors: Benjawan Rangsikamol, Chutimet Srinilta

Abstract:

This research paper deals with extraction of Thai character features using the proposed histogram projection so as to improve the recognition performance. The process starts with transformation of image files into binary files before thinning. After character thinning, the skeletons are entered into the proposed extraction using histogram projection (horizontal and vertical) to extract unique features which are inputs of the subsequent recognition step. The recognition rate with the proposed extraction technique is as high as 97 percent since the technique works very well with the idiosyncrasies of Thai characters.

Keywords: character recognition, histogram projection, multilayer perceptron, Thai character features extraction

Procedia PDF Downloads 423
5318 A Review of Serious Games Characteristics: Common and Specific Aspects

Authors: B. Ben Amara, H. Mhiri Sellami

Abstract:

Serious games adoption is increasing in multiple fields, including health, education, and business. In the same way, many research studied serious games (SGs) for various purposes such as classification, positive impacts, or learning outcomes. Although most of these research examine SG characteristics (SGCs) for conducting their studies, to author’s best knowledge, there is no consensus about features neither in number not in the description. In this paper, we conduct a literature review to collect essential game attributes regardless of the application areas and the study objectives. Firstly, we aimed to define Common SGCs (CSGCs) that characterize the game aspect, by gathering features having the same meanings. Secondly, we tried to identify specific features related to the application area or to the study purpose as a serious aspect. The findings suggest that any type of SG can be defined by a number of CSGCs depicting the gaming side, such as adaptability and rules. In addition, we outlined a number of specific SGCs describing the 'serious' aspect, including specific needs of the domain and indented outcomes. In conclusion, our review showed that it is possible to bridge the research gap due to the lack of consensus by using CSGCs. Moreover, these features facilitate the design and development of successful serious games in any domain and provide a foundation for further research in this area.

Keywords: serious game characteristics, serious games common aspects, serious games features, serious games outcomes

Procedia PDF Downloads 94
5317 Evaluation of Technology Tools for Mathematics Instruction by Novice Elementary Teachers

Authors: Christopher J. Johnston

Abstract:

This paper presents the finding of a research study in which novice (first and second year) elementary teachers (grades Kindergarten – six) evaluated various mathematics Virtual Manipulatives, websites, and Applets (tools) for use in mathematics instruction. Participants identified the criteria they used for evaluating these types of resources and provided recommendations for or against five pre-selected tools. During the study, participants participated in three data collection activities: (1) A brief Likert-scale survey which gathered information about their attitudes toward technology use; (2) An identification of criteria for evaluating technology tools; and (3) A review of five pre-selected technology tools in light of their self-identified criteria. Data were analyzed qualitatively using four theoretical categories (codes): Software Features (41%), Mathematics (26%), Learning (22%), and Motivation (11%). These four theoretical categories were then grouped into two broad categories: Content and Instruction (Mathematics and Learning), and Surface Features (Software Features and Motivation). These combined, broad categories suggest novice teachers place roughly the same weight on pedagogical features as they do technological features. Implications for mathematics teacher educators are discussed, and suggestions for future research are provided.

Keywords: mathematics education, novice teachers, technology, virtual manipulatives

Procedia PDF Downloads 94
5316 Using Predictive Analytics to Identify First-Year Engineering Students at Risk of Failing

Authors: Beng Yew Low, Cher Liang Cha, Cheng Yong Teoh

Abstract:

Due to a lack of continual assessment or grade related data, identifying first-year engineering students in a polytechnic education at risk of failing is challenging. Our experience over the years tells us that there is no strong correlation between having good entry grades in Mathematics and the Sciences and excelling in hardcore engineering subjects. Hence, identifying students at risk of failure cannot be on the basis of entry grades in Mathematics and the Sciences alone. These factors compound the difficulty of early identification and intervention. This paper describes the development of a predictive analytics model in the early detection of students at risk of failing and evaluates its effectiveness. Data from continual assessments conducted in term one, supplemented by data of student psychological profiles such as interests and study habits, were used. Three classification techniques, namely Logistic Regression, K Nearest Neighbour, and Random Forest, were used in our predictive model. Based on our findings, Random Forest was determined to be the strongest predictor with an Area Under the Curve (AUC) value of 0.994. Correspondingly, the Accuracy, Precision, Recall, and F-Score were also highest among these three classifiers. Using this Random Forest Classification technique, students at risk of failure could be identified at the end of term one. They could then be assigned to a Learning Support Programme at the beginning of term two. This paper gathers the results of our findings. It also proposes further improvements that can be made to the model.

Keywords: continual assessment, predictive analytics, random forest, student psychological profile

Procedia PDF Downloads 88
5315 Solving Weighted Number of Operation Plus Processing Time Due-Date Assignment, Weighted Scheduling and Process Planning Integration Problem Using Genetic and Simulated Annealing Search Methods

Authors: Halil Ibrahim Demir, Caner Erden, Mumtaz Ipek, Ozer Uygun

Abstract:

Traditionally, the three important manufacturing functions, which are process planning, scheduling and due-date assignment, are performed separately and sequentially. For couple of decades, hundreds of studies are done on integrated process planning and scheduling problems and numerous researches are performed on scheduling with due date assignment problem, but unfortunately the integration of these three important functions are not adequately addressed. Here, the integration of these three important functions is studied by using genetic, random-genetic hybrid, simulated annealing, random-simulated annealing hybrid and random search techniques. As well, the importance of the integration of these three functions and the power of meta-heuristics and of hybrid heuristics are studied.

Keywords: process planning, weighted scheduling, weighted due-date assignment, genetic search, simulated annealing, hybrid meta-heuristics

Procedia PDF Downloads 442
5314 A Chinese Nested Named Entity Recognition Model Based on Lexical Features

Authors: Shuo Liu, Dan Liu

Abstract:

In the field of named entity recognition, most of the research has been conducted around simple entities. However, for nested named entities, which still contain entities within entities, it has been difficult to identify them accurately due to their boundary ambiguity. In this paper, a hierarchical recognition model is constructed based on the grammatical structure and semantic features of Chinese text for boundary calculation based on lexical features. The analysis is carried out at different levels in terms of granularity, semantics, and lexicality, respectively, avoiding repetitive work to reduce computational effort and using the semantic features of words to calculate the boundaries of entities to improve the accuracy of the recognition work. The results of the experiments carried out on web-based microblogging data show that the model achieves an accuracy of 86.33% and an F1 value of 89.27% in recognizing nested named entities, making up for the shortcomings of some previous recognition models and improving the efficiency of recognition of nested named entities.

Keywords: coarse-grained, nested named entity, Chinese natural language processing, word embedding, T-SNE dimensionality reduction algorithm

Procedia PDF Downloads 91
5313 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: feature generation, feature learning, genetic algorithm, music information retrieval

Procedia PDF Downloads 391
5312 Detection and Classification of Myocardial Infarction Using New Extracted Features from Standard 12-Lead ECG Signals

Authors: Naser Safdarian, Nader Jafarnia Dabanloo

Abstract:

In this paper we used four features i.e. Q-wave integral, QRS complex integral, T-wave integral and total integral as extracted feature from normal and patient ECG signals to detection and localization of myocardial infarction (MI) in left ventricle of heart. In our research we focused on detection and localization of MI in standard ECG. We use the Q-wave integral and T-wave integral because this feature is important impression in detection of MI. We used some pattern recognition method such as Artificial Neural Network (ANN) to detect and localize the MI. Because these methods have good accuracy for classification of normal and abnormal signals. We used one type of Radial Basis Function (RBF) that called Probabilistic Neural Network (PNN) because of its nonlinearity property, and used other classifier such as k-Nearest Neighbors (KNN), Multilayer Perceptron (MLP) and Naive Bayes Classification. We used PhysioNet database as our training and test data. We reached over 80% for accuracy in test data for localization and over 95% for detection of MI. Main advantages of our method are simplicity and its good accuracy. Also we can improve accuracy of classification by adding more features in this method. A simple method based on using only four features which extracted from standard ECG is presented which has good accuracy in MI localization.

Keywords: ECG signal processing, myocardial infarction, features extraction, pattern recognition

Procedia PDF Downloads 418