Search results for: predictive analysis algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28763

Search results for: predictive analysis algorithms

28403 Comparison between Transient Elastography (FibroScan) and Liver Biopsy for Diagnosis of Hepatic Fibrosis in Chronic Hepatitis C Genotype 4

Authors: Gamal Shiha, Seham Seif, Shahera Etreby, Khaled Zalata, Waleed Samir

Abstract:

Background: Transient Elastography (TE; FibroScan®) is a non-invasive technique to assess liver fibrosis. Aim: To compare TE and liver biopsy in hepatitis C virus (HCV) patients, genotype IV and evaluate the effect of steatosis and schistosomiasis on FibroScan. Methods: The fibrosis stage (METAVIR Score) TE, was assessed in 519 patients. The diagnostic performance of FibroScan is assessed by calculating the area under the receiver operating characteristic curves (AUROCs). Results: The cut-off value of ≥ F2 was 8.55 kPa, ≥ F3 was 10.2 kPa and cirrhosis = F4 was 16.3 kPa. The positive predictive value and negative predictive value were 70.1% and 81.7% for the diagnosis of ≥ F2, 62.6% and 96.22% for F ≥ 3, and 27.7% and 100% for F4. No significant difference between schistosomiasis, steatosis degree and FibroScan measurements. Conclusion: Fibroscan could accurately predict liver fibrosis.

Keywords: chronic hepatitis C, FibroScan, liver biopsy, liver fibrosis

Procedia PDF Downloads 379
28402 State Estimation Based on Unscented Kalman Filter for Burgers’ Equation

Authors: Takashi Shimizu, Tomoaki Hashimoto

Abstract:

Controlling the flow of fluids is a challenging problem that arises in many fields. Burgers’ equation is a fundamental equation for several flow phenomena such as traffic, shock waves, and turbulence. The optimal feedback control method, so-called model predictive control, has been proposed for Burgers’ equation. However, the model predictive control method is inapplicable to systems whose all state variables are not exactly known. In practical point of view, it is unusual that all the state variables of systems are exactly known, because the state variables of systems are measured through output sensors and limited parts of them can be only available. In fact, it is usual that flow velocities of fluid systems cannot be measured for all spatial domains. Hence, any practical feedback controller for fluid systems must incorporate some type of state estimator. To apply the model predictive control to the fluid systems described by Burgers’ equation, it is needed to establish a state estimation method for Burgers’ equation with limited measurable state variables. To this purpose, we apply unscented Kalman filter for estimating the state variables of fluid systems described by Burgers’ equation. The objective of this study is to establish a state estimation method based on unscented Kalman filter for Burgers’ equation. The effectiveness of the proposed method is verified by numerical simulations.

Keywords: observer systems, unscented Kalman filter, nonlinear systems, Burgers' equation

Procedia PDF Downloads 124
28401 Using Genetic Algorithms and Rough Set Based Fuzzy K-Modes to Improve Centroid Model Clustering Performance on Categorical Data

Authors: Rishabh Srivastav, Divyam Sharma

Abstract:

We propose an algorithm to cluster categorical data named as ‘Genetic algorithm initialized rough set based fuzzy K-Modes for categorical data’. We propose an amalgamation of the simple K-modes algorithm, the Rough and Fuzzy set based K-modes and the Genetic Algorithm to form a new algorithm,which we hypothesise, will provide better Centroid Model clustering results, than existing standard algorithms. In the proposed algorithm, the initialization and updation of modes is done by the use of genetic algorithms while the membership values are calculated using the rough set and fuzzy logic.

Keywords: categorical data, fuzzy logic, genetic algorithm, K modes clustering, rough sets

Procedia PDF Downloads 218
28400 Enhance the Power of Sentiment Analysis

Authors: Yu Zhang, Pedro Desouza

Abstract:

Since big data has become substantially more accessible and manageable due to the development of powerful tools for dealing with unstructured data, people are eager to mine information from social media resources that could not be handled in the past. Sentiment analysis, as a novel branch of text mining, has in the last decade become increasingly important in marketing analysis, customer risk prediction and other fields. Scientists and researchers have undertaken significant work in creating and improving their sentiment models. In this paper, we present a concept of selecting appropriate classifiers based on the features and qualities of data sources by comparing the performances of five classifiers with three popular social media data sources: Twitter, Amazon Customer Reviews, and Movie Reviews. We introduced a couple of innovative models that outperform traditional sentiment classifiers for these data sources, and provide insights on how to further improve the predictive power of sentiment analysis. The modelling and testing work was done in R and Greenplum in-database analytic tools.

Keywords: sentiment analysis, social media, Twitter, Amazon, data mining, machine learning, text mining

Procedia PDF Downloads 328
28399 Supervised Learning for Cyber Threat Intelligence

Authors: Jihen Bennaceur, Wissem Zouaghi, Ali Mabrouk

Abstract:

The major aim of cyber threat intelligence (CTI) is to provide sophisticated knowledge about cybersecurity threats to ensure internal and external safeguards against modern cyberattacks. Inaccurate, incomplete, outdated, and invaluable threat intelligence is the main problem. Therefore, data analysis based on AI algorithms is one of the emergent solutions to overcome the threat of information-sharing issues. In this paper, we propose a supervised machine learning-based algorithm to improve threat information sharing by providing a sophisticated classification of cyber threats and data. Extensive simulations investigate the accuracy, precision, recall, f1-score, and support overall to validate the designed algorithm and to compare it with several supervised machine learning algorithms.

Keywords: threat information sharing, supervised learning, data classification, performance evaluation

Procedia PDF Downloads 120
28398 The Predictive Role of Attachment and Adjustment in the Decision-Making Process in Infertility

Authors: A. Luli, A. Santona

Abstract:

It is rare for individuals that are involved in a relationship to think about the possibility of having procreation problems in the near present or in the future. However, infertility is a condition that affects millions of people all around the world. Often, infertile individuals have to deal with experiences of psychological, relational and social problems. In these cases, they have to review their choices and take into consideration, if it is necessary, new ones. Different studies have examined the different decisions that infertile individuals have to go through dealing with infertility and its treatment, but none of them is focused on the decision-making style used by infertile individuals to solve their problem and on the factors that influences it. The aim of this paper is to define the style of decision-making used by infertile persons to give a solution to the ‘problem’ and the potential predictive role of the attachment and of the dyadic adjustment. The total sample is composed by 251 participants, divided in two groups: the experimental group composed by 114 participants, 62 males and 52 females, age between 25 and 59 years, and the control group composed by 137 participants, 65 males and 72 females, age between 22 and 49 years. The battery of instruments used is composed by: the General Decision Making Style (GDMS), the Experiences in Close Relationships Questionnaire Revised (ECR-R), Dyadic Adjustment Scale (DAS), and the Symptom Checklist-90-R (SCL-90-R). The results from the analysis of the samples showed a prevalence of the rational decision-making style for both males and females. No significant statistical difference was found between the experimental and control group. Also the analyses showed a significant statistical relationship between the decision making styles and the adult attachment styles for both males and females. In this case, only for males, there was a significant statistical difference between the experimental and the control group. Another significant statistical relationship was founded between the decision making styles and the adjustment scales for both males and females. Also in this case, the difference between the two groups was founded to be significant only of males. These results contribute to enrich the literature on the subject of decision-making styles in infertile individuals, showing also the predictive role of the attachment styles and the adjustment, confirming in this was the few results in the literature.

Keywords: adjustment, attachment, decision-making style, infertility

Procedia PDF Downloads 307
28397 A Posterior Predictive Model-Based Control Chart for Monitoring Healthcare

Authors: Yi-Fan Lin, Peter P. Howley, Frank A. Tuyl

Abstract:

Quality measurement and reporting systems are used in healthcare internationally. In Australia, the Australian Council on Healthcare Standards records and reports hundreds of clinical indicators (CIs) nationally across the healthcare system. These CIs are measures of performance in the clinical setting, and are used as a screening tool to help assess whether a standard of care is being met. Existing analysis and reporting of these CIs incorporate Bayesian methods to address sampling variation; however, such assessments are retrospective in nature, reporting upon the previous six or twelve months of data. The use of Bayesian methods within statistical process control for monitoring systems is an important pursuit to support more timely decision-making. Our research has developed and assessed a new graphical monitoring tool, similar to a control chart, based on the beta-binomial posterior predictive (BBPP) distribution to facilitate the real-time assessment of health care organizational performance via CIs. The BBPP charts have been compared with the traditional Bernoulli CUSUM (BC) chart by simulation. The more traditional “central” and “highest posterior density” (HPD) interval approaches were each considered to define the limits, and the multiple charts were compared via in-control and out-of-control average run lengths (ARLs), assuming that the parameter representing the underlying CI rate (proportion of cases with an event of interest) required estimation. Preliminary results have identified that the BBPP chart with HPD-based control limits provides better out-of-control run length performance than the central interval-based and BC charts. Further, the BC chart’s performance may be improved by using Bayesian parameter estimation of the underlying CI rate.

Keywords: average run length (ARL), bernoulli cusum (BC) chart, beta binomial posterior predictive (BBPP) distribution, clinical indicator (CI), healthcare organization (HCO), highest posterior density (HPD) interval

Procedia PDF Downloads 182
28396 Adolescent-Parent Relationship as the Most Important Factor in Preventing Mood Disorders in Adolescents: An Application of Artificial Intelligence to Social Studies

Authors: Elżbieta Turska

Abstract:

Introduction: One of the most difficult times in a person’s life is adolescence. The experiences in this period may shape the future life of this person to a large extent. This is the reason why many young people experience sadness, dejection, hopelessness, sense of worthlessness, as well as losing interest in various activities and social relationships, all of which are often classified as mood disorders. As many as 15-40% adolescents experience depressed moods and for most of them they resolve and are not carried into adulthood. However, (5-6%) of those affected by mood disorders develop the depressive syndrome and as many as (1-3%) develop full-blown clinical depression. Materials: A large questionnaire was given to 2508 students, aged 13–16 years old, and one of its parts was the Burns checklist, i.e. the standard test for identifying depressed mood. The questionnaire asked about many aspects of the student’s life, it included a total of 53 questions, most of which had subquestions. It is important to note that the data suffered from many problems, the most important of which were missing data and collinearity. Aim: In order to identify the correlates of mood disorders we built predictive models which were then trained and validated. Our aim was not to be able to predict which students suffer from mood disorders but rather to explore the factors influencing mood disorders. Methods: The problems with data described above practically excluded using all classical statistical methods. For this reason, we attempted to use the following Artificial Intelligence (AI) methods: classification trees with surrogate variables, random forests and xgboost. All analyses were carried out with the use of the mlr package for the R programming language. Resuts: The predictive model built by classification trees algorithm outperformed the other algorithms by a large margin. As a result, we were able to rank the variables (questions and subquestions from the questionnaire) from the most to least influential as far as protection against mood disorder is concerned. Thirteen out of twenty most important variables reflect the relationships with parents. This seems to be a really significant result both from the cognitive point of view and also from the practical point of view, i.e. as far as interventions to correct mood disorders are concerned.

Keywords: mood disorders, adolescents, family, artificial intelligence

Procedia PDF Downloads 80
28395 Privacy Concerns and Law Enforcement Data Collection to Tackle Domestic and Sexual Violence

Authors: Francesca Radice

Abstract:

Domestic and sexual violence provokes, on average in Australia, one female death per week due to intimate violence behaviours. 83% of couples meet online, and intercepting domestic and sexual violence at this level would be beneficial. It has been observed that violent or coercive behaviour has been apparent from initial conversations on dating apps like Tinder. Child pornography, stalking, and coercive control are some criminal offences from dating apps, including women murdered after finding partners through Tinder. Police databases and predictive policing are novel approaches taken to prevent crime before harm is done. This research will investigate how police databases can be used in a privacy-preserving way to characterise users in terms of their potential for violent crime. Using the COPS database of NSW Police, we will explore how the past criminal record can be interpreted to yield a category of potential danger for each dating app user. It is up to the judgement of each subscriber on what degree of the potential danger they are prepared to enter into. Sentiment analysis is an area where research into natural language processing has made great progress over the last decade. This research will investigate how sentiment analysis can be used to interpret interchanges between dating app users to detect manipulative or coercive sentiments. These can be used to alert law enforcement if continued for a defined number of communications. One of the potential problems of this approach is the potential prejudice a categorisation can cause. Another drawback is the possibility of misinterpreting communications and involving law enforcement without reason. The approach will be thoroughly tested with cross-checks by human readers who verify both the level of danger predicted by the interpretation of the criminal record and the sentiment detected from personal messages. Even if only a few violent crimes can be prevented, the approach will have a tangible value for real people.

Keywords: sentiment analysis, data mining, predictive policing, virtual manipulation

Procedia PDF Downloads 58
28394 The Study of Power as a Pertinent Motive among Tribal College Students of Assam

Authors: K. P. Gogoi

Abstract:

The current research study investigates the motivational pattern viz Power motivation among the tribal college students of Assam. The sample consisted of 240 college students (120 tribal and 120 non-tribal) ranging from 18-24 years, 60 males and 60 females for both tribal’s and non-tribal’s. Attempts were made to include all the prominent tribes of Assam viz. Thematic Apperception Test, Power motive Scale and a semi structured interview schedule were used to gather information about their family types, parental deprivation, parental relations, social and political belongingness. Mean, Standard Deviation, and t-test were the statistical measures adopted in this 2x2 factorial design study. In addition to this discriminant analysis has been worked out to strengthen the predictive validity of the obtained data. TAT scores reveal significant difference between the tribal’s and non-tribal on power motivation. However results obtained on gender difference indicates similar scores among both the cultures. Cross validation of the TAT results was done by using the power motive scale by T. S. Dapola which confirms the results on need for power through TAT scores. Power motivation has been studied in three directions i.e. coercion, inducement and restraint. An interesting finding is that on coercion tribal’s score high showing significant difference whereas in inducement or seduction the non-tribal’s scored high showing significant difference. On the other hand on restraint no difference exists between both cultures. Discriminant analysis has been worked out between the variables n-power, coercion, inducement and restraint. Results indicated that inducement or seduction (.502) is the dependent measure which has the most discriminating power between these two cultures.

Keywords: power motivation, tribal, social, political, predictive validity, cross validation, coercion, inducement, restraint

Procedia PDF Downloads 465
28393 Role of Imaging in Predicting the Receptor Positivity Status in Lung Adenocarcinoma: A Chapter in Radiogenomics

Authors: Sonal Sethi, Mukesh Yadav, Abhimanyu Gupta

Abstract:

The upcoming field of radiogenomics has the potential to upgrade the role of imaging in lung cancer management by noninvasive characterization of tumor histology and genetic microenvironment. Receptor positivity like epidermal growth factor receptor (EGFR) and anaplastic lymphoma kinase (ALK) genotyping are critical in lung adenocarcinoma for treatment. As conventional identification of receptor positivity is an invasive procedure, we analyzed the features on non-invasive computed tomography (CT), which predicts the receptor positivity in lung adenocarcinoma. Retrospectively, we did a comprehensive study from 77 proven lung adenocarcinoma patients with CT images, EGFR and ALK receptor genotyping, and clinical information. Total 22/77 patients were receptor-positive (15 had only EGFR mutation, 6 had ALK mutation, and 1 had both EGFR and ALK mutation). Various morphological characteristics and metastatic distribution on CT were analyzed along with the clinical information. Univariate and multivariable logistic regression analyses were used. On multivariable logistic regression analysis, we found spiculated margin, lymphangitic spread, air bronchogram, pleural effusion, and distant metastasis had a significant predictive value for receptor mutation status. On univariate analysis, air bronchogram and pleural effusion had significant individual predictive value. Conclusions: Receptor positive lung cancer has characteristic imaging features compared with nonreceptor positive lung adenocarcinoma. Since CT is routinely used in lung cancer diagnosis, we can predict the receptor positivity by a noninvasive technique and would follow a more aggressive algorithm for evaluation of distant metastases as well as for the treatment.

Keywords: lung cancer, multidisciplinary cancer care, oncologic imaging, radiobiology

Procedia PDF Downloads 97
28392 Psychosocial Development: The Study of Adaptation and Development and Post-Retirement Satisfaction in Ageing Australians

Authors: Sahar El-Achkar, Mizan Ahmad

Abstract:

Poor adaptation of developmental milestones over the lifespan can significantly impact emotional experiences and Satisfaction with Life (SWL) post-retirement. Thus, it is important to understand how adaptive behaviour over the life course can predict emotional experiences. Broadly emotional experiences are either Positive Affect (PA) or Negative Affect (NA). This study sought to explore the impact of successful adaptation of developmental milestones throughout one’s life on emotional experiences and satisfaction with life following retirement. A cross-sectional self-report survey was completed by 132 Australian retirees between the ages 55 and 70 years. Three hierarchical regression models were fitted, controlling for age and gender, to predict PA, NA, and SWL. The full model predicting PA was statistically significant overall, F (8, 121) = 17.97, p < .001, account for 57% of the variability in PA. Industry/Inferiority were significantly predictive of PA. The full model predicting NA was statistically significant overall, F (8, 121) = 12.00, p < .001, accounting for 51% of the variability in NA. Age and Trust/Mistrust were significantly predictive of NA. The full model predicting NA was statistically significant overall, F (8, 121) = 12.00, p < .001, accounting for 51% of the variability in NA. Age and Trust/Mistrust were significantly predictive of NA. The full model predicting SWL, F (8, 121) = 11.05, p < .001, accounting for 45% of the variability in SWL. Trust/Mistrust and Ego Integrity/Despair were significantly predictive of SWL. A sense of industry post-retirement is important in generating PA. These results highlight that individuals presenting with adaptation and identity issues are likely to present with adjustment challenges and unpleasant emotional experiences post-retirement. This supports the importance of identifying and understanding the benefits of successful adaptation and development throughout the lifespan and its significance for the self-concept. Most importantly, the quality of lives of many may be improved, and the future risk of continued poor emotional experiences and SWL post-retirement may be mitigated. Specifically, the clinical implications of these findings are that they support the promotion of successful adaption over the life course and healthy ageing.

Keywords: adaptation, development, negative affect, positive affect, retirement, satisfaction with life

Procedia PDF Downloads 51
28391 The Effect of Feature Selection on Pattern Classification

Authors: Chih-Fong Tsai, Ya-Han Hu

Abstract:

The aim of feature selection (or dimensionality reduction) is to filter out unrepresentative features (or variables) making the classifier perform better than the one without feature selection. Since there are many well-known feature selection algorithms, and different classifiers based on different selection results may perform differently, very few studies consider examining the effect of performing different feature selection algorithms on the classification performances by different classifiers over different types of datasets. In this paper, two widely used algorithms, which are the genetic algorithm (GA) and information gain (IG), are used to perform feature selection. On the other hand, three well-known classifiers are constructed, which are the CART decision tree (DT), multi-layer perceptron (MLP) neural network, and support vector machine (SVM). Based on 14 different types of datasets, the experimental results show that in most cases IG is a better feature selection algorithm than GA. In addition, the combinations of IG with DT and IG with SVM perform best and second best for small and large scale datasets.

Keywords: data mining, feature selection, pattern classification, dimensionality reduction

Procedia PDF Downloads 641
28390 Transfer Function Model-Based Predictive Control for Nuclear Core Power Control in PUSPATI TRIGA Reactor

Authors: Mohd Sabri Minhat, Nurul Adilla Mohd Subha

Abstract:

The 1MWth PUSPATI TRIGA Reactor (RTP) in Malaysia Nuclear Agency has been operating more than 35 years. The existing core power control is using conventional controller known as Feedback Control Algorithm (FCA). It is technically challenging to keep the core power output always stable and operating within acceptable error bands for the safety demand of the RTP. Currently, the system could be considered unsatisfactory with power tracking performance, yet there is still significant room for improvement. Hence, a new design core power control is very important to improve the current performance in tracking and regulating reactor power by controlling the movement of control rods that suit the demand of highly sensitive of nuclear reactor power control. In this paper, the proposed Model Predictive Control (MPC) law was applied to control the core power. The model for core power control was based on mathematical models of the reactor core, MPC, and control rods selection algorithm. The mathematical models of the reactor core were based on point kinetics model, thermal hydraulic models, and reactivity models. The proposed MPC was presented in a transfer function model of the reactor core according to perturbations theory. The transfer function model-based predictive control (TFMPC) was developed to design the core power control with predictions based on a T-filter towards the real-time implementation of MPC on hardware. This paper introduces the sensitivity functions for TFMPC feedback loop to reduce the impact on the input actuation signal and demonstrates the behaviour of TFMPC in term of disturbance and noise rejections. The comparisons of both tracking and regulating performance between the conventional controller and TFMPC were made using MATLAB and analysed. In conclusion, the proposed TFMPC has satisfactory performance in tracking and regulating core power for controlling nuclear reactor with high reliability and safety.

Keywords: core power control, model predictive control, PUSPATI TRIGA reactor, TFMPC

Procedia PDF Downloads 217
28389 Efficient Credit Card Fraud Detection Based on Multiple ML Algorithms

Authors: Neha Ahirwar

Abstract:

In the contemporary digital era, the rise of credit card fraud poses a significant threat to both financial institutions and consumers. As fraudulent activities become more sophisticated, there is an escalating demand for robust and effective fraud detection mechanisms. Advanced machine learning algorithms have become crucial tools in addressing this challenge. This paper conducts a thorough examination of the design and evaluation of a credit card fraud detection system, utilizing four prominent machine learning algorithms: random forest, logistic regression, decision tree, and XGBoost. The surge in digital transactions has opened avenues for fraudsters to exploit vulnerabilities within payment systems. Consequently, there is an urgent need for proactive and adaptable fraud detection systems. This study addresses this imperative by exploring the efficacy of machine learning algorithms in identifying fraudulent credit card transactions. The selection of random forest, logistic regression, decision tree, and XGBoost for scrutiny in this study is based on their documented effectiveness in diverse domains, particularly in credit card fraud detection. These algorithms are renowned for their capability to model intricate patterns and provide accurate predictions. Each algorithm is implemented and evaluated for its performance in a controlled environment, utilizing a diverse dataset comprising both genuine and fraudulent credit card transactions.

Keywords: efficient credit card fraud detection, random forest, logistic regression, XGBoost, decision tree

Procedia PDF Downloads 32
28388 Unveiling Comorbidities in Irritable Bowel Syndrome: A UK BioBank Study utilizing Supervised Machine Learning

Authors: Uswah Ahmad Khan, Muhammad Moazam Fraz, Humayoon Shafique Satti, Qasim Aziz

Abstract:

Approximately 10-14% of the global population experiences a functional disorder known as irritable bowel syndrome (IBS). The disorder is defined by persistent abdominal pain and an irregular bowel pattern. IBS significantly impairs work productivity and disrupts patients' daily lives and activities. Although IBS is widespread, there is still an incomplete understanding of its underlying pathophysiology. This study aims to help characterize the phenotype of IBS patients by differentiating the comorbidities found in IBS patients from those in non-IBS patients using machine learning algorithms. In this study, we extracted samples coding for IBS from the UK BioBank cohort and randomly selected patients without a code for IBS to create a total sample size of 18,000. We selected the codes for comorbidities of these cases from 2 years before and after their IBS diagnosis and compared them to the comorbidities in the non-IBS cohort. Machine learning models, including Decision Trees, Gradient Boosting, Support Vector Machine (SVM), AdaBoost, Logistic Regression, and XGBoost, were employed to assess their accuracy in predicting IBS. The most accurate model was then chosen to identify the features associated with IBS. In our case, we used XGBoost feature importance as a feature selection method. We applied different models to the top 10% of features, which numbered 50. Gradient Boosting, Logistic Regression and XGBoost algorithms yielded a diagnosis of IBS with an optimal accuracy of 71.08%, 71.427%, and 71.53%, respectively. Among the comorbidities most closely associated with IBS included gut diseases (Haemorrhoids, diverticular diseases), atopic conditions(asthma), and psychiatric comorbidities (depressive episodes or disorder, anxiety). This finding emphasizes the need for a comprehensive approach when evaluating the phenotype of IBS, suggesting the possibility of identifying new subsets of IBS rather than relying solely on the conventional classification based on stool type. Additionally, our study demonstrates the potential of machine learning algorithms in predicting the development of IBS based on comorbidities, which may enhance diagnosis and facilitate better management of modifiable risk factors for IBS. Further research is necessary to confirm our findings and establish cause and effect. Alternative feature selection methods and even larger and more diverse datasets may lead to more accurate classification models. Despite these limitations, our findings highlight the effectiveness of Logistic Regression and XGBoost in predicting IBS diagnosis.

Keywords: comorbidities, disease association, irritable bowel syndrome (IBS), predictive analytics

Procedia PDF Downloads 90
28387 An Experimental Study for Assessing Email Classification Attributes Using Feature Selection Methods

Authors: Issa Qabaja, Fadi Thabtah

Abstract:

Email phishing classification is one of the vital problems in the online security research domain that have attracted several scholars due to its impact on the users payments performed daily online. One aspect to reach a good performance by the detection algorithms in the email phishing problem is to identify the minimal set of features that significantly have an impact on raising the phishing detection rate. This paper investigate three known feature selection methods named Information Gain (IG), Chi-square and Correlation Features Set (CFS) on the email phishing problem to separate high influential features from low influential ones in phishing detection. We measure the degree of influentially by applying four data mining algorithms on a large set of features. We compare the accuracy of these algorithms on the complete features set before feature selection has been applied and after feature selection has been applied. After conducting experiments, the results show 12 common significant features have been chosen among the considered features by the feature selection methods. Further, the average detection accuracy derived by the data mining algorithms on the reduced 12-features set was very slight affected when compared with the one derived from the 47-features set.

Keywords: data mining, email classification, phishing, online security

Procedia PDF Downloads 407
28386 Data Mining Model for Predicting the Status of HIV Patients during Drug Regimen Change

Authors: Ermias A. Tegegn, Million Meshesha

Abstract:

Human Immunodeficiency Virus and Acquired Immunodeficiency Syndrome (HIV/AIDS) is a major cause of death for most African countries. Ethiopia is one of the seriously affected countries in sub Saharan Africa. Previously in Ethiopia, having HIV/AIDS was almost equivalent to a death sentence. With the introduction of Antiretroviral Therapy (ART), HIV/AIDS has become chronic, but manageable disease. The study focused on a data mining technique to predict future living status of HIV/AIDS patients at the time of drug regimen change when the patients become toxic to the currently taking ART drug combination. The data is taken from University of Gondar Hospital ART program database. Hybrid methodology is followed to explore the application of data mining on ART program dataset. Data cleaning, handling missing values and data transformation were used for preprocessing the data. WEKA 3.7.9 data mining tools, classification algorithms, and expertise are utilized as means to address the research problem. By using four different classification algorithms, (i.e., J48 Classifier, PART rule induction, Naïve Bayes and Neural network) and by adjusting their parameters thirty-two models were built on the pre-processed University of Gondar ART program dataset. The performances of the models were evaluated using the standard metrics of accuracy, precision, recall, and F-measure. The most effective model to predict the status of HIV patients with drug regimen substitution is pruned J48 decision tree with a classification accuracy of 98.01%. This study extracts interesting attributes such as Ever taking Cotrim, Ever taking TbRx, CD4 count, Age, Weight, and Gender so as to predict the status of drug regimen substitution. The outcome of this study can be used as an assistant tool for the clinician to help them make more appropriate drug regimen substitution. Future research directions are forwarded to come up with an applicable system in the area of the study.

Keywords: HIV drug regimen, data mining, hybrid methodology, predictive model

Procedia PDF Downloads 115
28385 An Optimal Steganalysis Based Approach for Embedding Information in Image Cover Media with Security

Authors: Ahlem Fatnassi, Hamza Gharsellaoui, Sadok Bouamama

Abstract:

This paper deals with the study of interest in the fields of Steganography and Steganalysis. Steganography involves hiding information in a cover media to obtain the stego media in such a way that the cover media is perceived not to have any embedded message for its unintended recipients. Steganalysis is the mechanism of detecting the presence of hidden information in the stego media and it can lead to the prevention of disastrous security incidents. In this paper, we provide a critical review of the steganalysis algorithms available to analyze the characteristics of an image stego media against the corresponding cover media and understand the process of embedding the information and its detection. We anticipate that this paper can also give a clear picture of the current trends in steganography so that we can develop and improvise appropriate steganalysis algorithms.

Keywords: optimization, heuristics and metaheuristics algorithms, embedded systems, low-power consumption, steganalysis heuristic approach

Procedia PDF Downloads 270
28384 Food Supply Chain Optimization: Achieving Cost Effectiveness Using Predictive Analytics

Authors: Jayant Kumar, Aarcha Jayachandran Sasikala, Barry Adrian Shepherd

Abstract:

Public Distribution System is a flagship welfare programme of the Government of India with both historical and political significance. Targeted at lower sections of society,it is one of the largest supply chain networks in the world. There has been several studies by academics and planning commission about the effectiveness of the system. Our study focuses on applying predictive analytics to aid the central body to keep track of the problem of breach of service level agreement between the two echelons of food supply chain. Each shop breach is leading to a potential additional inventory carrying cost. Thus, through this study, we aim to show that aided with such analytics, the network can be made more cost effective. The methods we illustrate in this study are applicable to other commercial supply chains as well.

Keywords: PDS, analytics, cost effectiveness, Karnataka, inventory cost, service level JEL classification: C53

Procedia PDF Downloads 505
28383 Molecular Topology and TLC Retention Behaviour of s-Triazines: QSRR Study

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

Quantitative structure-retention relationship (QSRR) analysis was used to predict the chromatographic behavior of s-triazine derivatives by using theoretical descriptors computed from the chemical structure. Fundamental basis of the reported investigation is to relate molecular topological descriptors with chromatographic behavior of s-triazine derivatives obtained by reversed-phase (RP) thin layer chromatography (TLC) on silica gel impregnated with paraffin oil and applied ethanol-water (φ = 0.5-0.8; v/v). Retention parameter (RM0) of 14 investigated s-triazine derivatives was used as dependent variable while simple connectivity index different orders were used as independent variables. The best QSRR model for predicting RM0 value was obtained with simple third order connectivity index (3χ) in the second-degree polynomial equation. Numerical values of the correlation coefficient (r=0.915), Fisher's value (F=28.34) and root mean square error (RMSE = 0.36) indicate that model is statistically significant. In order to test the predictive power of the QSRR model leave-one-out cross-validation technique has been applied. The parameters of the internal cross-validation analysis (r2CV=0.79, r2adj=0.81, PRESS=1.89) reflect the high predictive ability of the generated model and it confirms that can be used to predict RM0 value. Multivariate classification technique, hierarchical cluster analysis (HCA), has been applied in order to group molecules according to their molecular connectivity indices. HCA is a descriptive statistical method and it is the most frequently used for important area of data processing such is classification. The HCA performed on simple molecular connectivity indices obtained from the 2D structure of investigated s-triazine compounds resulted in two main clusters in which compounds molecules were grouped according to the number of atoms in the molecule. This is in agreement with the fact that these descriptors were calculated on the basis of the number of atoms in the molecule of the investigated s-triazine derivatives.

Keywords: s-triazines, QSRR, chemometrics, chromatography, molecular descriptors

Procedia PDF Downloads 369
28382 The Impact of Temporal Impairment on Quality of Experience (QoE) in Video Streaming: A No Reference (NR) Subjective and Objective Study

Authors: Muhammad Arslan Usman, Muhammad Rehan Usman, Soo Young Shin

Abstract:

Live video streaming is one of the most widely used service among end users, yet it is a big challenge for the network operators in terms of quality. The only way to provide excellent Quality of Experience (QoE) to the end users is continuous monitoring of live video streaming. For this purpose, there are several objective algorithms available that monitor the quality of the video in a live stream. Subjective tests play a very important role in fine tuning the results of objective algorithms. As human perception is considered to be the most reliable source for assessing the quality of a video stream, subjective tests are conducted in order to develop more reliable objective algorithms. Temporal impairments in a live video stream can have a negative impact on the end users. In this paper we have conducted subjective evaluation tests on a set of video sequences containing temporal impairment known as frame freezing. Frame Freezing is considered as a transmission error as well as a hardware error which can result in loss of video frames on the reception side of a transmission system. In our subjective tests, we have performed tests on videos that contain a single freezing event and also for videos that contain multiple freezing events. We have recorded our subjective test results for all the videos in order to give a comparison on the available No Reference (NR) objective algorithms. Finally, we have shown the performance of no reference algorithms used for objective evaluation of videos and suggested the algorithm that works better. The outcome of this study shows the importance of QoE and its effect on human perception. The results for the subjective evaluation can serve the purpose for validating objective algorithms.

Keywords: objective evaluation, subjective evaluation, quality of experience (QoE), video quality assessment (VQA)

Procedia PDF Downloads 582
28381 Neuroevolution Based on Adaptive Ensembles of Biologically Inspired Optimization Algorithms Applied for Modeling a Chemical Engineering Process

Authors: Sabina-Adriana Floria, Marius Gavrilescu, Florin Leon, Silvia Curteanu, Costel Anton

Abstract:

Neuroevolution is a subfield of artificial intelligence used to solve various problems in different application areas. Specifically, neuroevolution is a technique that applies biologically inspired methods to generate neural network architectures and optimize their parameters automatically. In this paper, we use different biologically inspired optimization algorithms in an ensemble strategy with the aim of training multilayer perceptron neural networks, resulting in regression models used to simulate the industrial chemical process of obtaining bricks from silicone-based materials. Installations in the raw ceramics industry, i.e., bricks, are characterized by significant energy consumption and large quantities of emissions. In addition, the initial conditions that were taken into account during the design and commissioning of the installation can change over time, which leads to the need to add new mixes to adjust the operating conditions for the desired purpose, e.g., material properties and energy saving. The present approach follows the study by simulation of a process of obtaining bricks from silicone-based materials, i.e., the modeling and optimization of the process. Optimization aims to determine the working conditions that minimize the emissions represented by nitrogen monoxide. We first use a search procedure to find the best values for the parameters of various biologically inspired optimization algorithms. Then, we propose an adaptive ensemble strategy that uses only a subset of the best algorithms identified in the search stage. The adaptive ensemble strategy combines the results of selected algorithms and automatically assigns more processing capacity to the more efficient algorithms. Their efficiency may also vary at different stages of the optimization process. In a given ensemble iteration, the most efficient algorithms aim to maintain good convergence, while the less efficient algorithms can improve population diversity. The proposed adaptive ensemble strategy outperforms the individual optimizers and the non-adaptive ensemble strategy in convergence speed, and the obtained results provide lower error values.

Keywords: optimization, biologically inspired algorithm, neuroevolution, ensembles, bricks, emission minimization

Procedia PDF Downloads 76
28380 Development of Digital Twin Concept to Detect Abnormal Changes in Structural Behaviour

Authors: Shady Adib, Vladimir Vinogradov, Peter Gosling

Abstract:

Digital Twin (DT) technology is a new technology that appeared in the early 21st century. The DT is defined as the digital representation of living and non-living physical assets. By connecting the physical and virtual assets, data are transmitted smoothly, allowing the virtual asset to fully represent the physical asset. Although there are lots of studies conducted on the DT concept, there is still limited information about the ability of the DT models for monitoring and detecting unexpected changes in structural behaviour in real time. This is due to the large computational efforts required for the analysis and an excessively large amount of data transferred from sensors. This paper aims to develop the DT concept to be able to detect the abnormal changes in structural behaviour in real time using advanced modelling techniques, deep learning algorithms, and data acquisition systems, taking into consideration model uncertainties. finite element (FE) models were first developed offline to be used with a reduced basis (RB) model order reduction technique for the construction of low-dimensional space to speed the analysis during the online stage. The RB model was validated against experimental test results for the establishment of a DT model of a two-dimensional truss. The established DT model and deep learning algorithms were used to identify the location of damage once it has appeared during the online stage. Finally, the RB model was used again to identify the damage severity. It was found that using the RB model, constructed offline, speeds the FE analysis during the online stage. The constructed RB model showed higher accuracy for predicting the damage severity, while deep learning algorithms were found to be useful for estimating the location of damage with small severity.

Keywords: data acquisition system, deep learning, digital twin, model uncertainties, reduced basis, reduced order model

Procedia PDF Downloads 76
28379 Geophysical Methods and Machine Learning Algorithms for Stuck Pipe Prediction and Avoidance

Authors: Ammar Alali, Mahmoud Abughaban

Abstract:

Cost reduction and drilling optimization is the goal of many drilling operators. Historically, stuck pipe incidents were a major segment of non-productive time (NPT) associated costs. Traditionally, stuck pipe problems are part of the operations and solved post-sticking. However, the real key to savings and success is in predicting the stuck pipe incidents and avoiding the conditions leading to its occurrences. Previous attempts in stuck-pipe predictions have neglected the local geology of the problem. The proposed predictive tool utilizes geophysical data processing techniques and Machine Learning (ML) algorithms to predict drilling activities events in real-time using surface drilling data with minimum computational power. The method combines two types of analysis: (1) real-time prediction, and (2) cause analysis. Real-time prediction aggregates the input data, including historical drilling surface data, geological formation tops, and petrophysical data, from wells within the same field. The input data are then flattened per the geological formation and stacked per stuck-pipe incidents. The algorithm uses two physical methods (stacking and flattening) to filter any noise in the signature and create a robust pre-determined pilot that adheres to the local geology. Once the drilling operation starts, the Wellsite Information Transfer Standard Markup Language (WITSML) live surface data are fed into a matrix and aggregated in a similar frequency as the pre-determined signature. Then, the matrix is correlated with the pre-determined stuck-pipe signature for this field, in real-time. The correlation used is a machine learning Correlation-based Feature Selection (CFS) algorithm, which selects relevant features from the class and identifying redundant features. The correlation output is interpreted as a probability curve of stuck pipe incidents prediction in real-time. Once this probability passes a fixed-threshold defined by the user, the other component, cause analysis, alerts the user of the expected incident based on set pre-determined signatures. A set of recommendations will be provided to reduce the associated risk. The validation process involved feeding of historical drilling data as live-stream, mimicking actual drilling conditions, of an onshore oil field. Pre-determined signatures were created for three problematic geological formations in this field prior. Three wells were processed as case studies, and the stuck-pipe incidents were predicted successfully, with an accuracy of 76%. This accuracy of detection could have resulted in around 50% reduction in NPT, equivalent to 9% cost saving in comparison with offset wells. The prediction of stuck pipe problem requires a method to capture geological, geophysical and drilling data, and recognize the indicators of this issue at a field and geological formation level. This paper illustrates the efficiency and the robustness of the proposed cross-disciplinary approach in its ability to produce such signatures and predicting this NPT event.

Keywords: drilling optimization, hazard prediction, machine learning, stuck pipe

Procedia PDF Downloads 193
28378 A Comparative Asessment of Some Algorithms for Modeling and Forecasting Horizontal Displacement of Ialy Dam, Vietnam

Authors: Kien-Trinh Thi Bui, Cuong Manh Nguyen

Abstract:

In order to simulate and reproduce the operational characteristics of a dam visually, it is necessary to capture the displacement at different measurement points and analyze the observed movement data promptly to forecast the dam safety. The accuracy of forecasts is further improved by applying machine learning methods to data analysis progress. In this study, the horizontal displacement monitoring data of the Ialy hydroelectric dam was applied to machine learning algorithms: Gaussian processes, multi-layer perceptron neural networks, and the M5-rules algorithm for modelling and forecasting of horizontal displacement of the Ialy hydropower dam (Vietnam), respectively, for analysing. The database which used in this research was built by collecting time series of data from 2006 to 2021 and divided into two parts: training dataset and validating dataset. The final results show all three algorithms have high performance for both training and model validation, but the MLPs is the best model. The usability of them are further investigated by comparison with a benchmark models created by multi-linear regression. The result show the performance which obtained from all the GP model, the MLPs model and the M5-Rules model are much better, therefore these three models should be used to analyze and predict the horizontal displacement of the dam.

Keywords: Gaussian processes, horizontal displacement, hydropower dam, Ialy dam, M5-Rules, multi-layer perception neural networks

Procedia PDF Downloads 177
28377 A General Framework for Knowledge Discovery from Echocardiographic and Natural Images

Authors: S. Nandagopalan, N. Pradeep

Abstract:

The aim of this paper is to propose a general framework for storing, analyzing, and extracting knowledge from two-dimensional echocardiographic images, color Doppler images, non-medical images, and general data sets. A number of high performance data mining algorithms have been used to carry out this task. Our framework encompasses four layers namely physical storage, object identification, knowledge discovery, user level. Techniques such as active contour model to identify the cardiac chambers, pixel classification to segment the color Doppler echo image, universal model for image retrieval, Bayesian method for classification, parallel algorithms for image segmentation, etc., were employed. Using the feature vector database that have been efficiently constructed, one can perform various data mining tasks like clustering, classification, etc. with efficient algorithms along with image mining given a query image. All these facilities are included in the framework that is supported by state-of-the-art user interface (UI). The algorithms were tested with actual patient data and Coral image database and the results show that their performance is better than the results reported already.

Keywords: active contour, Bayesian, echocardiographic image, feature vector

Procedia PDF Downloads 417
28376 Troubleshooting Petroleum Equipment Based on Wireless Sensors Based on Bayesian Algorithm

Authors: Vahid Bayrami Rad

Abstract:

In this research, common methods and techniques have been investigated with a focus on intelligent fault finding and monitoring systems in the oil industry. In fact, remote and intelligent control methods are considered a necessity for implementing various operations in the oil industry, but benefiting from the knowledge extracted from countless data generated with the help of data mining algorithms. It is a avoid way to speed up the operational process for monitoring and troubleshooting in today's big oil companies. Therefore, by comparing data mining algorithms and checking the efficiency and structure and how these algorithms respond in different conditions, The proposed (Bayesian) algorithm using data clustering and their analysis and data evaluation using a colored Petri net has provided an applicable and dynamic model from the point of view of reliability and response time. Therefore, by using this method, it is possible to achieve a dynamic and consistent model of the remote control system and prevent the occurrence of leakage in oil pipelines and refineries and reduce costs and human and financial errors. Statistical data The data obtained from the evaluation process shows an increase in reliability, availability and high speed compared to other previous methods in this proposed method.

Keywords: wireless sensors, petroleum equipment troubleshooting, Bayesian algorithm, colored Petri net, rapid miner, data mining-reliability

Procedia PDF Downloads 37
28375 Evaluation of Dual Polarization Rainfall Estimation Algorithm Applicability in Korea: A Case Study on Biseulsan Radar

Authors: Chulsang Yoo, Gildo Kim

Abstract:

Dual polarization radar provides comprehensive information about rainfall by measuring multiple parameters. In Korea, for the rainfall estimation, JPOLE and CSU-HIDRO algorithms are generally used. This study evaluated the local applicability of JPOLE and CSU-HIDRO algorithms in Korea by using the observed rainfall data collected on August, 2014 by the Biseulsan dual polarization radar data and KMA AWS. A total of 11,372 pairs of radar-ground rain rate data were classified according to thresholds of synthetic algorithms into suitable and unsuitable data. Then, evaluation criteria were derived by comparing radar rain rate and ground rain rate, respectively, for entire, suitable, unsuitable data. The results are as follows: (1) The radar rain rate equation including KDP, was found better in the rainfall estimation than the other equations for both JPOLE and CSU-HIDRO algorithms. The thresholds were found to be adequately applied for both algorithms including specific differential phase. (2) The radar rain rate equation including horizontal reflectivity and differential reflectivity were found poor compared to the others. The result was not improved even when only the suitable data were applied. Acknowledgments: This work was supported by the Basic Science Research Program through the National Research Foundation of Korea, funded by the Ministry of Education (NRF-2013R1A1A2011012).

Keywords: CSU-HIDRO algorithm, dual polarization radar, JPOLE algorithm, radar rainfall estimation algorithm

Procedia PDF Downloads 189
28374 Prediction Factor of Recurrence Supraventricular Tachycardia After Adenosine Treatment in the Emergency Department

Authors: Welawat Tienpratarn, Chaiyaporn Yuksen, Rungrawin Promkul, Chetsadakon Jenpanitpong, Pajit Bunta, Suthap Jaiboon

Abstract:

Supraventricular tachycardia (SVT) is an abnormally fast atrial tachycardia characterized by narrow (≤ 120 ms) and constant QRS. Adenosine was the drug of choice; the first dose was 6 mg. It can be repeated with the second and third doses of 12 mg, with greater than 90% success. The study found that patients observed at 4 hours after normal sinus rhythm was no recurrence within 24 hours. The objective of this study was to investigate the factors that influence the recurrence of SVT after adenosine in the emergency department (ED). The study was conducted retrospectively exploratory model, prognostic study at the Emergency Department (ED) in Faculty of Medicine, Ramathibodi Hospital, a university-affiliated super tertiary care hospital in Bangkok, Thailand. The study was conducted for ten years period between 2010 and 2020. The inclusion criteria were age > 15 years, visiting the ED with SVT, and treating with adenosine. Those patients were recorded with the recurrence SVT in ED. The multivariable logistic regression model developed the predictive model and prediction score for recurrence PSVT. 264 patients met the study criteria. Of those, 24 patients (10%) had recurrence PSVT. Five independent factors were predictive of recurrence PSVT. There was age>65 years, heart rate (after adenosine) > 100 per min, structural heart disease, and dose of adenosine. The clinical risk score to predict recurrence PSVT is developed accuracy 74.41%. The score of >6 had the likelihood ratio of recurrence PSVT by 5.71 times. The clinical predictive score of > 6 was associated with recurrence PSVT in ED.

Keywords: supraventricular tachycardia, recurrance, emergency department, adenosine

Procedia PDF Downloads 92