Search results for: Vehicle classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1669

Search results for: Vehicle classification

109 Formant Tracking Linear Prediction Model using HMMs for Noisy Speech Processing

Authors: Zaineb Ben Messaoud, Dorra Gargouri, Saida Zribi, Ahmed Ben Hamida

Abstract:

This paper presents a formant-tracking linear prediction (FTLP) model for speech processing in noise. The main focus of this work is the detection of formant trajectory based on Hidden Markov Models (HMM), for improved formant estimation in noise. The approach proposed in this paper provides a systematic framework for modelling and utilization of a time- sequence of peaks which satisfies continuity constraints on parameter; the within peaks are modelled by the LP parameters. The formant tracking LP model estimation is composed of three stages: (1) a pre-cleaning multi-band spectral subtraction stage to reduce the effect of residue noise on formants (2) estimation stage where an initial estimate of the LP model of speech for each frame is obtained (3) a formant classification using probability models of formants and Viterbi-decoders. The evaluation results for the estimation of the formant tracking LP model tested in Gaussian white noise background, demonstrate that the proposed combination of the initial noise reduction stage with formant tracking and LPC variable order analysis, results in a significant reduction in errors and distortions. The performance was evaluated with noisy natual vowels extracted from international french and English vocabulary speech signals at SNR value of 10dB. In each case, the estimated formants are compared to reference formants.

Keywords: Formants Estimation, HMM, Multi Band Spectral Subtraction, Variable order LPC coding, White Gauusien Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1933
108 A Communication Signal Recognition Algorithm Based on Holder Coefficient Characteristics

Authors: Hui Zhang, Ye Tian, Fang Ye, Ziming Guo

Abstract:

Communication signal modulation recognition technology is one of the key technologies in the field of modern information warfare. At present, communication signal automatic modulation recognition methods are mainly divided into two major categories. One is the maximum likelihood hypothesis testing method based on decision theory, the other is a statistical pattern recognition method based on feature extraction. Now, the most commonly used is a statistical pattern recognition method, which includes feature extraction and classifier design. With the increasingly complex electromagnetic environment of communications, how to effectively extract the features of various signals at low signal-to-noise ratio (SNR) is a hot topic for scholars in various countries. To solve this problem, this paper proposes a feature extraction algorithm for the communication signal based on the improved Holder cloud feature. And the extreme learning machine (ELM) is used which aims at the problem of the real-time in the modern warfare to classify the extracted features. The algorithm extracts the digital features of the improved cloud model without deterministic information in a low SNR environment, and uses the improved cloud model to obtain more stable Holder cloud features and the performance of the algorithm is improved. This algorithm addresses the problem that a simple feature extraction algorithm based on Holder coefficient feature is difficult to recognize at low SNR, and it also has a better recognition accuracy. The results of simulations show that the approach in this paper still has a good classification result at low SNR, even when the SNR is -15dB, the recognition accuracy still reaches 76%.

Keywords: Communication signal, feature extraction, holder coefficient, improved cloud model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 651
107 Exploring Influence Range of Tainan City Using Electronic Toll Collection Big Data

Authors: Chen Chou, Feng-Tyan Lin

Abstract:

Big Data has been attracted a lot of attentions in many fields for analyzing research issues based on a large number of maternal data. Electronic Toll Collection (ETC) is one of Intelligent Transportation System (ITS) applications in Taiwan, used to record starting point, end point, distance and travel time of vehicle on the national freeway. This study, taking advantage of ETC big data, combined with urban planning theory, attempts to explore various phenomena of inter-city transportation activities. ETC, one of government's open data, is numerous, complete and quick-update. One may recall that living area has been delimited with location, population, area and subjective consciousness. However, these factors cannot appropriately reflect what people’s movement path is in daily life. In this study, the concept of "Living Area" is replaced by "Influence Range" to show dynamic and variation with time and purposes of activities. This study uses data mining with Python and Excel, and visualizes the number of trips with GIS to explore influence range of Tainan city and the purpose of trips, and discuss living area delimited in current. It dialogues between the concepts of "Central Place Theory" and "Living Area", presents the new point of view, integrates the application of big data, urban planning and transportation. The finding will be valuable for resource allocation and land apportionment of spatial planning.

Keywords: Big Data, ITS, influence range, living area, central place theory, visualization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 940
106 Forest Risk and Vulnerability Assessment: A Case Study from East Bokaro Coal Mining Area in India

Authors: Sujata Upgupta, Prasoon Kumar Singh

Abstract:

The expansion of large scale coal mining into forest areas is a potential hazard for the local biodiversity and wildlife. The objective of this study is to provide a picture of the threat that coal mining poses to the forests of the East Bokaro landscape. The vulnerable forest areas at risk have been assessed and the priority areas for conservation have been presented. The forested areas at risk in the current scenario have been assessed and compared with the past conditions using classification and buffer based overlay approach. Forest vulnerability has been assessed using an analytical framework based on systematic indicators and composite vulnerability index values. The results indicate that more than 4 km2 of forests have been lost from 1973 to 2016. Large patches of forests have been diverted for coal mining projects. Forests in the northern part of the coal field within 1-3 km radius around the coal mines are at immediate risk. The original contiguous forests have been converted into fragmented and degraded forest patches. Most of the collieries are located within or very close to the forests thus threatening the biodiversity and hydrology of the surrounding regions. Based on the vulnerability values estimated, it was concluded that more than 90% of the forested grids in East Bokaro are highly vulnerable to mining. The forests in the sub-districts of Bermo and Chandrapura have been identified as the most vulnerable to coal mining activities. This case study would add to the capacity of the forest managers and mine managers to address the risk and vulnerability of forests at a small landscape level in order to achieve sustainable development.

Keywords: Coal mining, forest, indicators, vulnerability.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1127
105 The Effects of TiO2 Nanoparticles on Tumor Cell Colonies: Fractal Dimension and Morphological Properties

Authors: T. Sungkaworn, W. Triampo, P. Nalakarn, D. Triampo, I. M. Tang, Y. Lenbury, P. Picha

Abstract:

Semiconductor nanomaterials like TiO2 nanoparticles (TiO2-NPs) approximately less than 100 nm in diameter have become a new generation of advanced materials due to their novel and interesting optical, dielectric, and photo-catalytic properties. With the increasing use of NPs in commerce, to date few studies have investigated the toxicological and environmental effects of NPs. Motivated by the importance of TiO2-NPs that may contribute to the cancer research field especially from the treatment prospective together with the fractal analysis technique, we have investigated the effect of TiO2-NPs on colony morphology in the dark condition using fractal dimension as a key morphological characterization parameter. The aim of this work is mainly to investigate the cytotoxic effects of TiO2-NPs in the dark on the growth of human cervical carcinoma (HeLa) cell colonies from morphological aspect. The in vitro studies were carried out together with the image processing technique and fractal analysis. It was found that, these colonies were abnormal in shape and size. Moreover, the size of the control colonies appeared to be larger than those of the treated group. The mean Df +/- SEM of the colonies in untreated cultures was 1.085±0.019, N= 25, while that of the cultures treated with TiO2-NPs was 1.287±0.045. It was found that the circularity of the control group (0.401±0.071) is higher than that of the treated group (0.103±0.042). The same tendency was found in the diameter parameters which are 1161.30±219.56 μm and 852.28±206.50 μm for the control and treated group respectively. Possible explanation of the results was discussed, though more works need to be done in terms of the for mechanism aspects. Finally, our results indicate that fractal dimension can serve as a useful feature, by itself or in conjunction with other shape features, in the classification of cancer colonies.

Keywords: Tumor growth, Cell colonies, TiO2, Nanoparticles, Fractal, Morphology, Aggregation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1963
104 Comparative Evaluation of Accuracy of Selected Machine Learning Classification Techniques for Diagnosis of Cancer: A Data Mining Approach

Authors: Rajvir Kaur, Jeewani Anupama Ginige

Abstract:

With recent trends in Big Data and advancements in Information and Communication Technologies, the healthcare industry is at the stage of its transition from clinician oriented to technology oriented. Many people around the world die of cancer because the diagnosis of disease was not done at an early stage. Nowadays, the computational methods in the form of Machine Learning (ML) are used to develop automated decision support systems that can diagnose cancer with high confidence in a timely manner. This paper aims to carry out the comparative evaluation of a selected set of ML classifiers on two existing datasets: breast cancer and cervical cancer. The ML classifiers compared in this study are Decision Tree (DT), Support Vector Machine (SVM), k-Nearest Neighbor (k-NN), Logistic Regression, Ensemble (Bagged Tree) and Artificial Neural Networks (ANN). The evaluation is carried out based on standard evaluation metrics Precision (P), Recall (R), F1-score and Accuracy. The experimental results based on the evaluation metrics show that ANN showed the highest-level accuracy (99.4%) when tested with breast cancer dataset. On the other hand, when these ML classifiers are tested with the cervical cancer dataset, Ensemble (Bagged Tree) technique gave better accuracy (93.1%) in comparison to other classifiers.

Keywords: Artificial neural networks, breast cancer, cancer dataset, classifiers, cervical cancer, F-score, logistic regression, machine learning, precision, recall, support vector machine.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1506
103 New Echocardiographic Morphofunctional Diastolic Index (MFDI) in Differentiation of Normal Left Ventricular Filling from Pseudonormal and Restrictive

Authors: N. Nelasov, D. Safonov, M. Babaev, E. Mirzojan, O. Eroshenko, M. Morgunov, A. Erofeeva

Abstract:

We have shown previously that reflected high intensity motion signals (RIMS) can be used for detection of left ventricular (LV) diastolic dysfunction (DD). It is also well known, that left atrial (LA) dimension can be used as a marker of DD. In this study we decided to analyze the diagnostic role of new echocardiographic morphofunctional diastolic index (MFDI) in differentiation of normal filling of LV from pseudonormal and restrictive. MFDI includes LA dimension and velocity of early diastolic component ea of RIMS (MFDI = LA/ea).  

343 healthy subjects and patients with various cardiac pathology underwent dopplerechocardiographic exam. According to the criteria of "Don" classification scheme 155 subjects had signs of normal LV filling (N) and 55 - of pseudonormal and restrictive filling (PN + R). LA dimension was performed in standard manner. RIMS were registered by conventional pulsed wave Doppler from apical 4-chamber view, when the sample volume was positioned between the tips of mitral leaflets. The velocity of early diastolic component of RIMS was measured. After calculation of MFDI mean values of this index in two groups (N and PN + R) were compared. The cutoff value of MFDI for differentiation of patients with N and PN + R was determined.

Mean value of MFDI in subjects with normal filling was 1.38+0.33 and in patients with pseudonormal and restrictive filling 2.43+0.43; p<0.0001. The cutoff value of MFDI > 2.0 separated subjects with normal LV filling from subjects with pseudonormal and restrictive filling with sensitivity 89.1% and specificity 97.4%.

Keywords: Dopplerechocardiography, diastolic dysfunction, left atrium, reflected high intensity motion signals.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1561
102 Role of Inflammatory Markers in Arthritic Rats Treated with Ethanolic Bark Extract of Albizia procera

Authors: M. Sangeetha, D. Chamundeeswari, C. Saravanababu, C. Rose, V. Gopal

Abstract:

Rheumatoid arthritis (RA) is a chronic, progressive, systemic inflammatory disorder affecting the synovial joints and typically producing symmetrical arthritis that leads to joint destruction, which is responsible for the deformity and disability. Despite improvements in the treatment of RA over the past decade, there still is a need for new therapeutic agents that are efficacious, less expensive, and free of severe adverse reactions. The present study aimed to investigate role of inflammatory markers in arthritic rats treated with ethanolic bark extract of Albizia procera. The protective effect of ethanolic bark extract of Albizia procera against complete Freund’s adjuvant (CFA) induced arthritis in rats. Arthritis was induced by an intradermal injection of 0.1 ml FCA in the foot pad of left hind limb of rats. ETBE (100 and 200 mg/kg b.wt./p.o) and the reference drug diclofenac (25 mg/kg b.wt./p.o) were administered to arthritic rats. Paw volume was measured for all the animals before inducing arthritis and thereafter once in seven days by using plethysmometer for 42 days. Gene expression of inflammatory markers such as IL-1β and IL-10 were investigated in paw tissues. Up regulation of IL-1β and Down regulation IL-10 were observed in CFA injected rats when compared to normal rats. ETBE attenuated these alterations dose dependently when compared to the vehicle treated rats. These results provide insights into the mechanism of anti-arthritic activity, and unravel potential therapeutic use of Albizia procera in arthritis.

Keywords: CFA-Complete Freund’s adjuvant, ETBE, Ethanolic Bark Extract, IL- Interleukins, RA-Rheumatoid Arthritis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
101 Development and Optimization of Colon Targeted Drug Delivery System of Ayurvedic Churna Formulation Using Eudragit L100 and Ethyl Cellulose as Coating Material

Authors: Anil Bhandari, Imran Khan Pathan, Peeyush K. Sharma, Rakesh K. Patel, Suresh Purohit

Abstract:

The purpose of this study was to prepare time and pH dependent release tablets of Ayurvedic Churna formulation and evaluate their advantages as colon targeted drug delivery system. The Vidangadi Churna was selected for this study which contains Embelin and Gallic acid. Embelin is used in Helminthiasis as therapeutic agent. Embelin is insoluble in water and unstable in gastric environment so it was formulated in time and pH dependent tablets coated with combination of two polymers Eudragit L100 and ethyl cellulose. The 150mg of core tablet of dried extract and lactose were prepared by wet granulation method. The compression coating was used in the polymer concentration of 150mg for both the layer as upper and lower coating tablet was investigated. The results showed that no release was found in 0.1 N HCl and pH 6.8 phosphate buffers for initial 5 hours and about 98.97% of the drug was released in pH 7.4 phosphate buffer in total 17 Hours. The in vitro release profiles of drug from the formulation could be best expressed first order kinetics as highest linearity (r2= 0.9943). The results of the present study have demonstrated that the time and pH dependent tablets system is a promising vehicle for preventing rapid hydrolysis in gastric environment and improving oral bioavailability of Embelin and Gallic acid for treatment of Helminthiasis.

Keywords: Embelin, Gallic acid, Vidangadi Churna, Colon targeted drug delivery.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2333
100 A Specification-Based Approach for Retrieval of Reusable Business Component for Software Reuse

Authors: Meng Fanchao, Zhan Dechen, Xu Xiaofei

Abstract:

Software reuse can be considered as the most realistic and promising way to improve software engineering productivity and quality. Automated assistance for software reuse involves the representation, classification, retrieval and adaptation of components. The representation and retrieval of components are important to software reuse in Component-Based on Software Development (CBSD). However, current industrial component models mainly focus on the implement techniques and ignore the semantic information about component, so it is difficult to retrieve the components that satisfy user-s requirements. This paper presents a method of business component retrieval based on specification matching to solve the software reuse of enterprise information system. First, a business component model oriented reuse is proposed. In our model, the business data type is represented as sign data type based on XML, which can express the variable business data type that can describe the variety of business operations. Based on this model, we propose specification match relationships in two levels: business operation level and business component level. In business operation level, we use input business data types, output business data types and the taxonomy of business operations evaluate the similarity between business operations. In the business component level, we propose five specification matches between business components. To retrieval reusable business components, we propose the measure of similarity degrees to calculate the similarities between business components. Finally, a business component retrieval command like SQL is proposed to help user to retrieve approximate business components from component repository.

Keywords: Business component, business operation, business data type, specification matching.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374
99 Geosynthetic Reinforced Unpaved Road: Literature Study and Design Example

Authors: D. Jayalakshmi, S. Bhosale

Abstract:

This paper, in its first part, presents the state-of-the-art literature of design approaches for geosynthetic reinforced unpaved roads. The literature starting since 1970 and the critical appraisal of flexible pavement design by Giroud and Han (2004) and Jonathan Fannin (2006) is presented. The design example is illustrated for Indian conditions. The example emphasizes the results computed by Giroud and Han's (2004) design method with the Indian road congress guidelines by IRC SP 72 -2015. The input data considered are related to the subgrade soil condition of Maharashtra State in India. The unified soil classification of the subgrade soil is inorganic clay with high plasticity (CH), which is expansive with a California bearing ratio (CBR) of 2% to 3%. The example exhibits the unreinforced case and geotextile as reinforcement by varying the rut depth from 25 mm to 100 mm. The present result reveals the base thickness for the unreinforced case from the IRC design catalogs is in good agreement with Giroud and Han (2004) approach for a range of 75 mm to 100 mm rut depth. Since Giroud and Han (2004) method is applicable for both reinforced and unreinforced cases, for the same data with appropriate Nc factor, for the same rut depth, the base thickness for the reinforced case has arrived for the Indian condition. From this trial, for the CBR of 2%, the base thickness reduction due to geotextile inclusion is 35%. For the CBR range of 2% to 5% with different stiffness in geosynthetics, the reduction in base course thickness will be evaluated, and the validation will be executed by the full-scale accelerated pavement testing set up at the College of Engineering Pune (COE), India.

Keywords: Base thickness, design approach, equation, full scale accelerated pavement set up, Indian condition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 589
98 Invasion of Pectinatella magnifica in Freshwater Resources of the Czech Republic

Authors: J. Pazourek, K. Šmejkal, P. Kollár, J. Rajchard, J. Šinko, Z. Balounová, E. Vlková, H. Salmonová

Abstract:

Pectinatella magnifica (Leidy, 1851) is an invasive freshwater animal that lives in colonies. A colony of Pectinatella magnifica (a gelatinous blob) can be up to several feet in diameter large and under favorable conditions it exhibits an extreme growth rate. Recently European countries around rivers of Elbe, Oder, Danube, Rhine and Vltava have confirmed invasion of Pectinatella magnifica, including freshwater reservoirs in South Bohemia (Czech Republic). Our project (Czech Science Foundation, GAČR P503/12/0337) is focused onto biology and chemistry of Pectinatella magnifica. We monitor the organism occurrence in selected South Bohemia ponds and sandpits during the last years, collecting information about physical properties of surrounding water, and sampling the colonies for various analyses (classification, maps of secondary metabolites, toxicity tests). Because the gelatinous matrix is during the colony lifetime also a host for algae, bacteria and cyanobacteria (co-habitants), in this contribution, we also applied a high performance liquid chromatography (HPLC) method for determination of potentially present cyanobacterial toxins (microcystin-LR, microcystin-RR, nodularin). Results from the last 3-year monitoring show that these toxins are under limit of detection (LOD), so that they do not represent a danger yet. The final goal of our study is to assess toxicity risks related to fresh water resources invaded by Pectinatella magnifica, and to understand the process of invasion, which can enable to control it.

Keywords: Cyanobacteria, freshwater resources, Pectinatella magnifica invasion, toxicity monitoring.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1834
97 An Optimal Unsupervised Satellite image Segmentation Approach Based on Pearson System and k-Means Clustering Algorithm Initialization

Authors: Ahmed Rekik, Mourad Zribi, Ahmed Ben Hamida, Mohamed Benjelloun

Abstract:

This paper presents an optimal and unsupervised satellite image segmentation approach based on Pearson system and k-Means Clustering Algorithm Initialization. Such method could be considered as original by the fact that it utilised K-Means clustering algorithm for an optimal initialisation of image class number on one hand and it exploited Pearson system for an optimal statistical distributions- affectation of each considered class on the other hand. Satellite image exploitation requires the use of different approaches, especially those founded on the unsupervised statistical segmentation principle. Such approaches necessitate definition of several parameters like image class number, class variables- estimation and generalised mixture distributions. Use of statistical images- attributes assured convincing and promoting results under the condition of having an optimal initialisation step with appropriated statistical distributions- affectation. Pearson system associated with a k-means clustering algorithm and Stochastic Expectation-Maximization 'SEM' algorithm could be adapted to such problem. For each image-s class, Pearson system attributes one distribution type according to different parameters and especially the Skewness 'β1' and the kurtosis 'β2'. The different adapted algorithms, K-Means clustering algorithm, SEM algorithm and Pearson system algorithm, are then applied to satellite image segmentation problem. Efficiency of those combined algorithms was firstly validated with the Mean Quadratic Error 'MQE' evaluation, and secondly with visual inspection along several comparisons of these unsupervised images- segmentation.

Keywords: Unsupervised classification, Pearson system, Satellite image, Segmentation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1995
96 Quantification of E-Waste: A Case Study in Federal University of Espírito Santo, Brazil

Authors: Andressa S. T. Gomes, Luiza A. Souza, Luciana H. Yamane, Renato R. Siman

Abstract:

The segregation of waste of electrical and electronic equipment (WEEE) in the generating source, its characterization (quali-quantitative) and identification of origin, besides being integral parts of classification reports, are crucial steps to the success of its integrated management. The aim of this paper was to count WEEE generation at the Federal University of Espírito Santo (UFES), Brazil, as well as to define sources, temporary storage sites, main transportations routes and destinations, the most generated WEEE and its recycling potential. Quantification of WEEE generated at the University in the years between 2010 and 2015 was performed using data analysis provided by UFES’s sector of assets management. EEE and WEEE flow in the campuses information were obtained through questionnaires applied to the University workers. It was recorded 6028 WEEEs units of data processing equipment disposed by the university between 2010 and 2015. Among these waste, the most generated were CRT screens, desktops, keyboards and printers. Furthermore, it was observed that these WEEEs are temporarily stored in inappropriate places at the University campuses. In general, these WEEE units are donated to NGOs of the city, or sold through auctions (2010 and 2013). As for recycling potential, from the primary processing and further sale of printed circuit boards (PCB) from the computers, the amount collected could reach U$ 27,839.23. The results highlight the importance of a WEEE management policy at the University.

Keywords: Solid waste, waste of electric and electronic equipment, waste management, institutional generation of solid waste.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1508
95 Comparative Usability Study of the Websites of Top Universities in Three Continents: A Case Study of the University of Cape Town, Oxford University, and Harvard University

Authors: Stephen Akuma, Racheal Aluma, Abraham Undu

Abstract:

Academic websites play an important role in promoting education for all. They allow universities to provide users with digital academic services to save time and resources. A university website is not only a cost-effective and timely way to communicate with a variety of stakeholders, such as students, faculty, and visitors, but it is also a vehicle for the university to shape its image. The quality of a website is a major factor that universities consider in cyberspace. Potential students can easily apply to universities where the website provides useful and clear information. This has made the usability of websites an important area in meeting the needs and expectations of website users. In this paper, a comparative usability study of the University of Cape Town, Oxford University, and Harvard University academic websites (http://www.uct.ac.za/, https://www.ox.ac.uk/, and https://www.harvard.edu/) was carried out. The proactive user feedback technique was adopted for the comparative usability assessment of the aforementioned universities. The method was used by the researchers to collect and log records from the participants in real time. The result shows that the average dwell time on the websites of Harvard University, Oxford University, and Cape Town University in seconds for the three tasks are 51.58, 33.28, and 54.82 respectively. The System Usability Scale (SUS) scores for Harvard, Oxford, and the University of Cape Town are 49.81, 69.43, and 54.14 respectively. The result of the Analysis of Variance on the dwell time data shows a significant difference (p = .009) on the three websites. Our findings show that Oxford University has the most suitable website in terms of usability factors and other metrics than the other websites investigated. Practical implications are highlighted, and recommendations for improved website usability are suggested.

Keywords: Usability factors, user feedback, university websites, University of Cape Town, Harvard University, Oxford University.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 104
94 Incorporating Lexical-Semantic Knowledge into Convolutional Neural Network Framework for Pediatric Disease Diagnosis

Authors: Xiaocong Liu, Huazhen Wang, Ting He, Xiaozheng Li, Weihan Zhang, Jian Chen

Abstract:

The utilization of electronic medical record (EMR) data to establish the disease diagnosis model has become an important research content of biomedical informatics. Deep learning can automatically extract features from the massive data, which brings about breakthroughs in the study of EMR data. The challenge is that deep learning lacks semantic knowledge, which leads to impracticability in medical science. This research proposes a method of incorporating lexical-semantic knowledge from abundant entities into a convolutional neural network (CNN) framework for pediatric disease diagnosis. Firstly, medical terms are vectorized into Lexical Semantic Vectors (LSV), which are concatenated with the embedded word vectors of word2vec to enrich the feature representation. Secondly, the semantic distribution of medical terms serves as Semantic Decision Guide (SDG) for the optimization of deep learning models. The study evaluates the performance of LSV-SDG-CNN model on four kinds of Chinese EMR datasets. Additionally, CNN, LSV-CNN, and SDG-CNN are designed as baseline models for comparison. The experimental results show that LSV-SDG-CNN model outperforms baseline models on four kinds of Chinese EMR datasets. The best configuration of the model yielded an F1 score of 86.20%. The results clearly demonstrate that CNN has been effectively guided and optimized by lexical-semantic knowledge, and LSV-SDG-CNN model improves the disease classification accuracy with a clear margin.

Keywords: lexical semantics, feature representation, semantic decision, convolutional neural network, electronic medical record

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 536
93 Cirrhosis Mortality Prediction as Classification Using Frequent Subgraph Mining

Authors: Abdolghani Ebrahimi, Diego Klabjan, Chenxi Ge, Daniela Ladner, Parker Stride

Abstract:

In this work, we use machine learning and data analysis techniques to predict the one-year mortality of cirrhotic patients. Data from 2,322 patients with liver cirrhosis are collected at a single medical center. Different machine learning models are applied to predict one-year mortality. A comprehensive feature space including demographic information, comorbidity, clinical procedure and laboratory tests is being analyzed. A temporal pattern mining technic called Frequent Subgraph Mining (FSM) is being used. Model for End-stage liver disease (MELD) prediction of mortality is used as a comparator. All of our models statistically significantly outperform the MELD-score model and show an average 10% improvement of the area under the curve (AUC). The FSM technic itself does not improve the model significantly, but FSM, together with a machine learning technique called an ensemble, further improves the model performance. With the abundance of data available in healthcare through electronic health records (EHR), existing predictive models can be refined to identify and treat patients at risk for higher mortality. However, due to the sparsity of the temporal information needed by FSM, the FSM model does not yield significant improvements. Our work applies modern machine learning algorithms and data analysis methods on predicting one-year mortality of cirrhotic patients and builds a model that predicts one-year mortality significantly more accurate than the MELD score. We have also tested the potential of FSM and provided a new perspective of the importance of clinical features.

Keywords: machine learning, liver cirrhosis, subgraph mining, supervised learning

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 390
92 A Novel Neighborhood Defined Feature Selection on Phase Congruency Images for Recognition of Faces with Extreme Variations

Authors: Satyanadh Gundimada, Vijayan K Asari

Abstract:

A novel feature selection strategy to improve the recognition accuracy on the faces that are affected due to nonuniform illumination, partial occlusions and varying expressions is proposed in this paper. This technique is applicable especially in scenarios where the possibility of obtaining a reliable intra-class probability distribution is minimal due to fewer numbers of training samples. Phase congruency features in an image are defined as the points where the Fourier components of that image are maximally inphase. These features are invariant to brightness and contrast of the image under consideration. This property allows to achieve the goal of lighting invariant face recognition. Phase congruency maps of the training samples are generated and a novel modular feature selection strategy is implemented. Smaller sub regions from a predefined neighborhood within the phase congruency images of the training samples are merged to obtain a large set of features. These features are arranged in the order of increasing distance between the sub regions involved in merging. The assumption behind the proposed implementation of the region merging and arrangement strategy is that, local dependencies among the pixels are more important than global dependencies. The obtained feature sets are then arranged in the decreasing order of discriminating capability using a criterion function, which is the ratio of the between class variance to the within class variance of the sample set, in the PCA domain. The results indicate high improvement in the classification performance compared to baseline algorithms.

Keywords: Discriminant analysis, intra-class probability distribution, principal component analysis, phase congruency.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1818
91 PoPCoRN: A Power-Aware Periodic Surveillance Scheme in Convex Region using Wireless Mobile Sensor Networks

Authors: A. K. Prajapati

Abstract:

In this paper, the periodic surveillance scheme has been proposed for any convex region using mobile wireless sensor nodes. A sensor network typically consists of fixed number of sensor nodes which report the measurements of sensed data such as temperature, pressure, humidity, etc., of its immediate proximity (the area within its sensing range). For the purpose of sensing an area of interest, there are adequate number of fixed sensor nodes required to cover the entire region of interest. It implies that the number of fixed sensor nodes required to cover a given area will depend on the sensing range of the sensor as well as deployment strategies employed. It is assumed that the sensors to be mobile within the region of surveillance, can be mounted on moving bodies like robots or vehicle. Therefore, in our scheme, the surveillance time period determines the number of sensor nodes required to be deployed in the region of interest. The proposed scheme comprises of three algorithms namely: Hexagonalization, Clustering, and Scheduling, The first algorithm partitions the coverage area into fixed sized hexagons that approximate the sensing range (cell) of individual sensor node. The clustering algorithm groups the cells into clusters, each of which will be covered by a single sensor node. The later determines a schedule for each sensor to serve its respective cluster. Each sensor node traverses all the cells belonging to the cluster assigned to it by oscillating between the first and the last cell for the duration of its life time. Simulation results show that our scheme provides full coverage within a given period of time using few sensors with minimum movement, less power consumption, and relatively less infrastructure cost.

Keywords: Sensor Network, Graph Theory, MSN, Communication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1430
90 Land Suitability Prediction Modelling for Agricultural Crops Using Machine Learning Approach: A Case Study of Khuzestan Province, Iran

Authors: Saba Gachpaz, Hamid Reza Heidari

Abstract:

The sharp increase in population growth leads to more pressure on agricultural areas to satisfy the food supply. This necessitates increased resource consumption and underscores the importance of addressing sustainable agriculture development along with other environmental considerations. Land-use management is a crucial factor in obtaining optimum productivity. Machine learning is a widely used technique in the agricultural sector, from yield prediction to customer behavior. This method focuses on learning and provides patterns and correlations from our data set. In this study, nine physical control factors, namely, soil classification, electrical conductivity, normalized difference water index (NDWI), groundwater level, elevation, annual precipitation, pH of water, annual mean temperature, and slope in the alluvial plain in Khuzestan (an agricultural hotspot in Iran) are used to decide the best agricultural land use for both rainfed and irrigated agriculture for 10 different crops. For this purpose, each variable was imported into Arc GIS, and a raster layer was obtained. In the next level, by using training samples, all layers were imported into the python environment. A random forest model was applied, and the weight of each variable was specified. In the final step, results were visualized using a digital elevation model, and the importance of all factors for each one of the crops was obtained. Our results show that despite 62% of the study area being allocated to agricultural purposes, only 42.9% of these areas can be defined as a suitable class for cultivation purposes.

Keywords: Land suitability, machine learning, random forest, sustainable agriculture.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 192
89 LIDAR Obstacle Warning and Avoidance System for Unmanned Aircraft

Authors: Roberto Sabatini, Alessandro Gardi, Mark A. Richardson

Abstract:

The availability of powerful eye-safe laser sources and the recent advancements in electro-optical and mechanical beam-steering components have allowed laser-based Light Detection and Ranging (LIDAR) to become a promising technology for obstacle warning and avoidance in a variety of manned and unmanned aircraft applications. LIDAR outstanding angular resolution and accuracy characteristics are coupled to its good detection performance in a wide range of incidence angles and weather conditions, providing an ideal obstacle avoidance solution, which is especially attractive in low-level flying platforms such as helicopters and small-to-medium size Unmanned Aircraft (UA). The Laser Obstacle Avoidance Marconi (LOAM) system is one of such systems, which was jointly developed and tested by SELEX-ES and the Italian Air Force Research and Flight Test Centre. The system was originally conceived for military rotorcraft platforms and, in this paper, we briefly review the previous work and discuss in more details some of the key development activities required for integration of LOAM on UA platforms. The main hardware and software design features of this LOAM variant are presented, including a brief description of the system interfaces and sensor characteristics, together with the system performance models and data processing algorithms for obstacle detection, classification and avoidance. In particular, the paper focuses on the algorithm proposed for optimal avoidance trajectory generation in UA applications.

Keywords: LIDAR, Low-Level Flight, Nap-of-the-Earth Flight, Near Infra-Red, Obstacle Avoidance, Obstacle Detection, Obstacle Warning System, Sense and Avoid, Trajectory Optimisation, Unmanned Aircraft.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 6996
88 Advanced Stochastic Models for Partially Developed Speckle

Authors: Jihad S. Daba (Jean-Pierre Dubois), Philip Jreije

Abstract:

Speckled images arise when coherent microwave, optical, and acoustic imaging techniques are used to image an object, surface or scene. Examples of coherent imaging systems include synthetic aperture radar, laser imaging systems, imaging sonar systems, and medical ultrasound systems. Speckle noise is a form of object or target induced noise that results when the surface of the object is Rayleigh rough compared to the wavelength of the illuminating radiation. Detection and estimation in images corrupted by speckle noise is complicated by the nature of the noise and is not as straightforward as detection and estimation in additive noise. In this work, we derive stochastic models for speckle noise, with an emphasis on speckle as it arises in medical ultrasound images. The motivation for this work is the problem of segmentation and tissue classification using ultrasound imaging. Modeling of speckle in this context involves partially developed speckle model where an underlying Poisson point process modulates a Gram-Charlier series of Laguerre weighted exponential functions, resulting in a doubly stochastic filtered Poisson point process. The statistical distribution of partially developed speckle is derived in a closed canonical form. It is observed that as the mean number of scatterers in a resolution cell is increased, the probability density function approaches an exponential distribution. This is consistent with fully developed speckle noise as demonstrated by the Central Limit theorem.

Keywords: Doubly stochastic filtered process, Poisson point process, segmentation, speckle, ultrasound

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1709
87 Analysis of the Operational Performance of Three Unconventional Arterial Intersection Designs: Median U-Turn, Superstreet and Single Quadrant

Authors: Hana Naghawi, Khair Jadaan, Rabab Al-Louzi, Taqwa Hadidi

Abstract:

This paper is aimed to evaluate and compare the operational performance of three Unconventional Arterial Intersection Designs (UAIDs) including Median U-Turn, Superstreet, and Single Quadrant Intersection using real traffic data. For this purpose, the heavily congested signalized intersection of Wadi Saqra in Amman was selected. The effect of implementing each of the proposed UAIDs was not only evaluated on the isolated Wadi Saqra signalized intersection, but also on the arterial road including both surrounding intersections. The operational performance of the isolated intersection was based on the level of service (LOS) expressed in terms of control delay and volume to capacity ratio. On the other hand, the measures used to evaluate the operational performance on the arterial road included traffic progression, stopped delay per vehicle, number of stops and the travel speed. The analysis was performed using SYNCHRO 8 microscopic software. The simulation results showed that all three selected UAIDs outperformed the conventional intersection design in terms of control delay but only the Single Quadrant Intersection design improved the main intersection LOS from F to B. Also, the results indicated that the Single Quadrant Intersection design resulted in an increase in average travel speed by 52%, and a decrease in the average stopped delay by 34% on the selected corridor when compared to the corridor with conventional intersection design. On basis of these results, it can be concluded that the Median U-Turn and the Superstreet do not perform the best under heavy traffic volumes.

Keywords: Median U-turn, single quadrant, superstreet, unconventional arterial intersection design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 820
86 Application of Fuzzy Logic Approach for an Aircraft Model with and without Winglet

Authors: Altab Hossain, Ataur Rahman, Jakir Hossen, A.K.M. P. Iqbal, SK. Hasan

Abstract:

The measurement of aerodynamic forces and moments acting on an aircraft model is important for the development of wind tunnel measurement technology to predict the performance of the full scale vehicle. The potentials of an aircraft model with and without winglet and aerodynamic characteristics with NACA wing No. 65-3- 218 have been studied using subsonic wind tunnel of 1 m × 1 m rectangular test section and 2.5 m long of Aerodynamics Laboratory Faculty of Engineering (University Putra Malaysia). Focusing on analyzing the aerodynamic characteristics of the aircraft model, two main issues are studied in this paper. First, a six component wind tunnel external balance is used for measuring lift, drag and pitching moment. Secondly, Tests are conducted on the aircraft model with and without winglet of two configurations at Reynolds numbers 1.7×105, 2.1×105, and 2.5×105 for different angle of attacks. Fuzzy logic approach is found as efficient for the representation, manipulation and utilization of aerodynamic characteristics. Therefore, the primary purpose of this work was to investigate the relationship between lift and drag coefficients, with free-stream velocities and angle of attacks, and to illustrate how fuzzy logic might play an important role in study of lift aerodynamic characteristics of an aircraft model with the addition of certain winglet configurations. Results of the developed fuzzy logic were compared with the experimental results. For lift coefficient analysis, the mean of actual and predicted values were 0.62 and 0.60 respectively. The coreelation between actual and predicted values (from FLS model) of lift coefficient in different angle of attack was found as 0.99. The mean relative error of actual and predicted valus was found as 5.18% for the velocity of 26.36 m/s which was found to be less than the acceptable limits (10%). The goodness of fit of prediction value was 0.95 which was close to 1.0.

Keywords: Wind tunnel; Winglet; Lift coefficient; Fuzzy logic.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1856
85 Influence of Compactive Efforts on Cement- Bagasse Ash Treatment of Expansive Black Cotton Soil

Authors: Moses, G, Osinubi, K. J.

Abstract:

A laboratory study on the influence of compactive effort on expansive black cotton specimens treated with up to 8% ordinary Portland cement (OPC) admixed with up to 8% bagasse ash (BA) by dry weight of soil and compacted using the energies of the standard Proctor (SP), West African Standard (WAS) or “intermediate” and modified Proctor (MP) were undertaken. The expansive black cotton soil was classified as A-7-6 (16) or CL using the American Association of Highway and Transportation Officials (AASHTO) and Unified Soil Classification System (USCS), respectively. The 7day unconfined compressive strength (UCS) values of the natural soil for SP, WAS and MP compactive efforts are 286, 401 and 515kN/m2 respectively, while peak values of 1019, 1328 and 1420kN/m2 recorded at 8% OPC/ 6% BA, 8% OPC/ 2% BA and 6% OPC/ 4% BA treatments, respectively were less than the UCS value of 1710kN/m2 conventionally used as criterion for adequate cement stabilization. The soaked California bearing ratio (CBR) values of the OPC/BA stabilized soil increased with higher energy level from 2, 4 and 10% for the natural soil to Peak values of 55, 18 and 8% were recorded at 8% OPC/4% BA 8% OPC/2% BA and 8% OPC/4% BA, treatments when SP, WAS and MP compactive effort were used, respectively. The durability of specimens was determined by immersion in water. Soils treatment at 8% OPC/ 4% BA blend gave a value of 50% resistance to loss in strength value which is acceptable because of the harsh test condition of 7 days soaking period specimens were subjected instead of the 4 days soaking period that specified a minimum resistance to loss in strength of 80%. Finally An optimal blend of is 8% OPC/ 4% BA is recommended for treatment of expansive black cotton soil for use as a sub-base material.

Keywords: Bagasse ash, California bearing ratio, Compaction, Durability, Ordinary Portland cement, Unconfined compressive strength.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3524
84 Application of Rapidly Exploring Random Tree Star-Smart and G2 Quintic Pythagorean Hodograph Curves to the UAV Path Planning Problem

Authors: Luiz G. Véras, Felipe L. Medeiros, Lamartine F. Guimarães

Abstract:

This work approaches the automatic planning of paths for Unmanned Aerial Vehicles (UAVs) through the application of the Rapidly Exploring Random Tree Star-Smart (RRT*-Smart) algorithm. RRT*-Smart is a sampling process of positions of a navigation environment through a tree-type graph. The algorithm consists of randomly expanding a tree from an initial position (root node) until one of its branches reaches the final position of the path to be planned. The algorithm ensures the planning of the shortest path, considering the number of iterations tending to infinity. When a new node is inserted into the tree, each neighbor node of the new node is connected to it, if and only if the extension of the path between the root node and that neighbor node, with this new connection, is less than the current extension of the path between those two nodes. RRT*-smart uses an intelligent sampling strategy to plan less extensive routes by spending a smaller number of iterations. This strategy is based on the creation of samples/nodes near to the convex vertices of the navigation environment obstacles. The planned paths are smoothed through the application of the method called quintic pythagorean hodograph curves. The smoothing process converts a route into a dynamically-viable one based on the kinematic constraints of the vehicle. This smoothing method models the hodograph components of a curve with polynomials that obey the Pythagorean Theorem. Its advantage is that the obtained structure allows computation of the curve length in an exact way, without the need for quadratural techniques for the resolution of integrals.

Keywords: Path planning, path smoothing, Pythagorean hodograph curve, RRT*-Smart.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 859
83 Travel Time Evaluation of an Innovative U-Turn Facility on Urban Arterial Roadways

Authors: Ali Pirdavani, Tom Brijs, Tom Bellemans, Geert Wets, Koen Vanhoof

Abstract:

Signalized intersections on high-volume arterials are often congested during peak hours, causing a decrease in through movement efficiency on the arterial. Much of the vehicle delay incurred at conventional intersections is caused by high left-turn demand. Unconventional intersection designs attempt to reduce intersection delay and travel time by rerouting left-turns away from the main intersection and replacing it with right-turn followed by Uturn. The proposed new type of U-turn intersection is geometrically designed with a raised island which provides a protected U-turn movement. In this study several scenarios based on different distances between U-turn and main intersection, traffic volume of major/minor approaches and percentage of left-turn volumes were simulated by use of AIMSUN, a type of traffic microsimulation software. Subsequently some models are proposed in order to compute travel time of each movement. Eventually by correlating these equations to some in-field collected data of some implemented U-turn facilities, the reliability of the proposed models are approved. With these models it would be possible to calculate travel time of each movement under any kind of geometric and traffic condition. By comparing travel time of a conventional signalized intersection with U-turn intersection travel time, it would be possible to decide on converting signalized intersections into this new kind of U-turn facility or not. However comparison of travel time is not part of the scope of this research. In this paper only travel time of this innovative U-turn facility would be predicted. According to some before and after study about the traffic performance of some executed U-turn facilities, it is found that commonly, this new type of U-turn facility produces lower travel time. Thus, evaluation of using this type of unconventional intersection should be seriously considered.

Keywords: Innovative U-turn facility, Microsimulation, Traveltime, Unconventional intersection design.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1312
82 Obesity and Bone Mineral Density in Patients with Large Joint Osteoarthritis

Authors: Vladyslav Povoroznyuk, Anna Musiienko, Nataliia Zaverukha, Roksolana Povoroznyuk

Abstract:

Along with the global aging of population, the number of people with somatic diseases is increasing, including such interrelated pathologies as obesity, osteoarthritis (OA) and osteoporosis (OP). The objective of the study is to examine the connection between body mass index (BMI), OA and bone mineral density (BMD) of lumbar spine, femoral neck and trabecular bone score (TBS) in postmenopausal women with OA. We have observed 359 postmenopausal women (50-89 years old) and divided them into four groups by age: 50-59 yrs, 60-69 yrs, 70-79 yrs and over 80 years old. In addition, according to the American College of Rheumatology (ACR) Clinical classification criteria for knee and hip OA, we divided them into 2 groups: group I – 117 females with symptomatic OA (including 89 patients with knee OA, 28 patients with hip OA) and group II –242 women with a normal functional activity of large joints. Analysis of data was performed taking into account their BMI, classified by World Health Organization (WHO). Diagnosis of obesity was established when BMI was above 30 kg/m2. In woman with obesity, a symptomatic OA was detected in 44 postmenopausal women (41.1%), a normal functional activity of large joints - in 63 women (58.9%). However, in women with normal BMI – 73 women, who account for 29.0% of cases, a symptomatic OA was detected. According to a chi-squared (χ2) test, a significantly higher level of BMI was detected in postmenopausal women with OA (χ2 = 5.05, p = 0.02). Women with a symptomatic OA had a significantly higher BMD of lumbar spine compared with women who had a normal functional activity of large joints. No significant differences of BMD of femoral necks or TBS were detected in either the group with OA or with a normal functional activity of large joints.

Keywords: Bone mineral density, BMD, body mass index, BMI, obesity, overweight, postmenopausal women, osteoarthritis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 617
81 Customer Churn Prediction Using Four Machine Learning Algorithms Integrating Feature Selection and Normalization in the Telecom Sector

Authors: Alanoud Moraya Aldalan, Abdulaziz Almaleh

Abstract:

A crucial part of maintaining a customer-oriented business in the telecommunications industry is understanding the reasons and factors that lead to customer churn. Competition between telecom companies has greatly increased in recent years, which has made it more important to understand customers’ needs in this strong market. For those who are looking to turn over their service providers, understanding their needs is especially important. Predictive churn is now a mandatory requirement for retaining customers in the telecommunications industry. Machine learning can be used to accomplish this. Churn Prediction has become a very important topic in terms of machine learning classification in the telecommunications industry. Understanding the factors of customer churn and how they behave is very important to building an effective churn prediction model. This paper aims to predict churn and identify factors of customers’ churn based on their past service usage history. Aiming at this objective, the study makes use of feature selection, normalization, and feature engineering. Then, this study compared the performance of four different machine learning algorithms on the Orange dataset: Logistic Regression, Random Forest, Decision Tree, and Gradient Boosting. Evaluation of the performance was conducted by using the F1 score and ROC-AUC. Comparing the results of this study with existing models has proven to produce better results. The results showed the Gradients Boosting with feature selection technique outperformed in this study by achieving a 99% F1-score and 99% AUC, and all other experiments achieved good results as well.

Keywords: Machine Learning, Gradient Boosting, Logistic Regression, Churn, Random Forest, Decision Tree, ROC, AUC, F1-score.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 344
80 Development of an Ensemble Classification Model Based on Hybrid Filter-Wrapper Feature Selection for Email Phishing Detection

Authors: R. B. Ibrahim, M. S. Argungu, I. M. Mungadi

Abstract:

It is obvious in this present time, internet has become an indispensable part of human life since its inception. The Internet has provided diverse opportunities to make life so easy for human beings, through the adoption of various channels. Among these channels are email, internet banking, video conferencing, and the like. Email is one of the easiest means of communication hugely accepted among individuals and organizations globally. But over decades the security integrity of this platform has been challenged with malicious activities like Phishing. Email phishing is designed by phishers to fool the recipient into handing over sensitive personal information such as passwords, credit card numbers, account credentials, social security numbers, etc. This activity has caused a lot of financial damage to email users globally which has resulted in bankruptcy, sudden death of victims, and other health-related sicknesses. Although many methods have been proposed to detect email phishing, in this research, the results of multiple machine-learning methods for predicting email phishing have been compared with the use of filter-wrapper feature selection. It is worth noting that all three models performed substantially but one outperformed the other. The dataset used for these models is obtained from Kaggle online data repository, while three classifiers: decision tree, Naïve Bayes, and Logistic regression are ensemble (Bagging) respectively. Results from the study show that the Decision Tree (CART) bagging ensemble recorded the highest accuracy of 98.13% using PEF (Phishing Essential Features). This result further demonstrates the dependability of the proposed model.

Keywords: Ensemble, hybrid, filter-wrapper, phishing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 125