Search results for: k-means clustering based feature weighting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 29289

Search results for: k-means clustering based feature weighting

28539 Design and Implementation a Platform for Adaptive Online Learning Based on Fuzzy Logic

Authors: Budoor Al Abid

Abstract:

Educational systems are increasingly provided as open online services, providing guidance and support for individual learners. To adapt the learning systems, a proper evaluation must be made. This paper builds the evaluation model Fuzzy C Means Adaptive System (FCMAS) based on data mining techniques to assess the difficulty of the questions. The following steps are implemented; first using a dataset from an online international learning system called (slepemapy.cz) the dataset contains over 1300000 records with 9 features for students, questions and answers information with feedback evaluation. Next, a normalization process as preprocessing step was applied. Then FCM clustering algorithms are used to adaptive the difficulty of the questions. The result is three cluster labeled data depending on the higher Wight (easy, Intermediate, difficult). The FCM algorithm gives a label to all the questions one by one. Then Random Forest (RF) Classifier model is constructed on the clustered dataset uses 70% of the dataset for training and 30% for testing; the result of the model is a 99.9% accuracy rate. This approach improves the Adaptive E-learning system because it depends on the student behavior and gives accurate results in the evaluation process more than the evaluation system that depends on feedback only.

Keywords: machine learning, adaptive, fuzzy logic, data mining

Procedia PDF Downloads 196
28538 Performance Prediction Methodology of Slow Aging Assets

Authors: M. Ben Slimene, M.-S. Ouali

Abstract:

Asset management of urban infrastructures faces a multitude of challenges that need to be overcome to obtain a reliable measurement of performances. Predicting the performance of slowly aging systems is one of those challenges, which helps the asset manager to investigate specific failure modes and to undertake the appropriate maintenance and rehabilitation interventions to avoid catastrophic failures as well as to optimize the maintenance costs. This article presents a methodology for modeling the deterioration of slowly degrading assets based on an operating history. It consists of extracting degradation profiles by grouping together assets that exhibit similar degradation sequences using an unsupervised classification technique derived from artificial intelligence. The obtained clusters are used to build the performance prediction models. This methodology is applied to a sample of a stormwater drainage culvert dataset.

Keywords: artificial Intelligence, clustering, culvert, regression model, slow degradation

Procedia PDF Downloads 112
28537 Kou Jump Diffusion Model: An Application to the SP 500; Nasdaq 100 and Russell 2000 Index Options

Authors: Wajih Abbassi, Zouhaier Ben Khelifa

Abstract:

The present research points towards the empirical validation of three options valuation models, the ad-hoc Black-Scholes model as proposed by Berkowitz (2001), the constant elasticity of variance model of Cox and Ross (1976) and the Kou jump-diffusion model (2002). Our empirical analysis has been conducted on a sample of 26,974 options written on three indexes, the S&P 500, Nasdaq 100 and the Russell 2000 that were negotiated during the year 2007 just before the sub-prime crisis. We start by presenting the theoretical foundations of the models of interest. Then we use the technique of trust-region-reflective algorithm to estimate the structural parameters of these models from cross-section of option prices. The empirical analysis shows the superiority of the Kou jump-diffusion model. This superiority arises from the ability of this model to portray the behavior of market participants and to be closest to the true distribution that characterizes the evolution of these indices. Indeed the double-exponential distribution covers three interesting properties that are: the leptokurtic feature, the memory less property and the psychological aspect of market participants. Numerous empirical studies have shown that markets tend to have both overreaction and under reaction over good and bad news respectively. Despite of these advantages there are not many empirical studies based on this model partly because probability distribution and option valuation formula are rather complicated. This paper is the first to have used the technique of nonlinear curve-fitting through the trust-region-reflective algorithm and cross-section options to estimate the structural parameters of the Kou jump-diffusion model.

Keywords: jump-diffusion process, Kou model, Leptokurtic feature, trust-region-reflective algorithm, US index options

Procedia PDF Downloads 429
28536 Fake News Detection Based on Fusion of Domain Knowledge and Expert Knowledge

Authors: Yulan Wu

Abstract:

The spread of fake news on social media has posed significant societal harm to the public and the nation, with its threats spanning various domains, including politics, economics, health, and more. News on social media often covers multiple domains, and existing models studied by researchers and relevant organizations often perform well on datasets from a single domain. However, when these methods are applied to social platforms with news spanning multiple domains, their performance significantly deteriorates. Existing research has attempted to enhance the detection performance of multi-domain datasets by adding single-domain labels to the data. However, these methods overlook the fact that a news article typically belongs to multiple domains, leading to the loss of domain knowledge information contained within the news text. To address this issue, research has found that news records in different domains often use different vocabularies to describe their content. In this paper, we propose a fake news detection framework that combines domain knowledge and expert knowledge. Firstly, it utilizes an unsupervised domain discovery module to generate a low-dimensional vector for each news article, representing domain embeddings, which can retain multi-domain knowledge of the news content. Then, a feature extraction module uses the domain embeddings discovered through unsupervised domain knowledge to guide multiple experts in extracting news knowledge for the total feature representation. Finally, a classifier is used to determine whether the news is fake or not. Experiments show that this approach can improve multi-domain fake news detection performance while reducing the cost of manually labeling domain labels.

Keywords: fake news, deep learning, natural language processing, multiple domains

Procedia PDF Downloads 73
28535 Statistical Wavelet Features, PCA, and SVM-Based Approach for EEG Signals Classification

Authors: R. K. Chaurasiya, N. D. Londhe, S. Ghosh

Abstract:

The study of the electrical signals produced by neural activities of human brain is called Electroencephalography. In this paper, we propose an automatic and efficient EEG signal classification approach. The proposed approach is used to classify the EEG signal into two classes: epileptic seizure or not. In the proposed approach, we start with extracting the features by applying Discrete Wavelet Transform (DWT) in order to decompose the EEG signals into sub-bands. These features, extracted from details and approximation coefficients of DWT sub-bands, are used as input to Principal Component Analysis (PCA). The classification is based on reducing the feature dimension using PCA and deriving the support-vectors using Support Vector Machine (SVM). The experimental are performed on real and standard dataset. A very high level of classification accuracy is obtained in the result of classification.

Keywords: discrete wavelet transform, electroencephalogram, pattern recognition, principal component analysis, support vector machine

Procedia PDF Downloads 639
28534 A Trend Based Forecasting Framework of the ATA Method and Its Performance on the M3-Competition Data

Authors: H. Taylan Selamlar, I. Yavuz, G. Yapar

Abstract:

It is difficult to make predictions especially about the future and making accurate predictions is not always easy. However, better predictions remain the foundation of all science therefore the development of accurate, robust and reliable forecasting methods is very important. Numerous number of forecasting methods have been proposed and studied in the literature. There are still two dominant major forecasting methods: Box-Jenkins ARIMA and Exponential Smoothing (ES), and still new methods are derived or inspired from them. After more than 50 years of widespread use, exponential smoothing is still one of the most practically relevant forecasting methods available due to their simplicity, robustness and accuracy as automatic forecasting procedures especially in the famous M-Competitions. Despite its success and widespread use in many areas, ES models have some shortcomings that negatively affect the accuracy of forecasts. Therefore, a new forecasting method in this study will be proposed to cope with these shortcomings and it will be called ATA method. This new method is obtained from traditional ES models by modifying the smoothing parameters therefore both methods have similar structural forms and ATA can be easily adapted to all of the individual ES models however ATA has many advantages due to its innovative new weighting scheme. In this paper, the focus is on modeling the trend component and handling seasonality patterns by utilizing classical decomposition. Therefore, ATA method is expanded to higher order ES methods for additive, multiplicative, additive damped and multiplicative damped trend components. The proposed models are called ATA trended models and their predictive performances are compared to their counter ES models on the M3 competition data set since it is still the most recent and comprehensive time-series data collection available. It is shown that the models outperform their counters on almost all settings and when a model selection is carried out amongst these trended models ATA outperforms all of the competitors in the M3- competition for both short term and long term forecasting horizons when the models’ forecasting accuracies are compared based on popular error metrics.

Keywords: accuracy, exponential smoothing, forecasting, initial value

Procedia PDF Downloads 177
28533 Analysis of Ozone Episodes in the Forest and Vegetation Areas with Using HYSPLIT Model: A Case Study of the North-West Side of Biga Peninsula, Turkey

Authors: Deniz Sari, Selahattin İncecik, Nesimi Ozkurt

Abstract:

Surface ozone, which named as one of the most critical pollutants in the 21th century, threats to human health, forest and vegetation. Specifically, in rural areas surface ozone cause significant influences on agricultural productions and trees. In this study, in order to understand to the surface ozone levels in rural areas we focus on the north-western side of Biga Peninsula which covers by the mountainous and forested area. Ozone concentrations were measured for the first time with passive sampling at 10 sites and two online monitoring stations in this rural area from 2013 and 2015. Using with the daytime hourly O3 measurements during light hours (08:00–20:00) exceeding the threshold of 40 ppb over the 3 months (May, June and July) for agricultural crops, and over the six months (April to September) for forest trees AOT40 (Accumulated hourly O3 concentrations Over a Threshold of 40 ppb) cumulative index was calculated. AOT40 is defined by EU Directive 2008/50/EC to evaluate whether ozone pollution is a risk for vegetation, and is calculated by using hourly ozone concentrations from monitoring systems. In the present study, we performed the trajectory analysis by The Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model to follow the long-range transport sources contributing to the high ozone levels in the region. The ozone episodes observed between 2013 and 2015 were analysed using the HYSPLIT model developed by the NOAA-ARL. In addition, the cluster analysis is used to identify homogeneous groups of air mass transport patterns can be conducted through air trajectory clustering by grouping similar trajectories in terms of air mass movement. Backward trajectories produced for 3 years by HYSPLIT model were assigned to different clusters according to their moving speed and direction using a k-means clustering algorithm. According to cluster analysis results, northerly flows to study area cause to high ozone levels in the region. The results present that the ozone values in the study area are above the critical levels for forest and vegetation based on EU Directive 2008/50/EC.

Keywords: AOT40, Biga Peninsula, HYSPLIT, surface ozone

Procedia PDF Downloads 255
28532 Urbanization Effects on the Food-Water-Energy Nexus within Ecosystem Services: A Case Study of the Beijing-Tianjin-Hebei Urban Agglomeration in China

Authors: Ke Yang, QiHan, Bauke de Veirs

Abstract:

This study addresses the need for coordinated management of natural resources in urban agglomeration. Using ecosystem services theory, The study explore the relationship between land use in the Beijing-Tianjin-Hebei (B-T-H) region and the Food-Water-Energy (F-W-E) nexus from 2000 to 2030. We assess ecosystem services using the InVEST: Habitat Quality (HQ), Water Yield (WY), Carbon Sequestration (CS), Soil Retention (SDR), and Food Production (FP). The study find an annual expansion of construction land alongside a significant decline in cultivated land. Additionally, HQ, CS, and per capita FP decline annually until 2020 and are expected to persist through 2030. In contrast, WY and SDR grow annually but may decline by 2030. Spearman coefficient analysis reveals synergies between HQ and CS, SDR and CS, and SDR and HQ, with trade-offs between CS and WY and HQ and WY. Utilizing the K-means clustering analysis method, we introduce county-based spatial planning for the F-W-E system, offering valuable insights and recommendations for sustainable resource management.

Keywords: food-water-energy (F-W-E), ecosystem services, trade-offs and synergies, ecosystem service bundle, county-based

Procedia PDF Downloads 62
28531 An Infinite Mixture Model for Modelling Stutter Ratio in Forensic Data Analysis

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Forensic DNA analysis has received much attention over the last three decades, due to its incredible usefulness in human identification. The statistical interpretation of DNA evidence is recognised as one of the most mature fields in forensic science. Peak heights in an Electropherogram (EPG) are approximately proportional to the amount of template DNA in the original sample being tested. A stutter is a minor peak in an EPG, which is not masking as an allele of a potential contributor, and considered as an artefact that is presumed to be arisen due to miscopying or slippage during the PCR. Stutter peaks are mostly analysed in terms of stutter ratio that is calculated relative to the corresponding parent allele height. Analysis of mixture profiles has always been problematic in evidence interpretation, especially with the presence of PCR artefacts like stutters. Unlike binary and semi-continuous models; continuous models assign a probability (as a continuous weight) for each possible genotype combination, and significantly enhances the use of continuous peak height information resulting in more efficient reliable interpretations. Therefore, the presence of a sound methodology to distinguish between stutters and real alleles is essential for the accuracy of the interpretation. Sensibly, any such method has to be able to focus on modelling stutter peaks. Bayesian nonparametric methods provide increased flexibility in applied statistical modelling. Mixture models are frequently employed as fundamental data analysis tools in clustering and classification of data and assume unidentified heterogeneous sources for data. In model-based clustering, each unknown source is reflected by a cluster, and the clusters are modelled using parametric models. Specifying the number of components in finite mixture models, however, is practically difficult even though the calculations are relatively simple. Infinite mixture models, in contrast, do not require the user to specify the number of components. Instead, a Dirichlet process, which is an infinite-dimensional generalization of the Dirichlet distribution, is used to deal with the problem of a number of components. Chinese restaurant process (CRP), Stick-breaking process and Pólya urn scheme are frequently used as Dirichlet priors in Bayesian mixture models. In this study, we illustrate an infinite mixture of simple linear regression models for modelling stutter ratio and introduce some modifications to overcome weaknesses associated with CRP.

Keywords: Chinese restaurant process, Dirichlet prior, infinite mixture model, PCR stutter

Procedia PDF Downloads 330
28530 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang

Abstract:

Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 201
28529 Improved Performance in Content-Based Image Retrieval Using Machine Learning Approach

Authors: B. Ramesh Naik, T. Venugopal

Abstract:

This paper presents a novel approach which improves the high-level semantics of images based on machine learning approach. The contemporary approaches for image retrieval and object recognition includes Fourier transforms, Wavelets, SIFT and HoG. Though these descriptors helpful in a wide range of applications, they exploit zero order statistics, and this lacks high descriptiveness of image features. These descriptors usually take benefit of primitive visual features such as shape, color, texture and spatial locations to describe images. These features do not adequate to describe high-level semantics of the images. This leads to a gap in semantic content caused to unacceptable performance in image retrieval system. A novel method has been proposed referred as discriminative learning which is derived from machine learning approach that efficiently discriminates image features. The analysis and results of proposed approach were validated thoroughly on WANG and Caltech-101 Databases. The results proved that this approach is very competitive in content-based image retrieval.

Keywords: CBIR, discriminative learning, region weight learning, scale invariant feature transforms

Procedia PDF Downloads 181
28528 A Segmentation Method for Grayscale Images Based on the Firefly Algorithm and the Gaussian Mixture Model

Authors: Donatella Giuliani

Abstract:

In this research, we propose an unsupervised grayscale image segmentation method based on a combination of the Firefly Algorithm and the Gaussian Mixture Model. Firstly, the Firefly Algorithm has been applied in a histogram-based research of cluster means. The Firefly Algorithm is a stochastic global optimization technique, centered on the flashing characteristics of fireflies. In this context it has been performed to determine the number of clusters and the related cluster means in a histogram-based segmentation approach. Successively these means are used in the initialization step for the parameter estimation of a Gaussian Mixture Model. The parametric probability density function of a Gaussian Mixture Model is represented as a weighted sum of Gaussian component densities, whose parameters are evaluated applying the iterative Expectation-Maximization technique. The coefficients of the linear super-position of Gaussians can be thought as prior probabilities of each component. Applying the Bayes rule, the posterior probabilities of the grayscale intensities have been evaluated, therefore their maxima are used to assign each pixel to the clusters, according to their gray-level values. The proposed approach appears fairly solid and reliable when applied even to complex grayscale images. The validation has been performed by using different standard measures, more precisely: the Root Mean Square Error (RMSE), the Structural Content (SC), the Normalized Correlation Coefficient (NK) and the Davies-Bouldin (DB) index. The achieved results have strongly confirmed the robustness of this gray scale segmentation method based on a metaheuristic algorithm. Another noteworthy advantage of this methodology is due to the use of maxima of responsibilities for the pixel assignment that implies a consistent reduction of the computational costs.

Keywords: clustering images, firefly algorithm, Gaussian mixture model, meta heuristic algorithm, image segmentation

Procedia PDF Downloads 217
28527 Automated Feature Extraction and Object-Based Detection from High-Resolution Aerial Photos Based on Machine Learning and Artificial Intelligence

Authors: Mohammed Al Sulaimani, Hamad Al Manhi

Abstract:

With the development of Remote Sensing technology, the resolution of optical Remote Sensing images has greatly improved, and images have become largely available. Numerous detectors have been developed for detecting different types of objects. In the past few years, Remote Sensing has benefited a lot from deep learning, particularly Deep Convolution Neural Networks (CNNs). Deep learning holds great promise to fulfill the challenging needs of Remote Sensing and solving various problems within different fields and applications. The use of Unmanned Aerial Systems in acquiring Aerial Photos has become highly used and preferred by most organizations to support their activities because of their high resolution and accuracy, which make the identification and detection of very small features much easier than Satellite Images. And this has opened an extreme era of Deep Learning in different applications not only in feature extraction and prediction but also in analysis. This work addresses the capacity of Machine Learning and Deep Learning in detecting and extracting Oil Leaks from Flowlines (Onshore) using High-Resolution Aerial Photos which have been acquired by UAS fixed with RGB Sensor to support early detection of these leaks and prevent the company from the leak’s losses and the most important thing environmental damage. Here, there are two different approaches and different methods of DL have been demonstrated. The first approach focuses on detecting the Oil Leaks from the RAW Aerial Photos (not processed) using a Deep Learning called Single Shoot Detector (SSD). The model draws bounding boxes around the leaks, and the results were extremely good. The second approach focuses on detecting the Oil Leaks from the Ortho-mosaiced Images (Georeferenced Images) by developing three Deep Learning Models using (MaskRCNN, U-Net and PSP-Net Classifier). Then, post-processing is performed to combine the results of these three Deep Learning Models to achieve a better detection result and improved accuracy. Although there is a relatively small amount of datasets available for training purposes, the Trained DL Models have shown good results in extracting the extent of the Oil Leaks and obtaining excellent and accurate detection.

Keywords: GIS, remote sensing, oil leak detection, machine learning, aerial photos, unmanned aerial systems

Procedia PDF Downloads 34
28526 Complex Network Approach to International Trade of Fossil Fuel

Authors: Semanur Soyyigit Kaya, Ercan Eren

Abstract:

Energy has a prominent role for development of nations. Countries which have energy resources also have strategic power in the international trade of energy since it is essential for all stages of production in the economy. Thus, it is important for countries to analyze the weakness and strength of the system. On the other side, it is commonly believed that international trade has complex network properties. Complex network is a tool for the analysis of complex systems with heterogeneous agents and interaction between them. A complex network consists of nodes and the interactions between these nodes. Total properties which emerge as a result of these interactions are distinct from the sum of small parts (more or less) in complex systems. Thus, standard approaches to international trade are superficial to analyze these systems. Network analysis provides a new approach to analyze international trade as a network. In this network countries constitute nodes and trade relations (export or import) constitute edges. It becomes possible to analyze international trade network in terms of high degree indicators which are specific to complex systems such as connectivity, clustering, assortativity/disassortativity, centrality, etc. In this analysis, international trade of crude oil and coal which are types of fossil fuel has been analyzed from 2005 to 2014 via network analysis. First, it has been analyzed in terms of some topological parameters such as density, transitivity, clustering etc. Afterwards, fitness to Pareto distribution has been analyzed. Finally, weighted HITS algorithm has been applied to the data as a centrality measure to determine the real prominence of countries in these trade networks. Weighted HITS algorithm is a strong tool to analyze the network by ranking countries with regards to prominence of their trade partners. We have calculated both an export centrality and an import centrality by applying w-HITS algorithm to data.

Keywords: complex network approach, fossil fuel, international trade, network theory

Procedia PDF Downloads 336
28525 Genetic Diversity of Sorghum bicolor (L.) Moench Genotypes as Revealed by Microsatellite Markers

Authors: Maletsema Alina Mofokeng, Hussein Shimelis, Mark Laing, Pangirayi Tongoona

Abstract:

Sorghum is one of the most important cereal crops grown for food, feed and bioenergy. Knowledge of genetic diversity is important for conservation of genetic resources and improvement of crop plants through breeding. The objective of this study was to assess the level of genetic diversity among sorghum genotypes using microsatellite markers. A total of 103 accessions of sorghum genotypes obtained from the Department of Agriculture, Forestry and Fisheries, the African Centre for Crop Improvement and Agricultural Research Council-Grain Crops Institute collections in South Africa were estimated using 30 microsatellite markers. For all the loci analysed, 306 polymorphic alleles were detected with a mean value of 6.4 per locus. The polymorphic information content had an average value of 0.50 with heterozygosity mean value of 0.55 suggesting an important genetic diversity within the sorghum genotypes used. The unweighted pair group method with arithmetic mean clustering based on Euclidian coefficients revealed two major distinct groups without allocating genotypes based on the source of collection or origin. The genotypes 4154.1.1.1, 2055.1.1.1, 4441.1.1.1, 4442.1.1.1, 4722.1.1.1, and 4606.1.1.1 were the most diverse. The sorghum genotypes with high genetic diversity could serve as important sources of novel alleles for breeding and strategic genetic conservation.

Keywords: Genetic Diversity, Genotypes, Microsatellites, Sorghum

Procedia PDF Downloads 376
28524 Modelling of Heating and Evaporation of Biodiesel Fuel Droplets

Authors: Mansour Al Qubeissi, Sergei S. Sazhin, Cyril Crua, Morgan R. Heikal

Abstract:

This paper presents the application of the Discrete Component Model for heating and evaporation to multi-component biodiesel fuel droplets in direct injection internal combustion engines. This model takes into account the effects of temperature gradient, recirculation and species diffusion inside droplets. A distinctive feature of the model used in the analysis is that it is based on the analytical solutions to the temperature and species diffusion equations inside the droplets. Nineteen types of biodiesel fuels are considered. It is shown that a simplistic model, based on the approximation of biodiesel fuel by a single component or ignoring the diffusion of components of biodiesel fuel, leads to noticeable errors in predicted droplet evaporation time and time evolution of droplet surface temperature and radius.

Keywords: heat/mass transfer, biodiesel, multi-component fuel, droplet

Procedia PDF Downloads 568
28523 A Literature Review on the Role of Local Potential for Creative Industries

Authors: Maya Irjayanti

Abstract:

Local creativity utilization has been a strategic investment to be expanded as a creative industry due to its significant contribution to the national gross domestic product. Many developed and developing countries look toward creative industries as an agenda for the economic growth. This study aims to identify the role of local potential for creative industries from various empirical studies. The method performed in this study will involve a peer-reviewed journal articles and conference papers review addressing local potential and creative industries. The literature review analysis will include several steps: material collection, descriptive analysis, category selection, and material evaluation. Finally, the outcome expected provides a creative industries clustering based on the local potential of various nations. In addition, the finding of this study will be used as future research reference to explore a particular area with well-known aspects of local potential for creative industry products.

Keywords: business, creativity, local potential, local wisdom

Procedia PDF Downloads 386
28522 An Adaptive Dimensionality Reduction Approach for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah, Basel Solaiman

Abstract:

With the development of HyperSpectral Imagery (HSI) technology, the spectral resolution of HSI became denser, which resulted in large number of spectral bands, high correlation between neighboring, and high data redundancy. However, the semantic interpretation is a challenging task for HSI analysis due to the high dimensionality and the high correlation of the different spectral bands. In fact, this work presents a dimensionality reduction approach that allows to overcome the different issues improving the semantic interpretation of HSI. Therefore, in order to preserve the spatial information, the Tensor Locality Preserving Projection (TLPP) has been applied to transform the original HSI. In the second step, knowledge has been extracted based on the adjacency graph to describe the different pixels. Based on the transformation matrix using TLPP, a weighted matrix has been constructed to rank the different spectral bands based on their contribution score. Thus, the relevant bands have been adaptively selected based on the weighted matrix. The performance of the presented approach has been validated by implementing several experiments, and the obtained results demonstrate the efficiency of this approach compared to various existing dimensionality reduction techniques. Also, according to the experimental results, we can conclude that this approach can adaptively select the relevant spectral improving the semantic interpretation of HSI.

Keywords: band selection, dimensionality reduction, feature extraction, hyperspectral imagery, semantic interpretation

Procedia PDF Downloads 354
28521 DISGAN: Efficient Generative Adversarial Network-Based Method for Cyber-Intrusion Detection

Authors: Hongyu Chen, Li Jiang

Abstract:

Ubiquitous anomalies endanger the security of our system con- stantly. They may bring irreversible damages to the system and cause leakage of privacy. Thus, it is of vital importance to promptly detect these anomalies. Traditional supervised methods such as Decision Trees and Support Vector Machine (SVM) are used to classify normality and abnormality. However, in some case, the abnormal status are largely rarer than normal status, which leads to decision bias of these methods. Generative adversarial network (GAN) has been proposed to handle the case. With its strong generative ability, it only needs to learn the distribution of normal status, and identify the abnormal status through the gap between it and the learned distribution. Nevertheless, existing GAN-based models are not suitable to process data with discrete values, leading to immense degradation of detection performance. To cope with the discrete features, in this paper, we propose an efficient GAN-based model with specifically-designed loss function. Experiment results show that our model outperforms state-of-the-art models on discrete dataset and remarkably reduce the overhead.

Keywords: GAN, discrete feature, Wasserstein distance, multiple intermediate layers

Procedia PDF Downloads 129
28520 Remaining Useful Life Estimation of Bearings Based on Nonlinear Dimensional Reduction Combined with Timing Signals

Authors: Zhongmin Wang, Wudong Fan, Hengshan Zhang, Yimin Zhou

Abstract:

In data-driven prognostic methods, the prediction accuracy of the estimation for remaining useful life of bearings mainly depends on the performance of health indicators, which are usually fused some statistical features extracted from vibrating signals. However, the existing health indicators have the following two drawbacks: (1) The differnet ranges of the statistical features have the different contributions to construct the health indicators, the expert knowledge is required to extract the features. (2) When convolutional neural networks are utilized to tackle time-frequency features of signals, the time-series of signals are not considered. To overcome these drawbacks, in this study, the method combining convolutional neural network with gated recurrent unit is proposed to extract the time-frequency image features. The extracted features are utilized to construct health indicator and predict remaining useful life of bearings. First, original signals are converted into time-frequency images by using continuous wavelet transform so as to form the original feature sets. Second, with convolutional and pooling layers of convolutional neural networks, the most sensitive features of time-frequency images are selected from the original feature sets. Finally, these selected features are fed into the gated recurrent unit to construct the health indicator. The results state that the proposed method shows the enhance performance than the related studies which have used the same bearing dataset provided by PRONOSTIA.

Keywords: continuous wavelet transform, convolution neural net-work, gated recurrent unit, health indicators, remaining useful life

Procedia PDF Downloads 133
28519 A Survey on Types of Noises and De-Noising Techniques

Authors: Amandeep Kaur

Abstract:

Digital Image processing is a fundamental tool to perform various operations on the digital images for pattern recognition, noise removal and feature extraction. In this paper noise removal technique has been described for various types of noises. This paper comprises discussion about various noises available in the image due to different environmental, accidental factors. In this paper, various de-noising approaches have been discussed that utilize different wavelets and filters for de-noising. By analyzing various papers on image de-noising we extract that wavelet based de-noise approaches are much effective as compared to others.

Keywords: de-noising techniques, edges, image, image processing

Procedia PDF Downloads 336
28518 Task Evoked Pupillary Response for Surgical Task Difficulty Prediction via Multitask Learning

Authors: Beilei Xu, Wencheng Wu, Lei Lin, Rachel Melnyk, Ahmed Ghazi

Abstract:

In operating rooms, excessive cognitive stress can impede the performance of a surgeon, while low engagement can lead to unavoidable mistakes due to complacency. As a consequence, there is a strong desire in the surgical community to be able to monitor and quantify the cognitive stress of a surgeon while performing surgical procedures. Quantitative cognitiveload-based feedback can also provide valuable insights during surgical training to optimize training efficiency and effectiveness. Various physiological measures have been evaluated for quantifying cognitive stress for different mental challenges. In this paper, we present a study using the cognitive stress measured by the task evoked pupillary response extracted from the time series eye-tracking measurements to predict task difficulties in a virtual reality based robotic surgery training environment. In particular, we proposed a differential-task-difficulty scale, utilized a comprehensive feature extraction approach, and implemented a multitask learning framework and compared the regression accuracy between the conventional single-task-based and three multitask approaches across subjects.

Keywords: surgical metric, task evoked pupillary response, multitask learning, TSFresh

Procedia PDF Downloads 146
28517 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images

Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.

Keywords: defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets

Procedia PDF Downloads 485
28516 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 478
28515 Reed: An Approach Towards Quickly Bootstrapping Multilingual Acoustic Models

Authors: Bipasha Sen, Aditya Agarwal

Abstract:

Multilingual automatic speech recognition (ASR) system is a single entity capable of transcribing multiple languages sharing a common phone space. Performance of such a system is highly dependent on the compatibility of the languages. State of the art speech recognition systems are built using sequential architectures based on recurrent neural networks (RNN) limiting the computational parallelization in training. This poses a significant challenge in terms of time taken to bootstrap and validate the compatibility of multiple languages for building a robust multilingual system. Complex architectural choices based on self-attention networks are made to improve the parallelization thereby reducing the training time. In this work, we propose Reed, a simple system based on 1D convolutions which uses very short context to improve the training time. To improve the performance of our system, we use raw time-domain speech signals directly as input. This enables the convolutional layers to learn feature representations rather than relying on handcrafted features such as MFCC. We report improvement on training and inference times by atleast a factor of 4x and 7.4x respectively with comparable WERs against standard RNN based baseline systems on SpeechOcean's multilingual low resource dataset.

Keywords: convolutional neural networks, language compatibility, low resource languages, multilingual automatic speech recognition

Procedia PDF Downloads 123
28514 Multimodal Deep Learning for Human Activity Recognition

Authors: Ons Slimene, Aroua Taamallah, Maha Khemaja

Abstract:

In recent years, human activity recognition (HAR) has been a key area of research due to its diverse applications. It has garnered increasing attention in the field of computer vision. HAR plays an important role in people’s daily lives as it has the ability to learn advanced knowledge about human activities from data. In HAR, activities are usually represented by exploiting different types of sensors, such as embedded sensors or visual sensors. However, these sensors have limitations, such as local obstacles, image-related obstacles, sensor unreliability, and consumer concerns. Recently, several deep learning-based approaches have been proposed for HAR and these approaches are classified into two categories based on the type of data used: vision-based approaches and sensor-based approaches. This research paper highlights the importance of multimodal data fusion from skeleton data obtained from videos and data generated by embedded sensors using deep neural networks for achieving HAR. We propose a deep multimodal fusion network based on a twostream architecture. These two streams use the Convolutional Neural Network combined with the Bidirectional LSTM (CNN BILSTM) to process skeleton data and data generated by embedded sensors and the fusion at the feature level is considered. The proposed model was evaluated on a public OPPORTUNITY++ dataset and produced a accuracy of 96.77%.

Keywords: human activity recognition, action recognition, sensors, vision, human-centric sensing, deep learning, context-awareness

Procedia PDF Downloads 101
28513 Emotion Recognition with Occlusions Based on Facial Expression Reconstruction and Weber Local Descriptor

Authors: Jadisha Cornejo, Helio Pedrini

Abstract:

Recognition of emotions based on facial expressions has received increasing attention from the scientific community over the last years. Several fields of applications can benefit from facial emotion recognition, such as behavior prediction, interpersonal relations, human-computer interactions, recommendation systems. In this work, we develop and analyze an emotion recognition framework based on facial expressions robust to occlusions through the Weber Local Descriptor (WLD). Initially, the occluded facial expressions are reconstructed following an extension approach of Robust Principal Component Analysis (RPCA). Then, WLD features are extracted from the facial expression representation, as well as Local Binary Patterns (LBP) and Histogram of Oriented Gradients (HOG). The feature vector space is reduced using Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA). Finally, K-Nearest Neighbor (K-NN) and Support Vector Machine (SVM) classifiers are used to recognize the expressions. Experimental results on three public datasets demonstrated that the WLD representation achieved competitive accuracy rates for occluded and non-occluded facial expressions compared to other approaches available in the literature.

Keywords: emotion recognition, facial expression, occlusion, fiducial landmarks

Procedia PDF Downloads 182
28512 Information Retrieval for Kafficho Language

Authors: Mareye Zeleke Mekonen

Abstract:

The Kafficho language has distinct issues in information retrieval because of its restricted resources and dearth of standardized methods. In this endeavor, with the cooperation and support of linguists and native speakers, we investigate the creation of information retrieval systems specifically designed for the Kafficho language. The Kafficho information retrieval system allows Kafficho speakers to access information easily in an efficient and effective way. Our objective is to conduct an information retrieval experiment using 220 Kafficho text files, including fifteen sample questions. Tokenization, normalization, stop word removal, stemming, and other data pre-processing chores, together with additional tasks like term weighting, were prerequisites for the vector space model to represent each page and a particular query. The three well-known measurement metrics we used for our word were Precision, Recall, and and F-measure, with values of 87%, 28%, and 35%, respectively. This demonstrates how well the Kaffiho information retrieval system performed well while utilizing the vector space paradigm.

Keywords: Kafficho, information retrieval, stemming, vector space

Procedia PDF Downloads 57
28511 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 95
28510 Computing Customer Lifetime Value in E-Commerce Websites with Regard to Returned Orders and Payment Method

Authors: Morteza Giti

Abstract:

As online shopping is becoming increasingly popular, computing customer lifetime value for better knowing the customers is also gaining more importance. Two distinct factors that can affect the value of a customer in the context of online shopping is the number of returned orders and payment method. Returned orders are those which have been shipped but not collected by the customer and are returned to the store. Payment method refers to the way that customers choose to pay for the price of the order which are usually two: Pre-pay and Cash-on-delivery. In this paper, a novel model called RFMSP is presented to calculated the customer lifetime value, taking these two parameters into account. The RFMSP model is based on the common RFM model while adding two extra parameter. The S represents the order status and the P indicates the payment method. As a case study for this model, the purchase history of customers in an online shop is used to compute the customer lifetime value over a period of twenty months.

Keywords: RFMSP model, AHP, customer lifetime value, k-means clustering, e-commerce

Procedia PDF Downloads 321