Search results for: k-means clustering based feature weighting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 28563

Search results for: k-means clustering based feature weighting

27813 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels, so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to exponential growth of computation, this paper also proposes a key data extraction method, that only extracts part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: data augmentation, mutex task generation, meta-learning, text classification.

Procedia PDF Downloads 74
27812 Relay Node Selection Algorithm for Cooperative Communications in Wireless Networks

Authors: Sunmyeng Kim

Abstract:

IEEE 802.11a/b/g standards support multiple transmission rates. Even though the use of multiple transmission rates increase the WLAN capacity, this feature leads to the performance anomaly problem. Cooperative communication was introduced to relieve the performance anomaly problem. Data packets are delivered to the destination much faster through a relay node with high rate than through direct transmission to the destination at low rate. In the legacy cooperative protocols, a source node chooses a relay node only based on the transmission rate. Therefore, they are not so feasible in multi-flow environments since they do not consider the effect of other flows. To alleviate the effect, we propose a new relay node selection algorithm based on the transmission rate and channel contention level. Performance evaluation is conducted using simulation, and shows that the proposed protocol significantly outperforms the previous protocol in terms of throughput and delay.

Keywords: cooperative communications, MAC protocol, relay node, WLAN

Procedia PDF Downloads 314
27811 COVID-19 Analysis with Deep Learning Model Using Chest X-Rays Images

Authors: Uma Maheshwari V., Rajanikanth Aluvalu, Kumar Gautam

Abstract:

The COVID-19 disease is a highly contagious viral infection with major worldwide health implications. The global economy suffers as a result of COVID. The spread of this pandemic disease can be slowed if positive patients are found early. COVID-19 disease prediction is beneficial for identifying patients' health problems that are at risk for COVID. Deep learning and machine learning algorithms for COVID prediction using X-rays have the potential to be extremely useful in solving the scarcity of doctors and clinicians in remote places. In this paper, a convolutional neural network (CNN) with deep layers is presented for recognizing COVID-19 patients using real-world datasets. We gathered around 6000 X-ray scan images from various sources and split them into two categories: normal and COVID-impacted. Our model examines chest X-ray images to recognize such patients. Because X-rays are commonly available and affordable, our findings show that X-ray analysis is effective in COVID diagnosis. The predictions performed well, with an average accuracy of 99% on training photographs and 88% on X-ray test images.

Keywords: deep CNN, COVID–19 analysis, feature extraction, feature map, accuracy

Procedia PDF Downloads 57
27810 Synthetic Aperture Radar Remote Sensing Classification Using the Bag of Visual Words Model to Land Cover Studies

Authors: Reza Mohammadi, Mahmod R. Sahebi, Mehrnoosh Omati, Milad Vahidi

Abstract:

Classification of high resolution polarimetric Synthetic Aperture Radar (PolSAR) images plays an important role in land cover and land use management. Recently, classification algorithms based on Bag of Visual Words (BOVW) model have attracted significant interest among scholars and researchers in and out of the field of remote sensing. In this paper, BOVW model with pixel based low-level features has been implemented to classify a subset of San Francisco bay PolSAR image, acquired by RADARSAR 2 in C-band. We have used segment-based decision-making strategy and compared the result with the result of traditional Support Vector Machine (SVM) classifier. 90.95% overall accuracy of the classification with the proposed algorithm has shown that the proposed algorithm is comparable with the state-of-the-art methods. In addition to increase in the classification accuracy, the proposed method has decreased undesirable speckle effect of SAR images.

Keywords: Bag of Visual Words (BOVW), classification, feature extraction, land cover management, Polarimetric Synthetic Aperture Radar (PolSAR)

Procedia PDF Downloads 189
27809 Predicting Match Outcomes in Team Sport via Machine Learning: Evidence from National Basketball Association

Authors: Jacky Liu

Abstract:

This paper develops a team sports outcome prediction system with potential for wide-ranging applications across various disciplines. Despite significant advancements in predictive analytics, existing studies in sports outcome predictions possess considerable limitations, including insufficient feature engineering and underutilization of advanced machine learning techniques, among others. To address these issues, we extend the Sports Cross Industry Standard Process for Data Mining (SRP-CRISP-DM) framework and propose a unique, comprehensive predictive system, using National Basketball Association (NBA) data as an example to test this extended framework. Our approach follows a holistic methodology in feature engineering, employing both Time Series and Non-Time Series Data, as well as conducting Explanatory Data Analysis and Feature Selection. Furthermore, we contribute to the discourse on target variable choice in team sports outcome prediction, asserting that point spread prediction yields higher profits as opposed to game-winner predictions. Using machine learning algorithms, particularly XGBoost, results in a significant improvement in predictive accuracy of team sports outcomes. Applied to point spread betting strategies, it offers an astounding annual return of approximately 900% on an initial investment of $100. Our findings not only contribute to academic literature, but have critical practical implications for sports betting. Our study advances the understanding of team sports outcome prediction a burgeoning are in complex system predictions and pave the way for potential profitability and more informed decision making in sports betting markets.

Keywords: machine learning, team sports, game outcome prediction, sports betting, profits simulation

Procedia PDF Downloads 79
27808 Analysis of Ozone Episodes in the Forest and Vegetation Areas with Using HYSPLIT Model: A Case Study of the North-West Side of Biga Peninsula, Turkey

Authors: Deniz Sari, Selahattin İncecik, Nesimi Ozkurt

Abstract:

Surface ozone, which named as one of the most critical pollutants in the 21th century, threats to human health, forest and vegetation. Specifically, in rural areas surface ozone cause significant influences on agricultural productions and trees. In this study, in order to understand to the surface ozone levels in rural areas we focus on the north-western side of Biga Peninsula which covers by the mountainous and forested area. Ozone concentrations were measured for the first time with passive sampling at 10 sites and two online monitoring stations in this rural area from 2013 and 2015. Using with the daytime hourly O3 measurements during light hours (08:00–20:00) exceeding the threshold of 40 ppb over the 3 months (May, June and July) for agricultural crops, and over the six months (April to September) for forest trees AOT40 (Accumulated hourly O3 concentrations Over a Threshold of 40 ppb) cumulative index was calculated. AOT40 is defined by EU Directive 2008/50/EC to evaluate whether ozone pollution is a risk for vegetation, and is calculated by using hourly ozone concentrations from monitoring systems. In the present study, we performed the trajectory analysis by The Hybrid Single-Particle Lagrangian Integrated Trajectory (HYSPLIT) model to follow the long-range transport sources contributing to the high ozone levels in the region. The ozone episodes observed between 2013 and 2015 were analysed using the HYSPLIT model developed by the NOAA-ARL. In addition, the cluster analysis is used to identify homogeneous groups of air mass transport patterns can be conducted through air trajectory clustering by grouping similar trajectories in terms of air mass movement. Backward trajectories produced for 3 years by HYSPLIT model were assigned to different clusters according to their moving speed and direction using a k-means clustering algorithm. According to cluster analysis results, northerly flows to study area cause to high ozone levels in the region. The results present that the ozone values in the study area are above the critical levels for forest and vegetation based on EU Directive 2008/50/EC.

Keywords: AOT40, Biga Peninsula, HYSPLIT, surface ozone

Procedia PDF Downloads 233
27807 A Literature Review on the Role of Local Potential for Creative Industries

Authors: Maya Irjayanti

Abstract:

Local creativity utilization has been a strategic investment to be expanded as a creative industry due to its significant contribution to the national gross domestic product. Many developed and developing countries look toward creative industries as an agenda for the economic growth. This study aims to identify the role of local potential for creative industries from various empirical studies. The method performed in this study will involve a peer-reviewed journal articles and conference papers review addressing local potential and creative industries. The literature review analysis will include several steps: material collection, descriptive analysis, category selection, and material evaluation. Finally, the outcome expected provides a creative industries clustering based on the local potential of various nations. In addition, the finding of this study will be used as future research reference to explore a particular area with well-known aspects of local potential for creative industry products.

Keywords: business, creativity, local potential, local wisdom

Procedia PDF Downloads 355
27806 Network Word Discovery Framework Based on Sentence Semantic Vector Similarity

Authors: Ganfeng Yu, Yuefeng Ma, Shanliang Yang

Abstract:

The word discovery is a key problem in text information retrieval technology. Methods in new word discovery tend to be closely related to words because they generally obtain new word results by analyzing words. With the popularity of social networks, individual netizens and online self-media have generated various network texts for the convenience of online life, including network words that are far from standard Chinese expression. How detect network words is one of the important goals in the field of text information retrieval today. In this paper, we integrate the word embedding model and clustering methods to propose a network word discovery framework based on sentence semantic similarity (S³-NWD) to detect network words effectively from the corpus. This framework constructs sentence semantic vectors through a distributed representation model, uses the similarity of sentence semantic vectors to determine the semantic relationship between sentences, and finally realizes network word discovery by the meaning of semantic replacement between sentences. The experiment verifies that the framework not only completes the rapid discovery of network words but also realizes the standard word meaning of the discovery of network words, which reflects the effectiveness of our work.

Keywords: text information retrieval, natural language processing, new word discovery, information extraction

Procedia PDF Downloads 75
27805 A Mutually Exclusive Task Generation Method Based on Data Augmentation

Authors: Haojie Wang, Xun Li, Rui Yin

Abstract:

In order to solve the memorization overfitting in the model-agnostic meta-learning MAML algorithm, a method of generating mutually exclusive tasks based on data augmentation is proposed. This method generates a mutex task by corresponding one feature of the data to multiple labels so that the generated mutex task is inconsistent with the data distribution in the initial dataset. Because generating mutex tasks for all data will produce a large number of invalid data and, in the worst case, lead to an exponential growth of computation, this paper also proposes a key data extraction method that only extract part of the data to generate the mutex task. The experiments show that the method of generating mutually exclusive tasks can effectively solve the memorization overfitting in the meta-learning MAML algorithm.

Keywords: mutex task generation, data augmentation, meta-learning, text classification.

Procedia PDF Downloads 118
27804 TMIF: Transformer-Based Multi-Modal Interactive Fusion for Rumor Detection

Authors: Jiandong Lv, Xingang Wang, Cuiling Shao

Abstract:

The rapid development of social media platforms has made it one of the important news sources. While it provides people with convenient real-time communication channels, fake news and rumors are also spread rapidly through social media platforms, misleading the public and even causing bad social impact in view of the slow speed and poor consistency of artificial rumor detection. We propose an end-to-end rumor detection model-TIMF, which captures the dependencies between multimodal data based on the interactive attention mechanism, uses a transformer for cross-modal feature sequence mapping and combines hybrid fusion strategies to obtain decision results. This paper verifies two multi-modal rumor detection datasets and proves the superior performance and early detection performance of the proposed model.

Keywords: hybrid fusion, multimodal fusion, rumor detection, social media, transformer

Procedia PDF Downloads 205
27803 Remaining Useful Life Estimation of Bearings Based on Nonlinear Dimensional Reduction Combined with Timing Signals

Authors: Zhongmin Wang, Wudong Fan, Hengshan Zhang, Yimin Zhou

Abstract:

In data-driven prognostic methods, the prediction accuracy of the estimation for remaining useful life of bearings mainly depends on the performance of health indicators, which are usually fused some statistical features extracted from vibrating signals. However, the existing health indicators have the following two drawbacks: (1) The differnet ranges of the statistical features have the different contributions to construct the health indicators, the expert knowledge is required to extract the features. (2) When convolutional neural networks are utilized to tackle time-frequency features of signals, the time-series of signals are not considered. To overcome these drawbacks, in this study, the method combining convolutional neural network with gated recurrent unit is proposed to extract the time-frequency image features. The extracted features are utilized to construct health indicator and predict remaining useful life of bearings. First, original signals are converted into time-frequency images by using continuous wavelet transform so as to form the original feature sets. Second, with convolutional and pooling layers of convolutional neural networks, the most sensitive features of time-frequency images are selected from the original feature sets. Finally, these selected features are fed into the gated recurrent unit to construct the health indicator. The results state that the proposed method shows the enhance performance than the related studies which have used the same bearing dataset provided by PRONOSTIA.

Keywords: continuous wavelet transform, convolution neural net-work, gated recurrent unit, health indicators, remaining useful life

Procedia PDF Downloads 111
27802 An Infinite Mixture Model for Modelling Stutter Ratio in Forensic Data Analysis

Authors: M. A. C. S. Sampath Fernando, James M. Curran, Renate Meyer

Abstract:

Forensic DNA analysis has received much attention over the last three decades, due to its incredible usefulness in human identification. The statistical interpretation of DNA evidence is recognised as one of the most mature fields in forensic science. Peak heights in an Electropherogram (EPG) are approximately proportional to the amount of template DNA in the original sample being tested. A stutter is a minor peak in an EPG, which is not masking as an allele of a potential contributor, and considered as an artefact that is presumed to be arisen due to miscopying or slippage during the PCR. Stutter peaks are mostly analysed in terms of stutter ratio that is calculated relative to the corresponding parent allele height. Analysis of mixture profiles has always been problematic in evidence interpretation, especially with the presence of PCR artefacts like stutters. Unlike binary and semi-continuous models; continuous models assign a probability (as a continuous weight) for each possible genotype combination, and significantly enhances the use of continuous peak height information resulting in more efficient reliable interpretations. Therefore, the presence of a sound methodology to distinguish between stutters and real alleles is essential for the accuracy of the interpretation. Sensibly, any such method has to be able to focus on modelling stutter peaks. Bayesian nonparametric methods provide increased flexibility in applied statistical modelling. Mixture models are frequently employed as fundamental data analysis tools in clustering and classification of data and assume unidentified heterogeneous sources for data. In model-based clustering, each unknown source is reflected by a cluster, and the clusters are modelled using parametric models. Specifying the number of components in finite mixture models, however, is practically difficult even though the calculations are relatively simple. Infinite mixture models, in contrast, do not require the user to specify the number of components. Instead, a Dirichlet process, which is an infinite-dimensional generalization of the Dirichlet distribution, is used to deal with the problem of a number of components. Chinese restaurant process (CRP), Stick-breaking process and Pólya urn scheme are frequently used as Dirichlet priors in Bayesian mixture models. In this study, we illustrate an infinite mixture of simple linear regression models for modelling stutter ratio and introduce some modifications to overcome weaknesses associated with CRP.

Keywords: Chinese restaurant process, Dirichlet prior, infinite mixture model, PCR stutter

Procedia PDF Downloads 309
27801 O-LEACH: The Problem of Orphan Nodes in the LEACH of Routing Protocol for Wireless Sensor Networks

Authors: Wassim Jerbi, Abderrahmen Guermazi, Hafedh Trabelsi

Abstract:

The optimum use of coverage in wireless sensor networks (WSNs) is very important. LEACH protocol called Low Energy Adaptive Clustering Hierarchy, presents a hierarchical clustering algorithm for wireless sensor networks. LEACH is a protocol that allows the formation of distributed cluster. In each cluster, LEACH randomly selects some sensor nodes called cluster heads (CHs). The selection of CHs is made with a probabilistic calculation. It is supposed that each non-CH node joins a cluster and becomes a cluster member. Nevertheless, some CHs can be concentrated in a specific part of the network. Thus, several sensor nodes cannot reach any CH. to solve this problem. We created an O-LEACH Orphan nodes protocol, its role is to reduce the sensor nodes which do not belong the cluster. The cluster member called Gateway receives messages from neighboring orphan nodes. The gateway informs CH having the neighboring nodes that not belong to any group. However, Gateway called (CH') attaches the orphaned nodes to the cluster and then collected the data. O-Leach enables the formation of a new method of cluster, leads to a long life and minimal energy consumption. Orphan nodes possess enough energy and seeks to be covered by the network. The principal novel contribution of the proposed work is O-LEACH protocol which provides coverage of the whole network with a minimum number of orphaned nodes and has a very high connectivity rates.As a result, the WSN application receives data from the entire network including orphan nodes. The proper functioning of the Application requires, therefore, management of intelligent resources present within each the network sensor. The simulation results show that O-LEACH performs better than LEACH in terms of coverage, connectivity rate, energy and scalability.

Keywords: WSNs; routing; LEACH; O-LEACH; Orphan nodes; sub-cluster; gateway; CH’

Procedia PDF Downloads 350
27800 Liver Lesion Extraction with Fuzzy Thresholding in Contrast Enhanced Ultrasound Images

Authors: Abder-Rahman Ali, Adélaïde Albouy-Kissi, Manuel Grand-Brochier, Viviane Ladan-Marcus, Christine Hoeffl, Claude Marcus, Antoine Vacavant, Jean-Yves Boire

Abstract:

In this paper, we present a new segmentation approach for focal liver lesions in contrast enhanced ultrasound imaging. This approach, based on a two-cluster Fuzzy C-Means methodology, considers type-II fuzzy sets to handle uncertainty due to the image modality (presence of speckle noise, low contrast, etc.), and to calculate the optimum inter-cluster threshold. Fine boundaries are detected by a local recursive merging of ambiguous pixels. The method has been tested on a representative database. Compared to both Otsu and type-I Fuzzy C-Means techniques, the proposed method significantly reduces the segmentation errors.

Keywords: defuzzification, fuzzy clustering, image segmentation, type-II fuzzy sets

Procedia PDF Downloads 462
27799 Learning Grammars for Detection of Disaster-Related Micro Events

Authors: Josef Steinberger, Vanni Zavarella, Hristo Tanev

Abstract:

Natural disasters cause tens of thousands of victims and massive material damages. We refer to all those events caused by natural disasters, such as damage on people, infrastructure, vehicles, services and resource supply, as micro events. This paper addresses the problem of micro - event detection in online media sources. We present a natural language grammar learning algorithm and apply it to online news. The algorithm in question is based on distributional clustering and detection of word collocations. We also explore the extraction of micro-events from social media and describe a Twitter mining robot, who uses combinations of keywords to detect tweets which talk about effects of disasters.

Keywords: online news, natural language processing, machine learning, event extraction, crisis computing, disaster effects, Twitter

Procedia PDF Downloads 463
27798 Fight against Money Laundering with Optical Character Recognition

Authors: Saikiran Subbagari, Avinash Malladhi

Abstract:

Anti Money Laundering (AML) regulations are designed to prevent money laundering and terrorist financing activities worldwide. Financial institutions around the world are legally obligated to identify, assess and mitigate the risks associated with money laundering and report any suspicious transactions to governing authorities. With increasing volumes of data to analyze, financial institutions seek to automate their AML processes. In the rise of financial crimes, optical character recognition (OCR), in combination with machine learning (ML) algorithms, serves as a crucial tool for automating AML processes by extracting the data from documents and identifying suspicious transactions. In this paper, we examine the utilization of OCR for AML and delve into various OCR techniques employed in AML processes. These techniques encompass template-based, feature-based, neural network-based, natural language processing (NLP), hidden markov models (HMMs), conditional random fields (CRFs), binarizations, pattern matching and stroke width transform (SWT). We evaluate each technique, discussing their strengths and constraints. Also, we emphasize on how OCR can improve the accuracy of customer identity verification by comparing the extracted text with the office of foreign assets control (OFAC) watchlist. We will also discuss how OCR helps to overcome language barriers in AML compliance. We also address the implementation challenges that OCR-based AML systems may face and offer recommendations for financial institutions based on the data from previous research studies, which illustrate the effectiveness of OCR-based AML.

Keywords: anti-money laundering, compliance, financial crimes, fraud detection, machine learning, optical character recognition

Procedia PDF Downloads 121
27797 Regression Analysis in Estimating Stream-Flow and the Effect of Hierarchical Clustering Analysis: A Case Study in Euphrates-Tigris Basin

Authors: Goksel Ezgi Guzey, Bihrat Onoz

Abstract:

The scarcity of streamflow gauging stations and the increasing effects of global warming cause designing water management systems to be very difficult. This study is a significant contribution to assessing regional regression models for estimating streamflow. In this study, simulated meteorological data was related to the observed streamflow data from 1971 to 2020 for 33 stream gauging stations of the Euphrates-Tigris Basin. Ordinary least squares regression was used to predict flow for 2020-2100 with the simulated meteorological data. CORDEX- EURO and CORDEX-MENA domains were used with 0.11 and 0.22 grids, respectively, to estimate climate conditions under certain climate scenarios. Twelve meteorological variables simulated by two regional climate models, RCA4 and RegCM4, were used as independent variables in the ordinary least squares regression, where the observed streamflow was the dependent variable. The variability of streamflow was then calculated with 5-6 meteorological variables and watershed characteristics such as area and height prior to the application. Of the regression analysis of 31 stream gauging stations' data, the stations were subjected to a clustering analysis, which grouped the stations in two clusters in terms of their hydrometeorological properties. Two streamflow equations were found for the two clusters of stream gauging stations for every domain and every regional climate model, which increased the efficiency of streamflow estimation by a range of 10-15% for all the models. This study underlines the importance of homogeneity of a region in estimating streamflow not only in terms of the geographical location but also in terms of the meteorological characteristics of that region.

Keywords: hydrology, streamflow estimation, climate change, hydrologic modeling, HBV, hydropower

Procedia PDF Downloads 105
27796 Gear Fault Diagnosis Based on Optimal Morlet Wavelet Filter and Autocorrelation Enhancement

Authors: Mohamed El Morsy, Gabriela Achtenová

Abstract:

Condition monitoring is used to increase machinery availability and machinery performance, whilst reducing consequential damage, increasing machine life, reducing spare parts inventories, and reducing breakdown maintenance. An efficient condition monitoring system provides early warning of faults by predicting them at an early stage. When a localized fault occurs in gears, the vibration signals always exhibit non-stationary behavior. The periodic impulsive feature of the vibration signal appears in the time domain and the corresponding gear mesh frequency (GMF) emerges in the frequency domain. However, one limitation of frequency-domain analysis is its inability to handle non-stationary waveform signals, which are very common when machinery faults occur. Particularly at the early stage of gear failure, the GMF contains very little energy and is often overwhelmed by noise and higher-level macro-structural vibrations. An effective signal processing method would be necessary to remove such corrupting noise and interference. In this paper, a new hybrid method based on optimal Morlet wavelet filter and autocorrelation enhancement is presented. First, to eliminate the frequency associated with interferential vibrations, the vibration signal is filtered with a band-pass filter determined by a Morlet wavelet whose parameters are selected or optimized based on maximum Kurtosis. Then, to further reduce the residual in-band noise and highlight the periodic impulsive feature, an autocorrelation enhancement algorithm is applied to the filtered signal. The test stand is equipped with three dynamometers; the input dynamometer serves as the internal combustion engine, the output dynamometers induce a load on the output joint shaft flanges. The pitting defect is manufactured on the tooth side of a gear of the fifth speed on the secondary shaft. The gearbox used for experimental measurements is of the type most commonly used in modern small to mid-sized passenger cars with transversely mounted powertrain and front wheel drive: a five-speed gearbox with final drive gear and front wheel differential. The results obtained from practical experiments prove that the proposed method is very effective for gear fault diagnosis.

Keywords: wavelet analysis, pitted gear, autocorrelation, gear fault diagnosis

Procedia PDF Downloads 373
27795 The Convolution Recurrent Network of Using Residual LSTM to Process the Output of the Downsampling for Monaural Speech Enhancement

Authors: Shibo Wei, Ting Jiang

Abstract:

Convolutional-recurrent neural networks (CRN) have achieved much success recently in the speech enhancement field. The common processing method is to use the convolution layer to compress the feature space by multiple upsampling and then model the compressed features with the LSTM layer. At last, the enhanced speech is obtained by deconvolution operation to integrate the global information of the speech sequence. However, the feature space compression process may cause the loss of information, so we propose to model the upsampling result of each step with the residual LSTM layer, then join it with the output of the deconvolution layer and input them to the next deconvolution layer, by this way, we want to integrate the global information of speech sequence better. The experimental results show the network model (RES-CRN) we introduce can achieve better performance than LSTM without residual and overlaying LSTM simply in the original CRN in terms of scale-invariant signal-to-distortion ratio (SI-SNR), speech quality (PESQ), and intelligibility (STOI).

Keywords: convolutional-recurrent neural networks, speech enhancement, residual LSTM, SI-SNR

Procedia PDF Downloads 177
27794 In-Depth Analysis on Sequence Evolution and Molecular Interaction of Influenza Receptors (Hemagglutinin and Neuraminidase)

Authors: Dong Tran, Thanh Dac Van, Ly Le

Abstract:

Hemagglutinin (HA) and Neuraminidase (NA) play an important role in host immune evasion across influenza virus evolution process. The correlation between HA and NA evolution in respect to epitopic evolution and drug interaction has yet to be investigated. In this study, combining of sequence to structure evolution and statistical analysis on epitopic/binding site specificity, we identified potential therapeutic features of HA and NA that show specific antibody binding site of HA and specific binding distribution within NA active site of current inhibitors. Our approach introduces the use of sequence variation and molecular interaction to provide an effective strategy in establishing experimental based distributed representations of protein-protein/ligand complexes. The most important advantage of our method is that it does not require complete dataset of complexes but rather directly inferring feature interaction from sequence variation and molecular interaction. Using correlated sequence analysis, we additionally identified co-evolved mutations associated with maintaining HA/NA structural and functional variability toward immunity and therapeutic treatment. Our investigation on the HA binding specificity revealed unique conserved stalk domain interacts with unique loop domain of universal antibodies (CR9114, CT149, CR8043, CR8020, F16v3, CR6261, F10). On the other hand, NA inhibitors (Oseltamivir, Zaninamivir, Laninamivir) showed specific conserved residue contribution and similar to that of NA substrate (sialic acid) which can be exploited for drug design. Our study provides an important insight into rational design and identification of novel therapeutics targeting universally recognized feature of influenza HA/NA.

Keywords: influenza virus, hemagglutinin (HA), neuraminidase (NA), sequence evolution

Procedia PDF Downloads 140
27793 A Method of the Semantic on Image Auto-Annotation

Authors: Lin Huo, Xianwei Liu, Jingxiong Zhou

Abstract:

Recently, due to the existence of semantic gap between image visual features and human concepts, the semantic of image auto-annotation has become an important topic. Firstly, by extract low-level visual features of the image, and the corresponding Hash method, mapping the feature into the corresponding Hash coding, eventually, transformed that into a group of binary string and store it, image auto-annotation by search is a popular method, we can use it to design and implement a method of image semantic auto-annotation. Finally, Through the test based on the Corel image set, and the results show that, this method is effective.

Keywords: image auto-annotation, color correlograms, Hash code, image retrieval

Procedia PDF Downloads 469
27792 Computing Customer Lifetime Value in E-Commerce Websites with Regard to Returned Orders and Payment Method

Authors: Morteza Giti

Abstract:

As online shopping is becoming increasingly popular, computing customer lifetime value for better knowing the customers is also gaining more importance. Two distinct factors that can affect the value of a customer in the context of online shopping is the number of returned orders and payment method. Returned orders are those which have been shipped but not collected by the customer and are returned to the store. Payment method refers to the way that customers choose to pay for the price of the order which are usually two: Pre-pay and Cash-on-delivery. In this paper, a novel model called RFMSP is presented to calculated the customer lifetime value, taking these two parameters into account. The RFMSP model is based on the common RFM model while adding two extra parameter. The S represents the order status and the P indicates the payment method. As a case study for this model, the purchase history of customers in an online shop is used to compute the customer lifetime value over a period of twenty months.

Keywords: RFMSP model, AHP, customer lifetime value, k-means clustering, e-commerce

Procedia PDF Downloads 300
27791 Human Identification and Detection of Suspicious Incidents Based on Outfit Colors: Image Processing Approach in CCTV Videos

Authors: Thilini M. Yatanwala

Abstract:

CCTV (Closed-Circuit-Television) Surveillance System is being used in public places over decades and a large variety of data is being produced every moment. However, most of the CCTV data is stored in isolation without having integrity. As a result, identification of the behavior of suspicious people along with their location has become strenuous. This research was conducted to acquire more accurate and reliable timely information from the CCTV video records. The implemented system can identify human objects in public places based on outfit colors. Inter-process communication technologies were used to implement the CCTV camera network to track people in the premises. The research was conducted in three stages and in the first stage human objects were filtered from other movable objects available in public places. In the second stage people were uniquely identified based on their outfit colors and in the third stage an individual was continuously tracked in the CCTV network. A face detection algorithm was implemented using cascade classifier based on the training model to detect human objects. HAAR feature based two-dimensional convolution operator was introduced to identify features of the human face such as region of eyes, region of nose and bridge of the nose based on darkness and lightness of facial area. In the second stage outfit colors of human objects were analyzed by dividing the area into upper left, upper right, lower left, lower right of the body. Mean color, mod color and standard deviation of each area were extracted as crucial factors to uniquely identify human object using histogram based approach. Color based measurements were written in to XML files and separate directories were maintained to store XML files related to each camera according to time stamp. As the third stage of the approach, inter-process communication techniques were used to implement an acknowledgement based CCTV camera network to continuously track individuals in a network of cameras. Real time analysis of XML files generated in each camera can determine the path of individual to monitor full activity sequence. Higher efficiency was achieved by sending and receiving acknowledgments only among adjacent cameras. Suspicious incidents such as a person staying in a sensitive area for a longer period or a person disappeared from the camera coverage can be detected in this approach. The system was tested for 150 people with the accuracy level of 82%. However, this approach was unable to produce expected results in the presence of group of people wearing similar type of outfits. This approach can be applied to any existing camera network without changing the physical arrangement of CCTV cameras. The study of human identification and suspicious incident detection using outfit color analysis can achieve higher level of accuracy and the project will be continued by integrating motion and gait feature analysis techniques to derive more information from CCTV videos.

Keywords: CCTV surveillance, human detection and identification, image processing, inter-process communication, security, suspicious detection

Procedia PDF Downloads 165
27790 Low Overhead Dynamic Channel Selection with Cluster-Based Spatial-Temporal Station Reporting in Wireless Networks

Authors: Zeyad Abdelmageid, Xianbin Wang

Abstract:

Choosing the operational channel for a WLAN access point (AP) in WLAN networks has been a static channel assignment process initiated by the user during the deployment process of the AP, which fails to cope with the dynamic conditions of the assigned channel at the station side afterward. However, the dramatically growing number of Wi-Fi APs and stations operating in the unlicensed band has led to dynamic, distributed, and often severe interference. This highlights the urgent need for the AP to dynamically select the best overall channel of operation for the basic service set (BSS) by considering the distributed and changing channel conditions at all stations. Consequently, dynamic channel selection algorithms which consider feedback from the station side have been developed. Despite the significant performance improvement, existing channel selection algorithms suffer from very high feedback overhead. Feedback latency from the STAs, due to the high overhead, can cause the eventually selected channel to no longer be optimal for operation due to the dynamic sharing nature of the unlicensed band. This has inspired us to develop our own dynamic channel selection algorithm with reduced overhead through the proposed low-overhead, cluster-based station reporting mechanism. The main idea behind the cluster-based station reporting is the observation that STAs which are very close to each other tend to have very similar channel conditions. Instead of requesting each STA to report on every candidate channel while causing high overhead, the AP divides STAs into clusters then assigns each STA in each cluster one channel to report feedback on. With the proper design of the cluster based reporting, the AP does not lose any information about the channel conditions at the station side while reducing feedback overhead. The simulation results show equal performance and, at times, better performance with a fraction of the overhead. We believe that this algorithm has great potential in designing future dynamic channel selection algorithms with low overhead.

Keywords: channel assignment, Wi-Fi networks, clustering, DBSCAN, overhead

Procedia PDF Downloads 95
27789 An Improved Face Recognition Algorithm Using Histogram-Based Features in Spatial and Frequency Domains

Authors: Qiu Chen, Koji Kotani, Feifei Lee, Tadahiro Ohmi

Abstract:

In this paper, we propose an improved face recognition algorithm using histogram-based features in spatial and frequency domains. For adding spatial information of the face to improve recognition performance, a region-division (RD) method is utilized. The facial area is firstly divided into several regions, then feature vectors of each facial part are generated by Binary Vector Quantization (BVQ) histogram using DCT coefficients in low frequency domains, as well as Local Binary Pattern (LBP) histogram in spatial domain. Recognition results with different regions are first obtained separately and then fused by weighted averaging. Publicly available ORL database is used for the evaluation of our proposed algorithm, which is consisted of 40 subjects with 10 images per subject containing variations in lighting, posing, and expressions. It is demonstrated that face recognition using RD method can achieve much higher recognition rate.

Keywords: binary vector quantization (BVQ), DCT coefficients, face recognition, local binary patterns (LBP)

Procedia PDF Downloads 326
27788 GIS Based Spatial Modeling for Selecting New Hospital Sites Using APH, Entropy-MAUT and CRITIC-MAUT: A Study in Rural West Bengal, India

Authors: Alokananda Ghosh, Shraban Sarkar

Abstract:

The study aims to identify suitable sites for new hospitals with critical obstetric care facilities in Birbhum, one of the vulnerable and underserved districts of Eastern India, considering six main and 14 sub-criteria, using GIS-based Analytic Hierarchy Process (AHP) and Multi-Attribute Utility Theory (MAUT) approach. The criteria were identified through field surveys and previous literature. After collecting expert decisions, a pairwise comparison matrix was prepared using the Saaty scale to calculate the weights through AHP. On the contrary, objective weighting methods, i.e., Entropy and Criteria Importance through Interaction Correlation (CRITIC), were used to perform the MAUT. Finally, suitability maps were prepared by weighted sum analysis. Sensitivity analyses of AHP were performed to explore the effect of dominant criteria. Results from AHP reveal that ‘maternal death in transit’ followed by ‘accessibility and connectivity’, ‘maternal health care service (MHCS) coverage gap’ were three important criteria with comparatively higher weighted values. Whereas ‘accessibility and connectivity’ and ‘maternal death in transit’ were observed to have more imprint in entropy and CRITIC, respectively. While comparing the predictive suitable classes of these three models with the layer of existing hospitals, except Entropy-MAUT, the other two are pointing towards the left-over underserved areas of existing facilities. Only 43%-67% of existing hospitals were in the moderate to lower suitable class. Therefore, the results of the predictive models might bring valuable input in future planning.

Keywords: hospital site suitability, analytic hierarchy process, multi-attribute utility theory, entropy, criteria importance through interaction correlation, multi-criteria decision analysis

Procedia PDF Downloads 37
27787 Machine Learning Techniques for COVID-19 Detection: A Comparative Analysis

Authors: Abeer A. Aljohani

Abstract:

COVID-19 virus spread has been one of the extreme pandemics across the globe. It is also referred to as coronavirus, which is a contagious disease that continuously mutates into numerous variants. Currently, the B.1.1.529 variant labeled as omicron is detected in South Africa. The huge spread of COVID-19 disease has affected several lives and has surged exceptional pressure on the healthcare systems worldwide. Also, everyday life and the global economy have been at stake. This research aims to predict COVID-19 disease in its initial stage to reduce the death count. Machine learning (ML) is nowadays used in almost every area. Numerous COVID-19 cases have produced a huge burden on the hospitals as well as health workers. To reduce this burden, this paper predicts COVID-19 disease is based on the symptoms and medical history of the patient. This research presents a unique architecture for COVID-19 detection using ML techniques integrated with feature dimensionality reduction. This paper uses a standard UCI dataset for predicting COVID-19 disease. This dataset comprises symptoms of 5434 patients. This paper also compares several supervised ML techniques to the presented architecture. The architecture has also utilized 10-fold cross validation process for generalization and the principal component analysis (PCA) technique for feature reduction. Standard parameters are used to evaluate the proposed architecture including F1-Score, precision, accuracy, recall, receiver operating characteristic (ROC), and area under curve (AUC). The results depict that decision tree, random forest, and neural networks outperform all other state-of-the-art ML techniques. This achieved result can help effectively in identifying COVID-19 infection cases.

Keywords: supervised machine learning, COVID-19 prediction, healthcare analytics, random forest, neural network

Procedia PDF Downloads 70
27786 Clinical Feature Analysis and Prediction on Recurrence in Cervical Cancer

Authors: Ravinder Bahl, Jamini Sharma

Abstract:

The paper demonstrates analysis of the cervical cancer based on a probabilistic model. It involves technique for classification and prediction by recognizing typical and diagnostically most important test features relating to cervical cancer. The main contributions of the research include predicting the probability of recurrences in no recurrence (first time detection) cases. The combination of the conventional statistical and machine learning tools is applied for the analysis. Experimental study with real data demonstrates the feasibility and potential of the proposed approach for the said cause.

Keywords: cervical cancer, recurrence, no recurrence, probabilistic, classification, prediction, machine learning

Procedia PDF Downloads 344
27785 Complex Network Approach to International Trade of Fossil Fuel

Authors: Semanur Soyyigit Kaya, Ercan Eren

Abstract:

Energy has a prominent role for development of nations. Countries which have energy resources also have strategic power in the international trade of energy since it is essential for all stages of production in the economy. Thus, it is important for countries to analyze the weakness and strength of the system. On the other side, it is commonly believed that international trade has complex network properties. Complex network is a tool for the analysis of complex systems with heterogeneous agents and interaction between them. A complex network consists of nodes and the interactions between these nodes. Total properties which emerge as a result of these interactions are distinct from the sum of small parts (more or less) in complex systems. Thus, standard approaches to international trade are superficial to analyze these systems. Network analysis provides a new approach to analyze international trade as a network. In this network countries constitute nodes and trade relations (export or import) constitute edges. It becomes possible to analyze international trade network in terms of high degree indicators which are specific to complex systems such as connectivity, clustering, assortativity/disassortativity, centrality, etc. In this analysis, international trade of crude oil and coal which are types of fossil fuel has been analyzed from 2005 to 2014 via network analysis. First, it has been analyzed in terms of some topological parameters such as density, transitivity, clustering etc. Afterwards, fitness to Pareto distribution has been analyzed. Finally, weighted HITS algorithm has been applied to the data as a centrality measure to determine the real prominence of countries in these trade networks. Weighted HITS algorithm is a strong tool to analyze the network by ranking countries with regards to prominence of their trade partners. We have calculated both an export centrality and an import centrality by applying w-HITS algorithm to data.

Keywords: complex network approach, fossil fuel, international trade, network theory

Procedia PDF Downloads 312
27784 On-Line Data-Driven Multivariate Statistical Prediction Approach to Production Monitoring

Authors: Hyun-Woo Cho

Abstract:

Detection of incipient abnormal events in production processes is important to improve safety and reliability of manufacturing operations and reduce losses caused by failures. The construction of calibration models for predicting faulty conditions is quite essential in making decisions on when to perform preventive maintenance. This paper presents a multivariate calibration monitoring approach based on the statistical analysis of process measurement data. The calibration model is used to predict faulty conditions from historical reference data. This approach utilizes variable selection techniques, and the predictive performance of several prediction methods are evaluated using real data. The results shows that the calibration model based on supervised probabilistic model yielded best performance in this work. By adopting a proper variable selection scheme in calibration models, the prediction performance can be improved by excluding non-informative variables from their model building steps.

Keywords: calibration model, monitoring, quality improvement, feature selection

Procedia PDF Downloads 340