Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 1955

Search results for: feature matching

1715 The Capacity of Mel Frequency Cepstral Coefficients for Speech Recognition

Authors: Fawaz S. Al-Anzi, Dia AbuZeina

Abstract:

Speech recognition is of an important contribution in promoting new technologies in human computer interaction. Today, there is a growing need to employ speech technology in daily life and business activities. However, speech recognition is a challenging task that requires different stages before obtaining the desired output. Among automatic speech recognition (ASR) components is the feature extraction process, which parameterizes the speech signal to produce the corresponding feature vectors. Feature extraction process aims at approximating the linguistic content that is conveyed by the input speech signal. In speech processing field, there are several methods to extract speech features, however, Mel Frequency Cepstral Coefficients (MFCC) is the popular technique. It has been long observed that the MFCC is dominantly used in the well-known recognizers such as the Carnegie Mellon University (CMU) Sphinx and the Markov Model Toolkit (HTK). Hence, this paper focuses on the MFCC method as the standard choice to identify the different speech segments in order to obtain the language phonemes for further training and decoding steps. Due to MFCC good performance, the previous studies show that the MFCC dominates the Arabic ASR research. In this paper, we demonstrate MFCC as well as the intermediate steps that are performed to get these coefficients using the HTK toolkit.

Keywords: speech recognition, acoustic features, mel frequency, cepstral coefficients

Procedia PDF Downloads 232

1714 Learning Dynamic Representations of Nodes in Temporally Variant Graphs

Authors: Sandra Mitrovic, Gaurav Singh

Abstract:

In many industries, including telecommunications, churn prediction has been a topic of active research. A lot of attention has been drawn on devising the most informative features, and this area of research has gained even more focus with spread of (social) network analytics. The call detail records (CDRs) have been used to construct customer networks and extract potentially useful features. However, to the best of our knowledge, no studies including network features have yet proposed a generic way of representing network information. Instead, ad-hoc and dataset dependent solutions have been suggested. In this work, we build upon a recently presented method (node2vec) to obtain representations for nodes in observed network. The proposed approach is generic and applicable to any network and domain. Unlike node2vec, which assumes a static network, we consider a dynamic and time-evolving network. To account for this, we propose an approach that constructs the feature representation of each node by generating its node2vec representations at different timestamps, concatenating them and finally compressing using an auto-encoder-like method in order to retain reasonably long and informative feature vectors. We test the proposed method on churn prediction task in telco domain. To predict churners at timestamp ts+1, we construct training and testing datasets consisting of feature vectors from time intervals [t1, ts-1] and [t2, ts] respectively, and use traditional supervised classification models like SVM and Logistic Regression. Observed results show the effectiveness of proposed approach as compared to ad-hoc feature selection based approaches and static node2vec.

Keywords: churn prediction, dynamic networks, node2vec, auto-encoders

Procedia PDF Downloads 288

1713 The Impact of Adopting Cross Breed Dairy Cows on Households’ Income and Food Security in the Case of Dejen Woreda, Amhara Region, Ethiopia

Authors: Misganaw Chere Siferih

Abstract:

This study assessed the impact of crossbreed dairy cows on household income and food security. The study area is found in Dejen Woreda, East Gojam Zone, and Amhara region of Ethiopia. Random sampling technique was used to obtain a sample of 80 crossbreed dairy cow owners and 176 indigenous dairy cow owners. The study employed food consumption score analytical framework to measure food security status of the household. No Statistical significant mean difference is found between crossbreed owners and indigenous owners. Logistic regression was employed to investigate crossbreed dairy cow adoption determinants , the result indicates that gender, education, labor number, land size cultivated, dairy cooperatives membership, net income and food security status of the household are statistically significant independent variables, which explained the binary dependent variable, crossbreed dairy cow adoption. Propensity score matching (PSM) was employed to analyze the impact of crossbreed dairy cow owners on farmers’ income and food security. The average net income of crossbreed dairy cow owners was found to be significantly higher than indigenous dairy cow owners. Estimates of average treatment effect of the treated (ATT) indicated that crossbreed dairy cow is able to impact households’ net income by 42%, 38.5%, 30.8% and 44.5% higher in kernel, radius, nearest neighborhood and stratification matching algorithms respectively as compared to indigenous dairy cow owners. However, estimates of average treatment of the treated (ATT) suggest that being an owner of crossbreed dairy cow is not able to affect food security significantly. Thus, crossbreed dairy cow enables farmers to increase income but not their food security in the study area. Finally, the study recommended establishing dairy cooperatives and advice farmers to become a member of them, attention to promoting the impact of crossbreed dairy cows and promotion of nutrition focus projects.

Keywords: crossbreed dairy cow, net income, food security, propensity score matching

Procedia PDF Downloads 20

1712 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: rough set theory, attribute reduction, fuzzy logic, memetic algorithms, record to record algorithm, great deluge algorithm

Procedia PDF Downloads 421

1711 Design of IMC-PID Controller Cascaded Filter for Simplified Decoupling Control System

Authors: Le Linh, Truong Nguyen Luan Vu, Le Hieu Giang

Abstract:

In this work, the IMC-PID controller cascaded filter based on Internal Model Control (IMC) scheme is systematically proposed for the simplified decoupling control system. The simplified decoupling is firstly introduced for multivariable processes by using coefficient matching to obtain a stable, proper, and causal simplified decoupler. Accordingly, transfer functions of decoupled apparent processes can be expressed as a set of n equivalent independent processes and then derived as a ratio of the original open-loop transfer function to the diagonal element of the dynamic relative gain array. The IMC-PID controller in series with filter is then directly employed to enhance the overall performance of the decoupling control system while avoiding difficulties arising from properties inherent to simplified decoupling. Some simulation studies are considered to demonstrate the simplicity and effectiveness of the proposed method. Simulations were conducted by tuning various controllers of the multivariate processes with multiple time delays. The results indicate that the proposed method consistently performs well with fast and well-balanced closed-loop time responses.

Keywords: coefficient matching method, internal model control (IMC) scheme, PID controller cascaded filter, simplified decoupler

Procedia PDF Downloads 417

1710 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 104

1709 Multi-Stage Classification for Lung Lesion Detection on CT Scan Images Applying Medical Image Processing Technique

Authors: Behnaz Sohani, Sahand Shahalinezhad, Amir Rahmani, Aliyu Aliyu

Abstract:

Recently, medical imaging and specifically medical image processing is becoming one of the most dynamically developing areas of medical science. It has led to the emergence of new approaches in terms of the prevention, diagnosis, and treatment of various diseases. In the process of diagnosis of lung cancer, medical professionals rely on computed tomography (CT) scans, in which failure to correctly identify masses can lead to incorrect diagnosis or sampling of lung tissue. Identification and demarcation of masses in terms of detecting cancer within lung tissue are critical challenges in diagnosis. In this work, a segmentation system in image processing techniques has been applied for detection purposes. Particularly, the use and validation of a novel lung cancer detection algorithm have been presented through simulation. This has been performed employing CT images based on multilevel thresholding. The proposed technique consists of segmentation, feature extraction, and feature selection and classification. More in detail, the features with useful information are selected after featuring extraction. Eventually, the output image of lung cancer is obtained with 96.3% accuracy and 87.25%. The purpose of feature extraction applying the proposed approach is to transform the raw data into a more usable form for subsequent statistical processing. Future steps will involve employing the current feature extraction method to achieve more accurate resulting images, including further details available to machine vision systems to recognise objects in lung CT scan images.

Keywords: lung cancer detection, image segmentation, lung computed tomography (CT) images, medical image processing

Procedia PDF Downloads 53

1708 Segmentation of Arabic Handwritten Numeral Strings Based on Watershed Approach

Authors: Nidal F. Shilbayeh, Remah W. Al-Khatib, Sameer A. Nooh

Abstract:

Arabic offline handwriting recognition systems are considered as one of the most challenging topics. Arabic Handwritten Numeral Strings are used to automate systems that deal with numbers such as postal code, banking account numbers and numbers on car plates. Segmentation of connected numerals is the main bottleneck in the handwritten numeral recognition system. This is in turn can increase the speed and efficiency of the recognition system. In this paper, we proposed algorithms for automatic segmentation and feature extraction of Arabic handwritten numeral strings based on Watershed approach. The algorithms have been designed and implemented to achieve the main goal of segmenting and extracting the string of numeral digits written by hand especially in a courtesy amount of bank checks. The segmentation algorithm partitions the string into multiple regions that can be associated with the properties of one or more criteria. The numeral extraction algorithm extracts the numeral string digits into separated individual digit. Both algorithms for segmentation and feature extraction have been tested successfully and efficiently for all types of numerals.

Keywords: handwritten numerals, segmentation, courtesy amount, feature extraction, numeral recognition

Procedia PDF Downloads 352

1707 Data Clustering in Wireless Sensor Network Implemented on Self-Organization Feature Map (SOFM) Neural Network

Authors: Krishan Kumar, Mohit Mittal, Pramod Kumar

Abstract:

Wireless sensor network is one of the most promising communication networks for monitoring remote environmental areas. In this network, all the sensor nodes are communicated with each other via radio signals. The sensor nodes have capability of sensing, data storage and processing. The sensor nodes collect the information through neighboring nodes to particular node. The data collection and processing is done by data aggregation techniques. For the data aggregation in sensor network, clustering technique is implemented in the sensor network by implementing self-organizing feature map (SOFM) neural network. Some of the sensor nodes are selected as cluster head nodes. The information aggregated to cluster head nodes from non-cluster head nodes and then this information is transferred to base station (or sink nodes). The aim of this paper is to manage the huge amount of data with the help of SOM neural network. Clustered data is selected to transfer to base station instead of whole information aggregated at cluster head nodes. This reduces the battery consumption over the huge data management. The network lifetime is enhanced at a greater extent.

Keywords: artificial neural network, data clustering, self organization feature map, wireless sensor network

Procedia PDF Downloads 484

1706 Multi-Atlas Segmentation Based on Dynamic Energy Model: Application to Brain MR Images

Authors: Jie Huo, Jonathan Wu

Abstract:

Segmentation of anatomical structures in medical images is essential for scientific inquiry into the complex relationships between biological structure and clinical diagnosis, treatment and assessment. As a method of incorporating the prior knowledge and the anatomical structure similarity between a target image and atlases, multi-atlas segmentation has been successfully applied in segmenting a variety of medical images, including the brain, cardiac, and abdominal images. The basic idea of multi-atlas segmentation is to transfer the labels in atlases to the coordinate of the target image by matching the target patch to the atlas patch in the neighborhood. However, this technique is limited by the pairwise registration between target image and atlases. In this paper, a novel multi-atlas segmentation approach is proposed by introducing a dynamic energy model. First, the target is mapped to each atlas image by minimizing the dynamic energy function, then the segmentation of target image is generated by weighted fusion based on the energy. The method is tested on MICCAI 2012 Multi-Atlas Labeling Challenge dataset which includes 20 target images and 15 atlases images. The paper also analyzes the influence of different parameters of the dynamic energy model on the segmentation accuracy and measures the dice coefficient by using different feature terms with the energy model. The highest mean dice coefficient obtained with the proposed method is 0.861, which is competitive compared with the recently published method.

Keywords: brain MRI segmentation, dynamic energy model, multi-atlas segmentation, energy minimization

Procedia PDF Downloads 308

1705 National Directorate of Employment Training and Agricultural-Small and Medium Enterprises Performance in Nigeria

Authors: Festus M. Epetimehin

Abstract:

This study was conducted to identify the effect of National Directorate of Employment (NDE) training on the profit of Agricultural-Small and Medium Enterprises (SMEs) and to evaluate the factors that influenced farmers' participation in NDE training, as well as the type and frequency of training farmers and other agro-allied entrepreneurs in Nigeria. Using a multi-stage sampling procedure, a total of 384 respondents were sampled, including 192 beneficiaries and 192 non-beneficiaries in Oyo and Lagos States, respectively. Data were analysed using Binary Logit regression and Propensity Score Matching techniques. According to the binary logit analysis, respondents’ gender, availability to extension services, and the location of respondent’s operation were determinant factors influencing NDE training enrolment. All identified factors are related to the probability of respondents’ involvement in a positive way. Propensity score matching revealed that Agricultural-SMEs who participated in the NDE program boosted their profit by N341,072.18. The positive outcome of the effect implies that NDE training enhances Agri-SME performance in Nigeria. The study concluded that greater funding should be provided for the NDE for performance-enhancing training of the Agri-SMEs.

Keywords: PSM, binary logit model, Agri-SME

Procedia PDF Downloads 67

1704 Image-Based UAV Vertical Distance and Velocity Estimation Algorithm during the Vertical Landing Phase Using Low-Resolution Images

Authors: Seyed-Yaser Nabavi-Chashmi, Davood Asadi, Karim Ahmadi, Eren Demir

Abstract:

The landing phase of a UAV is very critical as there are many uncertainties in this phase, which can easily entail a hard landing or even a crash. In this paper, the estimation of relative distance and velocity to the ground, as one of the most important processes during the landing phase, is studied. Using accurate measurement sensors as an alternative approach can be very expensive for sensors like LIDAR, or with a limited operational range, for sensors like ultrasonic sensors. Additionally, absolute positioning systems like GPS or IMU cannot provide distance to the ground independently. The focus of this paper is to determine whether we can measure the relative distance and velocity of UAV and ground in the landing phase using just low-resolution images taken by a monocular camera. The Lucas-Konda feature detection technique is employed to extract the most suitable feature in a series of images taken during the UAV landing. Two different approaches based on Extended Kalman Filters (EKF) have been proposed, and their performance in estimation of the relative distance and velocity are compared. The first approach uses the kinematics of the UAV as the process and the calculated optical flow as the measurement; On the other hand, the second approach uses the feature’s projection on the camera plane (pixel position) as the measurement while employing both the kinematics of the UAV and the dynamics of variation of projected point as the process to estimate both relative distance and relative velocity. To verify the results, a sequence of low-quality images taken by a camera that is moving on a specifically developed testbed has been used to compare the performance of the proposed algorithm. The case studies show that the quality of images results in considerable noise, which reduces the performance of the first approach. On the other hand, using the projected feature position is much less sensitive to the noise and estimates the distance and velocity with relatively high accuracy. This approach also can be used to predict the future projected feature position, which can drastically decrease the computational workload, as an important criterion for real-time applications.

Keywords: altitude estimation, drone, image processing, trajectory planning

Procedia PDF Downloads 85

1703 Deciding Graph Non-Hamiltonicity via a Closure Algorithm

Authors: E. R. Swart, S. J. Gismondi, N. R. Swart, C. E. Bell

Abstract:

We present an heuristic algorithm that decides graph non-Hamiltonicity. All graphs are directed, each undirected edge regarded as a pair of counter directed arcs. Each of the n! Hamilton cycles in a complete graph on n+1 vertices is mapped to an n-permutation matrix P where p(u,i)=1 if and only if the ith arc in a cycle enters vertex u, starting and ending at vertex n+1. We first create exclusion set E by noting all arcs (u, v) not in G, sufficient to code precisely all cycles excluded from G i.e. cycles not in G use at least one arc not in G. Members are pairs of components of P, {p(u,i),p(v,i+1)}, i=1, n-1. A doubly stochastic-like relaxed LP formulation of the Hamilton cycle decision problem is constructed. Each {p(u,i),p(v,i+1)} in E is coded as variable q(u,i,v,i+1)=0 i.e. shrinks the feasible region. We then implement the Weak Closure Algorithm (WCA) that tests necessary conditions of a matching, together with Boolean closure to decide 0/1 variable assignments. Each {p(u,i),p(v,j)} not in E is tested for membership in E, and if possible, added to E (q(u,i,v,j)=0) to iteratively maximize |E|. If the WCA constructs E to be maximal, the set of all {p(u,i),p(v,j)}, then G is decided non-Hamiltonian. Only non-Hamiltonian G share this maximal property. Ten non-Hamiltonian graphs (10 through 104 vertices) and 2000 randomized 31 vertex non-Hamiltonian graphs are tested and correctly decided non-Hamiltonian. For Hamiltonian G, the complement of E covers a matching, perhaps useful in searching for cycles. We also present an example where the WCA fails.

Keywords: Hamilton cycle decision problem, computational complexity theory, graph theory, theoretical computer science

Procedia PDF Downloads 341

1702 A Wideband CMOS Power Amplifier with 23.3 dB S21, 10.6 dBm Psat and 12.3% PAE for 60 GHz WPAN and 77 GHz Automobile Radar Systems

Authors: Yo-Sheng Lin, Chien-Chin Wang, Yun-Wen Lin, Chien-Yo Lee

Abstract:

A wide band power amplifier (PA) for 60 GHz and 77 GHz direct-conversion transceiver using standard 90 nm CMOS technology is reported. The PA comprises a cascode input stage with a wide band T-type input-matching network and inductive interconnection and load, followed by a common-source (CS) gain stage and a CS output stage. To increase the saturated output power (PSAT) and power-added efficiency (PAE), the output stage adopts a two-way power dividing and combining architecture. Instead of the area-consumed Wilkinson power divider and combiner, miniature low-loss transmission-line inductors are used at the input and output terminals of each of the output stages for wide band input and output impedance matching to 100 ohm. This in turn results in further PSAT and PAE enhancement. The PA consumes 92.2 mW and achieves maximum power gain (S21) of 23.3 dB at 56 GHz, and S21 of 21.7 dB and 14 dB, respectively, at 60 GHz and 77 GHz. In addition, the PA achieves excellent saturated output power (PSAT) of 10.6 dB and maximum power added efficiency (PAE) of 12.3% at 60 GHz. At 77 GHz, the PA achieves excellent PSAT of 10.4 dB and maximum PAE of 6%. These results demonstrate the proposed wide band PA architecture is very promising for 60 GHz wireless personal local network (WPAN) and 77 GHz automobile radar systems.

Keywords: 60 GHz, 77 GHz, PA, WPAN, automotive radar

Procedia PDF Downloads 549

1701 Optimized Real Ground Motion Scaling for Vulnerability Assessment of Building Considering the Spectral Uncertainty and Shape

Authors: Chen Bo, Wen Zengping

Abstract:

Based on the results of previous studies, we focus on the research of real ground motion selection and scaling method for structural performance-based seismic evaluation using nonlinear dynamic analysis. The input of earthquake ground motion should be determined appropriately to make them compatible with the site-specific hazard level considered. Thus, an optimized selection and scaling method are established including the use of not only Monte Carlo simulation method to create the stochastic simulation spectrum considering the multivariate lognormal distribution of target spectrum, but also a spectral shape parameter. Its applications in structural fragility analysis are demonstrated through case studies. Compared to the previous scheme with no consideration of the uncertainty of target spectrum, the method shown here can make sure that the selected records are in good agreement with the median value, standard deviation and spectral correction of the target spectrum, and greatly reveal the uncertainty feature of site-specific hazard level. Meanwhile, it can help improve computational efficiency and matching accuracy. Given the important infection of target spectrum’s uncertainty on structural seismic fragility analysis, this work can provide the reasonable and reliable basis for structural seismic evaluation under scenario earthquake environment.

Keywords: ground motion selection, scaling method, seismic fragility analysis, spectral shape

Procedia PDF Downloads 265

1700 Roof and Road Network Detection through Object Oriented SVM Approach Using Low Density LiDAR and Optical Imagery in Misamis Oriental, Philippines

Authors: Jigg L. Pelayo, Ricardo G. Villar, Einstine M. Opiso

Abstract:

The advances of aerial laser scanning in the Philippines has open-up entire fields of research in remote sensing and machine vision aspire to provide accurate timely information for the government and the public. Rapid mapping of polygonal roads and roof boundaries is one of its utilization offering application to disaster risk reduction, mitigation and development. The study uses low density LiDAR data and high resolution aerial imagery through object-oriented approach considering the theoretical concept of data analysis subjected to machine learning algorithm in minimizing the constraints of feature extraction. Since separating one class from another in distinct regions of a multi-dimensional feature-space, non-trivial computing for fitting distribution were implemented to formulate the learned ideal hyperplane. Generating customized hybrid feature which were then used in improving the classifier findings. Supplemental algorithms for filtering and reshaping object features are develop in the rule set for enhancing the final product. Several advantages in terms of simplicity, applicability, and process transferability is noticeable in the methodology. The algorithm was tested in the different random locations of Misamis Oriental province in the Philippines demonstrating robust performance in the overall accuracy with greater than 89% and potential to semi-automation. The extracted results will become a vital requirement for decision makers, urban planners and even the commercial sector in various assessment processes.

Keywords: feature extraction, machine learning, OBIA, remote sensing

Procedia PDF Downloads 337

1699 Noncritical Phase-Matched Fourth Harmonic Generation of Converging Beam by Deuterated Potassium Dihydrogen Phosphate Crystal

Authors: Xiangxu Chai, Bin Feng, Ping Li, Deyan Zhu, Liquan Wang, Guanzhong Wang, Yukun Jing

Abstract:

In high power large-aperture laser systems, such as the inertial confinement fusion project, the Nd: glass laser (1053nm) is usually needed to be converted to ultraviolet (UV) light and the fourth harmonic generation (FHG) is one of the most favorite candidates to achieve UV light. Deuterated potassium dihydrogen phosphate (DKDP) crystal is an optimal choice for converting the Nd: glass radiation to the fourth harmonic laser by noncritical phase matching (NCPM). To reduce the damage probability of focusing lens, the DKDP crystal is suggested to be set before the focusing lens. And a converging beam enters the FHG crystal consequently. In this paper, we simulate the process of FHG in the scheme and the dependence of FHG efficiency on the lens’ F is derived. Besides, DKDP crystal with gradient deuterium is proposed to realize the NCPM FHG of the converging beam. At every position, the phase matching is achieved by adjusting the deuterium level, and the FHG efficiency increases as a result. The relation of the lens’ F with the deuterium gradient is investigated as well.

Keywords: fourth harmonic generation, laser induced damage, converging beam, DKDP crystal

Procedia PDF Downloads 197

1698 The Modification of Convolutional Neural Network in Fin Whale Identification

Authors: Jiahao Cui

Abstract:

In the past centuries, due to climate change and intense whaling, the global whale population has dramatically declined. Among the various whale species, the fin whale experienced the most drastic drop in number due to its popularity in whaling. Under this background, identifying fin whale calls could be immensely beneficial to the preservation of the species. This paper uses feature extraction to process the input audio signal, then a network based on AlexNet and three networks based on the ResNet model was constructed to classify fin whale calls. A mixture of the DOSITS database and the Watkins database was used during training. The results demonstrate that a modified ResNet network has the best performance considering precision and network complexity.

Keywords: convolutional neural network, ResNet, AlexNet, fin whale preservation, feature extraction

Procedia PDF Downloads 88

1697 Attribute Analysis of Quick Response Code Payment Users Using Discriminant Non-negative Matrix Factorization

Authors: Hironori Karachi, Haruka Yamashita

Abstract:

Recently, the system of quick response (QR) code is getting popular. Many companies introduce new QR code payment services and the services are competing with each other to increase the number of users. For increasing the number of users, we should grasp the difference of feature of the demographic information, usage information, and value of users between services. In this study, we conduct an analysis of real-world data provided by Nomura Research Institute including the demographic data of users and information of users’ usages of two services; LINE Pay, and PayPay. For analyzing such data and interpret the feature of them, Nonnegative Matrix Factorization (NMF) is widely used; however, in case of the target data, there is a problem of the missing data. EM-algorithm NMF (EMNMF) to complete unknown values for understanding the feature of the given data presented by matrix shape. Moreover, for comparing the result of the NMF analysis of two matrices, there is Discriminant NMF (DNMF) shows the difference of users features between two matrices. In this study, we combine EMNMF and DNMF and also analyze the target data. As the interpretation, we show the difference of the features of users between LINE Pay and Paypay.

Keywords: data science, non-negative matrix factorization, missing data, quality of services

Procedia PDF Downloads 98

1696 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 88

1695 Alexa (Machine Learning) in Artificial Intelligence

Authors: Loulwah Bokhari, Jori Nazer, Hala Sultan

Abstract:

Nowadays, artificial intelligence (AI) is used as a foundation for many activities in modern computing applications at home, in vehicles, and in businesses. Many modern machines are built to carry out a specific activity or purpose. This is where the Amazon Alexa application comes in, as it is used as a virtual assistant. The purpose of this paper is to explore the use of Amazon Alexa among people and how it has improved and made simple daily tasks easier for many people. We gave our participants several questions regarding Amazon Alexa and if they had recently used or heard of it, as well as the different tasks it provides and whether it successfully satisfied their needs. Overall, we found that participants who have recently used Alexa have found it to be helpful in their daily tasks.

Keywords: artificial intelligence, Echo system, machine learning, feature for feature match

Procedia PDF Downloads 90

1694 Closest Possible Neighbor of a Different Class: Explaining a Model Using a Neighbor Migrating Generator

Authors: Hassan Eshkiki, Benjamin Mora

Abstract:

The Neighbor Migrating Generator is a simple and efficient approach to finding the closest potential neighbor(s) with a different label for a given instance and so without the need to calibrate any kernel settings at all. This allows determining and explaining the most important features that will influence an AI model. It can be used to either migrate a specific sample to the class decision boundary of the original model within a close neighborhood of that sample or identify global features that can help localising neighbor classes. The proposed technique works by minimizing a loss function that is divided into two components which are independently weighted according to three parameters α, β, and ω, α being self-adjusting. Results show that this approach is superior to past techniques when detecting the smallest changes in the feature space and may also point out issues in models like over-fitting.

Keywords: explainable AI, EX AI, feature importance, counterfactual explanations

Procedia PDF Downloads 120

1693 Real-Time Data Stream Partitioning over a Sliding Window in Real-Time Spatial Big Data

Authors: Sana Hamdi, Emna Bouazizi, Sami Faiz

Abstract:

In recent years, real-time spatial applications, like location-aware services and traffic monitoring, have become more and more important. Such applications result dynamic environments where data as well as queries are continuously moving. As a result, there is a tremendous amount of real-time spatial data generated every day. The growth of the data volume seems to outspeed the advance of our computing infrastructure. For instance, in real-time spatial Big Data, users expect to receive the results of each query within a short time period without holding in account the load of the system. But with a huge amount of real-time spatial data generated, the system performance degrades rapidly especially in overload situations. To solve this problem, we propose the use of data partitioning as an optimization technique. Traditional horizontal and vertical partitioning can increase the performance of the system and simplify data management. But they remain insufficient for real-time spatial Big data; they can’t deal with real-time and stream queries efficiently. Thus, in this paper, we propose a novel data partitioning approach for real-time spatial Big data named VPA-RTSBD (Vertical Partitioning Approach for Real-Time Spatial Big data). This contribution is an implementation of the Matching algorithm for traditional vertical partitioning. We find, firstly, the optimal attribute sequence by the use of Matching algorithm. Then, we propose a new cost model used for database partitioning, for keeping the data amount of each partition more balanced limit and for providing a parallel execution guarantees for the most frequent queries. VPA-RTSBD aims to obtain a real-time partitioning scheme and deals with stream data. It improves the performance of query execution by maximizing the degree of parallel execution. This affects QoS (Quality Of Service) improvement in real-time spatial Big Data especially with a huge volume of stream data. The performance of our contribution is evaluated via simulation experiments. The results show that the proposed algorithm is both efficient and scalable, and that it outperforms comparable algorithms.

Keywords: real-time spatial big data, quality of service, vertical partitioning, horizontal partitioning, matching algorithm, hamming distance, stream query

Procedia PDF Downloads 133

1692 Technology Enriched Classroom for Intercultural Competence Building through Films

Authors: Tamara Matevosyan

Abstract:

In this globalized world, intercultural communication is becoming essential for understanding communication among people, for developing understanding of cultures, to appreciate the opportunities and challenges that each culture presents to people. Moreover, it plays an important role in developing an ideal personification to understand different behaviors in different cultures. Native speakers assimilate sociolinguistic knowledge in natural conditions, while it is a great problem for language learners, and in this context feature films reveal cultural peculiarities and involve students in real communication. As we know nowadays the key role of language learning is the development of intercultural competence as communicating with someone from a different cultural background can be exciting and scary, frustrating and enlightening. Intercultural competence is important in FL learning classroom and here feature films can perform as essential tools to develop this competence and overcome the intercultural gap that foreign students face. Current proposal attempts to reveal the correlation of the given culture and language through feature films. To ensure qualified, well-organized and practical classes on Intercultural Communication for language learners a number of methods connected with movie watching have been implemented. All the pre-watching, while watching and post-watching methods and techniques are aimed at developing students’ communicative competence. The application of such activities as Climax, Role-play, Interactive Language, Daily Life helps to reveal and overcome mistakes of cultural and pragmatic character. All the above-mentioned activities are directed at the assimilation of the language vocabulary with special reference to the given culture. The study dwells into the essence of culture as one of the core concepts of intercultural communication. Sometimes culture is not a priority in the process of language learning which leads to further misunderstandings in real life communication. The application of various methods and techniques with feature films aims at developing students’ cultural competence, their understanding of norms and values of individual cultures. Thus, feature film activities will enable learners to enlarge their knowledge of the particular culture and develop a fundamental insight into intercultural communication.

Keywords: climax, intercultural competence, interactive language, role-play

Procedia PDF Downloads 314

1691 Factor Study Affecting Visual Awareness on Dynamic Object Monitoring

Authors: Terry Liang Khin Teo, Sun Woh Lye, Kai Lun Brendon Goh

Abstract:

As applied to dynamic monitoring situations, the prevailing approach to situation awareness (SA) assumes that the relevant areas of interest (AOI) be perceived before that information can be processed further to affect decision-making and, thereafter, action. It is not entirely clear whether this is the case. This study seeks to investigate the monitoring of dynamic objects through matching eye fixations with the relevant AOIs in boundary-crossing scenarios. By this definition, a match is where a fixation is registered on the AOI. While many factors may affect monitoring characteristics, traffic simulations were designed in this study to explore two factors, namely: the number of inbounds/outbound traffic transfers and the number of entry and/or exit points in a radar monitoring sector. These two factors were graded into five levels of difficulty ranging from low to high traffic flow numbers. Combined permutation in terms of levels of difficulty of these two factors yielded a total of thirty scenarios. Through this, results showed that changes in the traffic flow numbers on transfer resulted in greater variations having match limits ranging from 29%-100%, as compared to the number of sector entry/exit points of range limit from 80%-100%. The subsequent analysis is able to determine the type and combination of traffic scenarios where imperfect matching is likely to occur.

Keywords: air traffic simulation, eye-tracking, visual monitoring, focus attention

Procedia PDF Downloads 30

1690 Visualization-Based Feature Extraction for Classification in Real-Time Interaction

Authors: Ágoston Nagy

Abstract:

This paper introduces a method of using unsupervised machine learning to visualize the feature space of a dataset in 2D, in order to find most characteristic segments in the set. After dimension reduction, users can select clusters by manual drawing. Selected clusters are recorded into a data model that is used for later predictions, based on realtime data. Predictions are made with supervised learning, using Gesture Recognition Toolkit. The paper introduces two example applications: a semantic audio organizer for analyzing incoming sounds, and a gesture database organizer where gestural data (recorded by a Leap motion) is visualized for further manipulation.

Keywords: gesture recognition, machine learning, real-time interaction, visualization

Procedia PDF Downloads 322

1689 The Impact of Informal Care on Health Behavior among Older People with Chronic Diseases: A Study in China Using Propensity Score Matching

Authors: Hong Wu, Naiji Lu

Abstract:

Improvement of health behavior among people with chronic diseases is vital for increasing longevity and enhancing quality of life. This paper researched the causal effects of informal care on the compliance with doctor’s health advices – smoking control, dietetic regulation, weight control and keep exercising – among older people with chronic diseases in China, which is facing the challenge of aging. We addressed the selection bias by using propensity score matching in the estimation process. We used the 2011-2012 national baseline data of the China Health and Retirement Longitudinal Study. Our results showed informal care can help improve health behavior of older people. First, informal care improved the compliance of smoking controls: whether smoke, frequency of smoking, and the time lag between wake up and the first cigarette was all lower for these older people with informal care; Second, for dietetic regulation, older people with informal care had more meals every day than older people without informal care; Third, three variables: BMI, whether gain weight and whether lose weight were used to measure the outcome of weight control. There were no significant difference between group with informal care and that without for BMI and the possibility of losing weight. Older people with informal care had lower possibility of gain weight than that without; Last, for the advice of keeping exercising, informal care increased the probability of walking exercise, however, the difference between groups for moderate and vigorous exercise were not significant. Our results indicate policy makers who aim to decrease accidents should take informal care to elders into account and provide an appropriate policy to meet the demand of informal care. Our birth policy and postponed retirement policy may decrease the informal caregiving hours, so adjustments of these policies are important and urgent to meet the current situation of aged tendency of population. In addition, government could give more support to develop organizations to provide formal care, such as nursing home. We infer that formal care is also useful for health behavior improvements.

Keywords: chronic diseases, compliance, CHARLS, health advice, informal care, older people, propensity score matching

Procedia PDF Downloads 376

1688 An Information Matrix Goodness-of-Fit Test of the Conditional Logistic Model for Matched Case-Control Studies

Authors: Li-Ching Chen

Abstract:

The case-control design has been widely applied in clinical and epidemiological studies to investigate the association between risk factors and a given disease. The retrospective design can be easily implemented and is more economical over prospective studies. To adjust effects for confounding factors, methods such as stratification at the design stage and may be adopted. When some major confounding factors are difficult to be quantified, a matching design provides an opportunity for researchers to control the confounding effects. The matching effects can be parameterized by the intercepts of logistic models and the conditional logistic regression analysis is then adopted. This study demonstrates an information-matrix-based goodness-of-fit statistic to test the validity of the logistic regression model for matched case-control data. The asymptotic null distribution of this proposed test statistic is inferred. It needs neither to employ a simulation to evaluate its critical values nor to partition covariate space. The asymptotic power of this test statistic is also derived. The performance of the proposed method is assessed through simulation studies. An example of the real data set is applied to illustrate the implementation of the proposed method as well.

Keywords: conditional logistic model, goodness-of-fit, information matrix, matched case-control studies

Procedia PDF Downloads 264

1687 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 547

1686 Real-Time Classification of Marbles with Decision-Tree Method

Authors: K. S. Parlak, E. Turan

Abstract:

The separation of marbles according to the pattern quality is a process made according to expert decision. The classification phase is the most critical part in terms of economic value. In this study, a self-learning system is proposed which performs the classification of marbles quickly and with high success. This system performs ten feature extraction by taking ten marble images from the camera. The marbles are classified by decision tree method using the obtained properties. The user forms the training set by training the system at the marble classification stage. The system evolves itself in every marble image that is classified. The aim of the proposed system is to minimize the error caused by the person performing the classification and achieve it quickly.

Keywords: decision tree, feature extraction, k-means clustering, marble classification

Procedia PDF Downloads 355