Search results for: word to vector
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1819

Search results for: word to vector

1429 Unsupervised Part-of-Speech Tagging for Amharic Using K-Means Clustering

Authors: Zelalem Fantahun

Abstract:

Part-of-speech tagging is the process of assigning a part-of-speech or other lexical class marker to each word into naturally occurring text. Part-of-speech tagging is the most fundamental and basic task almost in all natural language processing. In natural language processing, the problem of providing large amount of manually annotated data is a knowledge acquisition bottleneck. Since, Amharic is one of under-resourced language, the availability of tagged corpus is the bottleneck problem for natural language processing especially for POS tagging. A promising direction to tackle this problem is to provide a system that does not require manually tagged data. In unsupervised learning, the learner is not provided with classifications. Unsupervised algorithms seek out similarity between pieces of data in order to determine whether they can be characterized as forming a group. This paper explicates the development of unsupervised part-of-speech tagger using K-Means clustering for Amharic language since large amount of data is produced in day-to-day activities. In the development of the tagger, the following procedures are followed. First, the unlabeled data (raw text) is divided into 10 folds and tokenization phase takes place; at this level, the raw text is chunked at sentence level and then into words. The second phase is feature extraction which includes word frequency, syntactic and morphological features of a word. The third phase is clustering. Among different clustering algorithms, K-means is selected and implemented in this study that brings group of similar words together. The fourth phase is mapping, which deals with looking at each cluster carefully and the most common tag is assigned to a group. This study finds out two features that are capable of distinguishing one part-of-speech from others these are morphological feature and positional information and show that it is possible to use unsupervised learning for Amharic POS tagging. In order to increase performance of the unsupervised part-of-speech tagger, there is a need to incorporate other features that are not included in this study, such as semantic related information. Finally, based on experimental result, the performance of the system achieves a maximum of 81% accuracy.

Keywords: POS tagging, Amharic, unsupervised learning, k-means

Procedia PDF Downloads 416
1428 Specific Emitter Identification Based on Refined Composite Multiscale Dispersion Entropy

Authors: Shaoying Guo, Yanyun Xu, Meng Zhang, Weiqing Huang

Abstract:

The wireless communication network is developing rapidly, thus the wireless security becomes more and more important. Specific emitter identification (SEI) is an vital part of wireless communication security as a technique to identify the unique transmitters. In this paper, a SEI method based on multiscale dispersion entropy (MDE) and refined composite multiscale dispersion entropy (RCMDE) is proposed. The algorithms of MDE and RCMDE are used to extract features for identification of five wireless devices and cross-validation support vector machine (CV-SVM) is used as the classifier. The experimental results show that the total identification accuracy is 99.3%, even at low signal-to-noise ratio(SNR) of 5dB, which proves that MDE and RCMDE can describe the communication signal series well. In addition, compared with other methods, the proposed method is effective and provides better accuracy and stability for SEI.

Keywords: cross-validation support vector machine, refined com- posite multiscale dispersion entropy, specific emitter identification, transient signal, wireless communication device

Procedia PDF Downloads 112
1427 Tumor Boundary Extraction Using Intensity and Texture-Based on Gradient Vector

Authors: Namita Mittal, Himakshi Shekhawat, Ankit Vidyarthi

Abstract:

In medical research study, doctors and radiologists face lot of complexities in analysing the brain tumors in Magnetic Resonance (MR) images. Brain tumor detection is difficult due to amorphous tumor shape and overlapping of similar tissues in nearby region. So, radiologists require one such clinically viable solution which helps in automatic segmentation of tumor inside brain MR image. Initially, segmentation methods were used to detect tumor, by dividing the image into segments but causes loss of information. In this paper, a hybrid method is proposed which detect Region of Interest (ROI) on the basis of difference in intensity values and texture values of tumor region using nearby tissues with Gradient Vector Flow (GVF) technique in the identification of ROI. Proposed approach uses both intensity and texture values for identification of abnormal section of the brain MR images. Experimental results show that proposed method outperforms GVF method without any loss of information.

Keywords: brain tumor, GVF, intensity, MR images, segmentation, texture

Procedia PDF Downloads 409
1426 The Concept of Dharma under Hindu, Buddhist and Sikh Religions: A Comparative Analysis

Authors: Venkateswarlu Kappara

Abstract:

The term ‘Dharma’ is complex and ubiquitous. It has no equivalent word in English Initially applied to Aryans. In Rig Veda, it appears in a number of places with different meanings. The word Dharma comes from the roots word ‘dhr’ (Dhri-Dharayatetiiti Dharmaha). Principles of Dharma are all pervading. The closest synonyms for Dharma in English is ‘Righteousness.’ In a holy book Mahabharata, it is mentioned that Dharma destroys those who destroy it, Dharma Protects those who protect it. Also, Dharma might be shadowed, now and then by evil forces, but at the end, Dharma always triumphs. This line embodies the eternal victory of good over evil. In Mahabharata, Lord Krishna says Dharma upholds both, this worldly and other worldly affairs. Rig Veda says, ‘O Indra! Lead us on the path of Rta, on the right path over all evils.’ For Buddhists, Dharma most often means the body of teachings expounded by the Buddha. The Dharma is one of the three Jewels (Tri Ratnas) of Buddhism under which the followers take refuge. They are: the ‘Buddha’ meaning the minds perfection or enlightenment, the Dharma, meaning the teachings and the methods of the Buddha, and the Sangha meaning those awakened people who provide guidance and support followers. Buddha denies a separate permanent ‘I.’ Buddha Accepts Suffering (Dukka). Change / impermanence (Anicca) and not– self (Annatta) Dharma in the Buddhist scriptures has a variety of meanings including ‘phenomenon’ and ‘nature’ or ‘characteristic.’ For Sikhs, the word ‘Dharma’ means the ‘path’ of righteousness’ The Sikh scriptures attempt to answer the exposition of Dharma. The main Holy Scripture of the Sikh religion is called the Guru Granth Sahib. The faithful people are fully bound to do whatever the Dharma wants them to do. Such is the name of the Immaculate Lord. Only one who has faith comes to know such a state of mind. The righteous judge of Dharma, by the Hukam of God’s Command, sits and Administers true justice. From Dharma flow wealth and pleasure. The study indicates that in Sikh religion, the Dharma is the path of righteousness; In Buddhism, the mind’s perfection of enlightenment, and in Hinduism, it is non-violence, purity, truth, control of senses, not coveting the property of others. The comparative study implies that all religions dealt with Dharma for welfare of the mankind. The methodology adapted is theoretical, analytical and comparative. The present study indicates how far Indian philosophical systems influenced the present circumstances and how far the present system is not compatible with Ancient philosophical systems. A tentative generalization would be that the present system which is mostly influenced by the British Governance may not totally reflect the ancient norms. However, the mental make-up continues to be influenced by Ancient philosophical systems.

Keywords: Dharma, Dukka (suffering), Rakshati, righteous

Procedia PDF Downloads 145
1425 Comparison of Different Artificial Intelligence-Based Protein Secondary Structure Prediction Methods

Authors: Jamerson Felipe Pereira Lima, Jeane Cecília Bezerra de Melo

Abstract:

The difficulty and cost related to obtaining of protein tertiary structure information through experimental methods, such as X-ray crystallography or NMR spectroscopy, helped raising the development of computational methods to do so. An approach used in these last is prediction of tridimensional structure based in the residue chain, however, this has been proved an NP-hard problem, due to the complexity of this process, explained by the Levinthal paradox. An alternative solution is the prediction of intermediary structures, such as the secondary structure of the protein. Artificial Intelligence methods, such as Bayesian statistics, artificial neural networks (ANN), support vector machines (SVM), among others, were used to predict protein secondary structure. Due to its good results, artificial neural networks have been used as a standard method to predict protein secondary structure. Recent published methods that use this technique, in general, achieved a Q3 accuracy between 75% and 83%, whereas the theoretical accuracy limit for protein prediction is 88%. Alternatively, to achieve better results, support vector machines prediction methods have been developed. The statistical evaluation of methods that use different AI techniques, such as ANNs and SVMs, for example, is not a trivial problem, since different training sets, validation techniques, as well as other variables can influence the behavior of a prediction method. In this study, we propose a prediction method based on artificial neural networks, which is then compared with a selected SVM method. The chosen SVM protein secondary structure prediction method is the one proposed by Huang in his work Extracting Physico chemical Features to Predict Protein Secondary Structure (2013). The developed ANN method has the same training and testing process that was used by Huang to validate his method, which comprises the use of the CB513 protein data set and three-fold cross-validation, so that the comparative analysis of the results can be made comparing directly the statistical results of each method.

Keywords: artificial neural networks, protein secondary structure, protein structure prediction, support vector machines

Procedia PDF Downloads 587
1424 A Medical Resource Forecasting Model for Emergency Room Patients with Acute Hepatitis

Authors: R. J. Kuo, W. C. Cheng, W. C. Lien, T. J. Yang

Abstract:

Taiwan is a hyper endemic area for the Hepatitis B virus (HBV). The estimated total number of HBsAg carriers in the general population who are more than 20 years old is more than 3 million. Therefore, a case record review is conducted from January 2003 to June 2007 for all patients with a diagnosis of acute hepatitis who were admitted to the Emergency Department (ED) of a well-known teaching hospital. The cost for the use of medical resources is defined as the total medical fee. In this study, principal component analysis (PCA) is firstly employed to reduce the number of dimensions. Support vector regression (SVR) and artificial neural network (ANN) are then used to develop the forecasting model. A total of 117 patients meet the inclusion criteria. 61% patients involved in this study are hepatitis B related. The computational result shows that the proposed PCA-SVR model has superior performance than other compared algorithms. In conclusion, the Child-Pugh score and echogram can both be used to predict the cost of medical resources for patients with acute hepatitis in the ED.

Keywords: acute hepatitis, medical resource cost, artificial neural network, support vector regression

Procedia PDF Downloads 405
1423 Multisymplectic Geometry and Noether Symmetries for the Field Theories and the Relativistic Mechanics

Authors: H. Loumi-Fergane, A. Belaidi

Abstract:

The problem of symmetries in field theory has been analyzed using geometric frameworks, such as the multisymplectic models by using in particular the multivector field formalism. In this paper, we expand the vector fields associated to infinitesimal symmetries which give rise to invariant quantities as Noether currents for classical field theories and relativistic mechanic using the multisymplectic geometry where the Poincaré-Cartan form has thus been greatly simplified using the Second Order Partial Differential Equation (SOPDE) for multi-vector fields verifying Euler equations. These symmetries have been classified naturally according to the construction of the fiber bundle used.  In this work, unlike other works using the analytical method, our geometric model has allowed us firstly to distinguish the angular moments of the gauge field obtained during different transformations while these moments are gathered in a single expression and are obtained during a rotation in the Minkowsky space. Secondly, no conditions are imposed on the Lagrangian of the mechanics with respect to its dependence in time and in qi, the currents obtained naturally from the transformations are respectively the energy and the momentum of the system.

Keywords: conservation laws, field theories, multisymplectic geometry, relativistic mechanics

Procedia PDF Downloads 183
1422 Flywheel Energy Storage Control Using SVPWM for Small Satellites Application

Authors: Noha El-Gohary, Thanaa El-Shater, A. A. Mahfouz, M. M. Sakr

Abstract:

Searching for high power conversion efficiency and long lifetime are important goals when designing a power supply subsystem for satellite applications. To fulfill these goals, this paper presents a power supply subsystem for small satellites in which flywheel energy storage system is used as a secondary power source instead of chemical battery. In this paper, the model of flywheel energy storage system is introduced; a DC bus regulation control algorithm for charging and discharging of flywheel based on space vector pulse width modulation technique and motor current control is also introduced. Simulation results showed the operation of the flywheel for charging and discharging mode during illumination and shadowed period. The advantages of the proposed system are confirmed by the simulation results of the power supply system.

Keywords: small-satellites, flywheel energy storage system, space vector pulse width modulation, power conversion

Procedia PDF Downloads 377
1421 Optimal Feature Extraction Dimension in Finger Vein Recognition Using Kernel Principal Component Analysis

Authors: Amir Hajian, Sepehr Damavandinejadmonfared

Abstract:

In this paper the issue of dimensionality reduction is investigated in finger vein recognition systems using kernel Principal Component Analysis (KPCA). One aspect of KPCA is to find the most appropriate kernel function on finger vein recognition as there are several kernel functions which can be used within PCA-based algorithms. In this paper, however, another side of PCA-based algorithms -particularly KPCA- is investigated. The aspect of dimension of feature vector in PCA-based algorithms is of importance especially when it comes to the real-world applications and usage of such algorithms. It means that a fixed dimension of feature vector has to be set to reduce the dimension of the input and output data and extract the features from them. Then a classifier is performed to classify the data and make the final decision. We analyze KPCA (Polynomial, Gaussian, and Laplacian) in details in this paper and investigate the optimal feature extraction dimension in finger vein recognition using KPCA.

Keywords: biometrics, finger vein recognition, principal component analysis (PCA), kernel principal component analysis (KPCA)

Procedia PDF Downloads 342
1420 Representative Concentration Pathways Approach on Wolbachia Controlling Dengue Virus in Aedes aegypti

Authors: Ida Bagus Mandhara Brasika, I Dewa Gde Sathya Deva

Abstract:

Wolbachia is recently developed as the natural enemy of Dengue virus (DENV). It inhibits the replication of DENV in Aedes aegypti. Both DENV and its vector, Aedes aegypty, are sensitive to climate factor especially temperature. The changing of climate has a direct impact on temperature which means changing the vector transmission. Temperature has been known to effect Wolbachia density as it has an ideal temperature to grow. Some scenarios, which are known as Representative Concentration Pathways (RCPs), have been developed by Intergovernmental Panel on Climate Change (IPCC) to predict the future climate based on greenhouse gases concentration. These scenarios are applied to mitigate the future change of Aedes aegypti migration and how Wolbachia could control the virus. The prediction will determine the schemes to release Wolbachia-injected Aedes aegypti to reduce DENV transmission.

Keywords: Aedes aegypti, climate change, dengue virus, Intergovernmental Panel on Climate Change, representative concentration pathways, Wolbachia

Procedia PDF Downloads 277
1419 Sentiment Analysis of Ensemble-Based Classifiers for E-Mail Data

Authors: Muthukumarasamy Govindarajan

Abstract:

Detection of unwanted, unsolicited mails called spam from email is an interesting area of research. It is necessary to evaluate the performance of any new spam classifier using standard data sets. Recently, ensemble-based classifiers have gained popularity in this domain. In this research work, an efficient email filtering approach based on ensemble methods is addressed for developing an accurate and sensitive spam classifier. The proposed approach employs Naive Bayes (NB), Support Vector Machine (SVM) and Genetic Algorithm (GA) as base classifiers along with different ensemble methods. The experimental results show that the ensemble classifier was performing with accuracy greater than individual classifiers, and also hybrid model results are found to be better than the combined models for the e-mail dataset. The proposed ensemble-based classifiers turn out to be good in terms of classification accuracy, which is considered to be an important criterion for building a robust spam classifier.

Keywords: accuracy, arcing, bagging, genetic algorithm, Naive Bayes, sentiment mining, support vector machine

Procedia PDF Downloads 115
1418 First Survey of Seasonal Abundance and Daily Activity of Stomoxys calcitrans: In Zaouiet Sousse, the Sahel Area of Tunisia

Authors: Amira Kalifa, Faïek Errouissi

Abstract:

The seasonal changes and the daily activity of Stomoxys calcitrans (Diptera: Muscidae) were examined, using Vavoua traps, in a dairy cattle farm in Zaouiet Sousse, the Sahel area of Tunisia during May 2014 to October 2014. Over this period, a total of 4366 hematophagous diptera were captured and Stomoxys calcitrans was the most commonly trapped species (96.52%). Analysis of the seasonal activity, showed that S.calcitrans is bivoltine, with two peaks: a significant peak is recorded in May-June, during the dry season, and a second peak at the end of October, which is quite weak. This seasonal pattern would depend on climatic factors, particularly the temperature of the manure and that of the air. The activity pattern of Stomoxys calcitrans was diurnal with seasonal variations. The daily rhythm shows a peak between 11:00 am to 15:00 pm in May and between 11:00 am to 17:00 pm in June. These vector flies are important pests of livestock in Tunisia, where they are known as a mechanical vector of several pathogens and have a considerable economic and health impact on livestock. A better knowledge of their ecology is a prerequisite for more efficient control measures.

Keywords: cattle farm, daily rhythm, Stomoxys calcitrans, seasonal activity

Procedia PDF Downloads 247
1417 Low-Voltage Multiphase Brushless DC Motor for Electric Vehicle Application

Authors: Mengesha Mamo Wogari

Abstract:

In this paper, low voltage multiphase brushless DC motor with square wave air-gap flux distribution for electric vehicle application is proposed. Ten-phase, 5 kW motor, has been designed and simulated by finite element methods demonstrating the desired high torque capability at low speed and flux weakening operation for high-speed operations. The motor torque is proportional to number of phases for a constant phase current and air-gap flux. The concept of vector control and simple space vector modulation technique is used on MATLAB to control the motor demonstrating simple switching pattern for selected number of phases. The low voltage DC and inverter output AC are desired characteristics to avoid any electric shock in the vehicle, accidentally and during abnormal conditions. The switching devices for inverter are of low-voltage rating and cost effective though their number is equal to twice the number of phases.

Keywords: brushless DC motors, electric Vehicle, finite element methods, Low-voltage inverter, multiphase

Procedia PDF Downloads 123
1416 ANN Based Simulation of PWM Scheme for Seven Phase Voltage Source Inverter Using MATLAB/Simulink

Authors: Mohammad Arif Khan

Abstract:

This paper analyzes and presents the development of Artificial Neural Network based controller of space vector modulation (ANN-SVPWM) for a seven-phase voltage source inverter. At first, the conventional method of producing sinusoidal output voltage by utilizing six active and one zero space vectors are used to synthesize the input reference, is elaborated and then new PWM scheme called Artificial Neural Network Based PWM is presented. The ANN based controller has the advantage of the very fast implementation and analyzing the algorithms and avoids the direct computation of trigonometric and non-linear functions. The ANN controller uses the individual training strategy with the fixed weight and supervised models. A computer simulation program has been developed using Matlab/Simulink together with the neural network toolbox for training the ANN-controller. A comparison of the proposed scheme with the conventional scheme is presented based on various performance indices. Extensive Simulation results are provided to validate the findings.

Keywords: space vector PWM, total harmonic distortion, seven-phase, voltage source inverter, multi-phase, artificial neural network

Procedia PDF Downloads 434
1415 Corporate Social Responsibility the New Route to Competitive Advantage: An Applied Study on Telecommunication Sector in Egypt

Authors: Rania Sherif Abd El-Azim

Abstract:

The role of corporate social responsibility (CSR) in business has evolved and led to an era where industry leaders can no longer overlook the importance of being participative corporate citizens. This is not only because of the media’s skeptical attitude toward whether or not companies’ CSR efforts are sincere but also due to key stakeholders’ ability to hold companies to a higher standard than ever before as companies can gain competitive advantage through CSR. These programs result in addressing global challenges, such as climate, and poverty, or simply improving employee retention, so it has become increasingly clear that CSR is not just the new trend for companies but a necessary tool that organizations must integrate into their overall business strategies to build a stronger reputation as well as to also increase credibility among their key audience and enhance customers’ willingness to repurchase, pay premium price and enhancing positive word of mouth. According to the literature review, the link between CSR and competitive advantage at the firm level has long been an important topic for both CSR researchers and practitioners. Thus CSR can play an important role in enhancing the firm's competitive advantage, which seems an attractive area to investigate specially in Egypt. So, this paper will investigate the role of corporate social responsibility in enhancing the firm competitive advantage.

Keywords: corporate social responsibility, competitive advantage, corporate reputation, customers' willingness to repurchase, willingness to pay premium price, positive word of mouth

Procedia PDF Downloads 289
1414 Socio Economic Impact and Status of the Islamic Perspective of Veil

Authors: Shagufta Jahangir, Nadeemullah, Yaqoob, Raisa Jahangir

Abstract:

The Persian language word ‘Purdah’ and in Arabic ‘Hajab’ is used for veil. Veil has been used by women for being escaped from men. In one way or the other veil has been continuously used in ancient as well as modern civilizations by women. Developed nations have blamed the use of veil an obstacle in the process of development. Therefore, modern nations have struggled to get rid of the use of veil. They argue that it is a sign of slavery for women and it is an obstacle in the path of development. The modern secular Muslims considered veil as the biggest obstacle for social and economic development. It makes a woman helpless, as being zanjir in her feet. It has become an obstacle in the process of development for women. It is also considered as a tool for segregation among men and women. The so called Muslims of the modern era are trying to introduce changes in religion by imitation the modern nations of the world. In particular ways for Muslim woman use of veil in Islam is must. It is a right provided her by religion. It provides her strength. In the Holy Quran word ‘Hajab’ is used 5 times. Islam is against domination and forceful practice of veil, as a part of teaching of Islam it is being adopted by women as a protection. This article aims at: (1) historical background of veil (2) Its existence in civilizations, (3) Meaning and interpretation of veil in Islamic context, (4) Economic impact of it on women (5) Discussion on its practice in Islamic (eastern) and other (European) circles and conclusions followed by concerted bibliography.

Keywords: veil, economic development, civilizations, obstacle, secular Muslims, segregation

Procedia PDF Downloads 306
1413 The Ability of Forecasting the Term Structure of Interest Rates Based on Nelson-Siegel and Svensson Model

Authors: Tea Poklepović, Zdravka Aljinović, Branka Marasović

Abstract:

Due to the importance of yield curve and its estimation it is inevitable to have valid methods for yield curve forecasting in cases when there are scarce issues of securities and/or week trade on a secondary market. Therefore in this paper, after the estimation of weekly yield curves on Croatian financial market from October 2011 to August 2012 using Nelson-Siegel and Svensson models, yield curves are forecasted using Vector auto-regressive model and Neural networks. In general, it can be concluded that both forecasting methods have good prediction abilities where forecasting of yield curves based on Nelson Siegel estimation model give better results in sense of lower Mean Squared Error than forecasting based on Svensson model Also, in this case Neural networks provide slightly better results. Finally, it can be concluded that most appropriate way of yield curve prediction is neural networks using Nelson-Siegel estimation of yield curves.

Keywords: Nelson-Siegel Model, neural networks, Svensson Model, vector autoregressive model, yield curve

Procedia PDF Downloads 291
1412 Multi-Vehicle Detection Using Histogram of Oriented Gradients Features and Adaptive Sliding Window Technique

Authors: Saumya Srivastava, Rina Maiti

Abstract:

In order to achieve a better performance of vehicle detection in a complex environment, we present an efficient approach for a multi-vehicle detection system using an adaptive sliding window technique. For a given frame, image segmentation is carried out to establish the region of interest. Gradient computation followed by thresholding, denoising, and morphological operations is performed to extract the binary search image. Near-region field and far-region field are defined to generate hypotheses using the adaptive sliding window technique on the resultant binary search image. For each vehicle candidate, features are extracted using a histogram of oriented gradients, and a pre-trained support vector machine is applied for hypothesis verification. Later, the Kalman filter is used for tracking the vanishing point. The experimental results show that the method is robust and effective on various roads and driving scenarios. The algorithm was tested on highways and urban roads in India.

Keywords: gradient, vehicle detection, histograms of oriented gradients, support vector machine

Procedia PDF Downloads 96
1411 Machine Learning-Driven Prediction of Cardiovascular Diseases: A Supervised Approach

Authors: Thota Sai Prakash, B. Yaswanth, Jhade Bhuvaneswar, Marreddy Divakar Reddy, Shyam Ji Gupta

Abstract:

Across the globe, there are a lot of chronic diseases, and heart disease stands out as one of the most perilous. Sadly, many lives are lost to this condition, even though early intervention could prevent such tragedies. However, identifying heart disease in its initial stages is not easy. To address this challenge, we propose an automated system aimed at predicting the presence of heart disease using advanced techniques. By doing so, we hope to empower individuals with the knowledge needed to take proactive measures against this potentially fatal illness. Our approach towards this problem involves meticulous data preprocessing and the development of predictive models utilizing classification algorithms such as Support Vector Machines (SVM), Decision Tree, and Random Forest. We assess the efficiency of every model based on metrics like accuracy, ensuring that we select the most reliable option. Additionally, we conduct thorough data analysis to reveal the importance of different attributes. Among the models considered, Random Forest emerges as the standout performer with an accuracy rate of 96.04% in our study.

Keywords: support vector machines, decision tree, random forest

Procedia PDF Downloads 18
1410 An Automatic Speech Recognition of Conversational Telephone Speech in Malay Language

Authors: M. Draman, S. Z. Muhamad Yassin, M. S. Alias, Z. Lambak, M. I. Zulkifli, S. N. Padhi, K. N. Baharim, F. Maskuriy, A. I. A. Rahim

Abstract:

The performance of Malay automatic speech recognition (ASR) system for the call centre environment is presented. The system utilizes Kaldi toolkit as the platform to the entire library and algorithm used in performing the ASR task. The acoustic model implemented in this system uses a deep neural network (DNN) method to model the acoustic signal and the standard (n-gram) model for language modelling. With 80 hours of training data from the call centre recordings, the ASR system can achieve 72% of accuracy that corresponds to 28% of word error rate (WER). The testing was done using 20 hours of audio data. Despite the implementation of DNN, the system shows a low accuracy owing to the varieties of noises, accent and dialect that typically occurs in Malaysian call centre environment. This significant variation of speakers is reflected by the large standard deviation of the average word error rate (WERav) (i.e., ~ 10%). It is observed that the lowest WER (13.8%) was obtained from recording sample with a standard Malay dialect (central Malaysia) of native speaker as compared to 49% of the sample with the highest WER that contains conversation of the speaker that uses non-standard Malay dialect.

Keywords: conversational speech recognition, deep neural network, Malay language, speech recognition

Procedia PDF Downloads 298
1409 Music Genre Classification Based on Non-Negative Matrix Factorization Features

Authors: Soyon Kim, Edward Kim

Abstract:

In order to retrieve information from the massive stream of songs in the music industry, music search by title, lyrics, artist, mood, and genre has become more important. Despite the subjectivity and controversy over the definition of music genres across different nations and cultures, automatic genre classification systems that facilitate the process of music categorization have been developed. Manual genre selection by music producers is being provided as statistical data for designing automatic genre classification systems. In this paper, an automatic music genre classification system utilizing non-negative matrix factorization (NMF) is proposed. Short-term characteristics of the music signal can be captured based on the timbre features such as mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC), and octave band sum (OBS). Long-term time-varying characteristics of the music signal can be summarized with (1) the statistical features such as mean, variance, minimum, and maximum of the timbre features and (2) the modulation spectrum features such as spectral flatness measure, spectral crest measure, spectral peak, spectral valley, and spectral contrast of the timbre features. Not only these conventional basic long-term feature vectors, but also NMF based feature vectors are proposed to be used together for genre classification. In the training stage, NMF basis vectors were extracted for each genre class. The NMF features were calculated in the log spectral magnitude domain (NMF-LSM) as well as in the basic feature vector domain (NMF-BFV). For NMF-LSM, an entire full band spectrum was used. However, for NMF-BFV, only low band spectrum was used since high frequency modulation spectrum of the basic feature vectors did not contain important information for genre classification. In the test stage, using the set of pre-trained NMF basis vectors, the genre classification system extracted the NMF weighting values of each genre as the NMF feature vectors. A support vector machine (SVM) was used as a classifier. The GTZAN multi-genre music database was used for training and testing. It is composed of 10 genres and 100 songs for each genre. To increase the reliability of the experiments, 10-fold cross validation was used. For a given input song, an extracted NMF-LSM feature vector was composed of 10 weighting values that corresponded to the classification probabilities for 10 genres. An NMF-BFV feature vector also had a dimensionality of 10. Combined with the basic long-term features such as statistical features and modulation spectrum features, the NMF features provided the increased accuracy with a slight increase in feature dimensionality. The conventional basic features by themselves yielded 84.0% accuracy, but the basic features with NMF-LSM and NMF-BFV provided 85.1% and 84.2% accuracy, respectively. The basic features required dimensionality of 460, but NMF-LSM and NMF-BFV required dimensionalities of 10 and 10, respectively. Combining the basic features, NMF-LSM and NMF-BFV together with the SVM with a radial basis function (RBF) kernel produced the significantly higher classification accuracy of 88.3% with a feature dimensionality of 480.

Keywords: mel-frequency cepstral coefficient (MFCC), music genre classification, non-negative matrix factorization (NMF), support vector machine (SVM)

Procedia PDF Downloads 272
1408 Visual Thing Recognition with Binary Scale-Invariant Feature Transform and Support Vector Machine Classifiers Using Color Information

Authors: Wei-Jong Yang, Wei-Hau Du, Pau-Choo Chang, Jar-Ferr Yang, Pi-Hsia Hung

Abstract:

The demands of smart visual thing recognition in various devices have been increased rapidly for daily smart production, living and learning systems in recent years. This paper proposed a visual thing recognition system, which combines binary scale-invariant feature transform (SIFT), bag of words model (BoW), and support vector machine (SVM) by using color information. Since the traditional SIFT features and SVM classifiers only use the gray information, color information is still an important feature for visual thing recognition. With color-based SIFT features and SVM, we can discard unreliable matching pairs and increase the robustness of matching tasks. The experimental results show that the proposed object recognition system with color-assistant SIFT SVM classifier achieves higher recognition rate than that with the traditional gray SIFT and SVM classification in various situations.

Keywords: color moments, visual thing recognition system, SIFT, color SIFT

Procedia PDF Downloads 440
1407 Fuzzy-Machine Learning Models for the Prediction of Fire Outbreak: A Comparative Analysis

Authors: Uduak Umoh, Imo Eyoh, Emmauel Nyoho

Abstract:

This paper compares fuzzy-machine learning algorithms such as Support Vector Machine (SVM), and K-Nearest Neighbor (KNN) for the predicting cases of fire outbreak. The paper uses the fire outbreak dataset with three features (Temperature, Smoke, and Flame). The data is pre-processed using Interval Type-2 Fuzzy Logic (IT2FL) algorithm. Min-Max Normalization and Principal Component Analysis (PCA) are used to predict feature labels in the dataset, normalize the dataset, and select relevant features respectively. The output of the pre-processing is a dataset with two principal components (PC1 and PC2). The pre-processed dataset is then used in the training of the aforementioned machine learning models. K-fold (with K=10) cross-validation method is used to evaluate the performance of the models using the matrices – ROC (Receiver Operating Curve), Specificity, and Sensitivity. The model is also tested with 20% of the dataset. The validation result shows KNN is the better model for fire outbreak detection with an ROC value of 0.99878, followed by SVM with an ROC value of 0.99753.

Keywords: Machine Learning Algorithms , Interval Type-2 Fuzzy Logic, Fire Outbreak, Support Vector Machine, K-Nearest Neighbour, Principal Component Analysis

Procedia PDF Downloads 153
1406 An Event-Related Potential Study of Individual Differences in Word Recognition: The Evidence from Morphological Knowledge of Sino-Korean Prefixes

Authors: Jinwon Kang, Seonghak Jo, Joohee Ahn, Junghye Choi, Sun-Young Lee

Abstract:

A morphological priming has proved its importance by showing that segmentation occurs in morphemes when visual words are recognized within a noticeably short time. Regarding Sino-Korean prefixes, this study conducted an experiment on visual masked priming tasks with 57 ms stimulus-onset asynchrony (SOA) to see how individual differences in the amount of morphological knowledge affect morphological priming. The relationship between the prime and target words were classified as morphological (e.g., 미개척 migaecheog [unexplored] – 미해결 mihaegyel [unresolved]), semantical (e.g., 친환경 chinhwangyeong [eco-friendly]) – 무공해 mugonghae [no-pollution]), and orthographical (e.g., 미용실 miyongsil [beauty shop] – 미확보 mihwagbo [uncertainty]) conditions. We then compared the priming by configuring irrelevant paired stimuli for each condition’s control group. As a result, in the behavioral data, we observed facilitatory priming from a group with high morphological knowledge only under the morphological condition. In contrast, a group with low morphological knowledge showed the priming only under the orthographic condition. In the event-related potential (ERP) data, the group with high morphological knowledge presented the N250 only under the morphological condition. The findings of this study imply that individual differences in morphological knowledge in Korean may have a significant influence on the segmental processing of Korean word recognition.

Keywords: ERP, individual differences, morphological priming, sino-Korean prefixes

Procedia PDF Downloads 186
1405 Comparison of Techniques for Detection and Diagnosis of Eccentricity in the Air-Gap Fault in Induction Motors

Authors: Abrahão S. Fontes, Carlos A. V. Cardoso, Levi P. B. Oliveira

Abstract:

The induction motors are used worldwide in various industries. Several maintenance techniques are applied to increase the operating time and the lifespan of these motors. Among these, the predictive maintenance techniques such as Motor Current Signature Analysis (MCSA), Motor Square Current Signature Analysis (MSCSA), Park's Vector Approach (PVA) and Park's Vector Square Modulus (PVSM) are used to detect and diagnose faults in electric motors, characterized by patterns in the stator current frequency spectrum. In this article, these techniques are applied and compared on a real motor, which has the fault of eccentricity in the air-gap. It was used as a theoretical model of an electric induction motor without fault in order to assist comparison between the stator current frequency spectrum patterns with and without faults. Metrics were purposed and applied to evaluate the sensitivity of each technique fault detection. The results presented here show that the above techniques are suitable for the fault of eccentricity in the air gap, whose comparison between these showed the suitability of each one.

Keywords: eccentricity in the air-gap, fault diagnosis, induction motors, predictive maintenance

Procedia PDF Downloads 325
1404 Information-Controlled Laryngeal Feature Variations in Korean Consonants

Authors: Ponghyung Lee

Abstract:

This study seeks to investigate the variations occurring to Korean consonantal variations center around laryngeal features of the concerned sounds, to the exclusion of others. Our fundamental premise is that the weak contrast associated with concerned segments might be held accountable for the oscillation of the status quo of the concerned consonants. What is more, we assume that an array of notions as a measure of communicative efficiency of linguistic units would be significantly influential on triggering those variations. To this end, we have tried to compute the surprisal, entropic contribution, and relative contrastiveness associated with Korean obstruent consonants. What we found therein is that the Information-theoretic perspective is compelling enough to lend support our approach to a considerable extent. That is, the variant realizations, chronologically and stylistically, prove to be profoundly affected by a set of Information-theoretic factors enumerated above. When it comes to the biblical proper names, we use Georgetown University CQP Web-Bible corpora. From the 8 texts (4 from Old Testament and 4 from New Testament) among the total 64 texts, we extracted 199 samples. We address the issue of laryngeal feature variations associated with Korean obstruent consonants under the presumption that the variations stem from the weak contrast among the triad manifestations of laryngeal features. The variants emerge from diverse sources in chronological and stylistic senses: Christianity biblical texts, ordinary casual speech, the shift of loanword adaptation over time, and ideophones. For the purpose of discussing what they are really like from the perspective of Information Theory, it is necessary to closely look at the data. Among them, the massive changes occurring to loanword adaptation of proper nouns during the centennial history of Korean Christianity draw our special attention. We searched 199 types of initially capitalized words among 45,528-word tokens, which account for around 5% of total 901,701-word tokens (12,786-word types) from Georgetown University CQP Web-Bible corpora. We focus on the shift of the laryngeal features incorporated into word-initial consonants, which are available through the two distinct versions of Korean Bible: one came out in the 1960s for the Protestants, and the other was published in the 1990s for the Catholic Church. Of these proper names, we have closely traced the adaptation of plain obstruents, e. g. /b, d, g, s, ʤ/ in the sources. The results show that as much as 41% of the extracted proper names show variations; 37% in terms of aspiration, and 4% in terms of tensing. This study set out in an effort to shed light on the question: to what extent can we attribute the variations occurring to the laryngeal features associated with Korean obstruent consonants to the communicative aspects of linguistic activities? In this vein, the concerted effects of the triad, of surprisal, entropic contribution, and relative contrastiveness can be credited with the ups and downs in the feature specification, despite being contentiousness on the role of surprisal to some extent.

Keywords: entropic contribution, laryngeal feature variation, relative contrastiveness, surprisal

Procedia PDF Downloads 101
1403 Characterization of Climatic Drought in the Saiss Plateau (Morocco) Using Statistical Indices

Authors: Abdeghani Qadem

Abstract:

Climate change is now an undeniable reality with increasing impacts on water systems worldwide, especially leading to severe drought episodes. The Southern Mediterranean region is particularly affected by this drought, which can have devastating consequences on water resources. Morocco, due to its geographical location in North Africa and the Southern Mediterranean, is especially vulnerable to these effects of climate change, particularly drought. In this context, this article focuses on the study of climate variability and drought characteristics in the Saiss Plateau region and its adjacent areas with the Middle Atlas, using specific statistical indices. The study begins by analyzing the annual precipitation variation, with a particular emphasis on data homogenization and gap filling using a regional vector. Then, the analysis delves into drought episodes in the region, using the Standardized Precipitation Index (SPI) over a 12-month period. The central objective is to accurately assess significant drought changes between 1980 and 2015, based on data collected from nine meteorological stations located in the study area.

Keywords: climate variability, regional vector, drought, standardized precipitation index, Saiss Plateau, middle atlas

Procedia PDF Downloads 43
1402 External Sector and Its Impact on Economic Growth of Pakistan (1990-2010)

Authors: Rizwan Fazal

Abstract:

This study investigates the behavior of external sector of Pakistan economy and its impact on economic growth, using quarterly data for the period 1990:01-2010:04. External sector indices used in this study are financial integration, net foreign assets and trade integration. Augmented Ducky fuller confirms that all variables of external sector are non-stationary at level, but at first difference it becomes stationary. The co-integration test suggests one co-integrating variables in the study. The analysis is based on Vector Auto Regression model followed by Vector Error Correction Model. The empirical findings show that financial integration play important role in increasing economic growth in Pakistan economy while trade integration has negative effect on economic growth of Pakistan in the long run. However, the short run confirms that output lag accounts for error correction. The estimated CUSUM and CUSUMQ stability test provide information that the period of the study equation remains stable.

Keywords: financial integration, trade integration, net foreign assets, gross domestic product

Procedia PDF Downloads 251
1401 Early Recognition and Grading of Cataract Using a Combined Log Gabor/Discrete Wavelet Transform with ANN and SVM

Authors: Hadeer R. M. Tawfik, Rania A. K. Birry, Amani A. Saad

Abstract:

Eyes are considered to be the most sensitive and important organ for human being. Thus, any eye disorder will affect the patient in all aspects of life. Cataract is one of those eye disorders that lead to blindness if not treated correctly and quickly. This paper demonstrates a model for automatic detection, classification, and grading of cataracts based on image processing techniques and artificial intelligence. The proposed system is developed to ease the cataract diagnosis process for both ophthalmologists and patients. The wavelet transform combined with 2D Log Gabor Wavelet transform was used as feature extraction techniques for a dataset of 120 eye images followed by a classification process that classified the image set into three classes; normal, early, and advanced stage. A comparison between the two used classifiers, the support vector machine SVM and the artificial neural network ANN were done for the same dataset of 120 eye images. It was concluded that SVM gave better results than ANN. SVM success rate result was 96.8% accuracy where ANN success rate result was 92.3% accuracy.

Keywords: cataract, classification, detection, feature extraction, grading, log-gabor, neural networks, support vector machines, wavelet

Procedia PDF Downloads 301
1400 An Experimental Study on the Variability of Nonnative and Native Inference of Word Meanings in Timed and Untimed Conditions

Authors: Swathi M. Vanniarajan

Abstract:

Reading research suggests that online contextual vocabulary comprehension while reading is an interactive and integrative process. One’s success in it depends on a variety of factors including the amount and the nature of available linguistic and nonlinguistic cues, his/her analytical and integrative skills, schema memory (content familiarity), and processing speed characterized along the continuum of controlled to automatic processing. The experiment reported here, conducted with 30 native speakers as one group and 30 nonnative speakers as another group (all graduate students), hypothesized that while working on (24) tasks which required them to comprehend an unfamiliar word in real time without backtracking, due to the differences in the nature of their respective reading processes, the nonnative subjects would be less able to construct the meanings of the unknown words by integrating the multiple but sufficient contextual cues provided in the text but the native subjects would be able to. The results indicated that there were significant inter-group as well as intra-group differences in terms of the quality of definitions given. However, when given additional time, while the nonnative speakers could significantly improve the quality of their definitions, the native speakers in general would not, suggesting that all things being equal, time is a significant factor for success in nonnative vocabulary and reading comprehension processes and that accuracy precedes automaticity in the development of nonnative reading processes also.

Keywords: reading, second language processing, vocabulary comprehension

Procedia PDF Downloads 144