Search results for: Association rule mining
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1151

Search results for: Association rule mining

341 Sounds Alike Name Matching for Myanmar Language

Authors: Yuzana, Khin Marlar Tun

Abstract:

Personal name matching system is the core of essential task in national citizen database, text and web mining, information retrieval, online library system, e-commerce and record linkage system. It has necessitated to the all embracing research in the vicinity of name matching. Traditional name matching methods are suitable for English and other Latin based language. Asian languages which have no word boundary such as Myanmar language still requires sounds alike matching system in Unicode based application. Hence we proposed matching algorithm to get analogous sounds alike (phonetic) pattern that is convenient for Myanmar character spelling. According to the nature of Myanmar character, we consider for word boundary fragmentation, collation of character. Thus we use pattern conversion algorithm which fabricates words in pattern with fragmented and collated. We create the Myanmar sounds alike phonetic group to help in the phonetic matching. The experimental results show that fragmentation accuracy in 99.32% and processing time in 1.72 ms.

Keywords: natural language processing, name matching, phonetic matching

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1798
340 Automatic Clustering of Gene Ontology by Genetic Algorithm

Authors: Razib M. Othman, Safaai Deris, Rosli M. Illias, Zalmiyah Zakaria, Saberi M. Mohamad

Abstract:

Nowadays, Gene Ontology has been used widely by many researchers for biological data mining and information retrieval, integration of biological databases, finding genes, and incorporating knowledge in the Gene Ontology for gene clustering. However, the increase in size of the Gene Ontology has caused problems in maintaining and processing them. One way to obtain their accessibility is by clustering them into fragmented groups. Clustering the Gene Ontology is a difficult combinatorial problem and can be modeled as a graph partitioning problem. Additionally, deciding the number k of clusters to use is not easily perceived and is a hard algorithmic problem. Therefore, an approach for solving the automatic clustering of the Gene Ontology is proposed by incorporating cohesion-and-coupling metric into a hybrid algorithm consisting of a genetic algorithm and a split-and-merge algorithm. Experimental results and an example of modularized Gene Ontology in RDF/XML format are given to illustrate the effectiveness of the algorithm.

Keywords: Automatic clustering, cohesion-and-coupling metric, gene ontology; genetic algorithm, split-and-merge algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1956
339 Road Accidents Bigdata Mining and Visualization Using Support Vector Machines

Authors: Usha Lokala, Srinivas Nowduri, Prabhakar K. Sharma

Abstract:

Useful information has been extracted from the road accident data in United Kingdom (UK), using data analytics method, for avoiding possible accidents in rural and urban areas. This analysis make use of several methodologies such as data integration, support vector machines (SVM), correlation machines and multinomial goodness. The entire datasets have been imported from the traffic department of UK with due permission. The information extracted from these huge datasets forms a basis for several predictions, which in turn avoid unnecessary memory lapses. Since data is expected to grow continuously over a period of time, this work primarily proposes a new framework model which can be trained and adapt itself to new data and make accurate predictions. This work also throws some light on use of SVM’s methodology for text classifiers from the obtained traffic data. Finally, it emphasizes the uniqueness and adaptability of SVMs methodology appropriate for this kind of research work.

Keywords: Road accident, machine learning, support vector machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1129
338 Towards a Deeper Understanding of 21st Century Global Terrorism

Authors: Francis Jegede

Abstract:

This paper examines essential issues relating to the rise and nature of violent extremism involving non-state actors and groups in the early 21st century. The global trends in terrorism and violent extremism are examined in relation to Western governments’ counter terror operations. The paper analyses the existing legal framework for fighting violent extremism and terrorism and highlights the inherent limitations of the current International Law of War in dealing with the growing challenges posed by terrorists and violent extremist groups. The paper discusses how terrorist groups use civilians, women and children as tools and weapon of war to fuel their campaign of terror and suggests ways in which the international community could deal with the challenge of fighting terrorist groups without putting civilians, women and children in harm way. The paper emphasises the need to uphold human rights values and respect for the law of war in our response to global terrorism. The paper poses the question as to whether the current legal framework for dealing with terrorist groups is sufficient without contravening the essential provisions and ethos of the International Law of War and Human Rights. While the paper explains how terrorist groups flagrantly disregard the rule of law and disrespect human rights in their campaign of terror, it also notes instances in which the current Western strategy in fighting terrorism may be viewed or considered as conflicting with human rights and international law.

Keywords: Terrorism, law of war, international law, violent extremism.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2293
337 Time Series Regression with Meta-Clusters

Authors: Monika Chuchro

Abstract:

This paper presents a preliminary attempt to apply classification of time series using meta-clusters in order to improve the quality of regression models. In this case, clustering was performed as a method to obtain subgroups of time series data with normal distribution from the inflow into wastewater treatment plant data, composed of several groups differing by mean value. Two simple algorithms, K-mean and EM, were chosen as a clustering method. The Rand index was used to measure the similarity. After simple meta-clustering, a regression model was performed for each subgroups. The final model was a sum of the subgroups models. The quality of the obtained model was compared with the regression model made using the same explanatory variables, but with no clustering of data. Results were compared using determination coefficient (R2), measure of prediction accuracy- mean absolute percentage error (MAPE) and comparison on a linear chart. Preliminary results allow us to foresee the potential of the presented technique.

Keywords: Clustering, Data analysis, Data mining, Predictive models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951
336 Speed Sensorless Control with a Linearizationby State Feedback of Asynchronous Machine Using a Model Reference Adaptive System

Authors: A. Larabi, M. S. Boucherit

Abstract:

In this paper, we show that the association of the PI regulators for the speed and stator currents with a control strategy using the linearization by state feedback for an induction machine without speed sensor, and with an adaptation of the rotor resistance. The rotor speed is estimated by using the model reference adaptive system approach (MRAS). This method consists of using two models: The first is the reference model and the second is an adjustable one in which two components of the stator flux, obtained from the measurement of the currents and stator voltages are estimated. The estimated rotor speed is then obtained by canceling the difference between stator-flux of the reference model and those of the adjustable one. Satisfactory results of simulation are obtained and discussed in this paper to highlight the proposed approach.

Keywords: Asynchronous actuator, PI Regulator, adaptivemethod with reference model, Vector control.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1116
335 A Study of Growth Factors on Sustainable Manufacturing in Small and Medium-Sized Enterprises: Case Study of Japan Manufacturing

Authors: Tadayuki Kyoutani, Shigeyuki Haruyama, Ken Kaminishi, Zefry Darmawan

Abstract:

Japan’s semiconductor industries have developed greatly in recent years. Many were started from a Small and Medium-sized Enterprises (SMEs) that found at a good circumstance and now become the prosperous industries in the world. Sustainable growth factors that support the creation of spirit value inside the Japanese company were strongly embedded through performance. Those factors were not clearly defined among each company. A series of literature research conducted to explore quantitative text mining about the definition of sustainable growth factors. Sustainable criteria were developed from previous research to verify the definition of the factors. A typical frame work was proposed as a systematical approach to develop sustainable growth factor in a specific company. Result of approach was review in certain period shows that factors influenced in sustainable growth was importance for the company to achieve the goal.

Keywords: SME, manufacture, sustainable, growth factor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 635
334 Customers’ Priority to Implement SSTs Using AHP Analysis

Authors: Mohammad Jafariahangari, Marjan Habibi, Miresmaeil Mirnabibaboli, Mirza Hassan Hosseini

Abstract:

Self-service technologies (SSTs) make an important contribution to the daily life of people nowadays. However, the introduction of SST does not lead to its usage. Thereby, this paper was an attempt on discovery of the most preferred SST in the customers’ point of view. To fulfill this aim, the Analytical Hierarchy Process (AHP) was applied based on Saaty’s questionnaire which was administered to the customers of e-banking services located in Golestan providence, northern Iran. This study used qualitative factors in association with the intention of consumers’ usage of SSTs to rank three SSTs: ATM, mobile banking and internet banking. The results showed that mobile banking get the highest weight in consumers’ point of view. This research can be useful both for managers and service providers and also for customers who intend to use e-banking.

Keywords: Analytical Hierarchy Process, Decision-making, Ebanking, Iran, Self-service technologies.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2211
333 Closed Form Solution to problem of Calcium Diffusion in Cylindrical Shaped Neuron Cell

Authors: Amrita Tripathi, Neeru Adlakha

Abstract:

Calcium [Ca2+] dynamics is studied as a potential form of neuron excitability that can control many irregular processes like metabolism, secretion etc. Ca2+ ion enters presynaptic terminal and increases the synaptic strength and thus triggers the neurotransmitter release. The modeling and analysis of calcium dynamics in neuron cell becomes necessary for deeper understanding of the processes involved. A mathematical model has been developed for cylindrical shaped neuron cell by incorporating physiological parameters like buffer, diffusion coefficient, and association rate. Appropriate initial and boundary conditions have been framed. The closed form solution has been developed in terms of modified Bessel function. A computer program has been developed in MATLAB 7.11 for the whole approach.

Keywords: Laplace Transform, Modified Bessel function, reaction diffusion equation, diffusion coefficient, excess buffer, calcium influx

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1962
332 Methods for Distinction of Cattle Using Supervised Learning

Authors: Radoslav Židek, Veronika Šidlová, Radovan Kasarda, Birgit Fuerst-Waltl

Abstract:

Machine learning represents a set of topics dealing with the creation and evaluation of algorithms that facilitate pattern recognition, classification, and prediction, based on models derived from existing data. The data can present identification patterns which are used to classify into groups. The result of the analysis is the pattern which can be used for identification of data set without the need to obtain input data used for creation of this pattern. An important requirement in this process is careful data preparation validation of model used and its suitable interpretation. For breeders, it is important to know the origin of animals from the point of the genetic diversity. In case of missing pedigree information, other methods can be used for traceability of animal´s origin. Genetic diversity written in genetic data is holding relatively useful information to identify animals originated from individual countries. We can conclude that the application of data mining for molecular genetic data using supervised learning is an appropriate tool for hypothesis testing and identifying an individual.

Keywords: Genetic data, Pinzgau cattle, supervised learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2318
331 The Impact of System and Data Quality on Organizational Success in the Kingdom of Bahrain

Authors: Amal M. Alrayes

Abstract:

Data and system quality play a central role in organizational success, and the quality of any existing information system has a major influence on the effectiveness of overall system performance. Given the importance of system and data quality to an organization, it is relevant to highlight their importance on organizational performance in the Kingdom of Bahrain. This research aims to discover whether system quality and data quality are related, and to study the impact of system and data quality on organizational success. A theoretical model based on previous research is used to show the relationship between data and system quality, and organizational impact. We hypothesize, first, that system quality is positively associated with organizational impact, secondly that system quality is positively associated with data quality, and finally that data quality is positively associated with organizational impact. A questionnaire was conducted among public and private organizations in the Kingdom of Bahrain. The results show that there is a strong association between data and system quality, that affects organizational success.

Keywords: Data quality, performance, system quality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2118
330 Big Five Traits and Loneliness among Turkish Emerging Adults

Authors: Hasan Atak

Abstract:

Emerging adulthood, between the ages of 18 and 25, as a distinct developmental stage extending from adolescence to young adulthood. The proportions composing the five-factor model are neuroticism, extraversion, openness to experience, agreeableness, and conscientiousness. In the literature, there is any study which includes the relationship between emerging adults loneliness and personality traits. Therefore, the relationship between emerging adults loneliness and personality traits have to be investigated. This study examines the association between the Big Five personality traits, and loneliness among Turkish emerging adults. A total of 220 emerging adults completed the NEO Five Factor Inventory (NEO-FFI), and the The UCLA Loneliness Scale (UCLALS). Correlation analysis showed that three Big Five personality dimensions which are Neuroticism (positively), and Extraversion and Aggreableness (negatively) are moderately correlated with emerging adults loneliness. Regression analysis shows that Extraversion, Aggreableness and Neuroticism are the most important predictors of emerging adults loneliness. Results can be discussed in the context of emerging adulthood theory.

Keywords: Personality, Big Five Traits, Loneliness, Turkish Emerging Adults

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2813
329 Evaluating Hurst Parameters and Fractal Dimensions of Surveyed Dataset of Tailings Dam Embankment

Authors: I. Yakubu, Y. Y. Ziggah, C. Yeboah

Abstract:

In the mining environment, tailings dam embankment is among the hazards and risk areas. The tailings dam embankment could fail and result to damages to facilities, human injuries or even fatalities. Periodic monitoring of the dam embankment is needed to help assess the safety of the tailings dam embankment. Artificial intelligence techniques such as fractals can be used to analyse the stability of the monitored dataset from survey measurement techniques. In this paper, the fractal dimension (D) was determined using D = 2-H. The Hurst parameters (H) of each monitored prism were determined by using a time domain of rescaled range programming in MATLAB software. The fractal dimensions of each monitored prism were determined based on the values of H. The results reveal that the values of the determined H were all within the threshold of 0 ≤ H ≤ 1 m. The smaller the H, the bigger the fractal dimension is. Fractal dimension values ranging from 1.359 x 10-4 m to 1.8843 x 10-3 m were obtained from the monitored prisms on the based on the tailing dam embankment dataset used. The ranges of values obtained indicate that the tailings dam embankment is stable.

Keywords: Hurst parameter, fractal dimension, tailings dam embankment, surveyed dataset.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 759
328 Confidence Intervals for the Difference of Two Normal Population Variances

Authors: Suparat Niwitpong

Abstract:

Motivated by the recent work of Herbert, Hayen, Macaskill and Walter [Interval estimation for the difference of two independent variances. Communications in Statistics, Simulation and Computation, 40: 744-758, 2011.], we investigate, in this paper, new confidence intervals for the difference between two normal population variances based on the generalized confidence interval of Weerahandi [Generalized Confidence Intervals. Journal of the American Statistical Association, 88(423): 899-905, 1993.] and the closed form method of variance estimation of Zou, Huo and Taleban [Simple confidence intervals for lognormal means and their differences with environmental applications. Environmetrics 20: 172-180, 2009]. Monte Carlo simulation results indicate that our proposed confidence intervals give a better coverage probability than that of the existing confidence interval. Also two new confidence intervals perform similarly based on their coverage probabilities and their average length widths.

Keywords: Confidence interval, generalized confidence interval, the closed form method of variance estimation, variance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2778
327 The Early Stages of the Standardization of Finnish Building Sector

Authors: A. Soikkeli

Abstract:

Early 20th century functionalism aimed at generalising living and rationalising construction, thus laying the foundation for the standardisation of construction components and products. From the 1930s onwards, all measurement and quality instructions for building products, different types of building components, descriptions of working methods complying with advisable building practises, planning, measurement and calculation guidelines, terminology, etc. were called standards. Standardisation was regarded as a necessary prerequisite for the mass production of housing.

This article examines the early stages of standardisation in Finland in the 1940s and 1950s, as reflected on the working history of an individual architect, ErkkiKoiso-Kanttila (1914-2006). In 1950 Koiso-Kanttila was appointed the Head of Design of the Finnish Association of Architects’ Building Standards Committee, a position which he held until 1958. His main responsibilities were the development of the RT Building Information File and compiling of the files.

Keywords: Architecture, Post WWII period, Reconstruction, Standardisation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1556
326 Comparative Analysis of Farm Enterprises Performance in Two Agro-Ecological Feuding Zone of Nigeria

Authors: Bolarinwa K.K., Oyeyinka R.A

Abstract:

The two agro-ecological zones became the focus of the study because of violent nature of the incessant conflict in the zones. The available register of farmers association was the sampling frame work where ten percent (61) farmers per state were randomly sampled. Data were collected and analysed using z-test. The research findings revealed tree crops and grains production enterprises ranked higher in Osun (rain fed zones) and Taraba states (savannah zones) respectively. Osun state entrepreneur felt the effect of the conflict on their enterprises more than Tarba state. The reasons adduced for severity of the conflict on enterprises are majority (77.0%) migrated and (75.5%) of them were not allowed to enter their farms during and when conflict deescalated unlike situation in Taraba state. The different in enterprises production level between the two agroecological zone was statistically significant at p<0.05. The conflict had severe impact on farm enterprises.

Keywords: Conflict, severity, entrepreneurs, farm enterprises and production level.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2040
325 Identification of Conserved Domains and Motifs for GRF Gene Family

Authors: Jafar Ahmadi, Nafiseh Noormohammadi, Sedigheh Fabriki Ourang

Abstract:

GRF, Growth regulating factor, genes encode a novel class of plant-specific transcription factors. The GRF proteins play a role in the regulation of cell numbers in young and growing tissues and may act as transcription activations in growth and development of plants. Identification of GRF genes and their expression are important in plants to performance of the growth and development of various organs. In this study, to better understanding the structural and functional differences of GRFs family, 45 GRF proteins sequences in A. thaliana, Z. mays, O. sativa, B. napus, B. rapa, H. vulgare and S. bicolor, have been collected and analyzed through bioinformatics data mining. As a result, in secondary structure of GRFs, the number of alpha helices was more than beta sheets and in all of them QLQ domains were completely in the biggest alpha helix. In all GRFs, QLQ and WRC domains were completely protected except in AtGRF9. These proteins have no trans-membrane domain and due to have nuclear localization signals act in nuclear and they are component of unstable proteins in the test tube.

Keywords: Domain, Gene Family, GRF, Motif.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2330
324 Identification of PIP Aquaporin Genes from Wheat

Authors: Sh. A. Yousif, M. Bhave

Abstract:

There is strong evidence that water channel proteins 'aquaporins (AQPs)' are central components in plant-water relations as well as a number of other physiological parameters. We had previously reported the isolation of 24 plasma membrane intrinsic protein (PIP) type AQPs. However, the gene numbers in rice and the polyploid nature of bread wheat indicated a high probability of further genes in the latter. The present work focused on identification of further AQP isoforms in bread wheat. With the use of altered primer design, we identified five genes homologous, designated PIP1;5b, PIP2;9b, TaPIP2;2, TaPIP2;2a, TaPIP2;2b. Sequence alignments indicate PIP1;5b, PIP2;9b are likely to be homeologues of two previously reported genes while the other three are new genes and could be homeologs of each other. The results indicate further AQP diversity in wheat and the sequence data will enable physical mapping of these genes to identify their genomes as well as genetic to determine their association with any quantitative trait loci (QTLs) associated with plant-water relation such as salinity or drought tolerance.

Keywords: Aquaporins, homeologues, PIP, wheat

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2037
323 Performing Diagnosis in Building with Partially Valid Heterogeneous Tests

Authors: Houda Najeh, Mahendra Pratap Singh, Stéphane Ploix, Antoine Caucheteux, Karim Chabir, Mohamed Naceur Abdelkrim

Abstract:

Building system is highly vulnerable to different kinds of faults and human misbehaviors. Energy efficiency and user comfort are directly targeted due to abnormalities in building operation. The available fault diagnosis tools and methodologies particularly rely on rules or pure model-based approaches. It is assumed that model or rule-based test could be applied to any situation without taking into account actual testing contexts. Contextual tests with validity domain could reduce a lot of the design of detection tests. The main objective of this paper is to consider fault validity when validate the test model considering the non-modeled events such as occupancy, weather conditions, door and window openings and the integration of the knowledge of the expert on the state of the system. The concept of heterogeneous tests is combined with test validity to generate fault diagnoses. A combination of rules, range and model-based tests known as heterogeneous tests are proposed to reduce the modeling complexity. Calculation of logical diagnoses coming from artificial intelligence provides a global explanation consistent with the test result. An application example shows the efficiency of the proposed technique: an office setting at Grenoble Institute of Technology.

Keywords: Heterogeneous tests, validity, building system, sensor grids, sensor fault, diagnosis, fault detection and isolation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 652
322 The Video Database for Teaching and Learning in Football Refereeing

Authors: M. Armenteros, A. Domínguez, M. Fernández, A. J. Benítez

Abstract:

The following paper describes the video database tool used by the Fédération Internationale de Football Association (FIFA) as part of the research project developed in collaboration with the Carlos III University of Madrid. The database project began in 2012, with the aim of creating an educational tool for the training of instructors, referees and assistant referees, and it has been used in all FUTURO III courses since 2013. The platform now contains 3,135 video clips of different match situations from FIFA competitions. It has 1,835 users (FIFA instructors, referees and assistant referees). In this work, the main features of the database are described, such as the use of a search tool and the creation of multimedia presentations and video quizzes. The database has been developed in MySQL, ActionScript, Ruby on Rails and HTML. This tool has been rated by users as "very good" in all courses, which prompt us to introduce it as an ideal tool for any other sport that requires the use of video analysis.

Keywords: Video database, FIFA, refereeing, e-learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1316
321 Photo Mosaic Smartphone Application in Client-Server Based Large-Scale Image Databases

Authors: Sang-Hun Lee, Bum-Soo Kim, Yang-Sae Moon, Jinho Kim

Abstract:

In this paper we present a photo mosaic smartphone application in client-server based large-scale image databases. Photo mosaic is not a new concept, but there are very few smartphone applications especially for a huge number of images in the client-server environment. To support large-scale image databases, we first propose an overall framework working as a client-server model. We then present a concept of image-PAA features to efficiently handle a huge number of images and discuss its lower bounding property. We also present a best-match algorithm that exploits the lower bounding property of image-PAA. We finally implement an efficient Android-based application and demonstrate its feasibility.

Keywords: smartphone applications; photo mosaic; similarity search; data mining; large-scale image databases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1671
320 New Multisensor Data Fusion Method Based on Probabilistic Grids Representation

Authors: Zhichao Zhao, Yi Liu, Shunping Xiao

Abstract:

A new data fusion method called joint probability density matrix (JPDM) is proposed, which can associate and fuse measurements from spatially distributed heterogeneous sensors to identify the real target in a surveillance region. Using the probabilistic grids representation, we numerically combine the uncertainty regions of all the measurements in a general framework. The NP-hard multisensor data fusion problem has been converted to a peak picking problem in the grids map. Unlike most of the existing data fusion method, the JPDM method dose not need association processing, and will not lead to combinatorial explosion. Its convergence to the CRLB with a diminishing grid size has been proved. Simulation results are presented to illustrate the effectiveness of the proposed technique.

Keywords: Cramer-Rao lower bound (CRLB), data fusion, probabilistic grids, joint probability density matrix, localization, sensor network.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1803
319 On Speeding Up Support Vector Machines: Proximity Graphs Versus Random Sampling for Pre-Selection Condensation

Authors: Xiaohua Liu, Juan F. Beltran, Nishant Mohanchandra, Godfried T. Toussaint

Abstract:

Support vector machines (SVMs) are considered to be the best machine learning algorithms for minimizing the predictive probability of misclassification. However, their drawback is that for large data sets the computation of the optimal decision boundary is a time consuming function of the size of the training set. Hence several methods have been proposed to speed up the SVM algorithm. Here three methods used to speed up the computation of the SVM classifiers are compared experimentally using a musical genre classification problem. The simplest method pre-selects a random sample of the data before the application of the SVM algorithm. Two additional methods use proximity graphs to pre-select data that are near the decision boundary. One uses k-Nearest Neighbor graphs and the other Relative Neighborhood Graphs to accomplish the task.

Keywords: Machine learning, data mining, support vector machines, proximity graphs, relative-neighborhood graphs, k-nearestneighbor graphs, random sampling, training data condensation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1919
318 Belt Conveyor Dynamics in Transient Operation for Speed Control

Authors: D. He, Y. Pang, G. Lodewijks

Abstract:

Belt conveyors play an important role in continuous dry bulk material transport, especially at the mining industry. Speed control is expected to reduce the energy consumption of belt conveyors. Transient operation is the operation of increasing or decreasing conveyor speed for speed control. According to literature review, current research rarely takes the conveyor dynamics in transient operation into account. However, in belt conveyor speed control, the conveyor dynamic behaviors are significantly important since the poor dynamics might result in risks. In this paper, the potential risks in transient operation will be analyzed. An existing finite element model will be applied to build a conveyor model, and simulations will be carried out to analyze the conveyor dynamics. In order to realize the soft speed regulation, Harrison’s sinusoid acceleration profile will be applied, and Lodewijks estimator will be built to approximate the required acceleration time. A long inclined belt conveyor will be studied with two major simulations. The conveyor dynamics will be given.

Keywords: Belt conveyor, speed control, transient operation, dynamics

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2332
317 Yield Prediction Using Support Vectors Based Under-Sampling in Semiconductor Process

Authors: Sae-Rom Pak, Seung Hwan Park, Jeong Ho Cho, Daewoong An, Cheong-Sool Park, Jun Seok Kim, Jun-Geol Baek

Abstract:

It is important to predict yield in semiconductor test process in order to increase yield. In this study, yield prediction means finding out defective die, wafer or lot effectively. Semiconductor test process consists of some test steps and each test includes various test items. In other world, test data has a big and complicated characteristic. It also is disproportionably distributed as the number of data belonging to FAIL class is extremely low. For yield prediction, general data mining techniques have a limitation without any data preprocessing due to eigen properties of test data. Therefore, this study proposes an under-sampling method using support vector machine (SVM) to eliminate an imbalanced characteristic. For evaluating a performance, randomly under-sampling method is compared with the proposed method using actual semiconductor test data. As a result, sampling method using SVM is effective in generating robust model for yield prediction.

Keywords: Yield Prediction, Semiconductor Test Process, Support Vector Machine, Under Sampling

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2398
316 Financing Decision and Productivity Growth for the Venture Capital Industry Using High-Order Fuzzy Time Series

Authors: Shang-En Yu

Abstract:

Human society, there are many uncertainties, such as economic growth rate forecast of the financial crisis, many scholars have, since the the Song Chissom two scholars in 1993 the concept of the so-called fuzzy time series (Fuzzy Time Series)different mode to deal with these problems, a previous study, however, usually does not consider the relevant variables selected and fuzzy process based solely on subjective opinions the fuzzy semantic discrete, so can not objectively reflect the characteristics of the data set, in addition to carrying outforecasts are often fuzzy rules as equally important, failed to consider the importance of each fuzzy rule. For these reasons, the variable selection (Factor Selection) through self-organizing map (Self-Organizing Map, SOM) and proposed high-end weighted multivariate fuzzy time series model based on fuzzy neural network (Fuzzy-BPN), and using the the sequential weighted average operator (Ordered Weighted Averaging operator, OWA) weighted prediction. Therefore, in order to verify the proposed method, the Taiwan stock exchange (Taiwan Stock Exchange Corporation) Taiwan Weighted Stock Index (Taiwan Stock Exchange Capitalization Weighted Stock Index, TAIEX) as experimental forecast target, in order to filter the appropriate variables in the experiment Finally, included in other studies in recent years mode in conjunction with this study, the results showed that the predictive ability of this study further improve.

Keywords: Heterogeneity, residential mortgage loans, foreclosure.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1389
315 Food Quality Labels and their Perception by Consumers in the Czech Republic

Authors: Sarka Velcovska

Abstract:

The paper deals with quality labels used in the food products market, especially with labels of quality, labels of origin, and labels of organic farming. The aim of the paper is to identify perception of these labels by consumers in the Czech Republic. The first part refers to the definition and specification of food quality labels that are relevant in the Czech Republic. The second part includes the discussion of marketing research results. Data were collected with personal questioning method. Empirical findings on 150 respondents are related to consumer awareness and perception of national and European food quality labels used in the Czech Republic, attitudes to purchases of labelled products, and interest in information regarding the labels. Statistical methods, in the concrete Pearson´s chi-square test of independence, coefficient of contingency, and coefficient of association are used to determinate if significant differences do exist among selected demographic categories of Czech consumers.

Keywords: Food quality labels, quality labels awareness, quality labels perception, marketing research.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2327
314 A Comparison and Analysis of Name Matching Algorithms

Authors: Chakkrit Snae

Abstract:

Names are important in many societies, even in technologically oriented ones which use e.g. ID systems to identify individual people. Names such as surnames are the most important as they are used in many processes, such as identifying of people and genealogical research. On the other hand variation of names can be a major problem for the identification and search for people, e.g. web search or security reasons. Name matching presumes a-priori that the recorded name written in one alphabet reflects the phonetic identity of two samples or some transcription error in copying a previously recorded name. We add to this the lode that the two names imply the same person. This paper describes name variations and some basic description of various name matching algorithms developed to overcome name variation and to find reasonable variants of names which can be used to further increasing mismatches for record linkage and name search. The implementation contains algorithms for computing a range of fuzzy matching based on different types of algorithms, e.g. composite and hybrid methods and allowing us to test and measure algorithms for accuracy. NYSIIS, LIG2 and Phonex have been shown to perform well and provided sufficient flexibility to be included in the linkage/matching process for optimising name searching.

Keywords: Data mining, name matching algorithm, nominaldata, searching system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 11090
313 Unsupervised Text Mining Approach to Early Warning System

Authors: Ichihan Tai, Bill Olson, Paul Blessner

Abstract:

Traditional early warning systems that alarm against crisis are generally based on structured or numerical data; therefore, a system that can make predictions based on unstructured textual data, an uncorrelated data source, is a great complement to the traditional early warning systems. The Chicago Board Options Exchange (CBOE) Volatility Index (VIX), commonly referred to as the fear index, measures the cost of insurance against market crash, and spikes in the event of crisis. In this study, news data is consumed for prediction of whether there will be a market-wide crisis by predicting the movement of the fear index, and the historical references to similar events are presented in an unsupervised manner. Topic modeling-based prediction and representation are made based on daily news data between 1990 and 2015 from The Wall Street Journal against VIX index data from CBOE.

Keywords: Early Warning System, Knowledge Management, Topic Modeling, Market Prediction.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1920
312 The Ethio-Eritrea Claims Commission on Use of Force: Issue of Self-Defense or Violation of Sovereignty

Authors: Isaias Teklia Berhe

Abstract:

A decision that deals with international disputes, be it arbitral or judicial, has to properly reflect objectivity and coherence with existing rules of international law. This paper shows the decision of the Ethio-Eritrea Claims Commission on the jus ad bellum case is bereft of objectivity and coherence, which contributed a disservice to international law on many aspects. The Commission’s decision that holds Eritrea in contravention to Art 2(4) of the UN Charter based on Ethiopia’s contention is flawed. It fails to consider: the illegitimacy of an actual authority established over contested territory through hostile acts, the proper determination of effectivites under international law, the sanctity of colonially determined boundaries, Ethiopia’s prior firm political recognition and undergirds to respect colonial boundary, and Ethio-Eritrea Border Commission’s decision. The paper will also argue that the Commission confused Eritrea’s right of self-defense with the rule against the non-use of force to settle territorial disputes; wherefore its decision sanitizes or sterilizes unlawful change of territory resulted through unlawful use of force to the effect of advantaging aggressions. The paper likewise argues that the decision is so sacrilegious that it disregards the ossified legal finality of colonial boundaries. Moreover, its approach toward armed attack does not reflect the peculiarity of the jus ad bellum case rather it brings about definitional uncertainties and sustains the perception that the law on self-defense is unsettled.

Keywords: Armed attack, self-defense, territorial integrity, use of force.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1759