Search results for: pattern classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1940

Search results for: pattern classification

1370 An Analysis of Classification of Imbalanced Datasets by Using Synthetic Minority Over-Sampling Technique

Authors: Ghada A. Alfattni

Abstract:

Analysing unbalanced datasets is one of the challenges that practitioners in machine learning field face. However, many researches have been carried out to determine the effectiveness of the use of the synthetic minority over-sampling technique (SMOTE) to address this issue. The aim of this study was therefore to compare the effectiveness of the SMOTE over different models on unbalanced datasets. Three classification models (Logistic Regression, Support Vector Machine and Nearest Neighbour) were tested with multiple datasets, then the same datasets were oversampled by using SMOTE and applied again to the three models to compare the differences in the performances. Results of experiments show that the highest number of nearest neighbours gives lower values of error rates. 

Keywords: Imbalanced datasets, SMOTE, machine learning, logistic regression, support vector machine, nearest neighbour.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1315
1369 The Diet Adherence in Cardiovascular Disease Risk Factors Patients in the North of Iran Based on the Mediterranean Diet Adherence

Authors: Marjan Mahdavi-Roshan, Arsalan Salari, Mahboobeh Gholipour, Moona Naghshbandi

Abstract:

Background and objectives: Before any nutritional intervention, it is necessary to have the prospect of eating habits of people with cardiovascular risk factors. In this study, we assessed the adherence of healthy diet based on Mediterranean dietary pattern and related factors in adults in the north of Iran. Methods: This study was conducted on 550 men and women with cardiovascular risk factors that referred to Heshmat hospital in Rasht, northern Iran. Information was collected by interview and reading medical history and measuring anthropometric indexes. The Mediterranean Diet Adherence Screener was used for assessing dietary adherence, this screener was modified according to religious beliefs and culture of Iran. Results: The mean age of participants was 58±0.38 years. The mean of body mass index was 27±0.01 kg/m2, and the mean of waist circumference was 98±0.2 cm. The mean of dietary adherence was 5.76±0.07. 45% of participants had low adherence, and just 4% had suitable adherence. The mean of dietary adherence in men was significantly higher than women (p=0. 07). Participants in rural area and high educational participants insignificantly had an unsuitable dietary Adherence. There was no significant association between some cardiovascular disease risk factors and dietary adherence. Conclusion: Education to different group about dietary intake correction and using a Mediterranean dietary pattern that is similar to dietary intake in the north of Iran, for controlling cardiovascular disease is necessary.

Keywords: Dietary adherence, Mediterranean dietary pattern, cardiovascular disease, north of Iran.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 991
1368 Estimation Model of Dry Docking Duration Using Data Mining

Authors: Isti Surjandari, Riara Novita

Abstract:

Maintenance is one of the most important activities in the shipyard industry. However, sometimes it is not supported by adequate services from the shipyard, where inaccuracy in estimating the duration of the ship maintenance is still common. This makes estimation of ship maintenance duration is crucial. This study uses Data Mining approach, i.e., CART (Classification and Regression Tree) to estimate the duration of ship maintenance that is limited to dock works or which is known as dry docking. By using the volume of dock works as an input to estimate the maintenance duration, 4 classes of dry docking duration were obtained with different linear model and job criteria for each class. These linear models can then be used to estimate the duration of dry docking based on job criteria.

Keywords: Classification and regression tree (CART), data mining, dry docking, maintenance duration.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2434
1367 Preliminary Investigation on Combustion Characteristics of Rice Husk in FBC

Authors: W. Permchart, S. Tanatvanit

Abstract:

The experimental results on combustion of rice husk in a conical fluidized bed combustor (referred to as the conical FBC) using silica sand as the bed material are presented in this paper. The effects of excess combustion air and combustor loading as well as the sand bed height on the combustion pattern in FBC were investigated. Temperatures and gas concentrations (CO and NO) along over the combustor height as well as in the flue gas downstream from the ash collecting cyclone were measured. The results showed that the axial temperature profiles in FBC were explicitly affected by the combustor loading whereas the excess air and bed height were found to have minor influences on the temperature pattern. Meanwhile, the combustor loading and the excess air significantly affected the axial CO and NO concentration profiles; however, these profiles were almost independent of the bed height. The combustion and thermal efficiencies for this FBC were quantified for different operating conditions.

Keywords: Temperature, Combustor loading, Excess air, Bed height.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1633
1366 Liver Tumor Detection by Classification through FD Enhancement of CT Image

Authors: N. Ghatwary, A. Ahmed, H. Jalab

Abstract:

In this paper, an approach for the liver tumor detection in computed tomography (CT) images is represented. The detection process is based on classifying the features of target liver cell to either tumor or non-tumor. Fractional differential (FD) is applied for enhancement of Liver CT images, with the aim of enhancing texture and edge features. Later on, a fusion method is applied to merge between the various enhanced images and produce a variety of feature improvement, which will increase the accuracy of classification. Each image is divided into NxN non-overlapping blocks, to extract the desired features. Support vector machines (SVM) classifier is trained later on a supplied dataset different from the tested one. Finally, the block cells are identified whether they are classified as tumor or not. Our approach is validated on a group of patients’ CT liver tumor datasets. The experiment results demonstrated the efficiency of detection in the proposed technique.

Keywords: Fractional differential (FD), Computed Tomography (CT), fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1684
1365 Clustering Multivariate Empiric Characteristic Functions for Multi-Class SVM Classification

Authors: María-Dolores Cubiles-de-la-Vega, Rafael Pino-Mejías, Esther-Lydia Silva-Ramírez

Abstract:

A dissimilarity measure between the empiric characteristic functions of the subsamples associated to the different classes in a multivariate data set is proposed. This measure can be efficiently computed, and it depends on all the cases of each class. It may be used to find groups of similar classes, which could be joined for further analysis, or it could be employed to perform an agglomerative hierarchical cluster analysis of the set of classes. The final tree can serve to build a family of binary classification models, offering an alternative approach to the multi-class SVM problem. We have tested this dendrogram based SVM approach with the oneagainst- one SVM approach over four publicly available data sets, three of them being microarray data. Both performances have been found equivalent, but the first solution requires a smaller number of binary SVM models.

Keywords: Cluster Analysis, Empiric Characteristic Function, Multi-class SVM, R.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1881
1364 Scots Pine Needles as Bioindicators in Determining the Aerial Distribution Pattern of Sulphur Emissions around Industrial Plants

Authors: Risto Pöykiö, Jari Hietala, Hannu Nurmesniemi

Abstract:

In this study, the Scots pine (Pinus sylvestris L.) C needles (i.e. the current-year-needles) were used as bioindicators in determining the aerial distribution pattern of sulphur emissions around industrial point sources at Kemi, Northern Finland. The average sulphur concentration in the C needles was 897 mg/kg (d.w.), with a standard deviation of 118 mg/kg (d.w.) and range 740 – 1350 mg/kg (d.w.). According to results in this study, Scots pine needles (Pinus sylvestris L.) appear to be an ideal bioindicators for identifying atmospheric sulphur pollution derived from industrial plants and can complement the information provided by plant mapping studies around industrial plants.

Keywords: Emission, Sulphur, Scots Pine, Pinus sylvestris

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1748
1363 Customer Knowledge and Service Development, the Web 2.0 Role in Co-production

Authors: Roberto Boselli, Mirko Cesarini, Mario Mezzanzanica

Abstract:

The paper is concerned with relationships between SSME and ICTs and focuses on the role of Web 2.0 tools in the service development process. The research presented aims at exploring how collaborative technologies can support and improve service processes, highlighting customer centrality and value coproduction. The core idea of the paper is the centrality of user participation and the collaborative technologies as enabling factors; Wikipedia is analyzed as an example. The result of such analysis is the identification and description of a pattern characterising specific services in which users collaborate by means of web tools with value co-producers during the service process. The pattern of collaborative co-production concerning several categories of services including knowledge based services is then discussed.

Keywords: Service Interaction Patterns, Services Science, Web2.0 tools, Service Development Process.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
1362 Classification of Business Models of Italian Bancassurance by Balance Sheet Indicators

Authors: Andrea Bellucci, Martina Tofi

Abstract:

The aim of paper is to analyze business models of bancassurance in Italy for life business. The life insurance business is very developed in the Italian market and banks branches have 80% of the market share. Given its maturity, the life insurance market needs to consolidate its organizational form to allow for the development of non-life business, which nowadays collects few premiums but represents a great opportunity to enlarge the market share of bancassurance using its strength in the distribution channel while the market share of independent agents is decreasing. Starting with the main business model of bancassurance for life business, this paper will analyze the performances of life companies in the Italian market by balance sheet indicators and by main discriminant variables of business models. The study will observe trends from 2013 to 2015 for the Italian market by exploiting a database managed by Associazione Nazionale delle Imprese di Assicurazione (ANIA). The applied approach is based on a bottom-up analysis starting with variables and indicators to define business models’ classification. The statistical classification algorithm proposed by Ward is employed to design business models’ profiles. Results from the analysis will be a representation of the main business models built by their profile related to indicators. In that way, an unsupervised analysis is developed that has the limit of its judgmental dimension based on research opinion, but it is possible to obtain a design of effective business models.

Keywords: Balance sheet indicators, Bancassurance, business models, ward algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1263
1361 Target Signal Detection Using MUSIC Spectrum in Noise Environment

Authors: Sangjun Park, Sangbae Jeong, Moonsung Han, Minsoo hahn

Abstract:

In this paper, a target signal detection method using multiple signal classification (MUSIC) algorithm is proposed. The MUSIC algorithm is a subspace-based direction of arrival (DOA) estimation method. The algorithm detects the DOAs of multiple sources using the inverse of the eigenvalue-weighted eigen spectra. To apply the algorithm to target signal detection for GSC-based beamforming, we utilize its spectral response for the target DOA in noisy conditions. For evaluation of the algorithm, the performance of the proposed target signal detection method is compared with that of the normalized cross-correlation (NCC), the fixed beamforming, and the power ratio method. Experimental results show that the proposed algorithm significantly outperforms the conventional ones in receiver operating characteristics(ROC) curves.

Keywords: Beamforming, direction of arrival, multiple signal classification, target signal detection.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2542
1360 Hybrid Neural Network Methods for Lithology Identification in the Algerian Sahara

Authors: S. Chikhi, M. Batouche, H. Shout

Abstract:

In this paper, we combine a probabilistic neural method with radial-bias functions in order to construct the lithofacies of the wells DF01, DF02 and DF03 situated in the Triassic province of Algeria (Sahara). Lithofacies is a crucial problem in reservoir characterization. Our objective is to facilitate the experts' work in geological domain and to allow them to obtain quickly the structure and the nature of lands around the drilling. This study intends to design a tool that helps automatic deduction from numerical data. We used a probabilistic formalism to enhance the classification process initiated by a Self-Organized Map procedure. Our system gives lithofacies, from well-log data, of the concerned reservoir wells in an aspect easy to read by a geology expert who identifies the potential for oil production at a given source and so forms the basis for estimating the financial returns and economic benefits.

Keywords: Classification, Lithofacies, Probabilistic formalism, Reservoir characterization, Well-log data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1898
1359 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: Cellular automata, neural cellular automata, deep learning, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 867
1358 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: Consensus, curse of correlation, imbalanced classification, rank-based chain-mode ensemble.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 735
1357 An Optimal Feature Subset Selection for Leaf Analysis

Authors: N. Valliammal, S.N. Geethalakshmi

Abstract:

This paper describes an optimal approach for feature subset selection to classify the leaves based on Genetic Algorithm (GA) and Kernel Based Principle Component Analysis (KPCA). Due to high complexity in the selection of the optimal features, the classification has become a critical task to analyse the leaf image data. Initially the shape, texture and colour features are extracted from the leaf images. These extracted features are optimized through the separate functioning of GA and KPCA. This approach performs an intersection operation over the subsets obtained from the optimization process. Finally, the most common matching subset is forwarded to train the Support Vector Machine (SVM). Our experimental results successfully prove that the application of GA and KPCA for feature subset selection using SVM as a classifier is computationally effective and improves the accuracy of the classifier.

Keywords: Optimization, Feature extraction, Feature subset, Classification, GA, KPCA, SVM and Computation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2244
1356 Monitoring CO2 and H2S Emission in Live Austrian and UK Concrete Sewer Pipes

Authors: Anna Romanova, Morteza A. Alani

Abstract:

Corrosion of concrete sewer pipes induced by sulfuric acid is an acknowledged problem and a ticking time-bomb to sewer operators. Whilst the chemical reaction of the corrosion process is well-understood, the indirect roles of other parameters in the corrosion process which are found in sewer environment are not highly reflected on. This paper reports on a field studies undertaken in Austria and United Kingdom, where the parameters of temperature, pH, H2S and CO2 were monitored over a period of time. The study establishes that (i) effluent temperature and pH have similar daily pattern and peak times, when examined in minutes scale; (ii) H2S and CO2 have an identical hourly pattern; (iii) H2S instant or shifted relation to effluent temperature is governed by the root mean square value of CO2.

Keywords: Concrete corrosion, carbon dioxide, hydrogen sulphide, sewer pipe, sulfuric acid.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2123
1355 Multidimensional Data Mining by Means of Randomly Travelling Hyper-Ellipsoids

Authors: Pavel Y. Tabakov, Kevin Duffy

Abstract:

The present study presents a new approach to automatic data clustering and classification problems in large and complex databases and, at the same time, derives specific types of explicit rules describing each cluster. The method works well in both sparse and dense multidimensional data spaces. The members of the data space can be of the same nature or represent different classes. A number of N-dimensional ellipsoids are used for enclosing the data clouds. Due to the geometry of an ellipsoid and its free rotation in space the detection of clusters becomes very efficient. The method is based on genetic algorithms that are used for the optimization of location, orientation and geometric characteristics of the hyper-ellipsoids. The proposed approach can serve as a basis for the development of general knowledge systems for discovering hidden knowledge and unexpected patterns and rules in various large databases.

Keywords: Classification, clustering, data minig, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1774
1354 Segmentation of Korean Words on Korean Road Signs

Authors: Lae-Jeong Park, Kyusoo Chung, Jungho Moon

Abstract:

This paper introduces an effective method of segmenting Korean text (place names in Korean) from a Korean road sign image. A Korean advanced directional road sign is composed of several types of visual information such as arrows, place names in Korean and English, and route numbers. Automatic classification of the visual information and extraction of Korean place names from the road sign images make it possible to avoid a lot of manual inputs to a database system for management of road signs nationwide. We propose a series of problem-specific heuristics that correctly segments Korean place names, which is the most crucial information, from the other information by leaving out non-text information effectively. The experimental results with a dataset of 368 road sign images show 96% of the detection rate per Korean place name and 84% per road sign image.

Keywords: Segmentation, road signs, characters, classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2753
1353 Neuron Efficiency in Fluid Dynamics and Prediction of Groundwater Reservoirs'' Properties Using Pattern Recognition

Authors: J. K. Adedeji, S. T. Ijatuyi

Abstract:

The application of neural network using pattern recognition to study the fluid dynamics and predict the groundwater reservoirs properties has been used in this research. The essential of geophysical survey using the manual methods has failed in basement environment, hence the need for an intelligent computing such as predicted from neural network is inevitable. A non-linear neural network with an XOR (exclusive OR) output of 8-bits configuration has been used in this research to predict the nature of groundwater reservoirs and fluid dynamics of a typical basement crystalline rock. The control variables are the apparent resistivity of weathered layer (p1), fractured layer (p2), and the depth (h), while the dependent variable is the flow parameter (F=λ). The algorithm that was used in training the neural network is the back-propagation coded in C++ language with 300 epoch runs. The neural network was very intelligent to map out the flow channels and detect how they behave to form viable storage within the strata. The neural network model showed that an important variable gr (gravitational resistance) can be deduced from the elevation and apparent resistivity pa. The model results from SPSS showed that the coefficients, a, b and c are statistically significant with reduced standard error at 5%.

Keywords: Neural network, gravitational resistance, pattern recognition, non-linear.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 802
1352 Emotion Recognition Using Neural Network: A Comparative Study

Authors: Nermine Ahmed Hendy, Hania Farag

Abstract:

Emotion recognition is an important research field that finds lots of applications nowadays. This work emphasizes on recognizing different emotions from speech signal. The extracted features are related to statistics of pitch, formants, and energy contours, as well as spectral, perceptual and temporal features, jitter, and shimmer. The Artificial Neural Networks (ANN) was chosen as the classifier. Working on finding a robust and fast ANN classifier suitable for different real life application is our concern. Several experiments were carried out on different ANN to investigate the different factors that impact the classification success rate. Using a database containing 7 different emotions, it will be shown that with a proper and careful adjustment of features format, training data sorting, number of features selected and even the ANN type and architecture used, a success rate of 85% or even more can be achieved without increasing the system complicity and the computation time

Keywords: Classification, emotion recognition, features extraction, feature selection, neural network

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4701
1351 A Study on the Application of Machine Learning and Deep Learning Techniques for Skin Cancer Detection

Authors: Hritwik Ghosh, Irfan Sadiq Rahat, Sachi Nandan Mohanty, J. V. R. Ravindra, Abdus Sobur

Abstract:

In the rapidly evolving landscape of medical diagnostics, the early detection and accurate classification of skin cancer remain paramount for effective treatment outcomes. This research delves into the transformative potential of artificial intelligence (AI), specifically deep learning (DL), as a tool for discerning and categorizing various skin conditions. Utilizing a diverse dataset of 3,000 images, representing nine distinct skin conditions, we confront the inherent challenge of class imbalance. This imbalance, where conditions like melanomas are over-represented, is addressed by incorporating class weights during the model training phase, ensuring an equitable representation of all conditions in the learning process. Our approach presents a hybrid model, amalgamating the strengths of two renowned convolutional neural networks (CNNs), VGG16 and ResNet50. These networks, pre-trained on the ImageNet dataset, are adept at extracting intricate features from images. By synergizing these models, our research aims to capture a holistic set of features, thereby bolstering classification performance. Preliminary findings underscore the hybrid model's superiority over individual models, showcasing its prowess in feature extraction and classification. Moreover, the research emphasizes the significance of rigorous data pre-processing, including image resizing, color normalization, and segmentation, in ensuring data quality and model reliability. In essence, this study illuminates the promising role of AI and DL in revolutionizing skin cancer diagnostics, offering insights into its potential applications in broader medical domains.

Keywords: Artificial intelligence, machine learning, deep learning, skin cancer, dermatology, convolutional neural networks, image classification, computer vision, healthcare technology, cancer detection, medical imaging.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1522
1350 Attention Multiple Instance Learning for Cancer Tissue Classification in Digital Histopathology Images

Authors: Afaf Alharbi, Qianni Zhang

Abstract:

The identification of malignant tissue in histopathological slides holds significant importance in both clinical settings and pathology research. This paper presents a methodology aimed at automatically categorizing cancerous tissue through the utilization of a multiple instance learning framework. This framework is specifically developed to acquire knowledge of the Bernoulli distribution of the bag label probability by employing neural networks. Furthermore, we put forward a neural network-based permutation-invariant aggregation operator, equivalent to attention mechanisms, which is applied to the multi-instance learning network. Through empirical evaluation on an openly available colon cancer histopathology dataset, we provide evidence that our approach surpasses various conventional deep learning methods.

Keywords: Attention Multiple Instance Learning, Multiple Instance Learning, transfer learning, histopathological slides, cancer tissue classification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 234
1349 Image Rotation Using an Augmented 2-Step Shear Transform

Authors: Hee-Choul Kwon, Heeyong Kwon

Abstract:

Image rotation is one of main pre-processing steps for image processing or image pattern recognition. It is implemented with a rotation matrix multiplication. It requires a lot of floating point arithmetic operations and trigonometric calculations, so it takes a long time to execute. Therefore, there has been a need for a high speed image rotation algorithm without two major time-consuming operations. However, the rotated image has a drawback, i.e. distortions. We solved the problem using an augmented two-step shear transform. We compare the presented algorithm with the conventional rotation with images of various sizes. Experimental results show that the presented algorithm is superior to the conventional rotation one.

Keywords: High speed rotation operation, image rotation, transform matrix, image processing, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2654
1348 Distributed Splay Suffix Arrays: A New Structure for Distributed String Search

Authors: Tu Kun, Gu Nai-jie, Bi Kun, Liu Gang, Dong Wan-li

Abstract:

As a structure for processing string problem, suffix array is certainly widely-known and extensively-studied. But if the string access pattern follows the “90/10" rule, suffix array can not take advantage of the fact that we often find something that we have just found. Although the splay tree is an efficient data structure for small documents when the access pattern follows the “90/10" rule, it requires many structures and an excessive amount of pointer manipulations for efficiently processing and searching large documents. In this paper, we propose a new and conceptually powerful data structure, called splay suffix arrays (SSA), for string search. This data structure combines the features of splay tree and suffix arrays into a new approach which is suitable to implementation on both conventional and clustered computers.

Keywords: suffix arrays, splay tree, string search, distributedalgorithm

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1778
1347 MIM: A Species Independent Approach for Classifying Coding and Non-Coding DNA Sequences in Bacterial and Archaeal Genomes

Authors: Achraf El Allali, John R. Rose

Abstract:

A number of competing methodologies have been developed to identify genes and classify DNA sequences into coding and non-coding sequences. This classification process is fundamental in gene finding and gene annotation tools and is one of the most challenging tasks in bioinformatics and computational biology. An information theory measure based on mutual information has shown good accuracy in classifying DNA sequences into coding and noncoding. In this paper we describe a species independent iterative approach that distinguishes coding from non-coding sequences using the mutual information measure (MIM). A set of sixty prokaryotes is used to extract universal training data. To facilitate comparisons with the published results of other researchers, a test set of 51 bacterial and archaeal genomes was used to evaluate MIM. These results demonstrate that MIM produces superior results while remaining species independent.

Keywords: Coding Non-coding Classification, Entropy, GeneRecognition, Mutual Information.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1728
1346 Study of the Effect of Project Management on Manufacturing and Production Projects

Authors: S.B. Ahmadi, Z. Moradpour, Gh. Liaghat

Abstract:

In this article the accumulated results out of the effects and length of the manufacture and production projects in the university and research standard have been settled with the usefulness definition of the process of project management for the accessibility to the proportional pattern in the “time and action" stages. Studies show that many problems confronted by the researchers in these projects are connected to the non-profiting of: 1) autonomous timing for gathering the educational theme, 2) autonomous timing for planning and pattern, presenting before the construction, and 3) autonomous timing for manufacture and sample presentation from the output. The result of this study indicates the division of every manufacture and production projects into three smaller autonomous projects from its kind, budget and autonomous expenditure, shape and order of the stages for the management of these kinds of projects. In this case study real result are compared with theoretical results.

Keywords: Project management, Manufacturing, production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1676
1345 The Key Role of the Steroidal Hormones in the Pattern Distribution of the Epiphyseal Structure in Rabbit

Authors: Fatahian Dehkordi R.F, Parchami A.

Abstract:

Steroidal hormones with the efficient changes on the epiphyseal growth plate may influence tissue structure properties. Presents paper to investigate the effects of gonadectomy in the pattern distribution of the epiphyseal structure. Fifteen adult female New Zealand white rabbits were separated into three groups. One group was intact and others groups were selected for surgical operation. From these two groups, one group carried out steroidal administration. The results obtained showed that there is no statistically difference in the mean diameter of the growth plate cells between all three groups. The maximum value of the cartilage cells were allocated to the gonadectomized group and the minimum number were observed in Hormonal induced group significantly. Growth plate height was significantly greater in gonadectomized group than in two other groups.

Keywords: Steroidal hormones, Ovariectomy, Rabbit, Epiphyseal structure

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1281
1344 The Incidence of Obesity among Adult Women in Pekanbaru City, Indonesia, Related to High Fat Consumption, Stress Level, and Physical Activity

Authors: Yudia Mailani Putri, Martalena Purba, B. J. Istiti Kandarina

Abstract:

Background: Obesity has been recognized as a global health problem. Individuals classified as overweight and obese are increasing at an alarming rate. This condition is associated with psychological and physiological problems. as a person reaches adulthood, somatic growth ceases. At this stage, the human body has developed fully, to a stable state. As the capital of Riau Province in Indonesia, Pekanbaru is dominated by Malay ethnic population habitually consuming cholesterol-rich fatty foods as a daily menu, a trigger to the onset of obesity resulting in high prevalence of degenerative diseases. Research objectives: The aim of this study is elaborating the relationship between high-fat consumption pattern, stress level, physical activity and the incidence of obesity in adult women in Pekanbaru city. Research Methods: Among the combined research methods applied in this study, the first stage is quantitative observational, analytical cross-sectional research design with adult women aged 20-40 living in Pekanbaru city. The sample consists of 200 women with BMI≥25. Sample data is processed with univariate, bivariate (correlation and simple linear regression) and multivariate (multiple linear regression) analysis. The second phase is qualitative descriptive study purposive sampling by in-depth interviews. six participants withdrew from the study. Results: According to the results of the bivariate analysis, there are relationships between the incidence of obesity and the pattern of high fat foods consumption (energy intake (p≤0.000; r = 0.536), protein intake (p≤0.000; r=0.307), fat intake (p≤0.000; r=0.416), carbohydrate intake (p≤0.000; r=0.430), frequency of fatty food consumption (p≤0.000; r=0.506) and frequency of viscera foods consumption (p≤0.000; r=0.535). There is a relationship between physical activity and incidence of obesity (p≤0.000; r=-0.631). However, there is no relationship between the level of stress (p=0.741; r=0.019-) and the incidence of obesity. Physical activity is a predominant factor in the incidence of obesity in adult women in Pekanbaru city. Conclusion: There are relationships between high-fat food consumption pattern, physical activity and the incidence of obesity in Pekanbaru city whereas physical activity is a predominant factor in the occurrence of obesity, supported by the unchangeable pattern of high-fat foods consumption.

Keywords: Obesity, adult, high in fat, stress, physical activity, consumption pattern.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 820
1343 Probabilistic Crash Prediction and Prevention of Vehicle Crash

Authors: Lavanya Annadi, Fahimeh Jafari

Abstract:

Transportation brings immense benefits to society, but it also has its costs. Costs include the cost of infrastructure, personnel, and equipment, but also the loss of life and property in traffic accidents on the road, delays in travel due to traffic congestion, and various indirect costs in terms of air transport. This research aims to predict the probabilistic crash prediction of vehicles using Machine Learning due to natural and structural reasons by excluding spontaneous reasons, like overspeeding, etc., in the United States. These factors range from meteorological elements such as weather conditions, precipitation, visibility, wind speed, wind direction, temperature, pressure, and humidity, to human-made structures, like road structure components such as Bumps, Roundabouts, No Exit, Turning Loops, Give Away, etc. The probabilities are categorized into ten distinct classes. All the predictions are based on multiclass classification techniques, which are supervised learning. This study considers all crashes in all states collected by the US government. The probability of the crash was determined by employing Multinomial Expected Value, and a classification label was assigned accordingly. We applied three classification models, including multiclass Logistic Regression, Random Forest and XGBoost. The numerical results show that XGBoost achieved a 75.2% accuracy rate which indicates the part that is being played by natural and structural reasons for the crash. The paper has provided in-depth insights through exploratory data analysis.

Keywords: Road safety, crash prediction, exploratory analysis, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 88
1342 A Neural Network Classifier for Estimation of the Degree of Infestation by Late Blight on Tomato Leaves

Authors: Gizelle K. Vianna, Gabriel V. Cunha, Gustavo S. Oliveira

Abstract:

Foliage diseases in plants can cause a reduction in both quality and quantity of agricultural production. Intelligent detection of plant diseases is an essential research topic as it may help monitoring large fields of crops by automatically detecting the symptoms of foliage diseases. This work investigates ways to recognize the late blight disease from the analysis of tomato digital images, collected directly from the field. A pair of multilayer perceptron neural network analyzes the digital images, using data from both RGB and HSL color models, and classifies each image pixel. One neural network is responsible for the identification of healthy regions of the tomato leaf, while the other identifies the injured regions. The outputs of both networks are combined to generate the final classification of each pixel from the image and the pixel classes are used to repaint the original tomato images by using a color representation that highlights the injuries on the plant. The new images will have only green, red or black pixels, if they came from healthy or injured portions of the leaf, or from the background of the image, respectively. The system presented an accuracy of 97% in detection and estimation of the level of damage on the tomato leaves caused by late blight.

Keywords: Artificial neural networks, digital image processing, pattern recognition.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2554
1341 Investigating Activity Recognition Using 9-Axis Sensors and Filters in Wearable Devices

Authors: Jun Gil Ahn, Jong Kang Park, Jong Tae Kim

Abstract:

In this paper, we analyze major components of activity recognition (AR) in wearable device with 9-axis sensors and sensor fusion filters. 9-axis sensors commonly include 3-axis accelerometer, 3-axis gyroscope and 3-axis magnetometer. We chose sensor fusion filters as Kalman filter and Direction Cosine Matrix (DCM) filter. We also construct sensor fusion data from each activity sensor data and perform classification by accuracy of AR using Naïve Bayes and SVM. According to the classification results, we observed that the DCM filter and the specific combination of the sensing axes are more effective for AR in wearable devices while classifying walking, running, ascending and descending.

Keywords: Accelerometer, activity recognition, directional cosine matrix filter, gyroscope, Kalman filter, magnetometer.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1676