Search results for: Sparse datasets.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 326

Search results for: Sparse datasets.

86 Artificial Neural Network-Based Short-Term Load Forecasting for Mymensingh Area of Bangladesh

Authors: S. M. Anowarul Haque, Md. Asiful Islam

Abstract:

Electrical load forecasting is considered to be one of the most indispensable parts of a modern-day electrical power system. To ensure a reliable and efficient supply of electric energy, special emphasis should have been put on the predictive feature of electricity supply. Artificial Neural Network-based approaches have emerged to be a significant area of interest for electric load forecasting research. This paper proposed an Artificial Neural Network model based on the particle swarm optimization algorithm for improved electric load forecasting for Mymensingh, Bangladesh. The forecasting model is developed and simulated on the MATLAB environment with a large number of training datasets. The model is trained based on eight input parameters including historical load and weather data. The predicted load data are then compared with an available dataset for validation. The proposed neural network model is proved to be more reliable in terms of day-wise load forecasting for Mymensingh, Bangladesh.

Keywords: Load forecasting, artificial neural network, particle swarm optimization.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 635
85 Identifying Factors Contributing to the Spread of Lyme Disease: A Regression Analysis of Virginia’s Data

Authors: Fatemeh Valizadeh Gamchi, Edward L. Boone

Abstract:

This research focuses on Lyme disease, a widespread infectious condition in the United States caused by the bacterium Borrelia burgdorferi sensu stricto. It is critical to identify environmental and economic elements that are contributing to the spread of the disease. This study examined data from Virginia to identify a subset of explanatory variables significant for Lyme disease case numbers. To identify relevant variables and avoid overfitting, linear poisson, and regularization regression methods such as ridge, lasso, and elastic net penalty were employed. Cross-validation was performed to acquire tuning parameters. The methods proposed can automatically identify relevant disease count covariates. The efficacy of the techniques was assessed using four criteria on three simulated datasets. Finally, using the Virginia Department of Health’s Lyme disease dataset, the study successfully identified key factors, and the results were consistent with previous studies.

Keywords: Lyme disease, Poisson generalized linear model, Ridge regression, Lasso Regression, elastic net regression.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 54
84 A Comparison of SVM-based Criteria in Evolutionary Method for Gene Selection and Classification of Microarray Data

Authors: Rameswar Debnath, Haruhisa Takahashi

Abstract:

An evolutionary method whose selection and recombination operations are based on generalization error-bounds of support vector machine (SVM) can select a subset of potentially informative genes for SVM classifier very efficiently [7]. In this paper, we will use the derivative of error-bound (first-order criteria) to select and recombine gene features in the evolutionary process, and compare the performance of the derivative of error-bound with the error-bound itself (zero-order) in the evolutionary process. We also investigate several error-bounds and their derivatives to compare the performance, and find the best criteria for gene selection and classification. We use 7 cancer-related human gene expression datasets to evaluate the performance of the zero-order and first-order criteria of error-bounds. Though both criteria have the same strategy in theoretically, experimental results demonstrate the best criterion for microarray gene expression data.

Keywords: support vector machine, generalization error-bound, feature selection, evolutionary algorithm, microarray data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1496
83 DCBOR: A Density Clustering Based on Outlier Removal

Authors: A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, M. A. Ramadan

Abstract:

Data clustering is an important data exploration technique with many applications in data mining. We present an enhanced version of the well known single link clustering algorithm. We will refer to this algorithm as DCBOR. The proposed algorithm alleviates the chain effect by removing the outliers from the given dataset. So this algorithm provides outlier detection and data clustering simultaneously. This algorithm does not need to update the distance matrix, since the algorithm depends on merging the most k-nearest objects in one step and the cluster continues grow as long as possible under specified condition. So the algorithm consists of two phases; at the first phase, it removes the outliers from the input dataset. At the second phase, it performs the clustering process. This algorithm discovers clusters of different shapes, sizes, densities and requires only one input parameter; this parameter represents a threshold for outlier points. The value of the input parameter is ranging from 0 to 1. The algorithm supports the user in determining an appropriate value for it. We have tested this algorithm on different datasets contain outlier and connecting clusters by chain of density points, and the algorithm discovers the correct clusters. The results of our experiments demonstrate the effectiveness and the efficiency of DCBOR.

Keywords: Data Clustering, Clustering Algorithms, Handling Noise, Arbitrary Shape of Clusters.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1895
82 Convergence Analysis of Training Two-Hidden-Layer Partially Over-Parameterized ReLU Networks via Gradient Descent

Authors: Zhifeng Kong

Abstract:

Over-parameterized neural networks have attracted a great deal of attention in recent deep learning theory research, as they challenge the classic perspective of over-fitting when the model has excessive parameters and have gained empirical success in various settings. While a number of theoretical works have been presented to demystify properties of such models, the convergence properties of such models are still far from being thoroughly understood. In this work, we study the convergence properties of training two-hidden-layer partially over-parameterized fully connected networks with the Rectified Linear Unit activation via gradient descent. To our knowledge, this is the first theoretical work to understand convergence properties of deep over-parameterized networks without the equally-wide-hidden-layer assumption and other unrealistic assumptions. We provide a probabilistic lower bound of the widths of hidden layers and proved linear convergence rate of gradient descent. We also conducted experiments on synthetic and real-world datasets to validate our theory.

Keywords: Over-parameterization, Rectified Linear Units (ReLU), convergence, gradient descent, neural networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 817
81 Cross Project Software Fault Prediction at Design Phase

Authors: Pradeep Singh, Shrish Verma

Abstract:

Software fault prediction models are created by using the source code, processed metrics from the same or previous version of code and related fault data. Some company do not store and keep track of all artifacts which are required for software fault prediction. To construct fault prediction model for such company, the training data from the other projects can be one potential solution. Earlier we predicted the fault the less cost it requires to correct. The training data consists of metrics data and related fault data at function/module level. This paper investigates fault predictions at early stage using the cross-project data focusing on the design metrics. In this study, empirical analysis is carried out to validate design metrics for cross project fault prediction. The machine learning techniques used for evaluation is Naïve Bayes. The design phase metrics of other projects can be used as initial guideline for the projects where no previous fault data is available. We analyze seven datasets from NASA Metrics Data Program which offer design as well as code metrics. Overall, the results of cross project is comparable to the within company data learning.

Keywords: Software Metrics, Fault prediction, Cross project, Within project.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2486
80 Swarmed Discriminant Analysis for Multifunction Prosthesis Control

Authors: Rami N. Khushaba, Ahmed Al-Ani, Adel Al-Jumaily

Abstract:

One of the approaches enabling people with amputated limbs to establish some sort of interface with the real world includes the utilization of the myoelectric signal (MES) from the remaining muscles of those limbs. The MES can be used as a control input to a multifunction prosthetic device. In this control scheme, known as the myoelectric control, a pattern recognition approach is usually utilized to discriminate between the MES signals that belong to different classes of the forearm movements. Since the MES is recorded using multiple channels, the feature vector size can become very large. In order to reduce the computational cost and enhance the generalization capability of the classifier, a dimensionality reduction method is needed to identify an informative yet moderate size feature set. This paper proposes a new fuzzy version of the well known Fisher-s Linear Discriminant Analysis (LDA) feature projection technique. Furthermore, based on the fact that certain muscles might contribute more to the discrimination process, a novel feature weighting scheme is also presented by employing Particle Swarm Optimization (PSO) for estimating the weight of each feature. The new method, called PSOFLDA, is tested on real MES datasets and compared with other techniques to prove its superiority.

Keywords: Discriminant Analysis, Pattern Recognition, SignalProcessing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1511
79 Segmentation of Lungs from CT Scan Images for Early Diagnosis of Lung Cancer

Authors: Nisar Ahmed Memon, Anwar Majid Mirza, S.A.M. Gilani

Abstract:

Segmentation is an important step in medical image analysis and classification for radiological evaluation or computer aided diagnosis. The CAD (Computer Aided Diagnosis ) of lung CT generally first segment the area of interest (lung) and then analyze the separately obtained area for nodule detection in order to diagnosis the disease. For normal lung, segmentation can be performed by making use of excellent contrast between air and surrounding tissues. However this approach fails when lung is affected by high density pathology. Dense pathologies are present in approximately a fifth of clinical scans, and for computer analysis such as detection and quantification of abnormal areas it is vital that the entire and perfectly lung part of the image is provided and no part, as present in the original image be eradicated. In this paper we have proposed a lung segmentation technique which accurately segment the lung parenchyma from lung CT Scan images. The algorithm was tested against the 25 datasets of different patients received from Ackron Univeristy, USA and AGA Khan Medical University, Karachi, Pakistan.

Keywords: Computer Aided Diagnosis, Medical ImageProcessing, Region Growing, Segmentation, Thresholding,

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2544
78 Product Features Extraction from Opinions According to Time

Authors: Kamal Amarouche, Houda Benbrahim, Ismail Kassou

Abstract:

Nowadays, e-commerce shopping websites have experienced noticeable growth. These websites have gained consumers’ trust. After purchasing a product, many consumers share comments where opinions are usually embedded about the given product. Research on the automatic management of opinions that gives suggestions to potential consumers and portrays an image of the product to manufactures has been growing recently. After launching the product in the market, the reviews generated around it do not usually contain helpful information or generic opinions about this product (e.g. telephone: great phone...); in the sense that the product is still in the launching phase in the market. Within time, the product becomes old. Therefore, consumers perceive the advantages/ disadvantages about each specific product feature. Therefore, they will generate comments that contain their sentiments about these features. In this paper, we present an unsupervised method to extract different product features hidden in the opinions which influence its purchase, and that combines Time Weighting (TW) which depends on the time opinions were expressed with Term Frequency-Inverse Document Frequency (TF-IDF). We conduct several experiments using two different datasets about cell phones and hotels. The results show the effectiveness of our automatic feature extraction, as well as its domain independent characteristic.

Keywords: Opinion mining, product feature extraction, sentiment analysis, SentiWordNet.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1268
77 An In-Depth Analysis of Open Data Portals as an Emerging Public E-Service

Authors: Martin Lnenicka

Abstract:

Governments collect and produce large amounts of data. Increasingly, governments worldwide have started to implement open data initiatives and also launch open data portals to enable the release of these data in open and reusable formats. Therefore, a large number of open data repositories, catalogues and portals have been emerging in the world. The greater availability of interoperable and linkable open government data catalyzes secondary use of such data, so they can be used for building useful applications which leverage their value, allow insight, provide access to government services, and support transparency. The efficient development of successful open data portals makes it necessary to evaluate them systematic, in order to understand them better and assess the various types of value they generate, and identify the required improvements for increasing this value. Thus, the attention of this paper is directed particularly to the field of open data portals. The main aim of this paper is to compare the selected open data portals on the national level using content analysis and propose a new evaluation framework, which further improves the quality of these portals. It also establishes a set of considerations for involving businesses and citizens to create eservices and applications that leverage on the datasets available from these portals.

Keywords: Big data, content analysis, criteria comparison, data quality, open data, open data portals, public sector.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3031
76 ANN Based Model Development for Material Removal Rate in Dry Turning in Indian Context

Authors: Mangesh R. Phate, V. H. Tatwawadi

Abstract:

This paper is intended to develop an artificial neural network (ANN) based model of material removal rate (MRR) in the turning of ferrous and nonferrous material in a Indian small-scale industry. MRR of the formulated model was proved with the testing data and artificial neural network (ANN) model was developed for the analysis and prediction of the relationship between inputs and output parameters during the turning of ferrous and nonferrous materials. The input parameters of this model are operator, work-piece, cutting process, cutting tool, machine and the environment.

The ANN model consists of a three layered feedforward back propagation neural network. The network is trained with pairs of independent/dependent datasets generated when machining ferrous and nonferrous material. A very good performance of the neural network, in terms of contract with experimental data, was achieved. The model may be used for the testing and forecast of the complex relationship between dependent and the independent parameters in turning operations.

Keywords: Field data based model, Artificial neural network, Simulation, Convectional Turning, Material removal rate.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1934
75 Overall Student Satisfaction at Tabor School of Education: An Examination of Key Factors Based on the AUSSE SEQ

Authors: Francisco Ben, Tracey Price, Chad Morrison, Victoria Warren, Willy Gollan, Robyn Dunbar, Frank Davies, Mark Sorrell

Abstract:

This paper focuses particularly on the educational aspects that contribute to the overall educational satisfaction rated by Tabor School of Education students who participated in the Australasian Survey of Student Engagement (AUSSE) conducted by the Australian Council for Educational Research (ACER) in 2010, 2012 and 2013. In all three years of participation, Tabor ranked first especially in the area of overall student satisfaction. By using a single level path analysis in relation to the AUSSE datasets collected using the Student Engagement Questionnaire (SEQ) for Tabor School of Education, seven aspects that contribute to overall student satisfaction have been identified. There appears to be a direct causal link between aspects of the Supportive Learning Environment, Work Integrated Learning, Career Readiness, Academic Challenge, and overall educational satisfaction levels. A further three aspects, being Student and Staff Interactions, Active Learning, and Enriching Educational Experiences, indirectly influence overall educational satisfaction levels.

Keywords: Tertiary student satisfaction, tertiary educational experience, pre-service teacher education, path analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1194
74 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: Data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2498
73 Development of Fake News Model Using Machine Learning through Natural Language Processing

Authors: Sajjad Ahmed, Knut Hinkelmann, Flavio Corradini

Abstract:

Fake news detection research is still in the early stage as this is a relatively new phenomenon in the interest raised by society. Machine learning helps to solve complex problems and to build AI systems nowadays and especially in those cases where we have tacit knowledge or the knowledge that is not known. We used machine learning algorithms and for identification of fake news; we applied three classifiers; Passive Aggressive, Naïve Bayes, and Support Vector Machine. Simple classification is not completely correct in fake news detection because classification methods are not specialized for fake news. With the integration of machine learning and text-based processing, we can detect fake news and build classifiers that can classify the news data. Text classification mainly focuses on extracting various features of text and after that incorporating those features into classification. The big challenge in this area is the lack of an efficient way to differentiate between fake and non-fake due to the unavailability of corpora. We applied three different machine learning classifiers on two publicly available datasets. Experimental analysis based on the existing dataset indicates a very encouraging and improved performance.

Keywords: Fake news detection, types of fake news, machine learning, natural language processing, classification techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1453
72 A Hybrid Image Fusion Model for Generating High Spatial-Temporal-Spectral Resolution Data Using OLI-MODIS-Hyperion Satellite Imagery

Authors: Yongquan Zhao, Bo Huang

Abstract:

Spatial, Temporal, and Spectral Resolution (STSR) are three key characteristics of Earth observation satellite sensors; however, any single satellite sensor cannot provide Earth observations with high STSR simultaneously because of the hardware technology limitations of satellite sensors. On the other hand, a conflicting circumstance is that the demand for high STSR has been growing with the remote sensing application development. Although image fusion technology provides a feasible means to overcome the limitations of the current Earth observation data, the current fusion technologies cannot enhance all STSR simultaneously and provide high enough resolution improvement level. This study proposes a Hybrid Spatial-Temporal-Spectral image Fusion Model (HSTSFM) to generate synthetic satellite data with high STSR simultaneously, which blends the high spatial resolution from the panchromatic image of Landsat-8 Operational Land Imager (OLI), the high temporal resolution from the multi-spectral image of Moderate Resolution Imaging Spectroradiometer (MODIS), and the high spectral resolution from the hyper-spectral image of Hyperion to produce high STSR images. The proposed HSTSFM contains three fusion modules: (1) spatial-spectral image fusion; (2) spatial-temporal image fusion; (3) temporal-spectral image fusion. A set of test data with both phenological and land cover type changes in Beijing suburb area, China is adopted to demonstrate the performance of the proposed method. The experimental results indicate that HSTSFM can produce fused image that has good spatial and spectral fidelity to the reference image, which means it has the potential to generate synthetic data to support the studies that require high STSR satellite imagery.

Keywords: Hybrid spatial-temporal-spectral fusion, high resolution synthetic imagery, least square regression, sparse representation, spectral transformation.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1163
71 A Neuron Model of Facial Recognition and Detection of an Authorized Entity Using Machine Learning System

Authors: J. K. Adedeji, M. O. Oyekanmi

Abstract:

This paper has critically examined the use of Machine Learning procedures in curbing unauthorized access into valuable areas of an organization. The use of passwords, pin codes, user’s identification in recent times has been partially successful in curbing crimes involving identities, hence the need for the design of a system which incorporates biometric characteristics such as DNA and pattern recognition of variations in facial expressions. The facial model used is the OpenCV library which is based on the use of certain physiological features, the Raspberry Pi 3 module is used to compile the OpenCV library, which extracts and stores the detected faces into the datasets directory through the use of camera. The model is trained with 50 epoch run in the database and recognized by the Local Binary Pattern Histogram (LBPH) recognizer contained in the OpenCV. The training algorithm used by the neural network is back propagation coded using python algorithmic language with 200 epoch runs to identify specific resemblance in the exclusive OR (XOR) output neurons. The research however confirmed that physiological parameters are better effective measures to curb crimes relating to identities.

Keywords: Biometric characters, facial recognition, neural network, OpenCV.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 657
70 Local Spectrum Feature Extraction for Face Recognition

Authors: Muhammad Imran Ahmad, Ruzelita Ngadiran, Mohd Nazrin Md Isa, Nor Ashidi Mat Isa, Mohd Zaizu Ilyas, Raja Abdullah Raja Ahmad, Said Amirul Anwar Ab Hamid, Muzammil Jusoh

Abstract:

This paper presents two techniques, local feature extraction using image spectrum and low frequency spectrum modelling using GMM to capture the underlying statistical information to improve the performance of face recognition system. Local spectrum features are extracted using overlap sub block window that are mapped on the face image. For each of this block, spatial domain is transformed to frequency domain using DFT. A low frequency coefficient is preserved by discarding high frequency coefficients by applying rectangular mask on the spectrum of the facial image. Low frequency information is non- Gaussian in the feature space and by using combination of several Gaussian functions that has different statistical properties, the best feature representation can be modelled using probability density function. The recognition process is performed using maximum likelihood value computed using pre-calculated GMM components. The method is tested using FERET datasets and is able to achieved 92% recognition rates.

Keywords: Local features modelling, face recognition system, Gaussian mixture models.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2212
69 A Hybrid Nature Inspired Algorithm for Generating Optimal Query Plan

Authors: R. Gomathi, D. Sharmila

Abstract:

The emergence of the Semantic Web technology increases day by day due to the rapid growth of multiple web pages. Many standard formats are available to store the semantic web data. The most popular format is the Resource Description Framework (RDF). Querying large RDF graphs becomes a tedious procedure with a vast increase in the amount of data. The problem of query optimization becomes an issue in querying large RDF graphs. Choosing the best query plan reduces the amount of query execution time. To address this problem, nature inspired algorithms can be used as an alternative to the traditional query optimization techniques. In this research, the optimal query plan is generated by the proposed SAPSO algorithm which is a hybrid of Simulated Annealing (SA) and Particle Swarm Optimization (PSO) algorithms. The proposed SAPSO algorithm has the ability to find the local optimistic result and it avoids the problem of local minimum. Experiments were performed on different datasets by changing the number of predicates and the amount of data. The proposed algorithm gives improved results compared to existing algorithms in terms of query execution time.

Keywords: Semantic web, RDF, Query optimization, Nature inspired algorithms, PSO, SA.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2198
68 Implementing an Intuitive Reasoner with a Large Weather Database

Authors: Yung-Chien Sun, O. Grant Clark

Abstract:

In this paper, the implementation of a rule-based intuitive reasoner is presented. The implementation included two parts: the rule induction module and the intuitive reasoner. A large weather database was acquired as the data source. Twelve weather variables from those data were chosen as the “target variables" whose values were predicted by the intuitive reasoner. A “complex" situation was simulated by making only subsets of the data available to the rule induction module. As a result, the rules induced were based on incomplete information with variable levels of certainty. The certainty level was modeled by a metric called "Strength of Belief", which was assigned to each rule or datum as ancillary information about the confidence in its accuracy. Two techniques were employed to induce rules from the data subsets: decision tree and multi-polynomial regression, respectively for the discrete and the continuous type of target variables. The intuitive reasoner was tested for its ability to use the induced rules to predict the classes of the discrete target variables and the values of the continuous target variables. The intuitive reasoner implemented two types of reasoning: fast and broad where, by analogy to human thought, the former corresponds to fast decision making and the latter to deeper contemplation. . For reference, a weather data analysis approach which had been applied on similar tasks was adopted to analyze the complete database and create predictive models for the same 12 target variables. The values predicted by the intuitive reasoner and the reference approach were compared with actual data. The intuitive reasoner reached near-100% accuracy for two continuous target variables. For the discrete target variables, the intuitive reasoner predicted at least 70% as accurately as the reference reasoner. Since the intuitive reasoner operated on rules derived from only about 10% of the total data, it demonstrated the potential advantages in dealing with sparse data sets as compared with conventional methods.

Keywords: Artificial intelligence, intuition, knowledge acquisition, limited certainty.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1348
67 Comparative Study on Swarm Intelligence Techniques for Biclustering of Microarray Gene Expression Data

Authors: R. Balamurugan, A. M. Natarajan, K. Premalatha

Abstract:

Microarray gene expression data play a vital in biological processes, gene regulation and disease mechanism. Biclustering in gene expression data is a subset of the genes indicating consistent patterns under the subset of the conditions. Finding a biclustering is an optimization problem. In recent years, swarm intelligence techniques are popular due to the fact that many real-world problems are increasingly large, complex and dynamic. By reasons of the size and complexity of the problems, it is necessary to find an optimization technique whose efficiency is measured by finding the near optimal solution within a reasonable amount of time. In this paper, the algorithmic concepts of the Particle Swarm Optimization (PSO), Shuffled Frog Leaping (SFL) and Cuckoo Search (CS) algorithms have been analyzed for the four benchmark gene expression dataset. The experiment results show that CS outperforms PSO and SFL for 3 datasets and SFL give better performance in one dataset. Also this work determines the biological relevance of the biclusters with Gene Ontology in terms of function, process and component.

Keywords: Particle swarm optimization, Shuffled frog leaping, Cuckoo search, biclustering, gene expression data.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2632
66 Depth Camera Aided Dead-Reckoning Localization of Autonomous Mobile Robots in Unstructured Global Navigation Satellite System Denied Environments

Authors: David L. Olson, Stephen B. H. Bruder, Adam S. Watkins, Cleon E. Davis

Abstract:

In global navigation satellite system (GNSS) denied settings, such as indoor environments, autonomous mobile robots are often limited to dead-reckoning navigation techniques to determine their position, velocity, and attitude (PVA). Localization is typically accomplished by employing an inertial measurement unit (IMU), which, while precise in nature, accumulates errors rapidly and severely degrades the localization solution. Standard sensor fusion methods, such as Kalman filtering, aim to fuse precise IMU measurements with accurate aiding sensors to establish a precise and accurate solution. In indoor environments, where GNSS and no other a priori information is known about the environment, effective sensor fusion is difficult to achieve, as accurate aiding sensor choices are sparse. However, an opportunity arises by employing a depth camera in the indoor environment. A depth camera can capture point clouds of the surrounding floors and walls. Extracting attitude from these surfaces can serve as an accurate aiding source, which directly combats errors that arise due to gyroscope imperfections. This configuration for sensor fusion leads to a dramatic reduction of PVA error compared to traditional aiding sensor configurations. This paper provides the theoretical basis for the depth camera aiding sensor method, initial expectations of performance benefit via simulation, and hardware implementation thus verifying its veracity. Hardware implementation is performed on the Quanser Qbot 2™ mobile robot, with a Vector-Nav VN-200™ IMU and Kinect™ camera from Microsoft.

Keywords: Autonomous mobile robotics, dead reckoning, depth camera, inertial navigation, Kalman filtering, localization, sensor fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 648
65 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography

Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz

Abstract:

Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.

Keywords: Ring recognition, edge detection, X-ray computed tomography, dendrochronology.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 750
64 Investigations of Protein Aggregation Using Sequence and Structure Based Features

Authors: M. Michael Gromiha, A. Mary Thangakani, Sandeep Kumar, D. Velmurugan

Abstract:

The main cause of several neurodegenerative diseases such as Alzhemier, Parkinson and spongiform encephalopathies is formation of amyloid fibrils and plaques in proteins. We have analyzed different sets of proteins and peptides to understand the influence of sequence based features on protein aggregation process. The comparison of 373 pairs of homologous mesophilic and thermophilic proteins showed that aggregation prone regions (APRs) are present in both. But, the thermophilic protein monomers show greater ability to ‘stow away’ the APRs in their hydrophobic cores and protect them from solvent exposure. The comparison of amyloid forming and amorphous b-aggregating hexapeptides suggested distinct preferences for specific residues at the six positions as well as all possible combinations of nine residue pairs. The compositions of residues at different positions and residue pairs have been converted into energy potentials and utilized for distinguishing between amyloid forming and amorphous b-aggregating peptides. Our method could correctly identify the amyloid forming peptides at an accuracy of 95-100% in different datasets of peptides.

Keywords: Aggregation prone regions, amyloids, thermophilic proteins, amino acid residues, machine learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
63 Performance Analysis of HSDPA Systems using Low-Density Parity-Check (LDPC)Coding as Compared to Turbo Coding

Authors: K. Anitha Sheela, J. Tarun Kumar

Abstract:

HSDPA is a new feature which is introduced in Release-5 specifications of the 3GPP WCDMA/UTRA standard to realize higher speed data rate together with lower round-trip times. Moreover, the HSDPA concept offers outstanding improvement of packet throughput and also significantly reduces the packet call transfer delay as compared to Release -99 DSCH. Till now the HSDPA system uses turbo coding which is the best coding technique to achieve the Shannon limit. However, the main drawbacks of turbo coding are high decoding complexity and high latency which makes it unsuitable for some applications like satellite communications, since the transmission distance itself introduces latency due to limited speed of light. Hence in this paper it is proposed to use LDPC coding in place of Turbo coding for HSDPA system which decreases the latency and decoding complexity. But LDPC coding increases the Encoding complexity. Though the complexity of transmitter increases at NodeB, the End user is at an advantage in terms of receiver complexity and Bit- error rate. In this paper LDPC Encoder is implemented using “sparse parity check matrix" H to generate a codeword at Encoder and “Belief Propagation algorithm "for LDPC decoding .Simulation results shows that in LDPC coding the BER suddenly drops as the number of iterations increase with a small increase in Eb/No. Which is not possible in Turbo coding. Also same BER was achieved using less number of iterations and hence the latency and receiver complexity has decreased for LDPC coding. HSDPA increases the downlink data rate within a cell to a theoretical maximum of 14Mbps, with 2Mbps on the uplink. The changes that HSDPA enables includes better quality, more reliable and more robust data services. In other words, while realistic data rates are only a few Mbps, the actual quality and number of users achieved will improve significantly.

Keywords: AMC, HSDPA, LDPC, WCDMA, 3GPP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2015
62 Imputing Missing Data in Electronic Health Records: A Comparison of Linear and Non-Linear Imputation Models

Authors: Alireza Vafaei Sadr, Vida Abedi, Jiang Li, Ramin Zand

Abstract:

Missing data is a common challenge in medical research and can lead to biased or incomplete results. When the data bias leaks into models, it further exacerbates health disparities; biased algorithms can lead to misclassification and reduced resource allocation and monitoring as part of prevention strategies for certain minorities and vulnerable segments of patient populations, which in turn further reduce data footprint from the same population – thus, a vicious cycle. This study compares the performance of six imputation techniques grouped into Linear and Non-Linear models, on two different real-world electronic health records (EHRs) datasets, representing 17864 patient records. The mean absolute percentage error (MAPE) and root mean squared error (RMSE) are used as performance metrics, and the results show that the Linear models outperformed the Non-Linear models in terms of both metrics. These results suggest that sometimes Linear models might be an optimal choice for imputation in laboratory variables in terms of imputation efficiency and uncertainty of predicted values.

Keywords: EHR, Machine Learning, imputation, laboratory variables, algorithmic bias.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 61
61 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li

Abstract:

The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose an aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: Aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 417
60 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm, to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: Rough Set Theory, Attribute Reduction, Fuzzy Logic, Memetic Algorithms, Record to Record Algorithm, Great Deluge Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1896
59 Comparison of Deep Convolutional Neural Networks Models for Plant Disease Identification

Authors: Megha Gupta, Nupur Prakash

Abstract:

Identification of plant diseases has been performed using machine learning and deep learning models on the datasets containing images of healthy and diseased plant leaves. The current study carries out an evaluation of some of the deep learning models based on convolutional neural network architectures for identification of plant diseases. For this purpose, the publicly available New Plant Diseases Dataset, an augmented version of PlantVillage dataset, available on Kaggle platform, containing 87,900 images has been used. The dataset contained images of 26 diseases of 14 different plants and images of 12 healthy plants. The CNN models selected for the study presented in this paper are AlexNet, ZFNet, VGGNet (four models), GoogLeNet, and ResNet (three models). The selected models are trained using PyTorch, an open-source machine learning library, on Google Colaboratory. A comparative study has been carried out to analyze the high degree of accuracy achieved using these models. The highest test accuracy and F1-score of 99.59% and 0.996, respectively, were achieved by using GoogLeNet with Mini-batch momentum based gradient descent learning algorithm.

Keywords: comparative analysis, convolutional neural networks, deep learning, plant disease identification

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 571
58 On-line Recognition of Isolated Gestures of Flight Deck Officers (FDO)

Authors: Deniz T. Sodiri, Venkat V S S Sastry

Abstract:

The paper presents an on-line recognition machine (RM) for continuous/isolated, dynamic and static gestures that arise in Flight Deck Officer (FDO) training. RM is based on generic pattern recognition framework. Gestures are represented as templates using summary statistics. The proposed recognition algorithm exploits temporal and spatial characteristics of gestures via dynamic programming and Markovian process. The algorithm predicts corresponding index of incremental input data in the templates in an on-line mode. Accumulated consistency in the sequence of prediction provides a similarity measurement (Score) between input data and the templates. The algorithm provides an intuitive mechanism for automatic detection of start/end frames of continuous gestures. In the present paper, we consider isolated gestures. The performance of RM is evaluated using four datasets - artificial (W TTest), hand motion (Yang) and FDO (tracker, vision-based ). RM achieves comparable results which are in agreement with other on-line and off-line algorithms such as hidden Markov model (HMM) and dynamic time warping (DTW). The proposed algorithm has the additional advantage of providing timely feedback for training purposes.

Keywords: On-line Recognition Algorithm, IsolatedDynamic/Static Gesture Recognition, On-line Markovian/DynamicProgramming, Training in Virtual Environments.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1294
57 Enhanced Multi-Intensity Analysis in Multi-Scenery Classification-Based Macro and Micro Elements

Authors: R. Bremananth

Abstract:

Several computationally challenging issues are encountered while classifying complex natural scenes. In this paper, we address the problems that are encountered in rotation invariance with multi-intensity analysis for multi-scene overlapping. In the present literature, various algorithms proposed techniques for multi-intensity analysis, but there are several restrictions in these algorithms while deploying them in multi-scene overlapping classifications. In order to resolve the problem of multi-scenery overlapping classifications, we present a framework that is based on macro and micro basis functions. This algorithm conquers the minimum classification false alarm while pigeonholing multi-scene overlapping. Furthermore, a quadrangle multi-intensity decay is invoked. Several parameters are utilized to analyze invariance for multi-scenery classifications such as rotation, classification, correlation, contrast, homogeneity, and energy. Benchmark datasets were collected for complex natural scenes and experimented for the framework. The results depict that the framework achieves a significant improvement on gray-level matrix of co-occurrence features for overlapping in diverse degree of orientations while pigeonholing multi-scene overlapping.

Keywords: Automatic classification, contrast, homogeneity, invariant analysis, multi-scene analysis, overlapping.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1077