Search results for: time series classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 21403

Search results for: time series classification

20833 Domain-Specific Deep Neural Network Model for Classification of Abnormalities on Chest Radiographs

Authors: Nkechinyere Joy Olawuyi, Babajide Samuel Afolabi, Bola Ibitoye

Abstract:

This study collected a preprocessed dataset of chest radiographs and formulated a deep neural network model for detecting abnormalities. It also evaluated the performance of the formulated model and implemented a prototype of the formulated model. This was with the view to developing a deep neural network model to automatically classify abnormalities in chest radiographs. In order to achieve the overall purpose of this research, a large set of chest x-ray images were sourced for and collected from the CheXpert dataset, which is an online repository of annotated chest radiographs compiled by the Machine Learning Research Group, Stanford University. The chest radiographs were preprocessed into a format that can be fed into a deep neural network. The preprocessing techniques used were standardization and normalization. The classification problem was formulated as a multi-label binary classification model, which used convolutional neural network architecture to make a decision on whether an abnormality was present or not in the chest radiographs. The classification model was evaluated using specificity, sensitivity, and Area Under Curve (AUC) score as the parameter. A prototype of the classification model was implemented using Keras Open source deep learning framework in Python Programming Language. The AUC ROC curve of the model was able to classify Atelestasis, Support devices, Pleural effusion, Pneumonia, A normal CXR (no finding), Pneumothorax, and Consolidation. However, Lung opacity and Cardiomegaly had a probability of less than 0.5 and thus were classified as absent. Precision, recall, and F1 score values were 0.78; this implies that the number of False Positive and False Negative is the same, revealing some measure of label imbalance in the dataset. The study concluded that the developed model is sufficient to classify abnormalities present in chest radiographs into present or absent.

Keywords: transfer learning, convolutional neural network, radiograph, classification, multi-label

Procedia PDF Downloads 125
20832 Strategies for Synchronizing Chocolate Conching Data Using Dynamic Time Warping

Authors: Fernanda A. P. Peres, Thiago N. Peres, Flavio S. Fogliatto, Michel J. Anzanello

Abstract:

Batch processes are widely used in food industry and have an important role in the production of high added value products, such as chocolate. Process performance is usually described by variables that are monitored as the batch progresses. Data arising from these processes are likely to display a strong correlation-autocorrelation structure, and are usually monitored using control charts based on multiway principal components analysis (MPCA). Process control of a new batch is carried out comparing the trajectories of its relevant process variables with those in a reference set of batches that yielded products within specifications; it is clear that proper determination of the reference set is key for the success of a correct signalization of non-conforming batches in such quality control schemes. In chocolate manufacturing, misclassifications of non-conforming batches in the conching phase may lead to significant financial losses. In such context, the accuracy of process control grows in relevance. In addition to that, the main assumption in MPCA-based monitoring strategies is that all batches are synchronized in duration, both the new batch being monitored and those in the reference set. Such assumption is often not satisfied in chocolate manufacturing process. As a consequence, traditional techniques as MPCA-based charts are not suitable for process control and monitoring. To address that issue, the objective of this work is to compare the performance of three dynamic time warping (DTW) methods in the alignment and synchronization of chocolate conching process variables’ trajectories, aimed at properly determining the reference distribution for multivariate statistical process control. The power of classification of batches in two categories (conforming and non-conforming) was evaluated using the k-nearest neighbor (KNN) algorithm. Real data from a milk chocolate conching process was collected and the following variables were monitored over time: frequency of soybean lecithin dosage, rotation speed of the shovels, current of the main motor of the conche, and chocolate temperature. A set of 62 batches with durations between 495 and 1,170 minutes was considered; 53% of the batches were known to be conforming based on lab test results and experts’ evaluations. Results showed that all three DTW methods tested were able to align and synchronize the conching dataset. However, synchronized datasets obtained from these methods performed differently when inputted in the KNN classification algorithm. Kassidas, MacGregor and Taylor’s (named KMT) method was deemed the best DTW method for aligning and synchronizing a milk chocolate conching dataset, presenting 93.7% accuracy, 97.2% sensitivity and 90.3% specificity in batch classification, being considered the best option to determine the reference set for the milk chocolate dataset. Such method was recommended due to the lowest number of iterations required to achieve convergence and highest average accuracy in the testing portion using the KNN classification technique.

Keywords: batch process monitoring, chocolate conching, dynamic time warping, reference set distribution, variable duration

Procedia PDF Downloads 165
20831 Preliminary Evaluation of Decommissioning Wastes for the First Commercial Nuclear Power Reactor in South Korea

Authors: Kyomin Lee, Joohee Kim, Sangho Kang

Abstract:

The commercial nuclear power reactor in South Korea, Kori Unit 1, which was a 587 MWe pressurized water reactor that started operation since 1978, was permanently shut down in June 2017 without an additional operating license extension. The Kori 1 Unit is scheduled to become the nuclear power unit to enter the decommissioning phase. In this study, the preliminary evaluation of the decommissioning wastes for the Kori Unit 1 was performed based on the following series of process: firstly, the plant inventory is investigated based on various documents (i.e., equipment/ component list, construction records, general arrangement drawings). Secondly, the radiological conditions of systems, structures and components (SSCs) are established to estimate the amount of radioactive waste by waste classification. Third, the waste management strategies for Kori Unit 1 including waste packaging are established. Forth, selection of the proper decontamination and dismantling (D&D) technologies is made considering the various factors. Finally, the amount of decommissioning waste by classification for Kori 1 is estimated using the DeCAT program, which was developed by KEPCO-E&C for a decommissioning cost estimation. The preliminary evaluation results have shown that the expected amounts of decommissioning wastes were less than about 2% and 8% of the total wastes generated (i.e., sum of clean wastes and radwastes) before/after waste processing, respectively, and it was found that the majority of contaminated material was carbon or alloy steel and stainless steel. In addition, within the range of availability of information, the results of the evaluation were compared with the results from the various decommissioning experiences data or international/national decommissioning study. The comparison results have shown that the radioactive waste amount from Kori Unit 1 decommissioning were much less than those from the plants decommissioned in U.S. and were comparable to those from the plants in Europe. This result comes from the difference of disposal cost and clearance criteria (i.e., free release level) between U.S. and non-U.S. The preliminary evaluation performed using the methodology established in this study will be useful as a important information in establishing the decommissioning planning for the decommissioning schedule and waste management strategy establishment including the transportation, packaging, handling, and disposal of radioactive wastes.

Keywords: characterization, classification, decommissioning, decontamination and dismantling, Kori 1, radioactive waste

Procedia PDF Downloads 208
20830 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 517
20829 Video on Demand (VOD) Industry in Iran: Study of Reasons of Increasing Film and Series Platforms

Authors: Narges Hamidipour

Abstract:

VOD, which stands for "video on demand", is one kind of watching movies and series on web platforms that, by using them, individuals can access lots of video content by paying abonnement. The first platform in Iran was funded in 2014, and in the last 10 years, it has become the main part of the movie and series industry. There are 374 VOD platforms in Iran, but just three of them are in the mainstream. However, in these years, they have been developed and famed in different ways. This article focuses on the reasons for this development in the past years. For the framework, "digital economy", "media industries," and "political economy" have been used with the interview method. In this research, some experts in SATRA (regulatory organization of inclusive audio and video media in Iran), owners or managers of VODs and some others who directly have been in the system conveyed their opinions. By the way, some documents and analysis statistics are invoked to reach complete results.

Keywords: digital economy, political economy, VOD, interview, iran

Procedia PDF Downloads 64
20828 Robotic Assisted vs Traditional Laparoscopic Partial Nephrectomy Peri-Operative Outcomes: A Comparative Single Surgeon Study

Authors: Gerard Bray, Derek Mao, Arya Bahadori, Sachinka Ranasinghe

Abstract:

The EAU currently recommends partial nephrectomy as the preferred management for localised cT1 renal tumours, irrespective of surgical approach. With the advent of robotic assisted partial nephrectomy, there is growing evidence that warm ischaemia time may be reduced compared to the traditional laparoscopic approach. There is still no clear differences between the two approaches with regards to other peri-operative and oncological outcomes. Current limitations in the field denote the lack of single surgeon series to compare the two approaches as other studies often include multiple operators of different experience levels. To the best of our knowledge, this study is the first single surgeon series comparing peri-operative outcomes of robotic assisted and laparoscopic PN. The current study aims to reduce intra-operator bias while maintaining an adequate sample size to assess the differences in outcomes between the two approaches. We retrospectively compared patient demographics, peri-operative outcomes, and renal function derangements of all partial nephrectomies undertaken by a single surgeon with experience in both laparoscopic and robotic surgery. Warm ischaemia time, length of stay, and acute renal function deterioration were all significantly reduced with robotic partial nephrectomy, compared to laparoscopic nephrectomy. This study highlights the benefits of robotic partial nephrectomy. Further prospective studies with larger sample sizes would be valuable additions to the current literature.

Keywords: partial nephrectomy, robotic assisted partial nephrectomy, warm ischaemia time, peri-operative outcomes

Procedia PDF Downloads 139
20827 Theoretical Discussion on the Classification of Risks in Supply Chain Management

Authors: Liane Marcia Freitas Silva, Fernando Augusto Silva Marins, Maria Silene Alexandre Leite

Abstract:

The adoption of a network structure, like in the supply chains, favors the increase of dependence between companies and, by consequence, their vulnerability. Environment disasters, sociopolitical and economical events, and the dynamics of supply chains elevate the uncertainty of their operation, favoring the occurrence of events that can generate break up in the operations and other undesired consequences. Thus, supply chains are exposed to various risks that can influence the profitability of companies involved, and there are several previous studies that have proposed risk classification models in order to categorize the risks and to manage them. The objective of this paper is to analyze and discuss thirty of these risk classification models by means a theoretical survey. The research method adopted for analyzing and discussion includes three phases: The identification of the types of risks proposed in each one of the thirty models, the grouping of them considering equivalent concepts associated to their definitions, and, the analysis of these risks groups, evaluating their similarities and differences. After these analyses, it was possible to conclude that, in fact, there is more than thirty risks types identified in the literature of Supply Chains, but some of them are identical despite of be used distinct terms to characterize them, because different criteria for risk classification are adopted by researchers. In short, it is observed that some types of risks are identified as risk source for supply chains, such as, demand risk, environmental risk and safety risk. On the other hand, other types of risks are identified by the consequences that they can generate for the supply chains, such as, the reputation risk, the asset depreciation risk and the competitive risk. These results are consequence of the disagreements between researchers on risk classification, mainly about what is risk event and about what is the consequence of risk occurrence. An additional study is in developing in order to clarify how the risks can be generated, and which are the characteristics of the components in a Supply Chain that leads to occurrence of risk.

Keywords: sisks classification, survey, supply chain management, theoretical discussion

Procedia PDF Downloads 631
20826 Discerning Divergent Nodes in Social Networks

Authors: Mehran Asadi, Afrand Agah

Abstract:

In data mining, partitioning is used as a fundamental tool for classification. With the help of partitioning, we study the structure of data, which allows us to envision decision rules, which can be applied to classification trees. In this research, we used online social network dataset and all of its attributes (e.g., Node features, labels, etc.) to determine what constitutes an above average chance of being a divergent node. We used the R statistical computing language to conduct the analyses in this report. The data were found on the UC Irvine Machine Learning Repository. This research introduces the basic concepts of classification in online social networks. In this work, we utilize overfitting and describe different approaches for evaluation and performance comparison of different classification methods. In classification, the main objective is to categorize different items and assign them into different groups based on their properties and similarities. In data mining, recursive partitioning is being utilized to probe the structure of a data set, which allow us to envision decision rules and apply them to classify data into several groups. Estimating densities is hard, especially in high dimensions, with limited data. Of course, we do not know the densities, but we could estimate them using classical techniques. First, we calculated the correlation matrix of the dataset to see if any predictors are highly correlated with one another. By calculating the correlation coefficients for the predictor variables, we see that density is strongly correlated with transitivity. We initialized a data frame to easily compare the quality of the result classification methods and utilized decision trees (with k-fold cross validation to prune the tree). The method performed on this dataset is decision trees. Decision tree is a non-parametric classification method, which uses a set of rules to predict that each observation belongs to the most commonly occurring class label of the training data. Our method aggregates many decision trees to create an optimized model that is not susceptible to overfitting. When using a decision tree, however, it is important to use cross-validation to prune the tree in order to narrow it down to the most important variables.

Keywords: online social networks, data mining, social cloud computing, interaction and collaboration

Procedia PDF Downloads 154
20825 Development of Residual Power Series Methods for Efficient Solutions of Stiff Differential Equations

Authors: Gebreegziabher Hailu

Abstract:

This paper presents the development of residual power series methods aimed at efficiently solving stiff differential equations, which pose significant challenges in numerical analysis due to their rapid changes in solution behavior. The RPSM is a numerical approach that generates polynomial-based approximate solutions without the need for linearization, discretization, or perturbation techniques, making it straightforward to implement and less prone to computational errors. We introduce an approach that utilizes power series expansions combined with residual minimization techniques to enhance convergence and stability. By analyzing the theoretical foundations of stiffness, we delve into the formulation of the residual power series method, detailing how it effectively captures the dynamics of stiff systems while maintaining computational efficiency. Numerical experiments demonstrate the method's superiority in terms of accuracy and computational cost when compared to traditional methods like implicit Runge-Kutta or multistep techniques. We also explore adaptive strategies within our framework to automatically adjust parameters based on the stiffness characteristics of the problem at hand. Ultimately, our findings contribute to the broader toolkit for tackling stiff differential equations, offering a robust alternative that promises to streamline computational workflows in various applied mathematics and engineering contexts.

Keywords: residual power series methods, stiff differential equoations, numerical approach, Runge Kutta methods

Procedia PDF Downloads 21
20824 Prediction on Housing Price Based on Deep Learning

Authors: Li Yu, Chenlu Jiao, Hongrun Xin, Yan Wang, Kaiyang Wang

Abstract:

In order to study the impact of various factors on the housing price, we propose to build different prediction models based on deep learning to determine the existing data of the real estate in order to more accurately predict the housing price or its changing trend in the future. Considering that the factors which affect the housing price vary widely, the proposed prediction models include two categories. The first one is based on multiple characteristic factors of the real estate. We built Convolution Neural Network (CNN) prediction model and Long Short-Term Memory (LSTM) neural network prediction model based on deep learning, and logical regression model was implemented to make a comparison between these three models. Another prediction model is time series model. Based on deep learning, we proposed an LSTM-1 model purely regard to time series, then implementing and comparing the LSTM model and the Auto-Regressive and Moving Average (ARMA) model. In this paper, comprehensive study of the second-hand housing price in Beijing has been conducted from three aspects: crawling and analyzing, housing price predicting, and the result comparing. Ultimately the best model program was produced, which is of great significance to evaluation and prediction of the housing price in the real estate industry.

Keywords: deep learning, convolutional neural network, LSTM, housing prediction

Procedia PDF Downloads 304
20823 Friction Stir Welding of Al-Mg-Mn Aluminum Alloy Plates: A Review

Authors: K. Subbaiah, C. V. Jayakumar

Abstract:

Friction stir welding is a solid state welding process. Friction stir welding process eliminates the defects found in fusion welding processes. It is environmentally friend process. 5000 and 6000 series aluminum alloys are widely used in the transportation industries. The Al-Mg-Mn (5000) and Al-Mg-Si (6000) alloys are preferably offer best combination of use in Marine construction. The medium strength and high corrosion resistant 5000 series alloys are the aluminum alloys, which are found maximum utility in the world. In this review, the tool pin profile, process parameters such as hardness, yield strength and tensile strength, and microstructural evolution of friction stir welding of Al-Mg-Mn alloys (5000 Series) have been discussed.

Keywords: Al-Mg-Mn alloys, friction stir welding, tool pin profile, microstructure and mechanical properties

Procedia PDF Downloads 440
20822 Development of a Computer Aided Diagnosis Tool for Brain Tumor Extraction and Classification

Authors: Fathi Kallel, Abdulelah Alabd Uljabbar, Abdulrahman Aldukhail, Abdulaziz Alomran

Abstract:

The brain is an important organ in our body since it is responsible about the majority actions such as vision, memory, etc. However, different diseases such as Alzheimer and tumors could affect the brain and conduct to a partial or full disorder. Regular diagnosis are necessary as a preventive measure and could help doctors to early detect a possible trouble and therefore taking the appropriate treatment, especially in the case of brain tumors. Different imaging modalities are proposed for diagnosis of brain tumor. The powerful and most used modality is the Magnetic Resonance Imaging (MRI). MRI images are analyzed by doctor in order to locate eventual tumor in the brain and describe the appropriate and needed treatment. Diverse image processing methods are also proposed for helping doctors in identifying and analyzing the tumor. In fact, a large Computer Aided Diagnostic (CAD) tools including developed image processing algorithms are proposed and exploited by doctors as a second opinion to analyze and identify the brain tumors. In this paper, we proposed a new advanced CAD for brain tumor identification, classification and feature extraction. Our proposed CAD includes three main parts. Firstly, we load the brain MRI. Secondly, a robust technique for brain tumor extraction is proposed. This technique is based on both Discrete Wavelet Transform (DWT) and Principal Component Analysis (PCA). DWT is characterized by its multiresolution analytic property, that’s why it was applied on MRI images with different decomposition levels for feature extraction. Nevertheless, this technique suffers from a main drawback since it necessitates a huge storage and is computationally expensive. To decrease the dimensions of the feature vector and the computing time, PCA technique is considered. In the last stage, according to different extracted features, the brain tumor is classified into either benign or malignant tumor using Support Vector Machine (SVM) algorithm. A CAD tool for brain tumor detection and classification, including all above-mentioned stages, is designed and developed using MATLAB guide user interface.

Keywords: MRI, brain tumor, CAD, feature extraction, DWT, PCA, classification, SVM

Procedia PDF Downloads 246
20821 A Comparative Study for Various Techniques Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifyig the red blood cells as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively

Keywords: red blood cells, classification, radial basis function neural networks, suport vector machine, k-nearest neighbors algorithm

Procedia PDF Downloads 479
20820 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 305
20819 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 161
20818 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 531
20817 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 316
20816 Determinants of Aggregate Electricity Consumption in Ghana: A Multivariate Time Series Analysis

Authors: Renata Konadu

Abstract:

In Ghana, electricity has become the main form of energy which all sectors of the economy rely on for their businesses. Therefore, as the economy grows, the demand and consumption of electricity also grow alongside due to the heavy dependence on it. However, since the supply of electricity has not increased to match the demand, there has been frequent power outages and load shedding affecting business performances. To solve this problem and advance policies to secure electricity in Ghana, it is imperative that those factors that cause consumption to increase be analysed by considering the three classes of consumers; residential, industrial and non-residential. The main argument, however, is that, export of electricity to other neighbouring countries should be included in the electricity consumption model and considered as one of the significant factors which can decrease or increase consumption. The author made use of multivariate time series data from 1980-2010 and econometric models such as Ordinary Least Squares (OLS) and Vector Error Correction Model. Findings show that GDP growth, urban population growth, electricity exports and industry value added to GDP were cointegrated. The results also showed that there is unidirectional causality from electricity export and GDP growth and Industry value added to GDP to electricity consumption in the long run. However, in the short run, there was found to be a directional causality among all the variables and electricity consumption. The results have useful implication for energy policy makers especially with regards to electricity consumption, demand, and supply.

Keywords: electricity consumption, energy policy, GDP growth, vector error correction model

Procedia PDF Downloads 433
20815 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 523
20814 Degree of Approximation of Functions Conjugate to Periodic Functions Belonging to Lipschitz Classes by Product Matrix Means

Authors: Smita Sonker

Abstract:

Various investigators have determined the degree of approximation of conjugate signals (functions) of functions belonging to different classes Lipα, Lip(α,p), Lip(ξ(t),p), W(Lr,ξ(t), (β ≥ 0)) by matrix summability means, lower triangular matrix operator, product means (i.e. (C,1)(E,1), (C,1)(E,q), (E,q)(C,1) (N,p,q)(E,1), and (E,q)(N,pn) of their conjugate trigonometric Fourier series. In this paper, we shall determine the degree of approximation of 2π-periodic function conjugate functions of f belonging to the function classes Lipα and W(Lr; ξ(t); (β ≥ 0)) by (C1.T) -means of their conjugate trigonometric Fourier series. On the other hand, we shall review above-mentioned work in the light of Lenski.

Keywords: signals, trigonometric fourier approximation, class W(L^r, \xi(t), conjugate fourier series

Procedia PDF Downloads 395
20813 Normal and Peaberry Coffee Beans Classification from Green Coffee Bean Images Using Convolutional Neural Networks and Support Vector Machine

Authors: Hira Lal Gope, Hidekazu Fukai

Abstract:

The aim of this study is to develop a system which can identify and sort peaberries automatically at low cost for coffee producers in developing countries. In this paper, the focus is on the classification of peaberries and normal coffee beans using image processing and machine learning techniques. The peaberry is not bad and not a normal bean. The peaberry is born in an only single seed, relatively round seed from a coffee cherry instead of the usual flat-sided pair of beans. It has another value and flavor. To make the taste of the coffee better, it is necessary to separate the peaberry and normal bean before green coffee beans roasting. Otherwise, the taste of total beans will be mixed, and it will be bad. In roaster procedure time, all the beans shape, size, and weight must be unique; otherwise, the larger bean will take more time for roasting inside. The peaberry has a different size and different shape even though they have the same weight as normal beans. The peaberry roasts slower than other normal beans. Therefore, neither technique provides a good option to select the peaberries. Defect beans, e.g., sour, broken, black, and fade bean, are easy to check and pick up manually by hand. On the other hand, the peaberry pick up is very difficult even for trained specialists because the shape and color of the peaberry are similar to normal beans. In this study, we use image processing and machine learning techniques to discriminate the normal and peaberry bean as a part of the sorting system. As the first step, we applied Deep Convolutional Neural Networks (CNN) and Support Vector Machine (SVM) as machine learning techniques to discriminate the peaberry and normal bean. As a result, better performance was obtained with CNN than with SVM for the discrimination of the peaberry. The trained artificial neural network with high performance CPU and GPU in this work will be simply installed into the inexpensive and low in calculation Raspberry Pi system. We assume that this system will be used in under developed countries. The study evaluates and compares the feasibility of the methods in terms of accuracy of classification and processing speed.

Keywords: convolutional neural networks, coffee bean, peaberry, sorting, support vector machine

Procedia PDF Downloads 144
20812 A Machine Learning Approach for Assessment of Tremor: A Neurological Movement Disorder

Authors: Rajesh Ranjan, Marimuthu Palaniswami, A. A. Hashmi

Abstract:

With the changing lifestyle and environment around us, the prevalence of the critical and incurable disease has proliferated. One such condition is the neurological disorder which is rampant among the old age population and is increasing at an unstoppable rate. Most of the neurological disorder patients suffer from some movement disorder affecting the movement of their body parts. Tremor is the most common movement disorder which is prevalent in such patients that infect the upper or lower limbs or both extremities. The tremor symptoms are commonly visible in Parkinson’s disease patient, and it can also be a pure tremor (essential tremor). The patients suffering from tremor face enormous trouble in performing the daily activity, and they always need a caretaker for assistance. In the clinics, the assessment of tremor is done through a manual clinical rating task such as Unified Parkinson’s disease rating scale which is time taking and cumbersome. Neurologists have also affirmed a challenge in differentiating a Parkinsonian tremor with the pure tremor which is essential in providing an accurate diagnosis. Therefore, there is a need to develop a monitoring and assistive tool for the tremor patient that keep on checking their health condition by coordinating them with the clinicians and caretakers for early diagnosis and assistance in performing the daily activity. In our research, we focus on developing a system for automatic classification of tremor which can accurately differentiate the pure tremor from the Parkinsonian tremor using a wearable accelerometer-based device, so that adequate diagnosis can be provided to the correct patient. In this research, a study was conducted in the neuro-clinic to assess the upper wrist movement of the patient suffering from Pure (Essential) tremor and Parkinsonian tremor using a wearable accelerometer-based device. Four tasks were designed in accordance with Unified Parkinson’s disease motor rating scale which is used to assess the rest, postural, intentional and action tremor in such patient. Various features such as time-frequency domain, wavelet-based and fast-Fourier transform based cross-correlation were extracted from the tri-axial signal which was used as input feature vector space for the different supervised and unsupervised learning tools for quantification of severity of tremor. A minimum covariance maximum correlation energy comparison index was also developed which was used as the input feature for various classification tools for distinguishing the PT and ET tremor types. An automatic system for efficient classification of tremor was developed using feature extraction methods, and superior performance was achieved using K-nearest neighbors and Support Vector Machine classifiers respectively.

Keywords: machine learning approach for neurological disorder assessment, automatic classification of tremor types, feature extraction method for tremor classification, neurological movement disorder, parkinsonian tremor, essential tremor

Procedia PDF Downloads 153
20811 Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers

Authors: Sule Yucelbas, Gulay Tezel, Cuneyt Yucelbas, Seral Ozsen

Abstract:

In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other. As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.

Keywords: AIS, ANN, ECG, hybrid classifiers, PSO

Procedia PDF Downloads 442
20810 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 404
20809 AI-Driven Forecasting Models for Anticipating Oil Market Trends and Demand

Authors: Gaurav Kumar Sinha

Abstract:

The volatility of the oil market, influenced by geopolitical, economic, and environmental factors, presents significant challenges for stakeholders in predicting trends and demand. This article explores the application of artificial intelligence (AI) in developing robust forecasting models to anticipate changes in the oil market more accurately. We delve into various AI techniques, including machine learning, deep learning, and time series analysis, that have been adapted to analyze historical data and current market conditions to forecast future trends. The study evaluates the effectiveness of these models in capturing complex patterns and dependencies in market data, which traditional forecasting methods often miss. Additionally, the paper discusses the integration of external variables such as political events, economic policies, and technological advancements that influence oil prices and demand. By leveraging AI, stakeholders can achieve a more nuanced understanding of market dynamics, enabling better strategic planning and risk management. The article concludes with a discussion on the potential of AI-driven models in enhancing the predictive accuracy of oil market forecasts and their implications for global economic planning and strategic resource allocation.

Keywords: AI forecasting, oil market trends, machine learning, deep learning, time series analysis, predictive analytics, economic factors, geopolitical influence, technological advancements, strategic planning

Procedia PDF Downloads 33
20808 The Impact of Natural Resources on Financial Development: The Global Perspective

Authors: Remy Jonkam Oben

Abstract:

Using a time series approach, this study investigates how natural resources impact financial development from a global perspective over the 1980-2019 period. Some important determinants of financial development (economic growth, trade openness, population growth, and investment) have been added to the model as control variables. Unit root tests have revealed that all the variables are integrated into order one. Johansen's cointegration test has shown that the variables are in a long-run equilibrium relationship. The vector error correction model (VECM) has estimated the coefficient of the error correction term (ECT), which suggests that the short-run values of natural resources, economic growth, trade openness, population growth, and investment contribute to financial development converging to its long-run equilibrium level by a 23.63% annual speed of adjustment. The estimated coefficients suggest that global natural resource rent has a statistically-significant negative impact on global financial development in the long-run (thereby validating the financial resource curse) but not in the short-run. Causality test results imply that neither global natural resource rent nor global financial development Granger-causes each other.

Keywords: financial development, natural resources, resource curse hypothesis, time series analysis, Granger causality, global perspective

Procedia PDF Downloads 168
20807 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 347
20806 A Bayesian Population Model to Estimate Reference Points of Bombay-Duck (Harpadon nehereus) in Bay of Bengal, Bangladesh Using CMSY and BSM

Authors: Ahmad Rabby

Abstract:

The demographic trend analyses of Bombay-duck from time series catch data using CMSY and BSM for the first time in Bangladesh. During 2000-2018, CMSY indicates average lowest production in 2000 and highest in 2018. This has been used in the estimation of prior biomass by the default rules. Possible 31030 viable trajectories for 3422 r-k pairs were found by the CMSY analysis and the final estimates for intrinsic rate of population increase (r) was 1.19 year-1 with 95% CL= 0.957-1.48 year-1. The carrying capacity(k) of Bombay-duck was 283×103 tons with 95% CL=173×103 - 464×103 tons and MSY was 84.3×103tons year-1, 95% CL=49.1×103-145×103 tons year-1. Results from Bayesian state-space implementation of the Schaefer production model (BSM) using catch & CPUE data, found catchabilitiy coefficient(q) was 1.63 ×10-6 from lcl=1.27×10-6 to ucl=2.10×10-6 and r= 1.06 year-1 with 95% CL= 0.727 - 1.55 year-1, k was 226×103 tons with 95% CL=170×103-301×103 tons and MSY was 60×103 tons year-1 with 95% CL=49.9 ×103- 72.2 ×103 tons year-1. Results for Bombay-duck fishery management based on BSM assessment from time series catch data illustrated that, Fmsy=0.531 with 95% CL =0.364 - 0.775 (if B > 1/2 Bmsy then Fmsy =0.5r); Fmsy=0.531 with 95% CL =0.364-0.775 (r and Fmsy are linearly reduced if B < 1/2Bmsy). Biomass in 2018 was 110×103 tons with 2.5th to 97.5th percentile=82.3-155×103 tons. Relative biomass (B/Bmsy) in last year was 0.972 from 2.5th percentile to 97.5th percentile=0.728 -1.37. Fishing mortality in last year was 0.738 with 2.5th-97.5th percentile=0.525-1.37. Exploitation F/Fmsy was 1.39, from 2.5th to 97.5th percentile it was 0.988 -1.86. The biological reference points of B/BMSY was smaller than 1.0, while F/FMSY was higher than 1.0 revealed an over-exploitation of the fishery, indicating that more conservative management strategies are required for Bombay-duck fishery.

Keywords: biological reference points, catchability coefficient, carrying capacity, intrinsic rate of population increase

Procedia PDF Downloads 124
20805 The Shannon Entropy and Multifractional Markets

Authors: Massimiliano Frezza, Sergio Bianchi, Augusto Pianese

Abstract:

Introduced by Shannon in 1948 in the field of information theory as the average rate at which information is produced by a stochastic set of data, the concept of entropy has gained much attention as a measure of uncertainty and unpredictability associated with a dynamical system, eventually depicted by a stochastic process. In particular, the Shannon entropy measures the degree of order/disorder of a given signal and provides useful information about the underlying dynamical process. It has found widespread application in a variety of fields, such as, for example, cryptography, statistical physics and finance. In this regard, many contributions have employed different measures of entropy in an attempt to characterize the financial time series in terms of market efficiency, market crashes and/or financial crises. The Shannon entropy has also been considered as a measure of the risk of a portfolio or as a tool in asset pricing. This work investigates the theoretical link between the Shannon entropy and the multifractional Brownian motion (mBm), stochastic process which recently is the focus of a renewed interest in finance as a driving model of stochastic volatility. In particular, after exploring the current state of research in this area and highlighting some of the key results and open questions that remain, we show a well-defined relationship between the Shannon (log)entropy and the memory function H(t) of the mBm. In details, we allow both the length of time series and time scale to change over analysis to study how the relation modify itself. On the one hand, applications are developed after generating surrogates of mBm trajectories based on different memory functions; on the other hand, an empirical analysis of several international stock indexes, which confirms the previous results, concludes the work.

Keywords: Shannon entropy, multifractional Brownian motion, Hurst–Holder exponent, stock indexes

Procedia PDF Downloads 110
20804 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: artificial immune system, breast cancer diagnosis, Euclidean function, Gaussian function

Procedia PDF Downloads 433