Search results for: classification model
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 18064

Search results for: classification model

17704 Molecular Topology and TLC Retention Behaviour of s-Triazines: QSRR Study

Authors: Lidija R. Jevrić, Sanja O. Podunavac-Kuzmanović, Strahinja Z. Kovačević

Abstract:

Quantitative structure-retention relationship (QSRR) analysis was used to predict the chromatographic behavior of s-triazine derivatives by using theoretical descriptors computed from the chemical structure. Fundamental basis of the reported investigation is to relate molecular topological descriptors with chromatographic behavior of s-triazine derivatives obtained by reversed-phase (RP) thin layer chromatography (TLC) on silica gel impregnated with paraffin oil and applied ethanol-water (φ = 0.5-0.8; v/v). Retention parameter (RM0) of 14 investigated s-triazine derivatives was used as dependent variable while simple connectivity index different orders were used as independent variables. The best QSRR model for predicting RM0 value was obtained with simple third order connectivity index (3χ) in the second-degree polynomial equation. Numerical values of the correlation coefficient (r=0.915), Fisher's value (F=28.34) and root mean square error (RMSE = 0.36) indicate that model is statistically significant. In order to test the predictive power of the QSRR model leave-one-out cross-validation technique has been applied. The parameters of the internal cross-validation analysis (r2CV=0.79, r2adj=0.81, PRESS=1.89) reflect the high predictive ability of the generated model and it confirms that can be used to predict RM0 value. Multivariate classification technique, hierarchical cluster analysis (HCA), has been applied in order to group molecules according to their molecular connectivity indices. HCA is a descriptive statistical method and it is the most frequently used for important area of data processing such is classification. The HCA performed on simple molecular connectivity indices obtained from the 2D structure of investigated s-triazine compounds resulted in two main clusters in which compounds molecules were grouped according to the number of atoms in the molecule. This is in agreement with the fact that these descriptors were calculated on the basis of the number of atoms in the molecule of the investigated s-triazine derivatives.

Keywords: s-triazines, QSRR, chemometrics, chromatography, molecular descriptors

Procedia PDF Downloads 388
17703 Performance Enrichment of Deep Feed Forward Neural Network and Deep Belief Neural Networks for Fault Detection of Automobile Gearbox Using Vibration Signal

Authors: T. Praveenkumar, Kulpreet Singh, Divy Bhanpuriya, M. Saimurugan

Abstract:

This study analysed the classification accuracy for gearbox faults using Machine Learning Techniques. Gearboxes are widely used for mechanical power transmission in rotating machines. Its rotating components such as bearings, gears, and shafts tend to wear due to prolonged usage, causing fluctuating vibrations. Increasing the dependability of mechanical components like a gearbox is hampered by their sealed design, which makes visual inspection difficult. One way of detecting impending failure is to detect a change in the vibration signature. The current study proposes various machine learning algorithms, with aid of these vibration signals for obtaining the fault classification accuracy of an automotive 4-Speed synchromesh gearbox. Experimental data in the form of vibration signals were acquired from a 4-Speed synchromesh gearbox using Data Acquisition System (DAQs). Statistical features were extracted from the acquired vibration signal under various operating conditions. Then the extracted features were given as input to the algorithms for fault classification. Supervised Machine Learning algorithms such as Support Vector Machines (SVM) and unsupervised algorithms such as Deep Feed Forward Neural Network (DFFNN), Deep Belief Networks (DBN) algorithms are used for fault classification. The fusion of DBN & DFFNN classifiers were architected to further enhance the classification accuracy and to reduce the computational complexity. The fault classification accuracy for each algorithm was thoroughly studied, tabulated, and graphically analysed for fused and individual algorithms. In conclusion, the fusion of DBN and DFFNN algorithm yielded the better classification accuracy and was selected for fault detection due to its faster computational processing and greater efficiency.

Keywords: deep belief networks, DBN, deep feed forward neural network, DFFNN, fault diagnosis, fusion of algorithm, vibration signal

Procedia PDF Downloads 105
17702 Investigation of Topic Modeling-Based Semi-Supervised Interpretable Document Classifier

Authors: Dasom Kim, William Xiu Shun Wong, Yoonjin Hyun, Donghoon Lee, Minji Paek, Sungho Byun, Namgyu Kim

Abstract:

There have been many researches on document classification for classifying voluminous documents automatically. Through document classification, we can assign a specific category to each unlabeled document on the basis of various machine learning algorithms. However, providing labeled documents manually requires considerable time and effort. To overcome the limitations, the semi-supervised learning which uses unlabeled document as well as labeled documents has been invented. However, traditional document classifiers, regardless of supervised or semi-supervised ones, cannot sufficiently explain the reason or the process of the classification. Thus, in this paper, we proposed a methodology to visualize major topics and class components of each document. We believe that our methodology for visualizing topics and classes of each document can enhance the reliability and explanatory power of document classifiers.

Keywords: data mining, document classifier, text mining, topic modeling

Procedia PDF Downloads 394
17701 Automatic Classification for the Degree of Disc Narrowing from X-Ray Images Using CNN

Authors: Kwangmin Joo

Abstract:

Automatic detection of lumbar vertebrae and classification method is proposed for evaluating the degree of disc narrowing. Prior to classification, deep learning based segmentation is applied to detect individual lumbar vertebra. M-net is applied to segment five lumbar vertebrae and fine-tuning segmentation is employed to improve the accuracy of segmentation. Using the features extracted from previous step, clustering technique, k-means clustering, is applied to estimate the degree of disc space narrowing under four grade scoring system. As preliminary study, techniques proposed in this research could help building an automatic scoring system to diagnose the severity of disc narrowing from X-ray images.

Keywords: Disc space narrowing, Degenerative disc disorders, Deep learning based segmentation, Clustering technique

Procedia PDF Downloads 119
17700 Interaction with Earth’s Surface in Remote Sensing

Authors: Spoorthi Sripad

Abstract:

Remote sensing is a powerful tool for acquiring information about the Earth's surface without direct contact, relying on the interaction of electromagnetic radiation with various materials and features. This paper explores the fundamental principle of "Interaction with Earth's Surface" in remote sensing, shedding light on the intricate processes that occur when electromagnetic waves encounter different surfaces. The absorption, reflection, and transmission of radiation generate distinct spectral signatures, allowing for the identification and classification of surface materials. The paper delves into the significance of the visible, infrared, and thermal infrared regions of the electromagnetic spectrum, highlighting how their unique interactions contribute to a wealth of applications, from land cover classification to environmental monitoring. The discussion encompasses the types of sensors and platforms used to capture these interactions, including multispectral and hyperspectral imaging systems. By examining real-world applications, such as land cover classification and environmental monitoring, the paper underscores the critical role of understanding the interaction with the Earth's surface for accurate and meaningful interpretation of remote sensing data.

Keywords: remote sensing, earth's surface interaction, electromagnetic radiation, spectral signatures, land cover classification, archeology and cultural heritage preservation

Procedia PDF Downloads 48
17699 Analysis of Matching Pursuit Features of EEG Signal for Mental Tasks Classification

Authors: Zin Mar Lwin

Abstract:

Brain Computer Interface (BCI) Systems have developed for people who suffer from severe motor disabilities and challenging to communicate with their environment. BCI allows them for communication by a non-muscular way. For communication between human and computer, BCI uses a type of signal called Electroencephalogram (EEG) signal which is recorded from the human„s brain by means of an electrode. The electroencephalogram (EEG) signal is an important information source for knowing brain processes for the non-invasive BCI. Translating human‟s thought, it needs to classify acquired EEG signal accurately. This paper proposed a typical EEG signal classification system which experiments the Dataset from “Purdue University.” Independent Component Analysis (ICA) method via EEGLab Tools for removing artifacts which are caused by eye blinks. For features extraction, the Time and Frequency features of non-stationary EEG signals are extracted by Matching Pursuit (MP) algorithm. The classification of one of five mental tasks is performed by Multi_Class Support Vector Machine (SVM). For SVMs, the comparisons have been carried out for both 1-against-1 and 1-against-all methods.

Keywords: BCI, EEG, ICA, SVM

Procedia PDF Downloads 270
17698 A GIS Based Approach in District Peshawar, Pakistan for Groundwater Vulnerability Assessment Using DRASTIC Model

Authors: Syed Adnan, Javed Iqbal

Abstract:

In urban and rural areas groundwater is the most economic natural source of drinking. Groundwater resources of Pakistan are degraded due to high population growth and increased industrial development. A study was conducted in district Peshawar to assess groundwater vulnerable zones using GIS based DRASTIC model. Six input parameters (groundwater depth, groundwater recharge, aquifer material, soil type, slope and hydraulic conductivity) were used in the DRASTIC model to generate the groundwater vulnerable zones. Each parameter was divided into different ranges or media types and a subjective rating from 1-10 was assigned to each factor where 1 represented very low impact on pollution potential and 10 represented very high impact. Weight multiplier from 1-5 was used to balance and enhance the importance of each factor. The DRASTIC model scores obtained varied from 47 to 147. Using quantile classification scheme these values were reclassified into three zones i.e. low, moderate and high vulnerable zones. The areas of these zones were calculated. The final result indicated that about 400 km2, 506 km2, and 375 km2 were classified as low, moderate, and high vulnerable areas, respectively. It is recommended that the most vulnerable zones should be treated on first priority to facilitate the inhabitants for drinking purposes.

Keywords: DRASTIC model, groundwater vulnerability, GIS in groundwater, drinking sources

Procedia PDF Downloads 443
17697 Plant Identification Using Convolution Neural Network and Vision Transformer-Based Models

Authors: Virender Singh, Mathew Rees, Simon Hampton, Sivaram Annadurai

Abstract:

Plant identification is a challenging task that aims to identify the family, genus, and species according to plant morphological features. Automated deep learning-based computer vision algorithms are widely used for identifying plants and can help users narrow down the possibilities. However, numerous morphological similarities between and within species render correct classification difficult. In this paper, we tested custom convolution neural network (CNN) and vision transformer (ViT) based models using the PyTorch framework to classify plants. We used a large dataset of 88,000 provided by the Royal Horticultural Society (RHS) and a smaller dataset of 16,000 images from the PlantClef 2015 dataset for classifying plants at genus and species levels, respectively. Our results show that for classifying plants at the genus level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420 and other state-of-the-art CNN-based models suggested in previous studies on a similar dataset. ViT model achieved top accuracy of 83.3% for classifying plants at the genus level. For classifying plants at the species level, ViT models perform better compared to CNN-based models ResNet50 and ResNet-RS-420, with a top accuracy of 92.5%. We show that the correct set of augmentation techniques plays an important role in classification success. In conclusion, these results could help end users, professionals and the general public alike in identifying plants quicker and with improved accuracy.

Keywords: plant identification, CNN, image processing, vision transformer, classification

Procedia PDF Downloads 90
17696 Methods for Enhancing Ensemble Learning or Improving Classifiers of This Technique in the Analysis and Classification of Brain Signals

Authors: Seyed Mehdi Ghezi, Hesam Hasanpoor

Abstract:

This scientific article explores enhancement methods for ensemble learning with the aim of improving the performance of classifiers in the analysis and classification of brain signals. The research approach in this field consists of two main parts, each with its own strengths and weaknesses. The choice of approach depends on the specific research question and available resources. By combining these approaches and leveraging their respective strengths, researchers can enhance the accuracy and reliability of classification results, consequently advancing our understanding of the brain and its functions. The first approach focuses on utilizing machine learning methods to identify the best features among the vast array of features present in brain signals. The selection of features varies depending on the research objective, and different techniques have been employed for this purpose. For instance, the genetic algorithm has been used in some studies to identify the best features, while optimization methods have been utilized in others to identify the most influential features. Additionally, machine learning techniques have been applied to determine the influential electrodes in classification. Ensemble learning plays a crucial role in identifying the best features that contribute to learning, thereby improving the overall results. The second approach concentrates on designing and implementing methods for selecting the best classifier or utilizing meta-classifiers to enhance the final results in ensemble learning. In a different section of the research, a single classifier is used instead of multiple classifiers, employing different sets of features to improve the results. The article provides an in-depth examination of each technique, highlighting their advantages and limitations. By integrating these techniques, researchers can enhance the performance of classifiers in the analysis and classification of brain signals. This advancement in ensemble learning methodologies contributes to a better understanding of the brain and its functions, ultimately leading to improved accuracy and reliability in brain signal analysis and classification.

Keywords: ensemble learning, brain signals, classification, feature selection, machine learning, genetic algorithm, optimization methods, influential features, influential electrodes, meta-classifiers

Procedia PDF Downloads 69
17695 Text Emotion Recognition by Multi-Head Attention based Bidirectional LSTM Utilizing Multi-Level Classification

Authors: Vishwanath Pethri Kamath, Jayantha Gowda Sarapanahalli, Vishal Mishra, Siddhesh Balwant Bandgar

Abstract:

Recognition of emotional information is essential in any form of communication. Growing HCI (Human-Computer Interaction) in recent times indicates the importance of understanding of emotions expressed and becomes crucial for improving the system or the interaction itself. In this research work, textual data for emotion recognition is used. The text being the least expressive amongst the multimodal resources poses various challenges such as contextual information and also sequential nature of the language construction. In this research work, the proposal is made for a neural architecture to resolve not less than 8 emotions from textual data sources derived from multiple datasets using google pre-trained word2vec word embeddings and a Multi-head attention-based bidirectional LSTM model with a one-vs-all Multi-Level Classification. The emotions targeted in this research are Anger, Disgust, Fear, Guilt, Joy, Sadness, Shame, and Surprise. Textual data from multiple datasets were used for this research work such as ISEAR, Go Emotions, Affect datasets for creating the emotions’ dataset. Data samples overlap or conflicts were considered with careful preprocessing. Our results show a significant improvement with the modeling architecture and as good as 10 points improvement in recognizing some emotions.

Keywords: text emotion recognition, bidirectional LSTM, multi-head attention, multi-level classification, google word2vec word embeddings

Procedia PDF Downloads 171
17694 Common Orthodontic Indices and Classification in the United Kingdom

Authors: Ashwini Mohan, Haris Batley

Abstract:

An orthodontic index is used to rate or categorise an individual’s occlusion using a numeric or alphanumeric score. Indexing of malocclusions and their correction is important in epidemiology, diagnosis, communication between clinicians as well as their patients and assessing treatment outcomes. Many useful indices have been put forward, but to the author’s best knowledge, no one method to this day appears to be equally suitable for the use of epidemiologists, public health program planners and clinicians. This article describes the common clinical orthodontic indices and classifications used in United Kingdom.

Keywords: classification, indices, orthodontics, validity

Procedia PDF Downloads 143
17693 Detection and Classification of Myocardial Infarction Using New Extracted Features from Standard 12-Lead ECG Signals

Authors: Naser Safdarian, Nader Jafarnia Dabanloo

Abstract:

In this paper we used four features i.e. Q-wave integral, QRS complex integral, T-wave integral and total integral as extracted feature from normal and patient ECG signals to detection and localization of myocardial infarction (MI) in left ventricle of heart. In our research we focused on detection and localization of MI in standard ECG. We use the Q-wave integral and T-wave integral because this feature is important impression in detection of MI. We used some pattern recognition method such as Artificial Neural Network (ANN) to detect and localize the MI. Because these methods have good accuracy for classification of normal and abnormal signals. We used one type of Radial Basis Function (RBF) that called Probabilistic Neural Network (PNN) because of its nonlinearity property, and used other classifier such as k-Nearest Neighbors (KNN), Multilayer Perceptron (MLP) and Naive Bayes Classification. We used PhysioNet database as our training and test data. We reached over 80% for accuracy in test data for localization and over 95% for detection of MI. Main advantages of our method are simplicity and its good accuracy. Also we can improve accuracy of classification by adding more features in this method. A simple method based on using only four features which extracted from standard ECG is presented which has good accuracy in MI localization.

Keywords: ECG signal processing, myocardial infarction, features extraction, pattern recognition

Procedia PDF Downloads 451
17692 Using Deep Learning for the Detection of Faulty RJ45 Connectors on a Radio Base Station

Authors: Djamel Fawzi Hadj Sadok, Marrone Silvério Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner

Abstract:

A radio base station (RBS), part of the radio access network, is a particular type of equipment that supports the connection between a wide range of cellular user devices and an operator network access infrastructure. Nowadays, most of the RBS maintenance is carried out manually, resulting in a time consuming and costly task. A suitable candidate for RBS maintenance automation is repairing faulty links between devices caused by missing or unplugged connectors. A suitable candidate for RBS maintenance automation is repairing faulty links between devices caused by missing or unplugged connectors. This paper proposes and compares two deep learning solutions to identify attached RJ45 connectors on network ports. We named connector detection, the solution based on object detection, and connector classification, the one based on object classification. With the connector detection, we get an accuracy of 0:934, mean average precision 0:903. Connector classification, get a maximum accuracy of 0:981 and an AUC of 0:989. Although connector detection was outperformed in this study, this should not be viewed as an overall result as connector detection is more flexible for scenarios where there is no precise information about the environment and the possible devices. At the same time, the connector classification requires that information to be well-defined.

Keywords: radio base station, maintenance, classification, detection, deep learning, automation

Procedia PDF Downloads 194
17691 A Similarity Measure for Classification and Clustering in Image Based Medical and Text Based Banking Applications

Authors: K. P. Sandesh, M. H. Suman

Abstract:

Text processing plays an important role in information retrieval, data-mining, and web search. Measuring the similarity between the documents is an important operation in the text processing field. In this project, a new similarity measure is proposed. To compute the similarity between two documents with respect to a feature the proposed measure takes the following three cases into account: (1) The feature appears in both documents; (2) The feature appears in only one document and; (3) The feature appears in none of the documents. The proposed measure is extended to gauge the similarity between two sets of documents. The effectiveness of our measure is evaluated on several real-world data sets for text classification and clustering problems, especially in banking and health sectors. The results show that the performance obtained by the proposed measure is better than that achieved by the other measures.

Keywords: document classification, document clustering, entropy, accuracy, classifiers, clustering algorithms

Procedia PDF Downloads 514
17690 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 116
17689 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 179
17688 Theoretical Discussion on the Classification of Risks in Supply Chain Management

Authors: Liane Marcia Freitas Silva, Fernando Augusto Silva Marins, Maria Silene Alexandre Leite

Abstract:

The adoption of a network structure, like in the supply chains, favors the increase of dependence between companies and, by consequence, their vulnerability. Environment disasters, sociopolitical and economical events, and the dynamics of supply chains elevate the uncertainty of their operation, favoring the occurrence of events that can generate break up in the operations and other undesired consequences. Thus, supply chains are exposed to various risks that can influence the profitability of companies involved, and there are several previous studies that have proposed risk classification models in order to categorize the risks and to manage them. The objective of this paper is to analyze and discuss thirty of these risk classification models by means a theoretical survey. The research method adopted for analyzing and discussion includes three phases: The identification of the types of risks proposed in each one of the thirty models, the grouping of them considering equivalent concepts associated to their definitions, and, the analysis of these risks groups, evaluating their similarities and differences. After these analyses, it was possible to conclude that, in fact, there is more than thirty risks types identified in the literature of Supply Chains, but some of them are identical despite of be used distinct terms to characterize them, because different criteria for risk classification are adopted by researchers. In short, it is observed that some types of risks are identified as risk source for supply chains, such as, demand risk, environmental risk and safety risk. On the other hand, other types of risks are identified by the consequences that they can generate for the supply chains, such as, the reputation risk, the asset depreciation risk and the competitive risk. These results are consequence of the disagreements between researchers on risk classification, mainly about what is risk event and about what is the consequence of risk occurrence. An additional study is in developing in order to clarify how the risks can be generated, and which are the characteristics of the components in a Supply Chain that leads to occurrence of risk.

Keywords: sisks classification, survey, supply chain management, theoretical discussion

Procedia PDF Downloads 624
17687 A Deep Reinforcement Learning-Based Secure Framework against Adversarial Attacks in Power System

Authors: Arshia Aflaki, Hadis Karimipour, Anik Islam

Abstract:

Generative Adversarial Attacks (GAAs) threaten critical sectors, ranging from fingerprint recognition to industrial control systems. Existing Deep Learning (DL) algorithms are not robust enough against this kind of cyber-attack. As one of the most critical industries in the world, the power grid is not an exception. In this study, a Deep Reinforcement Learning-based (DRL) framework assisting the DL model to improve the robustness of the model against generative adversarial attacks is proposed. Real-world smart grid stability data, as an IIoT dataset, test our method and improves the classification accuracy of a deep learning model from around 57 percent to 96 percent.

Keywords: generative adversarial attack, deep reinforcement learning, deep learning, IIoT, generative adversarial networks, power system

Procedia PDF Downloads 23
17686 Identification of High-Rise Buildings Using Object Based Classification and Shadow Extraction Techniques

Authors: Subham Kharel, Sudha Ravindranath, A. Vidya, B. Chandrasekaran, K. Ganesha Raj, T. Shesadri

Abstract:

Digitization of urban features is a tedious and time-consuming process when done manually. In addition to this problem, Indian cities have complex habitat patterns and convoluted clustering patterns, which make it even more difficult to map features. This paper makes an attempt to classify urban objects in the satellite image using object-oriented classification techniques in which various classes such as vegetation, water bodies, buildings, and shadows adjacent to the buildings were mapped semi-automatically. Building layer obtained as a result of object-oriented classification along with already available building layers was used. The main focus, however, lay in the extraction of high-rise buildings using spatial technology, digital image processing, and modeling, which would otherwise be a very difficult task to carry out manually. Results indicated a considerable rise in the total number of buildings in the city. High-rise buildings were successfully mapped using satellite imagery, spatial technology along with logical reasoning and mathematical considerations. The results clearly depict the ability of Remote Sensing and GIS to solve complex problems in urban scenarios like studying urban sprawl and identification of more complex features in an urban area like high-rise buildings and multi-dwelling units. Object-Oriented Technique has been proven to be effective and has yielded an overall efficiency of 80 percent in the classification of high-rise buildings.

Keywords: object oriented classification, shadow extraction, high-rise buildings, satellite imagery, spatial technology

Procedia PDF Downloads 144
17685 A Gene Selection Algorithm for Microarray Cancer Classification Using an Improved Particle Swarm Optimization

Authors: Arfan Ali Nagra, Tariq Shahzad, Meshal Alharbi, Khalid Masood Khan, Muhammad Mugees Asif, Taher M. Ghazal, Khmaies Ouahada

Abstract:

Gene selection is an essential step for the classification of microarray cancer data. Gene expression cancer data (DNA microarray) facilitates computing the robust and concurrent expression of various genes. Particle swarm optimization (PSO) requires simple operators and less number of parameters for tuning the model in gene selection. The selection of a prognostic gene with small redundancy is a great challenge for the researcher as there are a few complications in PSO based selection method. In this research, a new variant of PSO (Self-inertia weight adaptive PSO) has been proposed. In the proposed algorithm, SIW-APSO-ELM is explored to achieve gene selection prediction accuracies. This new algorithm balances the exploration capabilities of the improved inertia weight adaptive particle swarm optimization and the exploitation. The self-inertia weight adaptive particle swarm optimization (SIW-APSO) is used to search the solution. The SIW-APSO is updated with an evolutionary process in such a way that each particle iteratively improves its velocities and positions. The extreme learning machine (ELM) has been designed for the selection procedure. The proposed method has been to identify a number of genes in the cancer dataset. The classification algorithm contains ELM, K- centroid nearest neighbor (KCNN), and support vector machine (SVM) to attain high forecast accuracy as compared to the start-of-the-art methods on microarray cancer datasets that show the effectiveness of the proposed method.

Keywords: microarray cancer, improved PSO, ELM, SVM, evolutionary algorithms

Procedia PDF Downloads 77
17684 Algorithm for Modelling Land Surface Temperature and Land Cover Classification and Their Interaction

Authors: Jigg Pelayo, Ricardo Villar, Einstine Opiso

Abstract:

The rampant and unintended spread of urban areas resulted in increasing artificial component features in the land cover types of the countryside and bringing forth the urban heat island (UHI). This paved the way to wide range of negative influences on the human health and environment which commonly relates to air pollution, drought, higher energy demand, and water shortage. Land cover type also plays a relevant role in the process of understanding the interaction between ground surfaces with the local temperature. At the moment, the depiction of the land surface temperature (LST) at city/municipality scale particularly in certain areas of Misamis Oriental, Philippines is inadequate as support to efficient mitigations and adaptations of the surface urban heat island (SUHI). Thus, this study purposely attempts to provide application on the Landsat 8 satellite data and low density Light Detection and Ranging (LiDAR) products in mapping out quality automated LST model and crop-level land cover classification in a local scale, through theoretical and algorithm based approach utilizing the principle of data analysis subjected to multi-dimensional image object model. The paper also aims to explore the relationship between the derived LST and land cover classification. The results of the presented model showed the ability of comprehensive data analysis and GIS functionalities with the integration of object-based image analysis (OBIA) approach on automating complex maps production processes with considerable efficiency and high accuracy. The findings may potentially lead to expanded investigation of temporal dynamics of land surface UHI. It is worthwhile to note that the environmental significance of these interactions through combined application of remote sensing, geographic information tools, mathematical morphology and data analysis can provide microclimate perception, awareness and improved decision-making for land use planning and characterization at local and neighborhood scale. As a result, it can aid in facilitating problem identification, support mitigations and adaptations more efficiently.

Keywords: LiDAR, OBIA, remote sensing, local scale

Procedia PDF Downloads 280
17683 Modified Naive Bayes-Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: tomato yield prediction, naive Bayes, redundancy, WSG

Procedia PDF Downloads 227
17682 A Comparative Study for Various Techniques Using WEKA for Red Blood Cells Classification

Authors: Jameela Ali, Hamid A. Jalab, Loay E. George, Abdul Rahim Ahmad, Azizah Suliman, Karim Al-Jashamy

Abstract:

Red blood cells (RBC) are the most common types of blood cells and are the most intensively studied in cell biology. The lack of RBCs is a condition in which the amount of hemoglobin level is lower than normal and is referred to as “anemia”. Abnormalities in RBCs will affect the exchange of oxygen. This paper presents a comparative study for various techniques for classifyig the red blood cells as normal, or abnormal (anemic) using WEKA. WEKA is an open source consists of different machine learning algorithms for data mining applications. The algorithm tested are Radial Basis Function neural network, Support vector machine, and K-Nearest Neighbors algorithm. Two sets of combined features were utilized for classification of blood cells images. The first set, exclusively consist of geometrical features, was used to identify whether the tested blood cell has a spherical shape or non-spherical cells. While the second set, consist mainly of textural features was used to recognize the types of the spherical cells. We have provided an evaluation based on applying these classification methods to our RBCs image dataset which were obtained from Serdang Hospital-Malaysia, and measuring the accuracy of test results. The best achieved classification rates are 97%, 98%, and 79% for Support vector machines, Radial Basis Function neural network, and K-Nearest Neighbors algorithm respectively

Keywords: red blood cells, classification, radial basis function neural networks, suport vector machine, k-nearest neighbors algorithm

Procedia PDF Downloads 472
17681 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 153
17680 Threat Analysis: A Technical Review on Risk Assessment and Management of National Testing Service (NTS)

Authors: Beenish Urooj, Ubaid Ullah, Sidra Riasat

Abstract:

National Testing Service-Pakistan (NTS) is an agency in Pakistan that conducts student success appraisal examinations. In this research paper, we must present a security model for the NTS organization. The security model will depict certain security countermeasures for a better defense against certain types of breaches and system malware. We will provide a security roadmap, which will help the company to execute its further goals to maintain security standards and policies. We also covered multiple aspects in securing the environment of the organization. We introduced the processes, architecture, data classification, auditing approaches, survey responses, data handling, and also training and awareness of risk for the company. The primary contribution is the Risk Survey, based on the maturity model meant to assess and examine employee training and knowledge of risks in the company's activities.

Keywords: NTS, risk assessment, threat factors, security, services

Procedia PDF Downloads 65
17679 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 524
17678 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 312
17677 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 127
17676 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 101
17675 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 519