Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2171

Search results for: cataloguing and classification

1091 Understanding and Improving Neural Network Weight Initialization

Authors: Diego Aguirre, Olac Fuentes

Abstract:

In this paper, we present a taxonomy of weight initialization schemes used in deep learning. We survey the most representative techniques in each class and compare them in terms of overhead cost, convergence rate, and applicability. We also introduce a new weight initialization scheme. In this technique, we perform an initial feedforward pass through the network using an initialization mini-batch. Using statistics obtained from this pass, we initialize the weights of the network, so the following properties are met: 1) weight matrices are orthogonal; 2) ReLU layers produce a predetermined number of non-zero activations; 3) the output produced by each internal layer has a unit variance; 4) weights in the last layer are chosen to minimize the error in the initial mini-batch. We evaluate our method on three popular architectures, and a faster converge rates are achieved on the MNIST, CIFAR-10/100, and ImageNet datasets when compared to state-of-the-art initialization techniques.

Keywords: deep learning, image classification, supervised learning, weight initialization

Procedia PDF Downloads 135

1090 The Awareness of Computer Science Students Regarding the Security of Location Based Games

Authors: Jacques Barnard, Magda Huisman, Gunther R. Drevin

Abstract:

Rapid expansion and development in die mobile technology market has created an opportunity for users to participate in location based games. As a consequence of this fast expanding market and new technology, it is important to be aware of the implications this has on security. This paper measures the impact on the security awareness of games’ participants, as well as on that of students at university level with regards to their various stages of input in years of studying and gamer classification. This serves to provide insight into the matter as to discernible differences in the awareness of the security implications concerning these technologies. The data was accumulated via a web questionnaire that was to be completed yearly by students from respective year groups. Results signify a meaningful disparity in security awareness among students completing the varying study years and research. This awareness, however, does not always impact on gamers.

Keywords: gamer classifications, location based games, location based data, security awareness

Procedia PDF Downloads 292

1089 Urbanization in Delhi: A Multiparameter Study

Authors: Ishu Surender, M. Amez Khair, Ishan Singh

Abstract:

Urbanization is a multidimensional phenomenon. It is an indication of the long-term process for the shift of economics to industrial from rural. The significance of urbanization in modernization, socio-economic development, and poverty eradication is relevant in modern times. This paper aims to study the urbanization index model in the capital of India, Delhi using aspects such as demographic aspect, infrastructural development aspect, and economic development aspect. The urbanization index of all the nine districts of Delhi will be determined using multiple parameters such as population density and the availability of health and education facilities. The definition of the urban area varies from city to city and requires periodic classification which makes direct comparisons difficult. The urbanization index calculated in this paper can be employed to measure the urbanization of a district and compare the level of urbanization in different districts.

Keywords: multiparameter, population density, multiple regression, normalized urbanization index

Procedia PDF Downloads 113

1088 Motives for Reshoring from China to Europe: A Hierarchical Classification of Companies

Authors: Fabienne Fel, Eric Griette

Abstract:

Reshoring, whether concerning back-reshoring or near-reshoring, is a quite recent phenomenon. Despite the economic and political interest of this topic, academic research questioning determinants of reshoring remains rare. Our paper aims at contributing to fill this gap. In order to better understand the reasons for reshoring, we conducted a study among 280 French firms during spring 2016, three-quarters of which sourced, or source, in China. 105 firms in the sample have reshored all or part of their Chinese production or supply in recent years, and we aimed to establish a typology of the motives that drove them to this decision. We asked our respondents about the history of their Chinese supplies, their current reshoring strategies, and their motivations. Statistical analysis was performed with SPSS 22 and SPAD 8. Our results show that change in commercial and financial terms with China is the first motive explaining the current reshoring movement from this country (it applies to 54% of our respondents). A change in corporate strategy is the second motive (30% of our respondents); the reshoring decision follows a change in companies’ strategies (upgrading, implementation of a CSR policy, or a 'lean management' strategy). The third motive (14% of our sample) is a mere correction of the initial offshoring decision, considered as a mistake (under-estimation of hidden costs, non-quality and non-responsiveness problems). Some authors emphasize that developing a short supply chain, involving geographic proximity between design and production, gives a competitive advantage to companies wishing to offer innovative products. Admittedly 40% of our respondents indicate that this motive could have played a part in their decision to reshore, but this reason was not enough for any of them and is not an intrinsic motive leading to leaving Chinese suppliers. Having questioned our respondents about the importance given to various problems leading them to reshore, we then performed a Principal Components Analysis (PCA), associated with an Ascending Hierarchical Classification (AHC), based on Ward criterion, so as to point out more specific motivations. Three main classes of companies should be distinguished: -The 'Cost Killers' (23% of the sample), which reshore their supplies from China only because of higher procurement costs and so as to find lower costs elsewhere. -The 'Realists' (50% of the sample), giving equal weight or importance to increasing procurement costs in China and to the quality of their supplies (to a large extend). Companies being part of this class tend to take advantage of this changing environment to change their procurement strategy, seeking suppliers offering better quality and responsiveness. - The 'Voluntarists' (26% of the sample), which choose to reshore their Chinese supplies regardless of higher Chinese costs, to obtain better quality and greater responsiveness. We emphasize that if the main driver for reshoring from China is indeed higher local costs, it is should not be regarded as an exclusive motivation; 77% of the companies in the sample, are also seeking, sometimes exclusively, more reactive suppliers, liable to quality, respect for the environment and intellectual property.

Keywords: China, procurement, reshoring, strategy, supplies

Procedia PDF Downloads 326

1087 Merit Order of Indonesian Coal Mining Sources to Meet the Domestic Power Plants Demand

Authors: Victor Siahaan

Abstract:

Coal still become the most important energy source for electricity generation known for its contribution which take the biggest portion of energy mix that a country has, for example Indonesia. The low cost of electricity generation and quite a lot of resources make this energy still be the first choice to fill the portion of base load power. To realize its significance to produce electricity, it is necessary to know the amount of coal (volume) needed to ensure that all coal power plants (CPP) in a country can operate properly. To secure the volume of coal, in this study, discussion was carried out regarding the identification of coal mining sources in Indonesia, classification of coal typical from each coal mining sources, and determination of the port of loading. By using data above, the sources of coal mining are then selected to feed certain CPP based on the compatibility of the coal typical and the lowest transport cost.

Keywords: merit order, Indonesian coal mine, electricity, power plant

Procedia PDF Downloads 153

1086 Effect of Clinical Depression on Automatic Speaker Verification

Authors: Sheeraz Memon, Namunu C. Maddage, Margaret Lech, Nicholas Allen

Abstract:

The effect of a clinical environment on the accuracy of the speaker verification was tested. The speaker verification tests were performed within homogeneous environments containing clinically depressed speakers only, and non-depresses speakers only, as well as within mixed environments containing different mixtures of both climatically depressed and non-depressed speakers. The speaker verification framework included the MFCCs features and the GMM modeling and classification method. The speaker verification experiments within homogeneous environments showed 5.1% increase of the EER within the clinically depressed environment when compared to the non-depressed environment. It indicated that the clinical depression increases the intra-speaker variability and makes the speaker verification task more challenging. Experiments with mixed environments indicated that the increase of the percentage of the depressed individuals within a mixed environment increases the speaker verification equal error rates.

Keywords: speaker verification, GMM, EM, clinical environment, clinical depression

Procedia PDF Downloads 375

1085 Investigating the Factors Affecting Generalization of Deep Learning Models for Plant Disease Detection

Authors: Praveen S. Muthukumarana, Achala C. Aponso

Abstract:

A large percentage of global crop harvest is lost due to crop diseases. Timely identification and treatment of crop diseases is difficult in many developing nations due to insufficient trained professionals in the field of agriculture. Many crop diseases can be accurately diagnosed by visual symptoms. In the past decade, deep learning has been successfully utilized in domains such as healthcare but adoption in agriculture for plant disease detection is rare. The literature shows that models trained with popular datasets such as PlantVillage does not generalize well on real world images. This paper attempts to find out how to make plant disease identification models that generalize well with real world images.

Keywords: agriculture, convolutional neural network, deep learning, plant disease classification, plant disease detection, plant disease diagnosis

Procedia PDF Downloads 145

1084 An Ensemble-based Method for Vehicle Color Recognition

Authors: Saeedeh Barzegar Khalilsaraei, Manoocheher Kelarestaghi, Farshad Eshghi

Abstract:

The vehicle color, as a prominent and stable feature, helps to identify a vehicle more accurately. As a result, vehicle color recognition is of great importance in intelligent transportation systems. Unlike conventional methods which use only a single Convolutional Neural Network (CNN) for feature extraction or classification, in this paper, four CNNs, with different architectures well-performing in different classes, are trained to extract various features from the input image. To take advantage of the distinct capability of each network, the multiple outputs are combined using a stack generalization algorithm as an ensemble technique. As a result, the final model performs better than each CNN individually in vehicle color identification. The evaluation results in terms of overall average accuracy and accuracy variance show the proposed method’s outperformance compared to the state-of-the-art rivals.

Keywords: Vehicle Color Recognition, Ensemble Algorithm, Stack Generalization, Convolutional Neural Network

Procedia PDF Downloads 85

1083 Groundwater Seepage Estimation into Amirkabir Tunnel Using Analytical Methods and DEM and SGR Method

Authors: Hadi Farhadian, Homayoon Katibeh

Abstract:

In this paper, groundwater seepage into Amirkabir tunnel has been estimated using analytical and numerical methods for 14 different sections of the tunnel. Site Groundwater Rating (SGR) method also has been performed for qualitative and quantitative classification of the tunnel sections. The obtained results of above-mentioned methods were compared together. The study shows reasonable accordance with results of the all methods unless for two sections of tunnel. In these two sections there are some significant discrepancies between numerical and analytical results mainly originated from model geometry and high overburden. SGR and the analytical and numerical calculations, confirm the high concentration of seepage inflow in fault zones. Maximum seepage flow into tunnel has been estimated 0.425 lit/sec/m using analytical method and 0.628 lit/sec/m using numerical method occurred in crashed zone. Based on SGR method, six sections of 14 sections in Amirkabir tunnel axis are found to be in "No Risk" class that is supported by the analytical and numerical seepage value of less than 0.04 lit/sec/m.

Keywords: water Seepage, Amirkabir Tunnel, analytical method, DEM, SGR

Procedia PDF Downloads 476

1082 Pre-Industrial Local Architecture According to Natural Properties

Authors: Selin Küçük

Abstract:

Pre-industrial architecture is integration of natural and subsequent properties by intelligence and experience. Since various settlements relatively industrialized or non-industrialized at any time, ‘pre-industrial’ term does not refer to a definite time. Natural properties, which are existent conditions and materials in natural local environment, are climate, geomorphology and local materials. Subsequent properties, which are all anthropological comparatives, are culture of societies, requirements of people and construction techniques that people use. Yet, after industrialization, technology took technique’s place, cultural effects are manipulated, requirements are changed and local/natural properties are almost disappeared in architecture. Technology is universal, global and expands simply; conversely technique is time and experience dependent and should has a considerable cultural background. This research is about construction techniques according to natural properties of a region and classification of these techniques. Understanding local architecture is only possible by searching its background which is hard to reach. There are always changes in positive and negative in architectural techniques through the time. Archaeological layers of a region sometimes give more accurate information about transformation of architecture. However, natural properties of any region are the most helpful elements to perceive construction techniques. Many international sources from different cultures are interested in local architecture by mentioning natural properties separately. Unfortunately, there is no literature deals with this subject as far as systematically in the correct way. This research aims to improve a clear perspective of local architecture existence by categorizing archetypes according to natural properties. The ultimate goal of this research is generating a clear classification of local architecture independent from subsequent (anthropological) properties over the world such like a handbook. Since local architecture is the most sustainable architecture with refer to its economic, ecologic and sociological properties, there should be an excessive information about construction techniques to be learned from. Constructing the same buildings in all over the world is one of the main criticism of modern architectural system. While this critics going on, the same buildings without identity increase incrementally. In post-industrial term, technology widely took technique’s place, yet cultural effects are manipulated, requirements are changed and natural local properties are almost disappeared in architecture. These study does not offer architects to use local techniques, but it indicates the progress of pre-industrial architectural evolution which is healthier, cheaper and natural. Immigration from rural areas to developing/developed cities should be prohibited, thus culture and construction techniques can be preserved. Since big cities have psychological, sensational and sociological impact on people, rural settlers can be convinced to not to immigrate by providing new buildings designed according to natural properties and maintaining their settlements. Improving rural conditions would remove the economical and sociological gulf between cities and rural. What result desired to arrived in, is if there is no deformation (adaptation process of another traditional buildings because of immigration) or assimilation in a climatic region, there should be very similar solutions in the same climatic regions of the world even if there is no relationship (trade, communication etc.) among them.

Keywords: climate zones, geomorphology, local architecture, local materials

Procedia PDF Downloads 428

1081 Identification of Breast Anomalies Based on Deep Convolutional Neural Networks and K-Nearest Neighbors

Authors: Ayyaz Hussain, Tariq Sadad

Abstract:

Breast cancer (BC) is one of the widespread ailments among females globally. The early prognosis of BC can decrease the mortality rate. Exact findings of benign tumors can avoid unnecessary biopsies and further treatments of patients under investigation. However, due to variations in images, it is a tough job to isolate cancerous cases from normal and benign ones. The machine learning technique is widely employed in the classification of BC pattern and prognosis. In this research, a deep convolution neural network (DCNN) called AlexNet architecture is employed to get more discriminative features from breast tissues. To achieve higher accuracy, K-nearest neighbor (KNN) classifiers are employed as a substitute for the softmax layer in deep learning. The proposed model is tested on a widely used breast image database called MIAS dataset for experimental purposes and achieved 99% accuracy.

Keywords: breast cancer, DCNN, KNN, mammography

Procedia PDF Downloads 136

1080 User Requirements Analysis for the Development of Assistive Navigation Mobile Apps for Blind and Visually Impaired People

Authors: Paraskevi Theodorou, Apostolos Meliones

Abstract:

In the context of the development process of two assistive navigation mobile apps for blind and visually impaired people (BVI) an extensive qualitative analysis of the requirements of potential users has been conducted. The analysis was based on interviews with BVIs and aimed to elicit not only their needs with respect to autonomous navigation but also their preferences on specific features of the apps under development. The elicited requirements were structured into four main categories, namely, requirements concerning the capabilities, functionality and usability of the apps, as well as compatibility requirements with respect to other apps and services. The main categories were then further divided into nine sub-categories. This classification, along with its content, aims to become a useful tool for the researcher or the developer who is involved in the development of digital services for BVI.

Keywords: accessibility, assistive mobile apps, blind and visually impaired people, user requirements analysis

Procedia PDF Downloads 123

1079 A Deep Reinforcement Learning-Based Secure Framework against Adversarial Attacks in Power System

Authors: Arshia Aflaki, Hadis Karimipour, Anik Islam

Abstract:

Generative Adversarial Attacks (GAAs) threaten critical sectors, ranging from fingerprint recognition to industrial control systems. Existing Deep Learning (DL) algorithms are not robust enough against this kind of cyber-attack. As one of the most critical industries in the world, the power grid is not an exception. In this study, a Deep Reinforcement Learning-based (DRL) framework assisting the DL model to improve the robustness of the model against generative adversarial attacks is proposed. Real-world smart grid stability data, as an IIoT dataset, test our method and improves the classification accuracy of a deep learning model from around 57 percent to 96 percent.

Keywords: generative adversarial attack, deep reinforcement learning, deep learning, IIoT, generative adversarial networks, power system

Procedia PDF Downloads 36

1078 Contextual Toxicity Detection with Data Augmentation

Authors: Julia Ive, Lucia Specia

Abstract:

Understanding and detecting toxicity is an important problem to support safer human interactions online. Our work focuses on the important problem of contextual toxicity detection, where automated classifiers are tasked with determining whether a short textual segment (usually a sentence) is toxic within its conversational context. We use “toxicity” as an umbrella term to denote a number of variants commonly named in the literature, including hate, abuse, offence, among others. Detecting toxicity in context is a non-trivial problem and has been addressed by very few previous studies. These previous studies have analysed the influence of conversational context in human perception of toxicity in controlled experiments and concluded that humans rarely change their judgements in the presence of context. They have also evaluated contextual detection models based on state-of-the-art Deep Learning and Natural Language Processing (NLP) techniques. Counterintuitively, they reached the general conclusion that computational models tend to suffer performance degradation in the presence of context. We challenge these empirical observations by devising better contextual predictive models that also rely on NLP data augmentation techniques to create larger and better data. In our study, we start by further analysing the human perception of toxicity in conversational data (i.e., tweets), in the absence versus presence of context, in this case, previous tweets in the same conversational thread. We observed that the conclusions of previous work on human perception are mainly due to data issues: The contextual data available does not provide sufficient evidence that context is indeed important (even for humans). The data problem is common in current toxicity datasets: cases labelled as toxic are either obviously toxic (i.e., overt toxicity with swear, racist, etc. words), and thus context does is not needed for a decision, or are ambiguous, vague or unclear even in the presence of context; in addition, the data contains labeling inconsistencies. To address this problem, we propose to automatically generate contextual samples where toxicity is not obvious (i.e., covert cases) without context or where different contexts can lead to different toxicity judgements for the same tweet. We generate toxic and non-toxic utterances conditioned on the context or on target tweets using a range of techniques for controlled text generation(e.g., Generative Adversarial Networks and steering techniques). On the contextual detection models, we posit that their poor performance is due to limitations on both of the data they are trained on (same problems stated above) and the architectures they use, which are not able to leverage context in effective ways. To improve on that, we propose text classification architectures that take the hierarchy of conversational utterances into account. In experiments benchmarking ours against previous models on existing and automatically generated data, we show that both data and architectural choices are very important. Our model achieves substantial performance improvements as compared to the baselines that are non-contextual or contextual but agnostic of the conversation structure.

Keywords: contextual toxicity detection, data augmentation, hierarchical text classification models, natural language processing

Procedia PDF Downloads 170

1077 Analytical Study of Data Mining Techniques for Software Quality Assurance

Authors: Mariam Bibi, Rubab Mehboob, Mehreen Sirshar

Abstract:

Satisfying the customer requirements is the ultimate goal of producing or developing any product. The quality of the product is decided on the bases of the level of customer satisfaction. There are different techniques which have been reported during the survey which enhance the quality of the product through software defect prediction and by locating the missing software requirements. Some mining techniques were proposed to assess the individual performance indicators in collaborative environment to reduce errors at individual level. The basic intention is to produce a product with zero or few defects thereby producing a best product quality wise. In the analysis of survey the techniques like Genetic algorithm, artificial neural network, classification and clustering techniques and decision tree are studied. After analysis it has been discovered that these techniques contributed much to the improvement and enhancement of the quality of the product.

Keywords: data mining, defect prediction, missing requirements, software quality

Procedia PDF Downloads 467

1076 Identifying Promoters and Their Types Based on a Two-Layer Approach

Authors: Bin Liu

Abstract:

Prokaryotic promoter, consisted of two short DNA sequences located at in -35 and -10 positions, is responsible for controlling the initiation and expression of gene expression. Different types of promoters have different functions, and their consensus sequences are similar. In addition, their consensus sequences may be different for the same type of promoter, which poses difficulties for promoter identification. Unfortunately, all existing computational methods treat promoter identification as a binary classification task and can only identify whether a query sequence belongs to a specific promoter type. It is desired to develop computational methods for effectively identifying promoters and their types. Here, a two-layer predictor is proposed to try to deal with the problem. The first layer is designed to predict whether a given sequence is a promoter and the second layer predicts the type of promoter that is judged as a promoter. Meanwhile, we also analyze the importance of feature and sequence conversation in two aspects: promoter identification and promoter type identification. To the best knowledge of ours, it is the first computational predictor to detect promoters and their types.

Keywords: promoter, promoter type, random forest, sequence information

Procedia PDF Downloads 184

1075 Assessment of Taiwan Railway Occurrences Investigations Using Causal Factor Analysis System and Bayesian Network Modeling Method

Authors: Lee Yan Nian

Abstract:

Safety investigation is different from an administrative investigation in that the former is conducted by an independent agency and the purpose of such investigation is to prevent accidents in the future and not to apportion blame or determine liability. Before October 2018, Taiwan railway occurrences were investigated by local supervisory authority. Characteristics of this kind of investigation are that enforcement actions, such as administrative penalty, are usually imposed on those persons or units involved in occurrence. On October 21, 2018, due to a Taiwan Railway accident, which caused 18 fatalities and injured another 267, establishing an agency to independently investigate this catastrophic railway accident was quickly decided. The Taiwan Transportation Safety Board (TTSB) was then established on August 1, 2019 to take charge of investigating major aviation, marine, railway and highway occurrences. The objective of this study is to assess the effectiveness of safety investigations conducted by the TTSB. In this study, the major railway occurrence investigation reports published by the TTSB are used for modeling and analysis. According to the classification of railway occurrences investigated by the TTSB, accident types of Taiwan railway occurrences can be categorized into: derailment, fire, Signal Passed at Danger and others. A Causal Factor Analysis System (CFAS) developed by the TTSB is used to identify the influencing causal factors and their causal relationships in the investigation reports. All terminologies used in the CFAS are equivalent to the Human Factors Analysis and Classification System (HFACS) terminologies, except for “Technical Events” which was added to classify causal factors resulting from mechanical failure. Accordingly, the Bayesian network structure of each occurrence category is established based on the identified causal factors in the CFAS. In the Bayesian networks, the prior probabilities of identified causal factors are obtained from the number of times in the investigation reports. Conditional Probability Table of each parent node is determined from domain experts’ experience and judgement. The resulting networks are quantitatively assessed under different scenarios to evaluate their forward predictions and backward diagnostic capabilities. Finally, the established Bayesian network of derailment is assessed using investigation reports of the same accident which was investigated by the TTSB and the local supervisory authority respectively. Based on the assessment results, findings of the administrative investigation is more closely tied to errors of front line personnel than to organizational related factors. Safety investigation can identify not only unsafe acts of individual but also in-depth causal factors of organizational influences. The results show that the proposed methodology can identify differences between safety investigation and administrative investigation. Therefore, effective intervention strategies in associated areas can be better addressed for safety improvement and future accident prevention through safety investigation.

Keywords: administrative investigation, bayesian network, causal factor analysis system, safety investigation

Procedia PDF Downloads 123

1074 Automatic Identification and Classification of Contaminated Biodegradable Plastics using Machine Learning Algorithms and Hyperspectral Imaging Technology

Authors: Nutcha Taneepanichskul, Helen C. Hailes, Mark Miodownik

Abstract:

Plastic waste has emerged as a critical global environmental challenge, primarily driven by the prevalent use of conventional plastics derived from petrochemical refining and manufacturing processes in modern packaging. While these plastics serve vital functions, their persistence in the environment post-disposal poses significant threats to ecosystems. Addressing this issue necessitates approaches, one of which involves the development of biodegradable plastics designed to degrade under controlled conditions, such as industrial composting facilities. It is imperative to note that compostable plastics are engineered for degradation within specific environments and are not suited for uncontrolled settings, including natural landscapes and aquatic ecosystems. The full benefits of compostable packaging are realized when subjected to industrial composting, preventing environmental contamination and waste stream pollution. Therefore, effective sorting technologies are essential to enhance composting rates for these materials and diminish the risk of contaminating recycling streams. In this study, it leverage hyperspectral imaging technology (HSI) coupled with advanced machine learning algorithms to accurately identify various types of plastics, encompassing conventional variants like Polyethylene terephthalate (PET), Polypropylene (PP), Low density polyethylene (LDPE), High density polyethylene (HDPE) and biodegradable alternatives such as Polybutylene adipate terephthalate (PBAT), Polylactic acid (PLA), and Polyhydroxyalkanoates (PHA). The dataset is partitioned into three subsets: a training dataset comprising uncontaminated conventional and biodegradable plastics, a validation dataset encompassing contaminated plastics of both types, and a testing dataset featuring real-world packaging items in both pristine and contaminated states. Five distinct machine learning algorithms, namely Partial Least Squares Discriminant Analysis (PLS-DA), Support Vector Machine (SVM), Convolutional Neural Network (CNN), Logistic Regression, and Decision Tree Algorithm, were developed and evaluated for their classification performance. Remarkably, the Logistic Regression and CNN model exhibited the most promising outcomes, achieving a perfect accuracy rate of 100% for the training and validation datasets. Notably, the testing dataset yielded an accuracy exceeding 80%. The successful implementation of this sorting technology within recycling and composting facilities holds the potential to significantly elevate recycling and composting rates. As a result, the envisioned circular economy for plastics can be established, thereby offering a viable solution to mitigate plastic pollution.

Keywords: biodegradable plastics, sorting technology, hyperspectral imaging technology, machine learning algorithms

Procedia PDF Downloads 79

1073 Assessment of DNA Sequence Encoding Techniques for Machine Learning Algorithms Using a Universal Bacterial Marker

Authors: Diego Santibañez Oyarce, Fernanda Bravo Cornejo, Camilo Cerda Sarabia, Belén Díaz Díaz, Esteban Gómez Terán, Hugo Osses Prado, Raúl Caulier-Cisterna, Jorge Vergara-Quezada, Ana Moya-Beltrán

Abstract:

The advent of high-throughput sequencing technologies has revolutionized genomics, generating vast amounts of genetic data that challenge traditional bioinformatics methods. Machine learning addresses these challenges by leveraging computational power to identify patterns and extract information from large datasets. However, biological sequence data, being symbolic and non-numeric, must be converted into numerical formats for machine learning algorithms to process effectively. So far, some encoding methods, such as one-hot encoding or k-mers, have been explored. This work proposes additional approaches for encoding DNA sequences in order to compare them with existing techniques and determine if they can provide improvements or if current methods offer superior results. Data from the 16S rRNA gene, a universal marker, was used to analyze eight bacterial groups that are significant in the pulmonary environment and have clinical implications. The bacterial genes included in this analysis are Prevotella, Abiotrophia, Acidovorax, Streptococcus, Neisseria, Veillonella, Mycobacterium, and Megasphaera. These data were downloaded from the NCBI database in Genbank file format, followed by a syntactic analysis to selectively extract relevant information from each file. For data encoding, a sequence normalization process was carried out as the first step. From approximately 22,000 initial data points, a subset was generated for testing purposes. Specifically, 55 sequences from each bacterial group met the length criteria, resulting in an initial sample of approximately 440 sequences. The sequences were encoded using different methods, including one-hot encoding, k-mers, Fourier transform, and Wavelet transform. Various machine learning algorithms, such as support vector machines, random forests, and neural networks, were trained to evaluate these encoding methods. The performance of these models was assessed using multiple metrics, including the confusion matrix, ROC curve, and F1 Score, providing a comprehensive evaluation of their classification capabilities. The results show that accuracies between encoding methods vary by up to approximately 15%, with the Fourier transform obtaining the best results for the evaluated machine learning algorithms. These findings, supported by the detailed analysis using the confusion matrix, ROC curve, and F1 Score, provide valuable insights into the effectiveness of different encoding methods and machine learning algorithms for genomic data analysis, potentially improving the accuracy and efficiency of bacterial classification and related genomic studies.

Keywords: DNA encoding, machine learning, Fourier transform, Fourier transformation

Procedia PDF Downloads 23

1072 Facial Emotion Recognition with Convolutional Neural Network Based Architecture

Authors: Koray U. Erbas

Abstract:

Neural networks are appealing for many applications since they are able to learn complex non-linear relationships between input and output data. As the number of neurons and layers in a neural network increase, it is possible to represent more complex relationships with automatically extracted features. Nowadays Deep Neural Networks (DNNs) are widely used in Computer Vision problems such as; classification, object detection, segmentation image editing etc. In this work, Facial Emotion Recognition task is performed by proposed Convolutional Neural Network (CNN)-based DNN architecture using FER2013 Dataset. Moreover, the effects of different hyperparameters (activation function, kernel size, initializer, batch size and network size) are investigated and ablation study results for Pooling Layer, Dropout and Batch Normalization are presented.

Keywords: convolutional neural network, deep learning, deep learning based FER, facial emotion recognition

Procedia PDF Downloads 273

1071 Threat Analysis: A Technical Review on Risk Assessment and Management of National Testing Service (NTS)

Authors: Beenish Urooj, Ubaid Ullah, Sidra Riasat

Abstract:

National Testing Service-Pakistan (NTS) is an agency in Pakistan that conducts student success appraisal examinations. In this research paper, we must present a security model for the NTS organization. The security model will depict certain security countermeasures for a better defense against certain types of breaches and system malware. We will provide a security roadmap, which will help the company to execute its further goals to maintain security standards and policies. We also covered multiple aspects in securing the environment of the organization. We introduced the processes, architecture, data classification, auditing approaches, survey responses, data handling, and also training and awareness of risk for the company. The primary contribution is the Risk Survey, based on the maturity model meant to assess and examine employee training and knowledge of risks in the company's activities.

Keywords: NTS, risk assessment, threat factors, security, services

Procedia PDF Downloads 70

1070 Contribution to the Study of Automatic Epileptiform Pattern Recognition in Long Term EEG Signals

Authors: Christine F. Boos, Fernando M. Azevedo

Abstract:

Electroencephalogram (EEG) is a record of the electrical activity of the brain that has many applications, such as monitoring alertness, coma and brain death; locating damaged areas of the brain after head injury, stroke and tumor; monitoring anesthesia depth; researching physiology and sleep disorders; researching epilepsy and localizing the seizure focus. Epilepsy is a chronic condition, or a group of diseases of high prevalence, still poorly explained by science and whose diagnosis is still predominantly clinical. The EEG recording is considered an important test for epilepsy investigation and its visual analysis is very often applied for clinical confirmation of epilepsy diagnosis. Moreover, this EEG analysis can also be used to help define the types of epileptic syndrome, determine epileptiform zone, assist in the planning of drug treatment and provide additional information about the feasibility of surgical intervention. In the context of diagnosis confirmation the analysis is made using long term EEG recordings with at least 24 hours long and acquired by a minimum of 24 electrodes in which the neurophysiologists perform a thorough visual evaluation of EEG screens in search of specific electrographic patterns called epileptiform discharges. Considering that the EEG screens usually display 10 seconds of the recording, the neurophysiologist has to evaluate 360 screens per hour of EEG or a minimum of 8,640 screens per long term EEG recording. Analyzing thousands of EEG screens in search patterns that have a maximum duration of 200 ms is a very time consuming, complex and exhaustive task. Because of this, over the years several studies have proposed automated methodologies that could facilitate the neurophysiologists’ task of identifying epileptiform discharges and a large number of methodologies used neural networks for the pattern classification. One of the differences between all of these methodologies is the type of input stimuli presented to the networks, i.e., how the EEG signal is introduced in the network. Five types of input stimuli have been commonly found in literature: raw EEG signal, morphological descriptors (i.e. parameters related to the signal’s morphology), Fast Fourier Transform (FFT) spectrum, Short-Time Fourier Transform (STFT) spectrograms and Wavelet Transform features. This study evaluates the application of these five types of input stimuli and compares the classification results of neural networks that were implemented using each of these inputs. The performance of using raw signal varied between 43 and 84% efficiency. The results of FFT spectrum and STFT spectrograms were quite similar with average efficiency being 73 and 77%, respectively. The efficiency of Wavelet Transform features varied between 57 and 81% while the descriptors presented efficiency values between 62 and 93%. After simulations we could observe that the best results were achieved when either morphological descriptors or Wavelet features were used as input stimuli.

Keywords: Artificial neural network, electroencephalogram signal, pattern recognition, signal processing

Procedia PDF Downloads 528

1069 Machine Learning Approach for Lateralization of Temporal Lobe Epilepsy

Authors: Samira-Sadat JamaliDinan, Haidar Almohri, Mohammad-Reza Nazem-Zadeh

Abstract:

Lateralization of temporal lobe epilepsy (TLE) is very important for positive surgical outcomes. We propose a machine learning framework to ultimately identify the epileptogenic hemisphere for temporal lobe epilepsy (TLE) cases using magnetoencephalography (MEG) coherence source imaging (CSI) and diffusion tensor imaging (DTI). Unlike most studies that use classification algorithms, we propose an effective clustering approach to distinguish between normal and TLE cases. We apply the famous Minkowski weighted K-Means (MWK-Means) technique as the clustering framework. To overcome the problem of poor initialization of K-Means, we use particle swarm optimization (PSO) to effectively select the initial centroids of clusters prior to applying MWK-Means. We demonstrate that compared to K-means and MWK-means independently, this approach is able to improve the result of a benchmark data set.

Keywords: temporal lobe epilepsy, machine learning, clustering, magnetoencephalography

Procedia PDF Downloads 155

1068 Enhancing Fall Detection Accuracy with a Transfer Learning-Aided Transformer Model Using Computer Vision

Authors: Sheldon McCall, Miao Yu, Liyun Gong, Shigang Yue, Stefanos Kollias

Abstract:

Falls are a significant health concern for older adults globally, and prompt identification is critical to providing necessary healthcare support. Our study proposes a new fall detection method using computer vision based on modern deep learning techniques. Our approach involves training a trans- former model on a large 2D pose dataset for general action recognition, followed by transfer learning. Specifically, we freeze the first few layers of the trained transformer model and train only the last two layers for fall detection. Our experimental results demonstrate that our proposed method outperforms both classical machine learning and deep learning approaches in fall/non-fall classification. Overall, our study suggests that our proposed methodology could be a valuable tool for identifying falls.

Keywords: healthcare, fall detection, transformer, transfer learning

Procedia PDF Downloads 146

1067 Multimodal Characterization of Emotion within Multimedia Space

Authors: Dayo Samuel Banjo, Connice Trimmingham, Niloofar Yousefi, Nitin Agarwal

Abstract:

Technological advancement and its omnipresent connection have pushed humans past the boundaries and limitations of a computer screen, physical state, or geographical location. It has provided a depth of avenues that facilitate human-computer interaction that was once inconceivable such as audio and body language detection. Given the complex modularities of emotions, it becomes vital to study human-computer interaction, as it is the commencement of a thorough understanding of the emotional state of users and, in the context of social networks, the producers of multimodal information. This study first acknowledges the accuracy of classification found within multimodal emotion detection systems compared to unimodal solutions. Second, it explores the characterization of multimedia content produced based on their emotions and the coherence of emotion in different modalities by utilizing deep learning models to classify emotion across different modalities.

Keywords: affective computing, deep learning, emotion recognition, multimodal

Procedia PDF Downloads 156

1066 Intelligent Grading System of Apple Using Neural Network Arbitration

Authors: Ebenezer Obaloluwa Olaniyi

Abstract:

In this paper, an intelligent system has been designed to grade apple based on either its defective or healthy for production in food processing. This paper is segmented into two different phase. In the first phase, the image processing techniques were employed to extract the necessary features required in the apple. These techniques include grayscale conversion, segmentation where a threshold value is chosen to separate the foreground of the images from the background. Then edge detection was also employed to bring out the features in the images. These extracted features were then fed into the neural network in the second phase of the paper. The second phase is a classification phase where neural network employed to classify the defective apple from the healthy apple. In this phase, the network was trained with back propagation and tested with feed forward network. The recognition rate obtained from our system shows that our system is more accurate and faster as compared with previous work.

Keywords: image processing, neural network, apple, intelligent system

Procedia PDF Downloads 398

1065 Continuous Improvement Programme as a Strategy for Technological Innovation in Developing Nations. Nigeria as a Case Study

Authors: Sefiu Adebowale Adewumi

Abstract:

Continuous improvement programme (CIP) adopts an approach to improve organizational performance with small incremental steps over time. In this approach, it is not the size of each step that is important, but the likelihood that the improvements will be ongoing. Many companies in developing nations are now complementing continuous improvement with innovation, which is the successful exploitation of new ideas. Focus area of CIP in the organization was in relation to the size of the organizations and also in relation to the generic classification of these organizations. Product quality was prevalent in the manufacturing industry while manpower training and retraining and marketing strategy were emphasized for improvement to be made in the service, transport and supply industries. However, focus on innovation in raw materials, process and methods are needed because these are the critical factors that influence product quality in the manufacturing industries.

Keywords: continuous improvement programme, developing countries, generic classfications, technological innovation

Procedia PDF Downloads 188

1064 Compression Strength of Treated Fine-Grained Soils with Epoxy or Cement

Authors: M. Mlhem

Abstract:

Geotechnical engineers face many problematic soils upon construction and they have the choice for replacing these soils with more appropriate soils or attempting to improve the engineering properties of the soil through a suitable soil stabilization technique. Mostly, improving soils is environmental, easier and more economical than other solutions. Stabilization soils technique is applied by introducing a cementing agent or by injecting a substance to fill the pore volume. Chemical stabilizers are divided into two groups: traditional agents such as cement or lime and non-traditional agents such as polymers. This paper studies the effect of epoxy additives on the compression strength of four types of soil and then compares with the effect of cement on the compression strength for the same soils. Overall, the epoxy additives are more effective in increasing the strength for different types of soils regardless its classification. On the other hand, there was no clear relation between studied parameters liquid limit, passing No.200, unit weight and between the strength of samples for different types of soils.

Keywords: additives, clay, compression strength, epoxy, stabilization

Procedia PDF Downloads 127

1063 Artificial Intelligence Models for Detecting Spatiotemporal Crop Water Stress in Automating Irrigation Scheduling: A Review

Authors: Elham Koohi, Silvio Jose Gumiere, Hossein Bonakdari, Saeid Homayouni

Abstract:

Water used in agricultural crops can be managed by irrigation scheduling based on soil moisture levels and plant water stress thresholds. Automated irrigation scheduling limits crop physiological damage and yield reduction. Knowledge of crop water stress monitoring approaches can be effective in optimizing the use of agricultural water. Understanding the physiological mechanisms of crop responding and adapting to water deficit ensures sustainable agricultural management and food supply. This aim could be achieved by analyzing and diagnosing crop characteristics and their interlinkage with the surrounding environment. Assessments of plant functional types (e.g., leaf area and structure, tree height, rate of evapotranspiration, rate of photosynthesis), controlling changes, and irrigated areas mapping. Calculating thresholds of soil water content parameters, crop water use efficiency, and Nitrogen status make irrigation scheduling decisions more accurate by preventing water limitations between irrigations. Combining Remote Sensing (RS), the Internet of Things (IoT), Artificial Intelligence (AI), and Machine Learning Algorithms (MLAs) can improve measurement accuracies and automate irrigation scheduling. This paper is a review structured by surveying about 100 recent research studies to analyze varied approaches in terms of providing high spatial and temporal resolution mapping, sensor-based Variable Rate Application (VRA) mapping, the relation between spectral and thermal reflectance and different features of crop and soil. The other objective is to assess RS indices formed by choosing specific reflectance bands and identifying the correct spectral band to optimize classification techniques and analyze Proximal Optical Sensors (POSs) to control changes. The innovation of this paper can be defined as categorizing evaluation methodologies of precision irrigation (applying the right practice, at the right place, at the right time, with the right quantity) controlled by soil moisture levels and sensitiveness of crops to water stress, into pre-processing, processing (retrieval algorithms), and post-processing parts. Then, the main idea of this research is to analyze the error reasons and/or values in employing different approaches in three proposed parts reported by recent studies. Additionally, as an overview conclusion tried to decompose different approaches to optimizing indices, calibration methods for the sensors, thresholding and prediction models prone to errors, and improvements in classification accuracy for mapping changes.

Keywords: agricultural crops, crop water stress detection, irrigation scheduling, precision agriculture, remote sensing

Procedia PDF Downloads 71

1062 Challenges and Opportunities: One Stop Processing for the Automation of Indonesian Large-Scale Topographic Base Map Using Airborne LiDAR Data

Authors: Elyta Widyaningrum

Abstract:

The LiDAR data acquisition has been recognizable as one of the fastest solution to provide the basis data for topographic base mapping in Indonesia. The challenges to accelerate the provision of large-scale topographic base maps as a development plan basis gives the opportunity to implement the automated scheme in the map production process. The one stop processing will also contribute to accelerate the map provision especially to conform with the Indonesian fundamental spatial data catalog derived from ISO 19110 and geospatial database integration. Thus, the automated LiDAR classification, DTM generation and feature extraction will be conducted in one GIS-software environment to form all layers of topographic base maps. The quality of automated topographic base map will be assessed and analyzed based on its completeness, correctness, contiguity, consistency and possible customization.

Keywords: automation, GIS environment, LiDAR processing, map quality

Procedia PDF Downloads 368