Search results for: DNA sequences classification
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2742

Search results for: DNA sequences classification

2382 Continual Learning Using Data Generation for Hyperspectral Remote Sensing Scene Classification

Authors: Samiah Alammari, Nassim Ammour

Abstract:

When providing a massive number of tasks successively to a deep learning process, a good performance of the model requires preserving the previous tasks data to retrain the model for each upcoming classification. Otherwise, the model performs poorly due to the catastrophic forgetting phenomenon. To overcome this shortcoming, we developed a successful continual learning deep model for remote sensing hyperspectral image regions classification. The proposed neural network architecture encapsulates two trainable subnetworks. The first module adapts its weights by minimizing the discrimination error between the land-cover classes during the new task learning, and the second module tries to learn how to replicate the data of the previous tasks by discovering the latent data structure of the new task dataset. We conduct experiments on HSI dataset Indian Pines. The results confirm the capability of the proposed method.

Keywords: continual learning, data reconstruction, remote sensing, hyperspectral image segmentation

Procedia PDF Downloads 266
2381 Comparing the Apparent Error Rate of Gender Specifying from Human Skeletal Remains by Using Classification and Cluster Methods

Authors: Jularat Chumnaul

Abstract:

In forensic science, corpses from various homicides are different; there are both complete and incomplete, depending on causes of death or forms of homicide. For example, some corpses are cut into pieces, some are camouflaged by dumping into the river, some are buried, some are burned to destroy the evidence, and others. If the corpses are incomplete, it can lead to the difficulty of personally identifying because some tissues and bones are destroyed. To specify gender of the corpses from skeletal remains, the most precise method is DNA identification. However, this method is costly and takes longer so that other identification techniques are used instead. The first technique that is widely used is considering the features of bones. In general, an evidence from the corpses such as some pieces of bones, especially the skull and pelvis can be used to identify their gender. To use this technique, forensic scientists are required observation skills in order to classify the difference between male and female bones. Although this technique is uncomplicated, saving time and cost, and the forensic scientists can fairly accurately determine gender by using this technique (apparently an accuracy rate of 90% or more), the crucial disadvantage is there are only some positions of skeleton that can be used to specify gender such as supraorbital ridge, nuchal crest, temporal lobe, mandible, and chin. Therefore, the skeletal remains that will be used have to be complete. The other technique that is widely used for gender specifying in forensic science and archeology is skeletal measurements. The advantage of this method is it can be used in several positions in one piece of bones, and it can be used even if the bones are not complete. In this study, the classification and cluster analysis are applied to this technique, including the Kth Nearest Neighbor Classification, Classification Tree, Ward Linkage Cluster, K-mean Cluster, and Two Step Cluster. The data contains 507 particular individuals and 9 skeletal measurements (diameter measurements), and the performance of five methods are investigated by considering the apparent error rate (APER). The results from this study indicate that the Two Step Cluster and Kth Nearest Neighbor method seem to be suitable to specify gender from human skeletal remains because both yield small apparent error rate of 0.20% and 4.14%, respectively. On the other hand, the Classification Tree, Ward Linkage Cluster, and K-mean Cluster method are not appropriate since they yield large apparent error rate of 10.65%, 10.65%, and 16.37%, respectively. However, there are other ways to evaluate the performance of classification such as an estimate of the error rate using the holdout procedure or misclassification costs, and the difference methods can make the different conclusions.

Keywords: skeletal measurements, classification, cluster, apparent error rate

Procedia PDF Downloads 251
2380 Non-intrusive Hand Control of Drone Using an Inexpensive and Streamlined Convolutional Neural Network Approach

Authors: Evan Lowhorn, Rocio Alba-Flores

Abstract:

The purpose of this work is to develop a method for classifying hand signals and using the output in a drone control algorithm. To achieve this, methods based on Convolutional Neural Networks (CNN) were applied. CNN's are a subset of deep learning, which allows grid-like inputs to be processed and passed through a neural network to be trained for classification. This type of neural network allows for classification via imaging, which is less intrusive than previous methods using biosensors, such as EMG sensors. Classification CNN's operate purely from the pixel values in an image; therefore they can be used without additional exteroceptive sensors. A development bench was constructed using a desktop computer connected to a high-definition webcam mounted on a scissor arm. This allowed the camera to be pointed downwards at the desk to provide a constant solid background for the dataset and a clear detection area for the user. A MATLAB script was created to automate dataset image capture at the development bench and save the images to the desktop. This allowed the user to create their own dataset of 12,000 images within three hours. These images were evenly distributed among seven classes. The defined classes include forward, backward, left, right, idle, and land. The drone has a popular flip function which was also included as an additional class. To simplify control, the corresponding hand signals chosen were the numerical hand signs for one through five for movements, a fist for land, and the universal “ok” sign for the flip command. Transfer learning with PyTorch (Python) was performed using a pre-trained 18-layer residual learning network (ResNet-18) to retrain the network for custom classification. An algorithm was created to interpret the classification and send encoded messages to a Ryze Tello drone over its 2.4 GHz Wi-Fi connection. The drone’s movements were performed in half-meter distance increments at a constant speed. When combined with the drone control algorithm, the classification performed as desired with negligible latency when compared to the delay in the drone’s movement commands.

Keywords: classification, computer vision, convolutional neural networks, drone control

Procedia PDF Downloads 210
2379 Recommendations to Improve Classification of Grade Crossings in Urban Areas of Mexico

Authors: Javier Alfonso Bonilla-Chávez, Angélica Lozano

Abstract:

In North America, more than 2,000 people annually die in accidents related to railroad tracks. In 2020, collisions at grade crossings were the main cause of deaths related to railway accidents in Mexico. Railway networks have constant interaction with motor transport users, cyclists, and pedestrians, mainly in grade crossings, where is the greatest vulnerability and risk of accidents. Usually, accidents at grade crossings are directly related to risky behavior and non-compliance with regulations by motorists, cyclists, and pedestrians, especially in developing countries. Around the world, countries classify these crossings in different ways. In Mexico, according to their dangerousness (high, medium, or low), types A, B and C have been established, recommending for each one different type of auditive and visual signaling and gates, as well as horizontal and vertical signaling. This classification is based in a weighting, but regrettably, it is not explained how the weight values were obtained. A review of the variables and the current approach for the grade crossing classification is required, since it is inadequate for some crossings. In contrast, North America (USA and Canada) and European countries consider a broader classification so that attention to each crossing is addressed more precisely and equipment costs are adjusted. Lack of a proper classification, could lead to cost overruns in the equipment and a deficient operation. To exemplify the lack of a good classification, six crossings are studied, three located in the rural area of Mexico and three in Mexico City. These cases show the need of: improving the current regulations, improving the existing infrastructure, and implementing technological systems, including informative signals with nomenclature of the involved crossing and direct telephone line for reporting emergencies. This implementation is unaffordable for most municipal governments. Also, an inventory of the most dangerous grade crossings in urban and rural areas must be obtained. Then, an approach for improving the classification of grade crossings is suggested. This approach must be based on criteria design, characteristics of adjacent roads or intersections which can influence traffic flow through the crossing, accidents related to motorized and non-motorized vehicles, land use and land management, type of area, and services and economic activities in the zone where the grade crossings is located. An expanded classification of grade crossing in Mexico could reduce accidents and improve the efficiency of the railroad.

Keywords: accidents, grade crossing, railroad, traffic safety

Procedia PDF Downloads 108
2378 Tensor Deep Stacking Neural Networks and Bilinear Mapping Based Speech Emotion Classification Using Facial Electromyography

Authors: P. S. Jagadeesh Kumar, Yang Yung, Wenli Hu

Abstract:

Speech emotion classification is a dominant research field in finding a sturdy and profligate classifier appropriate for different real-life applications. This effort accentuates on classifying different emotions from speech signal quarried from the features related to pitch, formants, energy contours, jitter, shimmer, spectral, perceptual and temporal features. Tensor deep stacking neural networks were supported to examine the factors that influence the classification success rate. Facial electromyography signals were composed of several forms of focuses in a controlled atmosphere by means of audio-visual stimuli. Proficient facial electromyography signals were pre-processed using moving average filter, and a set of arithmetical features were excavated. Extracted features were mapped into consistent emotions using bilinear mapping. With facial electromyography signals, a database comprising diverse emotions will be exposed with a suitable fine-tuning of features and training data. A success rate of 92% can be attained deprived of increasing the system connivance and the computation time for sorting diverse emotional states.

Keywords: speech emotion classification, tensor deep stacking neural networks, facial electromyography, bilinear mapping, audio-visual stimuli

Procedia PDF Downloads 254
2377 Research on Reservoir Lithology Prediction Based on Residual Neural Network and Squeeze-and- Excitation Neural Network

Authors: Li Kewen, Su Zhaoxin, Wang Xingmou, Zhu Jian Bing

Abstract:

Conventional reservoir prediction methods ar not sufficient to explore the implicit relation between seismic attributes, and thus data utilization is low. In order to improve the predictive classification accuracy of reservoir lithology, this paper proposes a deep learning lithology prediction method based on ResNet (Residual Neural Network) and SENet (Squeeze-and-Excitation Neural Network). The neural network model is built and trained by using seismic attribute data and lithology data of Shengli oilfield, and the nonlinear mapping relationship between seismic attribute and lithology marker is established. The experimental results show that this method can significantly improve the classification effect of reservoir lithology, and the classification accuracy is close to 70%. This study can effectively predict the lithology of undrilled area and provide support for exploration and development.

Keywords: convolutional neural network, lithology, prediction of reservoir, seismic attributes

Procedia PDF Downloads 176
2376 Random Forest Classification for Population Segmentation

Authors: Regina Chua

Abstract:

To reduce the costs of re-fielding a large survey, a Random Forest classifier was applied to measure the accuracy of classifying individuals into their assigned segments with the fewest possible questions. Given a long survey, one needed to determine the most predictive ten or fewer questions that would accurately assign new individuals to custom segments. Furthermore, the solution needed to be quick in its classification and usable in non-Python environments. In this paper, a supervised Random Forest classifier was modeled on a dataset with 7,000 individuals, 60 questions, and 254 features. The Random Forest consisted of an iterative collection of individual decision trees that result in a predicted segment with robust precision and recall scores compared to a single tree. A random 70-30 stratified sampling for training the algorithm was used, and accuracy trade-offs at different depths for each segment were identified. Ultimately, the Random Forest classifier performed at 87% accuracy at a depth of 10 with 20 instead of 254 features and 10 instead of 60 questions. With an acceptable accuracy in prioritizing feature selection, new tools were developed for non-Python environments: a worksheet with a formulaic version of the algorithm and an embedded function to predict the segment of an individual in real-time. Random Forest was determined to be an optimal classification model by its feature selection, performance, processing speed, and flexible application in other environments.

Keywords: machine learning, supervised learning, data science, random forest, classification, prediction, predictive modeling

Procedia PDF Downloads 94
2375 Phylogenetic Analysis Based On the Internal Transcribed Spacer-2 (ITS2) Sequences of Diadegma semiclausum (Hymenoptera: Ichneumonidae) Populations Reveals Significant Adaptive Evolution

Authors: Ebraheem Al-Jouri, Youssef Abu-Ahmad, Ramasamy Srinivasan

Abstract:

The parasitoid, Diadegma semiclausum (Hymenoptera: Ichneumonidae) is one of the most effective exotic parasitoids of diamondback moth (DBM), Plutella xylostella in the lowland areas of Homs, Syria. Molecular evolution studies are useful tools to shed light on the molecular bases of insect geographical spread and adaptation to new hosts and environment and for designing better control strategies. In this study, molecular evolution analysis was performed based on the 42 nuclear internal transcribed spacer-2 (ITS2) sequences representing the D. semiclausum and eight other Diadegma spp. from Syria and worldwide. Possible recombination events were identified by RDP4 program. Four potential recombinants of the American D. insulare and D. fenestrale (Jeju) were detected. After detecting and removing recombinant sequences, the ratio of non-synonymous (dN) to synonymous (dS) substitutions per site (dN/dS=ɷ) has been used to identify codon positions involved in adaptive processes. Bayesian techniques were applied to detect selective pressures at a codon level by using five different approaches including: fixed effects likelihood (FEL), internal fixed effects likelihood (IFEL), random effects method (REL), mixed effects model of evolution (MEME) and Program analysis of maximum liklehood (PAML). Among the 40 positively selected amino acids (aa) that differed significantly between clades of Diadegma species, three aa under positive selection were only identified in D. semiclausum. Additionally, all D. semiclausum branches tree were highly found under episodic diversifying selection (EDS) at p≤0.05. Our study provide evidence that both recombination and positive selection have contributed to the molecular diversity of Diadegma spp. and highlights the significant contribution of D. semiclausum in adaptive evolution and influence the fitness in the DBM parasitoid.

Keywords: diadegma sp, DBM, ITS2, phylogeny, recombination, dN/dS, evolution, positive selection

Procedia PDF Downloads 416
2374 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: feature generation, feature learning, genetic algorithm, music information retrieval

Procedia PDF Downloads 434
2373 Effect of the Birth Order and Arrival of Younger Siblings on the Development of a Child: Evidence from India

Authors: Swati Srivastava, Ashish Kumar Upadhyay

Abstract:

Using longitudinal data from three waves of Young Lives Study and Ordinary Least Square methods, study has investigated the effect of birth order and arrival of younger siblings on child development in India. Study used child’s height for age z-score, weight for age z-score, BMI for age z-score, Peabody Picture Vocabulary Test (PPVT-Score)c, maths score, Early Grade Reading Assessment Test (ERGA) score, and memory score to measure the physical and cognitive development of child during wave-3. Findings suggest that having a high birth order is detrimental for child development and the gap between adjacent siblings is larger for children late in the birth sequences than early in the birth sequences. Study also reported that not only older siblings but arrival of younger siblings before assessment of test also reduces the development of a child. The effects become stronger in case of female children than male children.

Keywords: height for age z-score, weight for age z-score, BMI for z-score, PPVT score, math score, EGRA score, memory score, birth order, siblings, Young Lives Study, India

Procedia PDF Downloads 335
2372 Computational Model for Predicting Effective siRNA Sequences Using Whole Stacking Energy (ΔG) for Gene Silencing

Authors: Reena Murali, David Peter S.

Abstract:

The small interfering RNA (siRNA) alters the regulatory role of mRNA during gene expression by translational inhibition. Recent studies shows that up regulation of mRNA cause serious diseases like Cancer. So designing effective siRNA with good knockdown effects play an important role in gene silencing. Various siRNA design tools had been developed earlier. In this work, we are trying to analyze the existing good scoring second generation siRNA predicting tools and to optimize the efficiency of siRNA prediction by designing a computational model using Artificial Neural Network and whole stacking energy (ΔG), which may help in gene silencing and drug design in cancer therapy. Our model is trained and tested against a large data set of siRNA sequences. Validation of our results is done by finding correlation coefficient of experimental versus observed inhibition efficacy of siRNA. We achieved a correlation coefficient of 0.727 in our previous computational model and we could improve the correlation coefficient up to 0.753 when the threshold of whole tacking energy is greater than or equal to -32.5 kcal/mol.

Keywords: artificial neural network, double stranded RNA, RNA interference, short interfering RNA

Procedia PDF Downloads 526
2371 Machine Learning-Enabled Classification of Climbing Using Small Data

Authors: Nicholas Milburn, Yu Liang, Dalei Wu

Abstract:

Athlete performance scoring within the climbing do-main presents interesting challenges as the sport does not have an objective way to assign skill. Assessing skill levels within any sport is valuable as it can be used to mark progress while training, and it can help an athlete choose appropriate climbs to attempt. Machine learning-based methods are popular for complex problems like this. The dataset available was composed of dynamic force data recorded during climbing; however, this dataset came with challenges such as data scarcity, imbalance, and it was temporally heterogeneous. Investigated solutions to these challenges include data augmentation, temporal normalization, conversion of time series to the spectral domain, and cross validation strategies. The investigated solutions to the classification problem included light weight machine classifiers KNN and SVM as well as the deep learning with CNN. The best performing model had an 80% accuracy. In conclusion, there seems to be enough information within climbing force data to accurately categorize climbers by skill.

Keywords: classification, climbing, data imbalance, data scarcity, machine learning, time sequence

Procedia PDF Downloads 142
2370 Occurrence of Porcine circovirus Type 2 in Pigs of Eastern Cape Province South Africa

Authors: Kayode O. Afolabi, Benson C. Iweriebor, Anthony I. Okoh, Larry C. Obi

Abstract:

Porcine circovirus type 2 (PCV2) is the major etiological viral agent of porcine multisystemic wasting syndrome (PWMS) and other porcine circovirus-associated diseases (PCVAD) of great economic importance in pig industry globally. In an effort to determine the status of swine herds in the Province as regarding the ‘small but powerful’ viral pathogen; a total of 375 blood, faecal and nasal swab samples were obtained from seven pig farms (commercial and communal) in Amathole, O.R. Tambo and Chris-Hani District Municipalities of Eastern Cape Province between the year 2015 and 2016. Three hundred and thirty nine (339) samples out of the total sample were subjected to molecular screening using PCV2 specific primers by conventional polymerase chain reaction (PCR). Selected sequences were further analyzed and confirmed through genome sequencing and phylogenetic analyses. The data obtained revealed that 15.93% of the screened samples (54/339) from the swine herds of the studied areas were positive for PCV2; while the severity of occurrence of the viral pathogen as observed at farm level ranges from approximately 5.6% to 60% in the studied farms. The Majority, precisely 15 out of 17 (88%) analyzed sequences were found clustering with other PCV2b reference strains in the phylogenetic analysis. More interestingly, two other sequences obtained were also found clustering within PCV2d genogroup, which is presently another fast-spreading genotype with observable higher virulence in global swine herds. This finding confirmed the presence of this all-important viral pathogen in pigs of the region; which could result in a serious outbreak of PCVAD and huge economic loss at the instances of triggering factors if no appropriate measures are taken to curb its spread effectively.

Keywords: pigs, polymerase chain reaction, porcine circovirus type 2, South Africa

Procedia PDF Downloads 209
2369 An Approach for Vocal Register Recognition Based on Spectral Analysis of Singing

Authors: Aleksandra Zysk, Pawel Badura

Abstract:

Recognizing and controlling vocal registers during singing is a difficult task for beginner vocalist. It requires among others identifying which part of natural resonators is being used when a sound propagates through the body. Thus, an application has been designed allowing for sound recording, automatic vocal register recognition (VRR), and a graphical user interface providing real-time visualization of the signal and recognition results. Six spectral features are determined for each time frame and passed to the support vector machine classifier yielding a binary decision on the head or chest register assignment of the segment. The classification training and testing data have been recorded by ten professional female singers (soprano, aged 19-29) performing sounds for both chest and head register. The classification accuracy exceeded 93% in each of various validation schemes. Apart from a hard two-class clustering, the support vector classifier returns also information on the distance between particular feature vector and the discrimination hyperplane in a feature space. Such an information reflects the level of certainty of the vocal register classification in a fuzzy way. Thus, the designed recognition and training application is able to assess and visualize the continuous trend in singing in a user-friendly graphical mode providing an easy way to control the vocal emission.

Keywords: classification, singing, spectral analysis, vocal emission, vocal register

Procedia PDF Downloads 303
2368 Performance Comparison of Deep Convolutional Neural Networks for Binary Classification of Fine-Grained Leaf Images

Authors: Kamal KC, Zhendong Yin, Dasen Li, Zhilu Wu

Abstract:

Intra-plant disease classification based on leaf images is a challenging computer vision task due to similarities in texture, color, and shape of leaves with a slight variation of leaf spot; and external environmental changes such as lighting and background noises. Deep convolutional neural network (DCNN) has proven to be an effective tool for binary classification. In this paper, two methods for binary classification of diseased plant leaves using DCNN are presented; model created from scratch and transfer learning. Our main contribution is a thorough evaluation of 4 networks created from scratch and transfer learning of 5 pre-trained models. Training and testing of these models were performed on a plant leaf images dataset belonging to 16 distinct classes, containing a total of 22,265 images from 8 different plants, consisting of a pair of healthy and diseased leaves. We introduce a deep CNN model, Optimized MobileNet. This model with depthwise separable CNN as a building block attained an average test accuracy of 99.77%. We also present a fine-tuning method by introducing the concept of a convolutional block, which is a collection of different deep neural layers. Fine-tuned models proved to be efficient in terms of accuracy and computational cost. Fine-tuned MobileNet achieved an average test accuracy of 99.89% on 8 pairs of [healthy, diseased] leaf ImageSet.

Keywords: deep convolution neural network, depthwise separable convolution, fine-grained classification, MobileNet, plant disease, transfer learning

Procedia PDF Downloads 186
2367 Harmonic Data Preparation for Clustering and Classification

Authors: Ali Asheibi

Abstract:

The rapid increase in the size of databases required to store power quality monitoring data has demanded new techniques for analysing and understanding the data. One suggested technique to assist in analysis is data mining. Preparing raw data to be ready for data mining exploration take up most of the effort and time spent in the whole data mining process. Clustering is an important technique in data mining and machine learning in which underlying and meaningful groups of data are discovered. Large amounts of harmonic data have been collected from an actual harmonic monitoring system in a distribution system in Australia for three years. This amount of acquired data makes it difficult to identify operational events that significantly impact the harmonics generated on the system. In this paper, harmonic data preparation processes to better understanding of the data have been presented. Underlying classes in this data has then been identified using clustering technique based on the Minimum Message Length (MML) method. The underlying operational information contained within the clusters can be rapidly visualised by the engineers. The C5.0 algorithm was used for classification and interpretation of the generated clusters.

Keywords: data mining, harmonic data, clustering, classification

Procedia PDF Downloads 247
2366 Unravelling the Knot: Towards a Definition of ‘Digital Labor’

Authors: Marta D'Onofrio

Abstract:

The debate on the digitalization of the economy has raised questions about how both labor and the regulation of work processes are changing due to the introduction of digital technologies in the productive system. Within the literature, the term ‘digital labor’ is commonly used to identify the impact of digitalization on labor. Despite the wide use of this term, it is still not available an unambiguous definition of it, and this could create confusion in the use of terminology and in the attempts of classification. As a consequence, the purpose of this paper is to provide for a definition and to propose a classification of ‘digital labor’, resorting to the theoretical approach of organizational studies.

Keywords: digital labor, digitalization, data-driven algorithms, big data, organizational studies

Procedia PDF Downloads 153
2365 Predicting Potential Protein Therapeutic Candidates from the Gut Microbiome

Authors: Prasanna Ramachandran, Kareem Graham, Helena Kiefel, Sunit Jain, Todd DeSantis

Abstract:

Microbes that reside inside the mammalian GI tract, commonly referred to as the gut microbiome, have been shown to have therapeutic effects in animal models of disease. We hypothesize that specific proteins produced by these microbes are responsible for this activity and may be used directly as therapeutics. To speed up the discovery of these key proteins from the big-data metagenomics, we have applied machine learning techniques. Using amino acid sequences of known epitopes and their corresponding binding partners, protein interaction descriptors (PID) were calculated, making a positive interaction set. A negative interaction dataset was calculated using sequences of proteins known not to interact with these same binding partners. Using Random Forest and positive and negative PID, a machine learning model was trained and used to predict interacting versus non-interacting proteins. Furthermore, the continuous variable, cosine similarity in the interaction descriptors was used to rank bacterial therapeutic candidates. Laboratory binding assays were conducted to test the candidates for their potential as therapeutics. Results from binding assays reveal the accuracy of the machine learning prediction and are subsequently used to further improve the model.

Keywords: protein-interactions, machine-learning, metagenomics, microbiome

Procedia PDF Downloads 376
2364 Classification of Tropical Semi-Modules

Authors: Wagneur Edouard

Abstract:

Tropical algebra is the algebra constructed over an idempotent semifield S. We show here that every m-dimensional tropical module M over S with strongly independent basis can be embedded into Sm, and provide an algebraic invariant -the Γ-matrix of M- which characterises the isomorphy class of M. The strong independence condition also yields a significant improvement to the Whitney embedding for tropical torsion modules published earlier We also show that the strong independence of the basis of M is equivalent to the unique representation of elements of M. Numerous examples illustrate our results.

Keywords: classification, idempotent semi-modules, strong independence, tropical algebra

Procedia PDF Downloads 370
2363 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network

Procedia PDF Downloads 135
2362 Engineering Parameters and Classification of Marly Soils of Tabriz

Authors: Amirali Mahouti, Hooshang Katebi

Abstract:

Enlargement of Tabriz metropolis to the east and north-east caused urban construction to be built on Marl layers and because of increase in excavations depth, further information of this layer is inescapable. Looking at geotechnical investigation shows there is not enough information about Tabriz Marl and this soil has been classified only by color. Tabriz Marl is lacustrine carbonate sediment outcrops, surrounds eastern, northern and southern region of city in the East Azerbaijan Province of Iran and is known as bed rock of city under alluvium sediments. This investigation aims to characterize geotechnical parameters of this soil to identify and set it in classification system of carbonated soils. For this purpose, specimens obtained from 80 locations over the city and subjected to physical and mechanical tests, such as Atterberg limits, density, moisture content, unconfined compression, direct shear and consolidation. CaCO3 content, organic content, PH, XRD, XRF, TGA and geophysical downhole tests also have been done on some of them.

Keywords: carbonated soils, classification of soils, mineralogy, physical and mechanical tests for Marls, Tabriz Marl

Procedia PDF Downloads 317
2361 Using New Machine Algorithms to Classify Iranian Musical Instruments According to Temporal, Spectral and Coefficient Features

Authors: Ronak Khosravi, Mahmood Abbasi Layegh, Siamak Haghipour, Avin Esmaili

Abstract:

In this paper, a study on classification of musical woodwind instruments using a small set of features selected from a broad range of extracted ones by the sequential forward selection method was carried out. Firstly, we extract 42 features for each record in the music database of 402 sound files belonging to five different groups of Flutes (end blown and internal duct), Single –reed, Double –reed (exposed and capped), Triple reed and Quadruple reed. Then, the sequential forward selection method is adopted to choose the best feature set in order to achieve very high classification accuracy. Two different classification techniques of support vector machines and relevance vector machines have been tested out and an accuracy of up to 96% can be achieved by using 21 time, frequency and coefficient features and relevance vector machine with the Gaussian kernel function.

Keywords: coefficient features, relevance vector machines, spectral features, support vector machines, temporal features

Procedia PDF Downloads 320
2360 Stabilization of Clay Soil Using A-3 Soil

Authors: Mohammed Mustapha Alhaji, Sadiku Salawu

Abstract:

A clay soil which classified under A-7-6 soil according to AASHTO soil classification system and CH according to the unified soil classification system was stabilized using A-3 soil (AASHTO soil classification system). The clay soil was replaced with 0%, 10%, 20% to 100% A-3 soil, compacted at both the BSL and BSH compaction energy level and using unconfined compressive strength as evaluation criteria. The MDD of the compactions at both the BSL and BSH compaction energy levels showed increase in MDD from 0% A-3 soil replacement to 40% A-3 soil replacement after which the values reduced to 100% A-3 soil replacement. The trend of the OMC with varied A-3 soil replacement is similar to that of MDD but in a reversed order. The OMC reduced from 0% A-3 soil replacement to 40% A-3 soil replacement after which the values increased to 100% A-3 soil replacement. This trend was attributed to the observed reduction in the void ratio from 0% A-3 soil replacement to 40% A-3 soil replacement after which the void ratio increased to 100% A-3 soil replacement. The maximum UCS for clay at varied A-3 soil replacement increased from 272 and 770kN/m2 for BSL and BSH compaction energy level at 0% A-3 soil replacement to 295 and 795kN/m2 for BSL and BSH compaction energy level respectively at 10% A-3 soil replacement after which the values reduced to 22 and 60kN/m2 for BSL and BSH compaction energy level respectively at 70% A-3 soil replacement. Beyond 70% A-3 soil replacement, the mixture cannot be moulded for UCS test.

Keywords: A-3 soil, clay minerals, pozzolanic action, stabilization

Procedia PDF Downloads 444
2359 Stereo Camera Based Speed-Hump Detection Process for Real Time Driving Assistance System in the Daytime

Authors: Hyun-Koo Kim, Yong-Hun Kim, Soo-Young Suk, Ju H. Park, Ho-Youl Jung

Abstract:

This paper presents an effective speed hump detection process at the day-time. we focus only on round types of speed humps in the day-time dynamic road environment. The proposed speed hump detection scheme consists mainly of two process as stereo matching and speed hump detection process. Our proposed process focuses to speed hump detection process. Speed hump detection process consist of noise reduction step, data fusion step, and speed hemp detection step. The proposed system is tested on Intel Core CPU with 2.80 GHz and 4 GB RAM tested in the urban road environments. The frame rate of test videos is 30 frames per second and the size of each frame of grabbed image sequences is 1280 pixels by 670 pixels. Using object-marked sequences acquired with an on-vehicle camera, we recorded speed humps and non-speed humps samples. Result of the tests, our proposed method can be applied in real-time systems by computation time is 13 ms. For instance; our proposed method reaches 96.1 %.

Keywords: data fusion, round types speed hump, speed hump detection, surface filter

Procedia PDF Downloads 510
2358 Using India’s Traditional Knowledge Digital Library on Traditional Tibetan Medicine

Authors: Chimey Lhamo, Ngawang Tsering

Abstract:

Traditional Tibetan medicine, known as Sowa Rigpa (Science of healing), originated more than 2500 years ago with an insightful background, and it has been growing significant attention in many Asian countries like China, India, Bhutan, and Nepal. Particularly, the Indian government has targeted Traditional Tibetan medicine as its major Indian medical system, including Ayurveda. Although Traditional Tibetan medicine has been growing interest and has a long history, it is not easily recognized worldwide because it exists only in the Tibetan language and it is neither accessible nor understood by patent examiners at the international patent office, data about Traditional Tibetan medicine is not yet broadly exist in the Internet. There has also been the exploitation of traditional Tibetan medicine increasing. The Traditional Knowledge Digital Library is a database aiming to prevent the patenting and misappropriation of India’s traditional medicine knowledge by using India’s Traditional knowledge Digital Library on Sowa Rigpa in order to prevent its exploitation at international patent with the help of information technology tools and an innovative classification systems-traditional knowledge resource classification (TKRC). As of date, more than 3000 Sowa Rigpa formulations have been transcribed into a Traditional Knowledge Digital Library database. In this paper, we are presenting India's Traditional Knowledge Digital Library for Traditional Tibetan medicine, and this database system helps to preserve and prevent the exploitation of Sowa Rigpa. Gradually it will be approved and accepted globally.

Keywords: traditional Tibetan medicine, India's traditional knowledge digital library, traditional knowledge resources classification, international patent classification

Procedia PDF Downloads 128
2357 A Review: Detection and Classification Defects on Banana and Apples by Computer Vision

Authors: Zahow Muoftah

Abstract:

Traditional manual visual grading of fruits has been one of the agricultural industry’s major challenges due to its laborious nature as well as inconsistency in the inspection and classification process. The main requirements for computer vision and visual processing are some effective techniques for identifying defects and estimating defect areas. Automated defect detection using computer vision and machine learning has emerged as a promising area of research with a high and direct impact on the visual inspection domain. Grading, sorting, and disease detection are important factors in determining the quality of fruits after harvest. Many studies have used computer vision to evaluate the quality level of fruits during post-harvest. Many studies have used computer vision to evaluate the quality level of fruits during post-harvest. Many studies have been conducted to identify diseases and pests that affect the fruits of agricultural crops. However, most previous studies concentrated solely on the diagnosis of a lesion or disease. This study focused on a comprehensive study to identify pests and diseases of apple and banana fruits using detection and classification defects on Banana and Apples by Computer Vision. As a result, the current article includes research from these domains as well. Finally, various pattern recognition techniques for detecting apple and banana defects are discussed.

Keywords: computer vision, banana, apple, detection, classification

Procedia PDF Downloads 106
2356 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 73
2355 Maturity Classification of Oil Palm Fresh Fruit Bunches Using Thermal Imaging Technique

Authors: Shahrzad Zolfagharnassab, Abdul Rashid Mohamed Shariff, Reza Ehsani, Hawa Ze Jaffar, Ishak Aris

Abstract:

Ripeness estimation of oil palm fresh fruit is important processes that affect the profitableness and salability of oil palm fruits. The adulthood or ripeness of the oil palm fruits influences the quality of oil palm. Conventional procedure includes physical grading of Fresh Fruit Bunches (FFB) maturity by calculating the number of loose fruits per bunch. This physical classification of oil palm FFB is costly, time consuming and the results may have human error. Hence, many researchers try to develop the methods for ascertaining the maturity of oil palm fruits and thereby, deviously the oil content of distinct palm fruits without the need for exhausting oil extraction and analysis. This research investigates the potential of infrared images (Thermal Images) as a predictor to classify the oil palm FFB ripeness. A total of 270 oil palm fresh fruit bunches from most common cultivar of oil palm bunches Nigresens according to three maturity categories: under ripe, ripe and over ripe were collected. Each sample was scanned by the thermal imaging cameras FLIR E60 and FLIR T440. The average temperature of each bunches were calculated by using image processing in FLIR Tools and FLIR ThermaCAM researcher pro 2.10 environment software. The results show that temperature content decreased from immature to over mature oil palm FFBs. An overall analysis-of-variance (ANOVA) test was proved that this predictor gave significant difference between underripe, ripe and overripe maturity categories. This shows that the temperature as predictors can be good indicators to classify oil palm FFB. Classification analysis was performed by using the temperature of the FFB as predictors through Linear Discriminant Analysis (LDA), Mahalanobis Discriminant Analysis (MDA), Artificial Neural Network (ANN) and K- Nearest Neighbor (KNN) methods. The highest overall classification accuracy was 88.2% by using Artificial Neural Network. This research proves that thermal imaging and neural network method can be used as predictors of oil palm maturity classification.

Keywords: artificial neural network, maturity classification, oil palm FFB, thermal imaging

Procedia PDF Downloads 360
2354 Learning from Dendrites: Improving the Point Neuron Model

Authors: Alexander Vandesompele, Joni Dambre

Abstract:

The diversity in dendritic arborization, as first illustrated by Santiago Ramon y Cajal, has always suggested a role for dendrites in the functionality of neurons. In the past decades, thanks to new recording techniques and optical stimulation methods, it has become clear that dendrites are not merely passive electrical components. They are observed to integrate inputs in a non-linear fashion and actively participate in computations. Regardless, in simulations of neural networks dendritic structure and functionality are often overlooked. Especially in a machine learning context, when designing artificial neural networks, point neuron models such as the leaky-integrate-and-fire (LIF) model are dominant. These models mimic the integration of inputs at the neuron soma, and ignore the existence of dendrites. In this work, the LIF point neuron model is extended with a simple form of dendritic computation. This gives the LIF neuron increased capacity to discriminate spatiotemporal input sequences, a dendritic functionality as observed in another study. Simulations of the spiking neurons are performed using the Bindsnet framework. In the common LIF model, incoming synapses are independent. Here, we introduce a dependency between incoming synapses such that the post-synaptic impact of a spike is not only determined by the weight of the synapse, but also by the activity of other synapses. This is a form of short term plasticity where synapses are potentiated or depressed by the preceding activity of neighbouring synapses. This is a straightforward way to prevent inputs from simply summing linearly at the soma. To implement this, each pair of synapses on a neuron is assigned a variable,representing the synaptic relation. This variable determines the magnitude ofthe short term plasticity. These variables can be chosen randomly or, more interestingly, can be learned using a form of Hebbian learning. We use Spike-Time-Dependent-Plasticity (STDP), commonly used to learn synaptic strength magnitudes. If all neurons in a layer receive the same input, they tend to learn the same through STDP. Adding inhibitory connections between the neurons creates a winner-take-all (WTA) network. This causes the different neurons to learn different input sequences. To illustrate the impact of the proposed dendritic mechanism, even without learning, we attach five input neurons to two output neurons. One output neuron isa regular LIF neuron, the other output neuron is a LIF neuron with dendritic relationships. Then, the five input neurons are allowed to fire in a particular order. The membrane potentials are reset and subsequently the five input neurons are fired in the reversed order. As the regular LIF neuron linearly integrates its inputs at the soma, the membrane potential response to both sequences is similar in magnitude. In the other output neuron, due to the dendritic mechanism, the membrane potential response is different for both sequences. Hence, the dendritic mechanism improves the neuron’s capacity for discriminating spa-tiotemporal sequences. Dendritic computations improve LIF neurons even if the relationships between synapses are established randomly. Ideally however, a learning rule is used to improve the dendritic relationships based on input data. It is possible to learn synaptic strength with STDP, to make a neuron more sensitive to its input. Similarly, it is possible to learn dendritic relationships with STDP, to make the neuron more sensitive to spatiotemporal input sequences. Feeding structured data to a WTA network with dendritic computation leads to a significantly higher number of discriminated input patterns. Without the dendritic computation, output neurons are less specific and may, for instance, be activated by a sequence in reverse order.

Keywords: dendritic computation, spiking neural networks, point neuron model

Procedia PDF Downloads 133
2353 Amharic Text News Classification Using Supervised Learning

Authors: Misrak Assefa

Abstract:

The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.

Keywords: text categorization, supervised machine learning, naive Bayes, decision tree

Procedia PDF Downloads 209