Search results for: classification of matter
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3833

Search results for: classification of matter

3533 A Spatial Hypergraph Based Semi-Supervised Band Selection Method for Hyperspectral Imagery Semantic Interpretation

Authors: Akrem Sellami, Imed Riadh Farah

Abstract:

Hyperspectral imagery (HSI) typically provides a wealth of information captured in a wide range of the electromagnetic spectrum for each pixel in the image. Hence, a pixel in HSI is a high-dimensional vector of intensities with a large spectral range and a high spectral resolution. Therefore, the semantic interpretation is a challenging task of HSI analysis. We focused in this paper on object classification as HSI semantic interpretation. However, HSI classification still faces some issues, among which are the following: The spatial variability of spectral signatures, the high number of spectral bands, and the high cost of true sample labeling. Therefore, the high number of spectral bands and the low number of training samples pose the problem of the curse of dimensionality. In order to resolve this problem, we propose to introduce the process of dimensionality reduction trying to improve the classification of HSI. The presented approach is a semi-supervised band selection method based on spatial hypergraph embedding model to represent higher order relationships with different weights of the spatial neighbors corresponding to the centroid of pixel. This semi-supervised band selection has been developed to select useful bands for object classification. The presented approach is evaluated on AVIRIS and ROSIS HSIs and compared to other dimensionality reduction methods. The experimental results demonstrate the efficacy of our approach compared to many existing dimensionality reduction methods for HSI classification.

Keywords: dimensionality reduction, hyperspectral image, semantic interpretation, spatial hypergraph

Procedia PDF Downloads 305
3532 Government (Big) Data Ecosystem: Definition, Classification of Actors, and Their Roles

Authors: Syed Iftikhar Hussain Shah, Vasilis Peristeras, Ioannis Magnisalis

Abstract:

Organizations, including governments, generate (big) data that are high in volume, velocity, veracity, and come from a variety of sources. Public Administrations are using (big) data, implementing base registries, and enforcing data sharing within the entire government to deliver (big) data related integrated services, provision of insights to users, and for good governance. Government (Big) data ecosystem actors represent distinct entities that provide data, consume data, manipulate data to offer paid services, and extend data services like data storage, hosting services to other actors. In this research work, we perform a systematic literature review. The key objectives of this paper are to propose a robust definition of government (big) data ecosystem and a classification of government (big) data ecosystem actors and their roles. We showcase a graphical view of actors, roles, and their relationship in the government (big) data ecosystem. We also discuss our research findings. We did not find too much published research articles about the government (big) data ecosystem, including its definition and classification of actors and their roles. Therefore, we lent ideas for the government (big) data ecosystem from numerous areas that include scientific research data, humanitarian data, open government data, industry data, in the literature.

Keywords: big data, big data ecosystem, classification of big data actors, big data actors roles, definition of government (big) data ecosystem, data-driven government, eGovernment, gaps in data ecosystems, government (big) data, public administration, systematic literature review

Procedia PDF Downloads 157
3531 Towards a Balancing Medical Database by Using the Least Mean Square Algorithm

Authors: Kamel Belammi, Houria Fatrim

Abstract:

imbalanced data set, a problem often found in real world application, can cause seriously negative effect on classification performance of machine learning algorithms. There have been many attempts at dealing with classification of imbalanced data sets. In medical diagnosis classification, we often face the imbalanced number of data samples between the classes in which there are not enough samples in rare classes. In this paper, we proposed a learning method based on a cost sensitive extension of Least Mean Square (LMS) algorithm that penalizes errors of different samples with different weight and some rules of thumb to determine those weights. After the balancing phase, we applythe different classifiers (support vector machine (SVM), k- nearest neighbor (KNN) and multilayer neuronal networks (MNN)) for balanced data set. We have also compared the obtained results before and after balancing method.

Keywords: multilayer neural networks, k- nearest neighbor, support vector machine, imbalanced medical data, least mean square algorithm, diabetes

Procedia PDF Downloads 528
3530 Unsupervised Classification of DNA Barcodes Species Using Multi-Library Wavelet Networks

Authors: Abdesselem Dakhli, Wajdi Bellil, Chokri Ben Amar

Abstract:

DNA Barcode, a short mitochondrial DNA fragment, made up of three subunits; a phosphate group, sugar and nucleic bases (A, T, C, and G). They provide good sources of information needed to classify living species. Such intuition has been confirmed by many experimental results. Species classification with DNA Barcode sequences has been studied by several researchers. The classification problem assigns unknown species to known ones by analyzing their Barcode. This task has to be supported with reliable methods and algorithms. To analyze species regions or entire genomes, it becomes necessary to use similarity sequence methods. A large set of sequences can be simultaneously compared using Multiple Sequence Alignment which is known to be NP-complete. To make this type of analysis feasible, heuristics, like progressive alignment, have been developed. Another tool for similarity search against a database of sequences is BLAST, which outputs shorter regions of high similarity between a query sequence and matched sequences in the database. However, all these methods are still computationally very expensive and require significant computational infrastructure. Our goal is to build predictive models that are highly accurate and interpretable. This method permits to avoid the complex problem of form and structure in different classes of organisms. On empirical data and their classification performances are compared with other methods. Our system consists of three phases. The first is called transformation, which is composed of three steps; Electron-Ion Interaction Pseudopotential (EIIP) for the codification of DNA Barcodes, Fourier Transform and Power Spectrum Signal Processing. The second is called approximation, which is empowered by the use of Multi Llibrary Wavelet Neural Networks (MLWNN).The third is called the classification of DNA Barcodes, which is realized by applying the algorithm of hierarchical classification.

Keywords: DNA barcode, electron-ion interaction pseudopotential, Multi Library Wavelet Neural Networks (MLWNN)

Procedia PDF Downloads 315
3529 Enhanced Arabic Semantic Information Retrieval System Based on Arabic Text Classification

Authors: A. Elsehemy, M. Abdeen , T. Nazmy

Abstract:

Since the appearance of the Semantic web, many semantic search techniques and models were proposed to exploit the information in ontology to enhance the traditional keyword-based search. Many advances were made in languages such as English, German, French and Spanish. However, other languages such as Arabic are not fully supported yet. In this paper we present a framework for ontology based information retrieval for Arabic language. Our system consists of four main modules, namely query parser, indexer, search and a ranking module. Our approach includes building a semantic index by linking ontology concepts to documents, including an annotation weight for each link, to be used in ranking the results. We also augmented the framework with an automatic document categorizer, which enhances the overall document ranking. We have built three Arabic domain ontologies: Sports, Economic and Politics as example for the Arabic language. We built a knowledge base that consists of 79 classes and more than 1456 instances. The system is evaluated using the precision and recall metrics. We have done many retrieval operations on a sample of 40,316 documents with a size 320 MB of pure text. The results show that the semantic search enhanced with text classification gives better performance results than the system without classification.

Keywords: Arabic text classification, ontology based retrieval, Arabic semantic web, information retrieval, Arabic ontology

Procedia PDF Downloads 522
3528 Hybrid Transformer and Neural Network Configuration for Protein Classification Using Amino Acids

Authors: Nathan Labiosa, Aryan Kohli

Abstract:

This study introduces a hybrid machine learning model for classifying proteins, developed to address the complexities of protein sequence and structural analysis. Utilizing an architecture that combines a lightweight transformer with a concurrent neural network, the hybrid model leverages both sequential and intrinsic physical properties of proteins. Trained on a comprehensive dataset from the Research Collaboratory for Structural Bioinformatics Protein Data Bank, the model demonstrates a classification accuracy of 95%, outperforming existing methods by at least 15%. The high accuracy achieved demonstrates the potential of this approach to innovate protein classification, facilitating advancements in drug discovery and the development of personalized medicine. By enabling precise protein function prediction, the hybrid model allows for specialized strategies in therapeutic targeting and the exploration of protein dynamics in biological systems. Future work will focus on enhancing the model’s generalizability across diverse datasets and exploring the integration of more machine learning techniques to refine predictive capabilities further. The implications of this research offer potential breakthroughs in biomedical research and the broader field of protein engineering.

Keywords: amino acids, deep learning, enzymes, neural networks, protein classification, proteins, transformers

Procedia PDF Downloads 11
3527 Evaluation of Fuel Properties of Six Tropical Hardwood Timber Species for Briquettes

Authors: Stephen J. Mitchual, Kwasi Frimpong-Mensah, Nicholas A. Darkwa

Abstract:

The fuel potential of six tropical hardwood species namely: Triplochiton scleroxylon, Ceiba pentandra, Aningeria robusta, Terminalia superba, Celtis mildbreadii and Piptadenia africana were studied. Properties studied include the species density, gross calorific value, volatile matter, ash, organic carbon, N, H, S, Cu, Pb, As and Cd content. Fuel properties were determined using standard laboratory methods. The result indicates that the Gross Calorific Value (GCV) of the species ranged from 20.16 to 22.22 MJ/kg and they slightly varied from each other. Additionally, the GCV of the biomass materials were higher than that of other biomass materials like; wheat straw, rice straw, maize straw and sugar cane. The ash and volatile matter content varied from 0.6075 to 5.0407%, and 75.23% to 83.70% respectively. The overall rating of the properties of the six biomass materials suggest that Piptadenia africana has the best fuel property to be used as briquettes and Aningeria robusta the worse. This study therefore suggests that a holistic assessment of a biomass material needs to be done before selecting it for fuel purpose.

Keywords: ash content, briquette, calorific value, elemental composition, species, volatile matter

Procedia PDF Downloads 414
3526 Stripping of Flavour-Active Compounds from Aqueous Food Streams: Effect of Liquid Matrix on Vapour-Liquid Equilibrium in a Beer-Like Solution

Authors: Ali Ammari, Karin Schroen

Abstract:

In brewing industries, stripping is a downstream process to separate volatiles from beer. Due to physiochemical similarities between flavour components, the selectivity of this method is not favourable. Besides, the presence of non-volatile compounds such as proteins and carbohydrates may affect the separation of flavours due to their retaining properties. By using a stripping column with structured packing coupled with a gas chromatography, in this work, the overall mass transfer coefficient along with their corresponding equilibrium data was investigated for a model solution consist of water, ethanol, ethyl acetate and isoamyl acetate. Static headspace analysis also was employed to derive equilibrium data for flavours in the presence of beer dry matter. As it was expected ethanol and dry matter showed retention properties; however, the effect of viscosity in mass transfer coefficient was discarded due to the fact that the viscosity of solution decreased during stripping. The effect of ethanol and beer dry matter were mapped to be used for designing stripping could.

Keywords: flavour, headspace, Henry’s coefficient, mass transfer coefficient, stripping

Procedia PDF Downloads 191
3525 Estimating Tree Height and Forest Classification from Multi Temporal Risat-1 HH and HV Polarized Satellite Aperture Radar Interferometric Phase Data

Authors: Saurav Kumar Suman, P. Karthigayani

Abstract:

In this paper the height of the tree is estimated and forest types is classified from the multi temporal RISAT-1 Horizontal-Horizontal (HH) and Horizontal-Vertical (HV) Polarised Satellite Aperture Radar (SAR) data. The novelty of the proposed project is combined use of the Back-scattering Coefficients (Sigma Naught) and the Coherence. It uses Water Cloud Model (WCM). The approaches use two main steps. (a) Extraction of the different forest parameter data from the Product.xml, BAND-META file and from Grid-xxx.txt file come with the HH & HV polarized data from the ISRO (Indian Space Research Centre). These file contains the required parameter during height estimation. (b) Calculation of the Vegetation and Ground Backscattering, Coherence and other Forest Parameters. (c) Classification of Forest Types using the ENVI 5.0 Tool and ROI (Region of Interest) calculation.

Keywords: RISAT-1, classification, forest, SAR data

Procedia PDF Downloads 402
3524 Study of Rehydration Process of Dried Squash (Cucurbita pepo) at Different Temperatures and Dry Matter-Water Ratios

Authors: Sima Cheraghi Dehdezi, Nasser Hamdami

Abstract:

Air-drying is the most widely employed method for preserving fruits and vegetables. Most of the dried products must be rehydrated by immersion in water prior to their use, so the study of rehydration kinetics in order to optimize rehydration phenomenon has great importance. Rehydration typically composes of three simultaneous processes: the imbibition of water into dried material, the swelling of the rehydrated products and the leaching of soluble solids to rehydration medium. In this research, squash (Cucurbita pepo) fruits were cut into 0.4 cm thick and 4 cm diameter slices. Then, squash slices were blanched in a steam chamber for 4 min. After cooling to room temperature, squash slices were dehydrated in a hot air dryer, under air flow 1.5 m/s and air temperature of 60°C up to moisture content of 0.1065 kg H2O per kg d.m. Dehydrated samples were kept in polyethylene bags and stored at 4°C. Squash slices with specified weight were rehydrated by immersion in distilled water at different temperatures (25, 50, and 75°C), various dry matter-water ratios (1:25, 1:50, and 1:100), which was agitated at 100 rpm. At specified time intervals, up to 300 min, the squash samples were removed from the water, and the weight, moisture content and rehydration indices of the sample were determined.The texture characteristics were examined over a 180 min period. The results showed that rehydration time and temperature had significant effects on moisture content, water absorption capacity (WAC), dry matter holding capacity (DHC), rehydration ability (RA), maximum force and stress in dried squash slices. Dry matter-water ratio had significant effect (p˂0.01) on all squash slice properties except DHC. Moisture content, WAC and RA of squash slices increased, whereas DHC and texture firmness (maximum force and stress) decreased with rehydration time. The maximum moisture content, WAC and RA and the minimum DHC, force and stress, were observed in squash slices rehydrated into 75°C water. The lowest moisture content, WAC and RA and the highest DHC, force and stress, were observed in squash slices immersed in water at 1:100 dry matter-water ratio. In general, for all rehydration conditions of squash slices, the highest water absorption rate occurred during the first minutes of process. Then, this rate decreased. The highest rehydration rate and amount of water absorption occurred in 75°C.

Keywords: dry matter-water ratio, squash, maximum force, rehydration ability

Procedia PDF Downloads 309
3523 Investigation of Influence of Maize Stover Components and Urea Treatment on Dry Matter Digestibility and Fermentation Kinetics Using in vitro Gas Techniques

Authors: Anon Paserakung, Chaloemphon Muangyen, Suban Foiklang, Yanin Opatpatanakit

Abstract:

Improving nutritive values and digestibility of maize stover is an alternative way to increase their utilization in ruminant and reduce air pollution from open burning of maize stover in the northern Thailand. The present study, 2x3 factorial arrangements in completely randomized design was conducted to investigate the effect of maize stover components (whole and upper stover; cut above 5th node). Urea treatment at levels 0, 3, and 6% DM on dry matter digestibility and fermentation kinetics of maize stover using in vitro gas production. After 21 days of urea treatment, results illustrated that there was no interaction between maize stover components and urea treatment on 48h in vitro dry matter digestibility (IVDMD). IVDMD was unaffected by maize stover components (P > 0.05), average IVDMD was 55%. However, using whole maize stover gave higher cumulative gas and gas kinetic parameters than those of upper stover (P<0.05). Treating maize stover by ensiling with urea resulted in a significant linear increase in IVDMD (P<0.05). IVDMD increased from 42.6% to 53.9% when increased urea concentration from 0 to 3% and maximum IVDMD (65.1%) was observed when maize stover was ensiled with 6% urea. Maize stover treated with urea at levels of 0, 3, and 6% linearly increased cumulative gas production at 96h (31.1 vs 50.5 and 59.1 ml, respectively) and all gas kinetic parameters excepted the gas production from the immediately soluble fraction (P<0.50). The results indicate that maize stover treated with 6% urea enhance in vitro dry matter digestibility and fermentation kinetics. This study provides a practical approach to increasing utilization of maize stover in feeding ruminant animals.

Keywords: maize stover, urea treatment, ruminant feed, gas production

Procedia PDF Downloads 219
3522 Ensemble-Based SVM Classification Approach for miRNA Prediction

Authors: Sondos M. Hammad, Sherin M. ElGokhy, Mahmoud M. Fahmy, Elsayed A. Sallam

Abstract:

In this paper, an ensemble-based Support Vector Machine (SVM) classification approach is proposed. It is used for miRNA prediction. Three problems, commonly associated with previous approaches, are alleviated. These problems arise due to impose assumptions on the secondary structural of premiRNA, imbalance between the numbers of the laboratory checked miRNAs and the pseudo-hairpins, and finally using a training data set that does not consider all the varieties of samples in different species. We aggregate the predicted outputs of three well-known SVM classifiers; namely, Triplet-SVM, Virgo and Mirident, weighted by their variant features without any structural assumptions. An additional SVM layer is used in aggregating the final output. The proposed approach is trained and then tested with balanced data sets. The results of the proposed approach outperform the three base classifiers. Improved values for the metrics of 88.88% f-score, 92.73% accuracy, 90.64% precision, 96.64% specificity, 87.2% sensitivity, and the area under the ROC curve is 0.91 are achieved.

Keywords: MiRNAs, SVM classification, ensemble algorithm, assumption problem, imbalance data

Procedia PDF Downloads 344
3521 Use of Gaussian-Euclidean Hybrid Function Based Artificial Immune System for Breast Cancer Diagnosis

Authors: Cuneyt Yucelbas, Seral Ozsen, Sule Yucelbas, Gulay Tezel

Abstract:

Due to the fact that there exist only a small number of complex systems in artificial immune system (AIS) that work out nonlinear problems, nonlinear AIS approaches, among the well-known solution techniques, need to be developed. Gaussian function is usually used as similarity estimation in classification problems and pattern recognition. In this study, diagnosis of breast cancer, the second type of the most widespread cancer in women, was performed with different distance calculation functions that euclidean, gaussian and gaussian-euclidean hybrid function in the clonal selection model of classical AIS on Wisconsin Breast Cancer Dataset (WBCD), which was taken from the University of California, Irvine Machine-Learning Repository. We used 3-fold cross validation method to train and test the dataset. According to the results, the maximum test classification accuracy was reported as 97.35% by using of gaussian-euclidean hybrid function for fold-3. Also, mean of test classification accuracies for all of functions were obtained as 94.78%, 94.45% and 95.31% with use of euclidean, gaussian and gaussian-euclidean, respectively. With these results, gaussian-euclidean hybrid function seems to be a potential distance calculation method, and it may be considered as an alternative distance calculation method for hard nonlinear classification problems.

Keywords: artificial immune system, breast cancer diagnosis, Euclidean function, Gaussian function

Procedia PDF Downloads 429
3520 Nutritional Evaluation of Different Quercus Species in Temperate Regions of Himachal Pradesh

Authors: Ankush Verma, Rohit Bishist

Abstract:

The present investigation was carried out at different locations of Shimla and Kinnaur district and nutrient analysis was done in the laboratory of Department of Silviculture and Agroforestry, Dr. Y.S. Parmar University of Horticulture and Forestry, Nauni, Distt. Solan, Himachal Pradesh during 2019-2020 with the objectives to study the seasonal variation in the nutritive value of different Quercus species and to study the farmers’ preference rating of fodder tress species. From each location leaf samples were collected at 3 months interval from each Quercus spp. The findings of the present study revealed that the nutritional traits of leaves of different Quercus species varied among different seasons throughout the year. The dry matter (61.12 to 64.99%), ether extract (4.07 to 4.42%), crude fibre (34.38 to 37.85%), neutral detergent fibre (57.70 to 61.54%), acid detergent fibre (44.64 to 48.51%), total ash (3.57 to 3.91%), acid insoluble ash (44.64 to 48.51%) and calcium (1.31 to 1.53%) increased with the maturity in the leaves of different Quercus species. While, crude protein (9.10 to 10.61%), nitrogen free extract (44.73 to 47.41%), organic matter (96.09 to 96.43%), and phosphorus (0.16 to 0.31%) decreased with the advancing maturity in the leaves of different Quercus species. Maximum mean values for dry matter (65.05%), ether extract (4.45%), crude fibre (40.82%), neutral detergent fibre (61.48%), acid detergent fibre (48.44%), and organic matter (96.67%) among different Quercus species were recorded in Quercus ilex, while, Maximum mean values for crude protein (10.54%), nitrogen free extract (50.53%), total ash (4.05%), acid insoluble ash (0.59%), calcium (1.61%) and phosphorus (0.40%) were recorded in Quercus leucotrichophora.

Keywords: nutritional evaluation, fodder species, crude protein, carbohydrates

Procedia PDF Downloads 80
3519 Incorporating Information Gain in Regular Expressions Based Classifiers

Authors: Rosa L. Figueroa, Christopher A. Flores, Qing Zeng-Treitler

Abstract:

A regular expression consists of sequence characters which allow describing a text path. Usually, in clinical research, regular expressions are manually created by programmers together with domain experts. Lately, there have been several efforts to investigate how to generate them automatically. This article presents a text classification algorithm based on regexes. The algorithm named REX was designed, and then, implemented as a simplified method to create regexes to classify Spanish text automatically. In order to classify ambiguous cases, such as, when multiple labels are assigned to a testing example, REX includes an information gain method Two sets of data were used to evaluate the algorithm’s effectiveness in clinical text classification tasks. The results indicate that the regular expression based classifier proposed in this work performs statically better regarding accuracy and F-measure than Support Vector Machine and Naïve Bayes for both datasets.

Keywords: information gain, regular expressions, smith-waterman algorithm, text classification

Procedia PDF Downloads 313
3518 One-Class Classification Approach Using Fukunaga-Koontz Transform and Selective Multiple Kernel Learning

Authors: Abdullah Bal

Abstract:

This paper presents a one-class classification (OCC) technique based on Fukunaga-Koontz Transform (FKT) for binary classification problems. The FKT is originally a powerful tool to feature selection and ordering for two-class problems. To utilize the standard FKT for data domain description problem (i.e., one-class classification), in this paper, a set of non-class samples which exist outside of positive class (target class) describing boundary formed with limited training data has been constructed synthetically. The tunnel-like decision boundary around upper and lower border of target class samples has been designed using statistical properties of feature vectors belonging to the training data. To capture higher order of statistics of data and increase discrimination ability, the proposed method, termed one-class FKT (OC-FKT), has been extended to its nonlinear version via kernel machines and referred as OC-KFKT for short. Multiple kernel learning (MKL) is a favorable family of machine learning such that tries to find an optimal combination of a set of sub-kernels to achieve a better result. However, the discriminative ability of some of the base kernels may be low and the OC-KFKT designed by this type of kernels leads to unsatisfactory classification performance. To address this problem, the quality of sub-kernels should be evaluated, and the weak kernels must be discarded before the final decision making process. MKL/OC-FKT and selective MKL/OC-FKT frameworks have been designed stimulated by ensemble learning (EL) to weight and then select the sub-classifiers using the discriminability and diversities measured by eigenvalue ratios. The eigenvalue ratios have been assessed based on their regions on the FKT subspaces. The comparative experiments, performed on various low and high dimensional data, against state-of-the-art algorithms confirm the effectiveness of our techniques, especially in case of small sample size (SSS) conditions.

Keywords: ensemble methods, fukunaga-koontz transform, kernel-based methods, multiple kernel learning, one-class classification

Procedia PDF Downloads 14
3517 Online Handwritten Character Recognition for South Indian Scripts Using Support Vector Machines

Authors: Steffy Maria Joseph, Abdu Rahiman V, Abdul Hameed K. M.

Abstract:

Online handwritten character recognition is a challenging field in Artificial Intelligence. The classification success rate of current techniques decreases when the dataset involves similarity and complexity in stroke styles, number of strokes and stroke characteristics variations. Malayalam is a complex south indian language spoken by about 35 million people especially in Kerala and Lakshadweep islands. In this paper, we consider the significant feature extraction for the similar stroke styles of Malayalam. This extracted feature set are suitable for the recognition of other handwritten south indian languages like Tamil, Telugu and Kannada. A classification scheme based on support vector machines (SVM) is proposed to improve the accuracy in classification and recognition of online malayalam handwritten characters. SVM Classifiers are the best for real world applications. The contribution of various features towards the accuracy in recognition is analysed. Performance for different kernels of SVM are also studied. A graphical user interface has developed for reading and displaying the character. Different writing styles are taken for each of the 44 alphabets. Various features are extracted and used for classification after the preprocessing of input data samples. Highest recognition accuracy of 97% is obtained experimentally at the best feature combination with polynomial kernel in SVM.

Keywords: SVM, matlab, malayalam, South Indian scripts, onlinehandwritten character recognition

Procedia PDF Downloads 571
3516 A t-SNE and UMAP Based Neural Network Image Classification Algorithm

Authors: Shelby Simpson, William Stanley, Namir Naba, Xiaodi Wang

Abstract:

Both t-SNE and UMAP are brand new state of art tools to predominantly preserve the local structure that is to group neighboring data points together, which indeed provides a very informative visualization of heterogeneity in our data. In this research, we develop a t-SNE and UMAP base neural network image classification algorithm to embed the original dataset to a corresponding low dimensional dataset as a preprocessing step, then use this embedded database as input to our specially designed neural network classifier for image classification. We use the fashion MNIST data set, which is a labeled data set of images of clothing objects in our experiments. t-SNE and UMAP are used for dimensionality reduction of the data set and thus produce low dimensional embeddings. Furthermore, we use the embeddings from t-SNE and UMAP to feed into two neural networks. The accuracy of the models from the two neural networks is then compared to a dense neural network that does not use embedding as an input to show which model can classify the images of clothing objects more accurately.

Keywords: t-SNE, UMAP, fashion MNIST, neural networks

Procedia PDF Downloads 193
3515 Geotechnical Properties and Compressibility Behavior of Organic Dredged Soils

Authors: Inci Develioglu, Hasan Firat Pulat

Abstract:

Sustainable development is one of the most important topics in today's world, and it is also an important research topic for geoenvironmental engineering. Dredging process is performed to expand the river and port channel, flood control and accessing harbors. Every year large amount of sediment are dredged for these purposes. Dredged marine soils can be reused as filling materials, road and foundation embankments, construction materials and wildlife habitat developments. In this study, geotechnical engineering properties and compressibility behavior of dredged soil obtained from the Izmir Bay were investigated. The samples with four different organic matter contents were obtained and particle size distributions, consistency limits, pH and specific gravity tests were performed. The consolidation tests were conducted to examine organic matter content (OMC) effects on compressibility behavior of dredged soil. This study has shown that the OMC has an important effect on the engineering properties of dredged soils. The liquid and plastic limits increased with increasing OMC. The lowest specific gravity belonged to sample which has the maximum OMC. The specific gravity values ranged between 2.76 and 2.52. The maximum void ratio difference belongs to sample with the highest OMC (De11% = 0.38). As the organic matter content of the samples increases, the change in the void ratio has also increased. The compression index increases with increasing OMC.

Keywords: compressibility, consolidation, geotechnical properties, organic matter content, dredged soil

Procedia PDF Downloads 251
3514 Autonomous Vehicle Detection and Classification in High Resolution Satellite Imagery

Authors: Ali J. Ghandour, Houssam A. Krayem, Abedelkarim A. Jezzini

Abstract:

High-resolution satellite images and remote sensing can provide global information in a fast way compared to traditional methods of data collection. Under such high resolution, a road is not a thin line anymore. Objects such as cars and trees are easily identifiable. Automatic vehicles enumeration can be considered one of the most important applications in traffic management. In this paper, autonomous vehicle detection and classification approach in highway environment is proposed. This approach consists mainly of three stages: (i) first, a set of preprocessing operations are applied including soil, vegetation, water suppression. (ii) Then, road networks detection and delineation is implemented using built-up area index, followed by several morphological operations. This step plays an important role in increasing the overall detection accuracy since vehicles candidates are objects contained within the road networks only. (iii) Multi-level Otsu segmentation is implemented in the last stage, resulting in vehicle detection and classification, where detected vehicles are classified into cars and trucks. Accuracy assessment analysis is conducted over different study areas to show the great efficiency of the proposed method, especially in highway environment.

Keywords: remote sensing, object identification, vehicle and road extraction, vehicle and road features-based classification

Procedia PDF Downloads 228
3513 Dynamic Distribution Calibration for Improved Few-Shot Image Classification

Authors: Majid Habib Khan, Jinwei Zhao, Xinhong Hei, Liu Jiedong, Rana Shahzad Noor, Muhammad Imran

Abstract:

Deep learning is increasingly employed in image classification, yet the scarcity and high cost of labeled data for training remain a challenge. Limited samples often lead to overfitting due to biased sample distribution. This paper introduces a dynamic distribution calibration method for few-shot learning. Initially, base and new class samples undergo normalization to mitigate disparate feature magnitudes. A pre-trained model then extracts feature vectors from both classes. The method dynamically selects distribution characteristics from base classes (both adjacent and remote) in the embedding space, using a threshold value approach for new class samples. Given the propensity of similar classes to share feature distributions like mean and variance, this research assumes a Gaussian distribution for feature vectors. Subsequently, distributional features of new class samples are calibrated using a corrected hyperparameter, derived from the distribution features of both adjacent and distant base classes. This calibration augments the new class sample set. The technique demonstrates significant improvements, with up to 4% accuracy gains in few-shot classification challenges, as evidenced by tests on miniImagenet and CUB datasets.

Keywords: deep learning, computer vision, image classification, few-shot learning, threshold

Procedia PDF Downloads 60
3512 Facial Pose Classification Using Hilbert Space Filling Curve and Multidimensional Scaling

Authors: Mekamı Hayet, Bounoua Nacer, Benabderrahmane Sidahmed, Taleb Ahmed

Abstract:

Pose estimation is an important task in computer vision. Though the majority of the existing solutions provide good accuracy results, they are often overly complex and computationally expensive. In this perspective, we propose the use of dimensionality reduction techniques to address the problem of facial pose estimation. Firstly, a face image is converted into one-dimensional time series using Hilbert space filling curve, then the approach converts these time series data to a symbolic representation. Furthermore, a distance matrix is calculated between symbolic series of an input learning dataset of images, to generate classifiers of frontal vs. profile face pose. The proposed method is evaluated with three public datasets. Experimental results have shown that our approach is able to achieve a correct classification rate exceeding 97% with K-NN algorithm.

Keywords: machine learning, pattern recognition, facial pose classification, time series

Procedia PDF Downloads 347
3511 COVID-19 Detection from Computed Tomography Images Using UNet Segmentation, Region Extraction, and Classification Pipeline

Authors: Kenan Morani, Esra Kaya Ayana

Abstract:

This study aimed to develop a novel pipeline for COVID-19 detection using a large and rigorously annotated database of computed tomography (CT) images. The pipeline consists of UNet-based segmentation, lung extraction, and a classification part, with the addition of optional slice removal techniques following the segmentation part. In this work, a batch normalization was added to the original UNet model to produce lighter and better localization, which is then utilized to build a full pipeline for COVID-19 diagnosis. To evaluate the effectiveness of the proposed pipeline, various segmentation methods were compared in terms of their performance and complexity. The proposed segmentation method with batch normalization outperformed traditional methods and other alternatives, resulting in a higher dice score on a publicly available dataset. Moreover, at the slice level, the proposed pipeline demonstrated high validation accuracy, indicating the efficiency of predicting 2D slices. At the patient level, the full approach exhibited higher validation accuracy and macro F1 score compared to other alternatives, surpassing the baseline. The classification component of the proposed pipeline utilizes a convolutional neural network (CNN) to make final diagnosis decisions. The COV19-CT-DB dataset, which contains a large number of CT scans with various types of slices and rigorously annotated for COVID-19 detection, was utilized for classification. The proposed pipeline outperformed many other alternatives on the dataset.

Keywords: classification, computed tomography, lung extraction, macro F1 score, UNet segmentation

Procedia PDF Downloads 126
3510 Exploring Multi-Feature Based Action Recognition Using Multi-Dimensional Dynamic Time Warping

Authors: Guoliang Lu, Changhou Lu, Xueyong Li

Abstract:

In action recognition, previous studies have demonstrated the effectiveness of using multiple features to improve the recognition performance. We focus on two practical issues: i) most studies use a direct way of concatenating/accumulating multi features to evaluate the similarity between two actions. This way could be too strong since each kind of feature can include different dimensions, quantities, etc; ii) in many studies, the employed classification methods lack of a flexible and effective mechanism to add new feature(s) into classification. In this paper, we explore an unified scheme based on recently-proposed multi-dimensional dynamic time warping (MD-DTW). Experiments demonstrated the scheme's effectiveness of combining multi-feature and the flexibility of adding new feature(s) to increase the recognition performance. In addition, the explored scheme also provides us an open architecture for using new advanced classification methods in the future to enhance action recognition.

Keywords: action recognition, multi features, dynamic time warping, feature combination

Procedia PDF Downloads 434
3509 Impacts Of Salinity on Co2 Turnover in Some Gefara Soils of Libya

Authors: Fathi Elyaagubi

Abstract:

Salinization is a major threat to the productivity of agricultural land. The Gefara Plain located in the northwest of Libya; comprises about 80% of the total agricultural activity. The high water requirements for the populations and agriculture are depleting the groundwater aquifer, resulting in intrusion of seawater in the first few kilometers along the coast. Due to increasing salinity in the groundwater used for irrigation, the soils of the Gefara Plain are becoming increasingly saline. This research paper investigated the sensitivity of these soils to increased salinity using Co2 evolution as an integrating measure of soil function. Soil was collected from four sites located in the Gefara Plain, Almaya, Janzur, Gargaresh and Tajura. Soil collected from Tajura had the highest background salinity, and Janzur had the highest organic matter content. All of the soils had relatively low organic matter content, ranging between 0.49-%1.25. The cumulative rate of 14CO2 of added 14C-labelled Lolium shoots (Lolium perenne L.) to soils was decreased under effects of water containing different concentrations of NaCl at 20, 50, 70, 90, 150, and 200 mM compared to the control at any time of incubation in four sites.

Keywords: soil salinity, gefara plain, organic matter, 14C-labelled lolium shoots

Procedia PDF Downloads 217
3508 Intelligent Transport System: Classification of Traffic Signs Using Deep Neural Networks in Real Time

Authors: Anukriti Kumar, Tanmay Singh, Dinesh Kumar Vishwakarma

Abstract:

Traffic control has been one of the most common and irritating problems since the time automobiles have hit the roads. Problems like traffic congestion have led to a significant time burden around the world and one significant solution to these problems can be the proper implementation of the Intelligent Transport System (ITS). It involves the integration of various tools like smart sensors, artificial intelligence, position technologies and mobile data services to manage traffic flow, reduce congestion and enhance driver's ability to avoid accidents during adverse weather. Road and traffic signs’ recognition is an emerging field of research in ITS. Classification problem of traffic signs needs to be solved as it is a major step in our journey towards building semi-autonomous/autonomous driving systems. The purpose of this work focuses on implementing an approach to solve the problem of traffic sign classification by developing a Convolutional Neural Network (CNN) classifier using the GTSRB (German Traffic Sign Recognition Benchmark) dataset. Rather than using hand-crafted features, our model addresses the concern of exploding huge parameters and data method augmentations. Our model achieved an accuracy of around 97.6% which is comparable to various state-of-the-art architectures.

Keywords: multiclass classification, convolution neural network, OpenCV

Procedia PDF Downloads 169
3507 A Systematic Literature Review on Security and Privacy Design Patterns

Authors: Ebtehal Aljedaani, Maha Aljohani

Abstract:

Privacy and security patterns are both important for developing software that protects users' data and privacy. Privacy patterns are designed to address common privacy problems, such as unauthorized data collection and disclosure. Security patterns are designed to protect software from attack and ensure reliability and trustworthiness. Using privacy and security patterns, software engineers can implement security and privacy by design principles, which means that security and privacy are considered throughout the software development process. These patterns are available to translate "security & privacy-by-design" into practical advice for software engineering. Previous research on privacy and security patterns has typically focused on one category of patterns at a time. This paper aims to bridge this gap by merging the two categories and identifying their similarities and differences. To do this, the authors conducted a systematic literature review of 25 research papers on privacy and security patterns. The papers were analysed based on the category of the pattern, the classification of the pattern, and the security requirements that the pattern addresses. This paper presents the results of a comprehensive review of privacy and security design patterns. The review is intended to help future IT designers understand the relationship between the two types of patterns and how to use them to design secure and privacy-preserving software. The paper provides a clear classification of privacy and security design patterns, along with examples of each type. The authors found that there is only one widely accepted classification of privacy design patterns, while there are several competing classifications of security design patterns. Three types of security design patterns were found to be the most commonly used.

Keywords: design patterns, security, privacy, classification of patterns, security patterns, privacy patterns

Procedia PDF Downloads 126
3506 Diagnosis and Analysis of Automated Liver and Tumor Segmentation on CT

Authors: R. R. Ramsheeja, R. Sreeraj

Abstract:

For view the internal structures of the human body such as liver, brain, kidney etc have a wide range of different modalities for medical images are provided nowadays. Computer Tomography is one of the most significant medical image modalities. In this paper use CT liver images for study the use of automatic computer aided techniques to calculate the volume of the liver tumor. Segmentation method is used for the detection of tumor from the CT scan is proposed. Gaussian filter is used for denoising the liver image and Adaptive Thresholding algorithm is used for segmentation. Multiple Region Of Interest(ROI) based method that may help to characteristic the feature different. It provides a significant impact on classification performance. Due to the characteristic of liver tumor lesion, inherent difficulties appear selective. For a better performance, a novel proposed system is introduced. Multiple ROI based feature selection and classification are performed. In order to obtain of relevant features for Support Vector Machine(SVM) classifier is important for better generalization performance. The proposed system helps to improve the better classification performance, reason in which we can see a significant reduction of features is used. The diagnosis of liver cancer from the computer tomography images is very difficult in nature. Early detection of liver tumor is very helpful to save the human life.

Keywords: computed tomography (CT), multiple region of interest(ROI), feature values, segmentation, SVM classification

Procedia PDF Downloads 507
3505 An Integrated Lightweight Naïve Bayes Based Webpage Classification Service for Smartphone Browsers

Authors: Mayank Gupta, Siba Prasad Samal, Vasu Kakkirala

Abstract:

The internet world and its priorities have changed considerably in the last decade. Browsing on smart phones has increased manifold and is set to explode much more. Users spent considerable time browsing different websites, that gives a great deal of insight into user’s preferences. Instead of plain information classifying different aspects of browsing like Bookmarks, History, and Download Manager into useful categories would improve and enhance the user’s experience. Most of the classification solutions are server side that involves maintaining server and other heavy resources. It has security constraints and maybe misses on contextual data during classification. On device, classification solves many such problems, but the challenge is to achieve accuracy on classification with resource constraints. This on device classification can be much more useful in personalization, reducing dependency on cloud connectivity and better privacy/security. This approach provides more relevant results as compared to current standalone solutions because it uses content rendered by browser which is customized by the content provider based on user’s profile. This paper proposes a Naive Bayes based lightweight classification engine targeted for a resource constraint devices. Our solution integrates with Web Browser that in turn triggers classification algorithm. Whenever a user browses a webpage, this solution extracts DOM Tree data from the browser’s rendering engine. This DOM data is a dynamic, contextual and secure data that can’t be replicated. This proposal extracts different features of the webpage that runs on an algorithm to classify into multiple categories. Naive Bayes based engine is chosen in this solution for its inherent advantages in using limited resources compared to other classification algorithms like Support Vector Machine, Neural Networks, etc. Naive Bayes classification requires small memory footprint and less computation suitable for smartphone environment. This solution has a feature to partition the model into multiple chunks that in turn will facilitate less usage of memory instead of loading a complete model. Classification of the webpages done through integrated engine is faster, more relevant and energy efficient than other standalone on device solution. This classification engine has been tested on Samsung Z3 Tizen hardware. The Engine is integrated into Tizen Browser that uses Chromium Rendering Engine. For this solution, extensive dataset is sourced from dmoztools.net and cleaned. This cleaned dataset has 227.5K webpages which are divided into 8 generic categories ('education', 'games', 'health', 'entertainment', 'news', 'shopping', 'sports', 'travel'). Our browser integrated solution has resulted in 15% less memory usage (due to partition method) and 24% less power consumption in comparison with standalone solution. This solution considered 70% of the dataset for training the data model and the rest 30% dataset for testing. An average accuracy of ~96.3% is achieved across the above mentioned 8 categories. This engine can be further extended for suggesting Dynamic tags and using the classification for differential uses cases to enhance browsing experience.

Keywords: chromium, lightweight engine, mobile computing, Naive Bayes, Tizen, web browser, webpage classification

Procedia PDF Downloads 162
3504 A Ratio-Weighted Decision Tree Algorithm for Imbalance Dataset Classification

Authors: Doyin Afolabi, Phillip Adewole, Oladipupo Sennaike

Abstract:

Most well-known classifiers, including the decision tree algorithm, can make predictions on balanced datasets efficiently. However, the decision tree algorithm tends to be biased towards imbalanced datasets because of the skewness of the distribution of such datasets. To overcome this problem, this study proposes a weighted decision tree algorithm that aims to remove the bias toward the majority class and prevents the reduction of majority observations in imbalance datasets classification. The proposed weighted decision tree algorithm was tested on three imbalanced datasets- cancer dataset, german credit dataset, and banknote dataset. The specificity, sensitivity, and accuracy metrics were used to evaluate the performance of the proposed decision tree algorithm on the datasets. The evaluation results show that for some of the weights of our proposed decision tree, the specificity, sensitivity, and accuracy metrics gave better results compared to that of the ID3 decision tree and decision tree induced with minority entropy for all three datasets.

Keywords: data mining, decision tree, classification, imbalance dataset

Procedia PDF Downloads 127