Search results for: classification of filters
1598 An Adaptive Decomposition for the Variability Analysis of Observation Time Series in Geophysics
Authors: Olivier Delage, Thierry Portafaix, Hassan Bencherif, Guillaume Guimbretiere
Abstract:
Most observation data sequences in geophysics can be interpreted as resulting from the interaction of several physical processes at several time and space scales. As a consequence, measurements time series in geophysics have often characteristics of non-linearity and non-stationarity and thereby exhibit strong fluctuations at all time-scales and require a time-frequency representation to analyze their variability. Empirical Mode Decomposition (EMD) is a relatively new technic as part of a more general signal processing method called the Hilbert-Huang transform. This analysis method turns out to be particularly suitable for non-linear and non-stationary signals and consists in decomposing a signal in an auto adaptive way into a sum of oscillating components named IMFs (Intrinsic Mode Functions), and thereby acts as a bank of bandpass filters. The advantages of the EMD technic are to be entirely data driven and to provide the principal variability modes of the dynamics represented by the original time series. However, the main limiting factor is the frequency resolution that may give rise to the mode mixing phenomenon where the spectral contents of some IMFs overlap each other. To overcome this problem, J. Gilles proposed an alternative entitled “Empirical Wavelet Transform” (EWT) which consists in building from the segmentation of the original signal Fourier spectrum, a bank of filters. The method used is based on the idea utilized in the construction of both Littlewood-Paley and Meyer’s wavelets. The heart of the method lies in the segmentation of the Fourier spectrum based on the local maxima detection in order to obtain a set of non-overlapping segments. Because linked to the Fourier spectrum, the frequency resolution provided by EWT is higher than that provided by EMD and therefore allows to overcome the mode-mixing problem. On the other hand, if the EWT technique is able to detect the frequencies involved in the original time series fluctuations, EWT does not allow to associate the detected frequencies to a specific mode of variability as in the EMD technic. Because EMD is closer to the observation of physical phenomena than EWT, we propose here a new technic called EAWD (Empirical Adaptive Wavelet Decomposition) based on the coupling of the EMD and EWT technics by using the IMFs density spectral content to optimize the segmentation of the Fourier spectrum required by EWT. In this study, EMD and EWT technics are described, then EAWD technic is presented. Comparison of results obtained respectively by EMD, EWT and EAWD technics on time series of ozone total columns recorded at Reunion island over [1978-2019] period is discussed. This study was carried out as part of the SOLSTYCE project dedicated to the characterization and modeling of the underlying dynamics of time series issued from complex systems in atmospheric sciencesKeywords: adaptive filtering, empirical mode decomposition, empirical wavelet transform, filter banks, mode-mixing, non-linear and non-stationary time series, wavelet
Procedia PDF Downloads 1371597 Waste from Drinking Water Treatment: The Feasibility for Application in Building Materials
Authors: Marco Correa
Abstract:
The increasing reduction of the volumes of surface water sources supplying most municipalities, as well as the rising demand for treated water, combined with the disposal of effluents from washing of decanters and filters of water treatment plants generates a continuous search for correct environmentally solutions to these problems. The effluents generated by the water treatment industry need to be suitably processed for return to the environment or re-use. This article shows alternatives for sludge dehydration from the water treatment plants (WTP) and eventual disposal of sludge drained. Using the simple design methodology, it is presented a case study for drainage in tanks geotextile, full-scale, which involve five sledge drainage tanks from WTP of the city of Rio Verde. Aiming to the reutilization of drained water from the sledge and enabling its reuse both at the beginning of the treatment process at the WTP and in less noble services as for watering the gardens of the local town hall. The sludge will be used to in the production of building materials.Keywords: dehydration, effluent discharges, re-use, sludge, WTP sludge
Procedia PDF Downloads 3111596 Vineyard Soils of Karnataka - Characterization, Classification and Soil Site Suitability Evaluation
Authors: Harsha B. R., K. S. Anil Kumar
Abstract:
Land characterization, classification, and soil suitability evaluation of grapes-growing pedons were assessed at fifteen taluks covering four agro climatic zones of Karnataka. Study on problems and potentials of grapes cultivation in selected agro-climatic zones was carried out along with the plant sample analysis. Twenty soil profiles were excavated as study site based on the dominance of area falling under grapes production and existing spatial variability of soils. The detailed information of profiles and horizon wise soil samples were collected to study the morphological, physical, chemical, and fertility characteristics. Climatic analysis and water retention characteristics of soils of major grapes-growing areas were also done. Based on the characterisation and classification study, it was revealed that soils of Doddaballapur (Bangalore Blue and Wine grapes), Bangalore North (GKVK Farm, Rajankunte, and IIHR Farm), Devanahalli, Magadi, Hoskote, Chikkaballapur (Dilkush and Red globe), Yelaburga, Hagari Bommanahalli, Bagalkot (UHS farm) and Indi fall under the soil order Alfisol. Vijaypur pedon of northern dry zone was keyed out as Vertisols whereas, Jamkhandi and Athani as Inceptisols. Properties of Aridisols were observed in B. Bagewadi (Manikchaman and Thompson Seedless) and Afzalpur. Soil fertility status and its mapping using GIS technique revealed that all the nutrients were found to be in adequate range except nitrogen, potassium, zinc, iron, and boron, which indicated the need for application along with organic matter to improve the SOC status. Varieties differed among themselves in yield and plant nutrient composition depending on their age, climatic, soil, and management requirements. Bangalore North (GKVK farm) and Jamkhandi are having medium soil organic carbon stocks of 6.21 and 6.55 kg m⁻³, respectively. Soils of Bangalore North (Rajankunte) were highly suitable (S1) for grapes cultivation. Under northern Karnataka, Vijayapura, B. Bagewadi, Indi, and Afzalpur vineyards were good performers despite the limitations of fertility and free lime content.Keywords: land characterization, suitability, soil orders, soil organic carbon stock
Procedia PDF Downloads 1141595 Detection of Internal Mold Infection of Intact Tomatoes by Non-Destructive, Transmittance VIS-NIR Spectroscopy
Authors: K. Petcharaporn
Abstract:
The external characteristics of tomatoes, such as freshness, color and size are typically used in quality control processes for tomatoes sorting. However, the internal mold infection of intact tomato cannot be sorted based on a visible assessment and destructive method alone. In this study, a non-destructive technique was used to predict the internal mold infection of intact tomatoes by using transmittance visible and near infrared (VIS-NIR) spectroscopy. Spectra for 200 samples contained 100 samples for normal tomatoes and 100 samples for mold infected tomatoes were acquired in the wavelength range between 665-955 nm. This data was used in conjunction with partial least squares-discriminant analysis (PLS-DA) method to generate a classification model for tomato quality between groups of internal mold infection of intact tomato samples. For this task, the data was split into two groups, 140 samples were used for a training set and 60 samples were used for a test set. The spectra of both normal and internally mold infected tomatoes showed different features in the visible wavelength range. Combined spectral pretreatments of standard normal variate transformation (SNV) and smoothing (Savitzky-Golay) gave the optimal calibration model in training set, 85.0% (63 out of 71 for the normal samples and 56 out of 69 for the internal mold samples). The classification accuracy of the best model on the test set was 91.7% (29 out of 29 for the normal samples and 26 out of 31 for the internal mold tomato samples). The results from this experiment showed that transmittance VIS-NIR spectroscopy can be used as a non-destructive technique to predict the internal mold infection of intact tomatoes.Keywords: tomato, mold, quality, prediction, transmittance
Procedia PDF Downloads 3631594 A Supervised Approach for Detection of Singleton Spam Reviews
Authors: Atefeh Heydari, Mohammadali Tavakoli, Naomie Salim
Abstract:
In recent years, we have witnessed that online reviews are the most important source of customers’ opinion. They are progressively more used by individuals and organisations to make purchase and business decisions. Unfortunately, for the reason of profit or fame, frauds produce deceptive reviews to hoodwink potential customers. Their activities mislead not only potential customers to make appropriate purchasing decisions and organisations to reshape their business, but also opinion mining techniques by preventing them from reaching accurate results. Spam reviews could be divided into two main groups, i.e. multiple and singleton spam reviews. Detecting a singleton spam review that is the only review written by a user ID is extremely challenging due to lack of clue for detection purposes. Singleton spam reviews are very harmful and various features and proofs used in multiple spam reviews detection are not applicable in this case. Current research aims to propose a novel supervised technique to detect singleton spam reviews. To achieve this, various features are proposed in this study and are to be combined with the most appropriate features extracted from literature and employed in a classifier. In order to compare the performance of different classifiers, SVM and naive Bayes classification algorithms were used for model building. The results revealed that SVM was more accurate than naive Bayes and our proposed technique is capable to detect singleton spam reviews effectively.Keywords: classification algorithms, Naïve Bayes, opinion review spam detection, singleton review spam detection, support vector machine
Procedia PDF Downloads 3091593 Reconstructability Analysis for Landslide Prediction
Authors: David Percy
Abstract:
Landslides are a geologic phenomenon that affects a large number of inhabited places and are constantly being monitored and studied for the prediction of future occurrences. Reconstructability analysis (RA) is a methodology for extracting informative models from large volumes of data that work exclusively with discrete data. While RA has been used in medical applications and social science extensively, we are introducing it to the spatial sciences through applications like landslide prediction. Since RA works exclusively with discrete data, such as soil classification or bedrock type, working with continuous data, such as porosity, requires that these data are binned for inclusion in the model. RA constructs models of the data which pick out the most informative elements, independent variables (IVs), from each layer that predict the dependent variable (DV), landslide occurrence. Each layer included in the model retains its classification data as a primary encoding of the data. Unlike other machine learning algorithms that force the data into one-hot encoding type of schemes, RA works directly with the data as it is encoded, with the exception of continuous data, which must be binned. The usual physical and derived layers are included in the model, and testing our results against other published methodologies, such as neural networks, yields accuracy that is similar but with the advantage of a completely transparent model. The results of an RA session with a data set are a report on every combination of variables and their probability of landslide events occurring. In this way, every combination of informative state combinations can be examined.Keywords: reconstructability analysis, machine learning, landslides, raster analysis
Procedia PDF Downloads 661592 Detection of Internal Mold Infection of Intact For Tomatoes by Non-Destructive, Transmittance VIS-NIR Spectroscopy
Authors: K. Petcharaporn, N. Prathengjit
Abstract:
The external characteristics of tomatoes, such as freshness, color and size are typically used in quality control processes for tomatoes sorting. However, the internal mold infection of intact tomato cannot be sorted based on a visible assessment and destructive method alone. In this study, a non-destructive technique was used to predict the internal mold infection of intact tomatoes by using transmittance visible and near infrared (VIS-NIR) spectroscopy. Spectra for 200 samples contained 100 samples for normal tomatoes and 100 samples for mold infected tomatoes were acquired in the wavelength range between 665-955 nm. This data was used in conjunction with partial least squares-discriminant analysis (PLS-DA) method to generate a classification model for tomato quality between groups of internal mold infection of intact tomato samples. For this task, the data was split into two groups, 140 samples were used for a training set and 60 samples were used for a test set. The spectra of both normal and internally mold infected tomatoes showed different features in the visible wavelength range. Combined spectral pretreatments of standard normal variate transformation (SNV) and smoothing (Savitzky-Golay) gave the optimal calibration model in training set, 85.0% (63 out of 71 for the normal samples and 56 out of 69 for the internal mold samples). The classification accuracy of the best model on the test set was 91.7% (29 out of 29 for the normal samples and 26 out of 31 for the internal mold tomato samples). The results from this experiment showed that transmittance VIS-NIR spectroscopy can be used as a non-destructive technique to predict the internal mold infection of intact tomatoes.Keywords: tomato, mold, quality, prediction, transmittance
Procedia PDF Downloads 5191591 Comparison of the Performance of GaInAsSb and GaSb Cells under Different Temperature Blackbody Radiations
Authors: Liangliang Tang, Chang Xu, Xingying Chen
Abstract:
GaInAsSb cells probably show better performance than GaSb cells in low-temperature thermophotovoltaic systems due to lower bandgap; however, few experiments proved this phenomenon so far. In this paper, numerical simulation is used to evaluate GaInAsSb and GaSb cells with similar structures under different radiation temperatures. We found that GaInAsSb cells with n-type emitters show slightly higher output power densities compared with that of GaSb cells with n-type emitters below 1,550 K-blackbody radiation, and the power density of the later cells will suppress the formers above this temperature point. During the temperature range of 1,000~2,000 K, the efficiencies of GaSb cells are about twice of GaInAsSb cells if perfect filters are used to prevent the emission of the non-absorbed long wavelength photons. Several parameters that affect the GaInAsSb cell were analyzed, such as doping profiles, thicknesses of GaInAsSb epitaxial layer and surface recombination velocity. The non-p junctions, i.e., n-type emitters are better for GaInAsSb cell fabrication, which is similar to that of GaSb cells.Keywords: thermophotovoltaic cell, GaSb, GaInAsSb, diffused emitters
Procedia PDF Downloads 2801590 Change Detection of Vegetative Areas Using Land Use Land Cover of Desertification Vulnerable Areas in Nigeria
Authors: T. Garba, Y. Y. Sabo A. Babanyara, K. G. Ilellah, A. K. Mutari
Abstract:
This study used the Normalized Difference Vegetation Index (NDVI) and maps compiled from the classification of Landsat TM and Landsat ETM images of 1986 and 1999 respectively and Nigeria sat 1 images of 2007 to quantify changes in land use and land cover in selected areas of Nigeria covering 143,609 hectares that are threatened by the encroaching Sahara desert. The results of this investigation revealed a decrease in natural vegetation over the three time slices (1986, 1999 and 2007) which was characterised by an increase in high positive pixel values from 0.04 in 1986 to 0.22 and 0.32 in 1999 and 2007 respectively and, a decrease in natural vegetation from 74,411.60ha in 1986 to 28,591.93ha and 21,819.19ha in 1999 and 2007 respectively. The same results also revealed a periodic trend in which there was progressive increase in the cultivated area from 60,191.87ha in 1986 to 104,376.07ha in 1999 and a terminal decrease to 88,868.31ha in 2007. These findings point to expansion of vegetated and cultivated areas in in the initial period between 1988 and 1996 and reversal of these increases in the terminal period between 1988 and 1996. The study also revealed progressive expansion of built-up areas from 1, 681.68ha in 1986 to 2,661.82ha in 1999 and to 3,765.35ha in 2007. These results argue for the urgent need to protect and conserve the depleting natural vegetation by adopting sustainable human resource use practices i.e. intensive farming in order to minimize persistent depletion of natural vegetation.Keywords: changes, classification, desertification, vegetation changes
Procedia PDF Downloads 3871589 Diagnosis of the Heart Rhythm Disorders by Using Hybrid Classifiers
Authors: Sule Yucelbas, Gulay Tezel, Cuneyt Yucelbas, Seral Ozsen
Abstract:
In this study, it was tried to identify some heart rhythm disorders by electrocardiography (ECG) data that is taken from MIT-BIH arrhythmia database by subtracting the required features, presenting to artificial neural networks (ANN), artificial immune systems (AIS), artificial neural network based on artificial immune system (AIS-ANN) and particle swarm optimization based artificial neural network (PSO-NN) classifier systems. The main purpose of this study is to evaluate the performance of hybrid AIS-ANN and PSO-ANN classifiers with regard to the ANN and AIS. For this purpose, the normal sinus rhythm (NSR), atrial premature contraction (APC), sinus arrhythmia (SA), ventricular trigeminy (VTI), ventricular tachycardia (VTK) and atrial fibrillation (AF) data for each of the RR intervals were found. Then these data in the form of pairs (NSR-APC, NSR-SA, NSR-VTI, NSR-VTK and NSR-AF) is created by combining discrete wavelet transform which is applied to each of these two groups of data and two different data sets with 9 and 27 features were obtained from each of them after data reduction. Afterwards, the data randomly was firstly mixed within themselves, and then 4-fold cross validation method was applied to create the training and testing data. The training and testing accuracy rates and training time are compared with each other. As a result, performances of the hybrid classification systems, AIS-ANN and PSO-ANN were seen to be close to the performance of the ANN system. Also, the results of the hybrid systems were much better than AIS, too. However, ANN had much shorter period of training time than other systems. In terms of training times, ANN was followed by PSO-ANN, AIS-ANN and AIS systems respectively. Also, the features that extracted from the data affected the classification results significantly.Keywords: AIS, ANN, ECG, hybrid classifiers, PSO
Procedia PDF Downloads 4421588 Treatment of Simulated Textile Wastewater Containing Reactive Azo Dyes Using Laboratory Scale Trickling Filter
Authors: Ayesha Irum, Sadia Mumtaz, Abdul Rehman, Iffat Naz, Safia Ahmed
Abstract:
The present study was conducted to evaluate the potential applicability of biological trickling filter system for the treatment of simulated textile wastewater containing reactive azo dyes with bacterial consortium under non-sterile conditions. The percentage decolorization for the treatment of wastewater containing structurally different dyes was found to be higher than 95% in all trials. The stable bacterial count of the biofilm on stone media of the trickling filter during the treatment confirmed the presence, proliferation, dominance and involvement of the added microbial consortium in the treatment of textile wastewater. Results of physicochemical parameters revealed the reduction in chemical oxygen demand (58.5-75.1%), sulphates (18.9-36.5%), and phosphates (63.6-73.0%). UV-Visible and FTIR spectroscopy confirmed decolorization of dye containing wastewater was the ultimate consequence of biodegradation. Toxicological studies revealed the nontoxic nature of degradative metabolites.Keywords: biodegradation, textile dyes, waste water, trickling filters
Procedia PDF Downloads 4331587 Detection of Curvilinear Structure via Recursive Anisotropic Diffusion
Authors: Sardorbek Numonov, Hyohun Kim, Dongwha Shin, Yeonseok Kim, Ji-Su Ahn, Dongeun Choi, Byung-Woo Hong
Abstract:
The detection of curvilinear structures often plays an important role in the analysis of images. In particular, it is considered as a crucial step for the diagnosis of chronic respiratory diseases to localize the fissures in chest CT imagery where the lung is divided into five lobes by the fissures that are characterized by linear features in appearance. However, the characteristic linear features for the fissures are often shown to be subtle due to the high intensity variability, pathological deformation or image noise involved in the imaging procedure, which leads to the uncertainty in the quantification of anatomical or functional properties of the lung. Thus, it is desired to enhance the linear features present in the chest CT images so that the distinctiveness in the delineation of the lobe is improved. We propose a recursive diffusion process that prefers coherent features based on the analysis of structure tensor in an anisotropic manner. The local image features associated with certain scales and directions can be characterized by the eigenanalysis of the structure tensor that is often regularized via isotropic diffusion filters. However, the isotropic diffusion filters involved in the computation of the structure tensor generally blur geometrically significant structure of the features leading to the degradation of the characteristic power in the feature space. Thus, it is required to take into consideration of local structure of the feature in scale and direction when computing the structure tensor. We apply an anisotropic diffusion in consideration of scale and direction of the features in the computation of the structure tensor that subsequently provides the geometrical structure of the features by its eigenanalysis that determines the shape of the anisotropic diffusion kernel. The recursive application of the anisotropic diffusion with the kernel the shape of which is derived from the structure tensor leading to the anisotropic scale-space where the geometrical features are preserved via the eigenanalysis of the structure tensor computed from the diffused image. The recursive interaction between the anisotropic diffusion based on the geometry-driven kernels and the computation of the structure tensor that determines the shape of the diffusion kernels yields a scale-space where geometrical properties of the image structure are effectively characterized. We apply our recursive anisotropic diffusion algorithm to the detection of curvilinear structure in the chest CT imagery where the fissures present curvilinear features and define the boundary of lobes. It is shown that our algorithm yields precise detection of the fissures while overcoming the subtlety in defining the characteristic linear features. The quantitative evaluation demonstrates the robustness and effectiveness of the proposed algorithm for the detection of fissures in the chest CT in terms of the false positive and the true positive measures. The receiver operating characteristic curves indicate the potential of our algorithm as a segmentation tool in the clinical environment. This work was supported by the MISP(Ministry of Science and ICT), Korea, under the National Program for Excellence in SW (20170001000011001) supervised by the IITP(Institute for Information and Communications Technology Promotion).Keywords: anisotropic diffusion, chest CT imagery, chronic respiratory disease, curvilinear structure, fissure detection, structure tensor
Procedia PDF Downloads 2321586 Life Stage Customer Segmentation by Fine-Tuning Large Language Models
Authors: Nikita Katyal, Shaurya Uppal
Abstract:
This paper tackles the significant challenge of accurately classifying customers within a retailer’s customer base. Accurate classification is essential for developing targeted marketing strategies that effectively engage this important demographic. To address this issue, we propose a method that utilizes Large Language Models (LLMs). By employing LLMs, we analyze the metadata associated with product purchases derived from historical data to identify key product categories that act as distinguishing factors. These categories, such as baby food, eldercare products, or family-sized packages, offer valuable insights into the likely household composition of customers, including families with babies, families with kids/teenagers, families with pets, households caring for elders, or mixed households. We segment high-confidence customers into distinct categories by integrating historical purchase behavior with LLM-powered product classification. This paper asserts that life stage segmentation can significantly enhance e-commerce businesses’ ability to target the appropriate customers with tailored products and campaigns, thereby augmenting sales and improving customer retention. Additionally, the paper details the data sources, model architecture, and evaluation metrics employed for the segmentation task.Keywords: LLMs, segmentation, product tags, fine-tuning, target segments, marketing communication
Procedia PDF Downloads 231585 A Survey on Lossless Compression of Bayer Color Filter Array Images
Authors: Alina Trifan, António J. R. Neves
Abstract:
Although most digital cameras acquire images in a raw format, based on a Color Filter Array that arranges RGB color filters on a square grid of photosensors, most image compression techniques do not use the raw data; instead, they use the rgb result of an interpolation algorithm of the raw data. This approach is inefficient and by performing a lossless compression of the raw data, followed by pixel interpolation, digital cameras could be more power efficient and provide images with increased resolution given that the interpolation step could be shifted to an external processing unit. In this paper, we conduct a survey on the use of lossless compression algorithms with raw Bayer images. Moreover, in order to reduce the effect of the transition between colors that increase the entropy of the raw Bayer image, we split the image into three new images corresponding to each channel (red, green and blue) and we study the same compression algorithms applied to each one individually. This simple pre-processing stage allows an improvement of more than 15% in predictive based methods.Keywords: bayer image, CFA, lossless compression, image coding standards
Procedia PDF Downloads 3211584 The Classification Accuracy of Finance Data through Holder Functions
Authors: Yeliz Karaca, Carlo Cattani
Abstract:
This study focuses on the local Holder exponent as a measure of the function regularity for time series related to finance data. In this study, the attributes of the finance dataset belonging to 13 countries (India, China, Japan, Sweden, France, Germany, Italy, Australia, Mexico, United Kingdom, Argentina, Brazil, USA) located in 5 different continents (Asia, Europe, Australia, North America and South America) have been examined.These countries are the ones mostly affected by the attributes with regard to financial development, covering a period from 2012 to 2017. Our study is concerned with the most important attributes that have impact on the development of finance for the countries identified. Our method is comprised of the following stages: (a) among the multi fractal methods and Brownian motion Holder regularity functions (polynomial, exponential), significant and self-similar attributes have been identified (b) The significant and self-similar attributes have been applied to the Artificial Neuronal Network (ANN) algorithms (Feed Forward Back Propagation (FFBP) and Cascade Forward Back Propagation (CFBP)) (c) the outcomes of classification accuracy have been compared concerning the attributes that have impact on the attributes which affect the countries’ financial development. This study has enabled to reveal, through the application of ANN algorithms, how the most significant attributes are identified within the relevant dataset via the Holder functions (polynomial and exponential function).Keywords: artificial neural networks, finance data, Holder regularity, multifractals
Procedia PDF Downloads 2461583 Artificial Intelligence Assisted Sentiment Analysis of Hotel Reviews Using Topic Modeling
Authors: Sushma Ghogale
Abstract:
With a surge in user-generated content or feedback or reviews on the internet, it has become possible and important to know consumers' opinions about products and services. This data is important for both potential customers and businesses providing the services. Data from social media is attracting significant attention and has become the most prominent channel of expressing an unregulated opinion. Prospective customers look for reviews from experienced customers before deciding to buy a product or service. Several websites provide a platform for users to post their feedback for the provider and potential customers. However, the biggest challenge in analyzing such data is in extracting latent features and providing term-level analysis of the data. This paper proposes an approach to use topic modeling to classify the reviews into topics and conduct sentiment analysis to mine the opinions. This approach can analyse and classify latent topics mentioned by reviewers on business sites or review sites, or social media using topic modeling to identify the importance of each topic. It is followed by sentiment analysis to assess the satisfaction level of each topic. This approach provides a classification of hotel reviews using multiple machine learning techniques and comparing different classifiers to mine the opinions of user reviews through sentiment analysis. This experiment concludes that Multinomial Naïve Bayes classifier produces higher accuracy than other classifiers.Keywords: latent Dirichlet allocation, topic modeling, text classification, sentiment analysis
Procedia PDF Downloads 971582 Change Detection and Analysis of Desertification Processes in Semi Arid Land in Algeria Using Landsat Data
Authors: Zegrar Ahmed, Ghabi Mohamed
Abstract:
The degradation of arid and semi-arid ecosystems in Algeria has become a palpable fact that only hinders progress and rural development. In these exceptionally fragile environments, the decline of vegetation is done according to an alarming increase and wind erosion dominates. The ecosystem is subjected to a long hot dry season and low annual average rainfall. The urgency of the fight against desertification is imposed by the very nature of the process that tends to self-accelerate, resulting when human intervention is not forthcoming the irreversibility situations, preventing any possibility of restoration state of these zones. These phenomena have led to different degradation processes, such as the destruction of vegetation, soil erosion, and deterioration of the physical environment. In this study, the work is mainly based on the criteria for classification and identification of physical parameters for spatial analysis and multi-sources to determine the vulnerability of major steppe formations and their impact on desertification. we used Landsat data with two different dates March 2010 and November 2014 in order to determine the changes in land cover, sand moving and land degradation for the diagnosis of the desertification Phenomenon. The application, through specific processes, including the supervised classification was used to characterize the main steppe formations. An analysis of the vulnerability of plant communities was conducted to assign weights and identify areas most susceptible to desertification. Vegetation indices are used to characterize the steppe formations to determine changes in land use.Keywords: remote sensing, SIG, ecosystem, degradation, desertification
Procedia PDF Downloads 3391581 An Efficient Separation for Convolutive Mixtures
Authors: Salah Al-Din I. Badran, Samad Ahmadi, Dylan Menzies, Ismail Shahin
Abstract:
This paper describes a new efficient blind source separation method; in this method we use a non-uniform filter bank and a new structure with different sub-bands. This method provides a reduced permutation and increased convergence speed comparing to the full-band algorithm. Recently, some structures have been suggested to deal with two problems: reducing permutation and increasing the speed of convergence of the adaptive algorithm for correlated input signals. The permutation problem is avoided with the use of adaptive filters of orders less than the full-band adaptive filter, which operate at a sampling rate lower than the sampling rate of the input signal. The decomposed signals by analysis bank filter are less correlated in each sub-band than the input signal at full-band, and can promote better rates of convergence.Keywords: Blind source separation, estimates, full-band, mixtures, sub-band
Procedia PDF Downloads 4451580 Navigating Government Finance Statistics: Effortless Retrieval and Comparative Analysis through Data Science and Machine Learning
Authors: Kwaku Damoah
Abstract:
This paper presents a methodology and software application (App) designed to empower users in accessing, retrieving, and comparatively exploring data within the hierarchical network framework of the Government Finance Statistics (GFS) system. It explores the ease of navigating the GFS system and identifies the gaps filled by the new methodology and App. The GFS, embodies a complex Hierarchical Network Classification (HNC) structure, encapsulating institutional units, revenues, expenses, assets, liabilities, and economic activities. Navigating this structure demands specialized knowledge, experience, and skill, posing a significant challenge for effective analytics and fiscal policy decision-making. Many professionals encounter difficulties deciphering these classifications, hindering confident utilization of the system. This accessibility barrier obstructs a vast number of professionals, students, policymakers, and the public from leveraging the abundant data and information within the GFS. Leveraging R programming language, Data Science Analytics and Machine Learning, an efficient methodology enabling users to access, navigate, and conduct exploratory comparisons was developed. The machine learning Fiscal Analytics App (FLOWZZ) democratizes access to advanced analytics through its user-friendly interface, breaking down expertise barriers.Keywords: data science, data wrangling, drilldown analytics, government finance statistics, hierarchical network classification, machine learning, web application.
Procedia PDF Downloads 701579 Study on Filter for Semiconductor of Minimizing Damage by X-Ray Laminography
Authors: Chan Jong Park, Hye Min Park, Jeong Ho Kim, Ki Hyun Park, Koan Sik Joo
Abstract:
This research used the MCNPX simulation program to evaluate the utility of a filter that was developed to minimize the damage to a semiconductor device during defect testing with X-ray. The X-ray generator was designed using the MCNPX code, and the X-ray absorption spectrum of the semiconductor device was obtained based on the designed X-ray generator code. To evaluate the utility of the filter, the X-ray absorption rates of the semiconductor device were calculated and compared for Ag, Rh, Mo and V filters with thicknesses of 25μm, 50μm, and 75μm. The results showed that the X-ray absorption rate varied with the type and thickness of the filter, ranging from 8.74% to 49.28%. The Rh filter showed the highest X-ray absorption rates of 29.8%, 15.18% and 8.74% for the above-mentioned filter thicknesses. As shown above, the characteristics of the X-ray absorption with respect to the type and thickness of the filter were identified using MCNPX simulation. With these results, both time and expense could be saved in the production of the desired filter. In the future, this filter will be produced, and its performance will be evaluated.Keywords: X-ray, MCNPX, filter, semiconductor, damage
Procedia PDF Downloads 4241578 Safety Considerations of Furanics for Sustainable Applications in Advanced Biorefineries
Authors: Anitha Muralidhara, Victor Engelen, Christophe Len, Pascal Pandard, Guy Marlair
Abstract:
Production of bio-based chemicals and materials from lignocellulosic biomass is gaining tremendous importance in advanced bio-refineries while aiming towards progressive replacement of petroleum based chemicals in transportation fuels and commodity polymers. One such attempt has resulted in the production of key furan derivatives (FD) such as furfural, HMF, MMF etc., via acid catalyzed dehydration (ACD) of C6 and C5 sugars, which are further converted into key chemicals or intermediates (such as Furandicarboxylic acid, Furfuryl alcohol etc.,). In subsequent processes, many high potential FD are produced, that can be converted into high added value polymers or high energy density biofuels. During ACD, an unavoidable polyfuranic byproduct is generated which is called humins. The family of FD is very large with varying chemical structures and diverse physicochemical properties. Accordingly, the associated risk profiles may largely vary. Hazardous Material (Haz-mat) classification systems such as GHS (CLP in the EU) and the UN TDG Model Regulations for transport of dangerous goods are one of the preliminary requirements for all chemicals for their appropriate classification, labelling, packaging, safe storage, and transportation. Considering the growing application routes of FD, it becomes important to notice the limited access to safety related information (safety data sheets available only for famous compounds such as HMF, furfural etc.,) in these internationally recognized haz-mat classification systems. However, these classifications do not necessarily provide information about the extent of risk involved when the chemical is used in any specific application. Factors such as thermal stability, speed of combustion, chemical incompatibilities, etc., can equally influence the safety profile of a compound, that are clearly out of the scope of any haz-mat classification system. Irrespective of the bio-based origin, FD has so far received inconsistent remarks concerning their toxicity profiles. With such inconsistencies, there is a fear that, a large family of FD may also follow extreme judgmental scenarios like ionic liquids, by ranking some compounds as extremely thermally stable, non-flammable, etc., Unless clarified, these messages could lead to misleading judgements while ranking the chemical based on its hazard rating. Safety is a key aspect in any sustainable biorefinery operation/facility, which is often underscored or neglected. To fill up these existing data gaps and to address ambiguities and discrepancies, the current study focuses on giving preliminary insights on safety assessment of FD and their potential targeted by-products. With the available information in the literature and obtained experimental results, physicochemical safety, environmental safety as well as (a scenario based) fire safety profiles of key FD, as well as side streams such as humins and levulinic acid, will be considered. With this, the study focuses on defining patterns and trends that gives coherent safety related information for existing and newly synthesized FD in the market for better functionality and sustainable applications.Keywords: furanics, humins, safety, thermal and fire hazard, toxicity
Procedia PDF Downloads 1661577 Application of MALDI-MS to Differentiate SARS-CoV-2 and Non-SARS-CoV-2 Symptomatic Infections in the Early and Late Phases of the Pandemic
Authors: Dmitriy Babenko, Sergey Yegorov, Ilya Korshukov, Aidana Sultanbekova, Valentina Barkhanskaya, Tatiana Bashirova, Yerzhan Zhunusov, Yevgeniya Li, Viktoriya Parakhina, Svetlana Kolesnichenko, Yeldar Baiken, Aruzhan Pralieva, Zhibek Zhumadilova, Matthew S. Miller, Gonzalo H. Hortelano, Anar Turmuhambetova, Antonella E. Chesca, Irina Kadyrova
Abstract:
Introduction: The rapidly evolving COVID-19 pandemic, along with the re-emergence of pathogens causing acute respiratory infections (ARI), has necessitated the development of novel diagnostic tools to differentiate various causes of ARI. MALDI-MS, due to its wide usage and affordability, has been proposed as a potential instrument for diagnosing SARS-CoV-2 versus non-SARS-CoV-2 ARI. The aim of this study was to investigate the potential of MALDI-MS in conjunction with a machine learning model to accurately distinguish between symptomatic infections caused by SARS-CoV-2 and non-SARS-CoV-2 during both the early and later phases of the pandemic. Furthermore, this study aimed to analyze mass spectrometry (MS) data obtained from nasal swabs of healthy individuals. Methods: We gathered mass spectra from 252 samples, comprising 108 SARS-CoV-2-positive samples obtained in 2020 (Covid 2020), 7 SARS-CoV- 2-positive samples obtained in 2023 (Covid 2023), 71 samples from symptomatic individuals without SARS-CoV-2 (Control non-Covid ARVI), and 66 samples from healthy individuals (Control healthy). All the samples were subjected to RT-PCR testing. For data analysis, we employed the caret R package to train and test seven machine-learning algorithms: C5.0, KNN, NB, RF, SVM-L, SVM-R, and XGBoost. We conducted a training process using a five-fold (outer) nested repeated (five times) ten-fold (inner) cross-validation with a randomized stratified splitting approach. Results: In this study, we utilized the Covid 2020 dataset as a case group and the non-Covid ARVI dataset as a control group to train and test various machine learning (ML) models. Among these models, XGBoost and SVM-R demonstrated the highest performance, with accuracy values of 0.97 [0.93, 0.97] and 0.95 [0.95; 0.97], specificity values of 0.86 [0.71; 0.93] and 0.86 [0.79; 0.87], and sensitivity values of 0.984 [0.984; 1.000] and 1.000 [0.968; 1.000], respectively. When examining the Covid 2023 dataset, the Naive Bayes model achieved the highest classification accuracy of 43%, while XGBoost and SVM-R achieved accuracies of 14%. For the healthy control dataset, the accuracy of the models ranged from 0.27 [0.24; 0.32] for k-nearest neighbors to 0.44 [0.41; 0.45] for the Support Vector Machine with a radial basis function kernel. Conclusion: Therefore, ML models trained on MALDI MS of nasopharyngeal swabs obtained from patients with Covid during the initial phase of the pandemic, as well as symptomatic non-Covid individuals, showed excellent classification performance, which aligns with the results of previous studies. However, when applied to swabs from healthy individuals and a limited sample of patients with Covid in the late phase of the pandemic, ML models exhibited lower classification accuracy.Keywords: SARS-CoV-2, MALDI-TOF MS, ML models, nasopharyngeal swabs, classification
Procedia PDF Downloads 1081576 Optimization Based Extreme Learning Machine for Watermarking of an Image in DWT Domain
Authors: RAM PAL SINGH, VIKASH CHAUDHARY, MONIKA VERMA
Abstract:
In this paper, we proposed the implementation of optimization based Extreme Learning Machine (ELM) for watermarking of B-channel of color image in discrete wavelet transform (DWT) domain. ELM, a regularization algorithm, works based on generalized single-hidden-layer feed-forward neural networks (SLFNs). However, hidden layer parameters, generally called feature mapping in context of ELM need not to be tuned every time. This paper shows the embedding and extraction processes of watermark with the help of ELM and results are compared with already used machine learning models for watermarking.Here, a cover image is divide into suitable numbers of non-overlapping blocks of required size and DWT is applied to each block to be transformed in low frequency sub-band domain. Basically, ELM gives a unified leaning platform with a feature mapping, that is, mapping between hidden layer and output layer of SLFNs, is tried for watermark embedding and extraction purpose in a cover image. Although ELM has widespread application right from binary classification, multiclass classification to regression and function estimation etc. Unlike SVM based algorithm which achieve suboptimal solution with high computational complexity, ELM can provide better generalization performance results with very small complexity. Efficacy of optimization method based ELM algorithm is measured by using quantitative and qualitative parameters on a watermarked image even though image is subjected to different types of geometrical and conventional attacks.Keywords: BER, DWT, extreme leaning machine (ELM), PSNR
Procedia PDF Downloads 3111575 Information Technology Impacts on the Supply Chain Performance: Case Study Approach
Authors: Kajal Zarei
Abstract:
Supply chain management is becoming an increasingly important issue in many businesses today. In such circumstances, a number of reasons such as management deficiency in different segments of the supply chain, lack of streamlined processes, resistance to change the current systems and technologies, and lack of advanced information system have paved the ground to ask for innovative research studies. To this end, information technology (IT) is becoming a major driver to overcome the supply chain limitations and deficiencies. The emergence of IT has provided an excellent opportunity for redefining the supply chain to be more effective and competitive. This paper has investigated the IT impact on two-digit industry codes in the International Standard Industrial Classification (ISIC) that are operating in four groups of the supply chains. Firstly, the primary fields of the supply chain were investigated, and then paired comparisons of different industry parts were accomplished. Using experts' ideas and Analytical Hierarchy Process (AHP), the status of industrial activities in Kurdistan Province in Iran was determined. The results revealed that manufacturing and inventory fields have been more important compared to other fields of the supply chain. In addition, IT has had greater impact on food and beverage industry, chemical industry, wood industry, wood products, and production of basic metals. The results indicated the need to IT awareness in supply chain management; in other words, IT applications needed to be developed for the identified industries.Keywords: supply chain, information technology, analytical hierarchy process, two-digit codes, international standard industrial classification
Procedia PDF Downloads 2811574 A Novel Approach for the Analysis of Ground Water Quality by Using Classification Rules and Water Quality Index
Authors: Kamakshaiah Kolli, R. Seshadri
Abstract:
Water is a key resource in all economic activities ranging from agriculture to industry. Only a tiny fraction of the planet's abundant water is available to us as fresh water. Assessment of water quality has always been paramount in the field of environmental quality management. It is the foundation for health, hygiene, progress and prosperity. With ever increasing pressure of human population, there is severe stress on water resources. Therefore efficient water management is essential to civil society for betterment of quality of life. The present study emphasizes on the groundwater quality, sources of ground water contamination, variation of groundwater quality and its spatial distribution. The bases for groundwater quality assessment are groundwater bodies and representative monitoring network enabling determination of chemical status of groundwater body. For this study, water samples were collected from various areas of the entire corporation area of Guntur. Water is required for all living organisms of which 1.7% is available as ground water. Water has no calories or any nutrients, but essential for various metabolic activities in our body. Chemical and physical parameters can be tested for identifying the portability of ground water. Electrical conductivity, pH, alkalinity, Total Alkalinity, TDS, Calcium, Magnesium, Sodium, Potassium, Chloride, and Sulphate of the ground water from Guntur district: Different areas of the District were analyzed. Our aim is to check, if the ground water from the above areas are potable or not. As multivariate are present, Data mining technique using JRIP rules was employed for classifying the ground water.Keywords: groundwater, water quality standards, potability, data mining, JRIP, PCA, classification
Procedia PDF Downloads 4301573 Cheiloscopy and Dactylography in Relation to ABO Blood Groups: Egyptian vs. Malay Populations
Authors: Manal Hassan Abdel Aziz, Fatma Mohamed Magdy Badr El Dine, Nourhan Mohamed Mohamed Saeed
Abstract:
Establishing association between lip print patterns and those of fingerprints as well as blood groups is of fundamental importance in the forensic identification domain. The first aim of the current study was to determine the prevalent types of ABO blood groups, lip prints and fingerprints patterns in both studied populations. Secondly, to analyze any relation found between the different print patterns and the blood groups, which would be valuable in identification purposes. The present study was conducted on 60 healthy volunteers, (30 males and 30 females) from each of the studied population. Lip prints and fingerprints were obtained and classified according to Tsuchihashi's classification and Michael Kuchen’s classification, respectively. The results show that the ulnar loop was the most frequent among both populations. Blood group A was the most frequent among Egyptians, while blood groups O and B were the predominant among Malaysians. Significant relations were observed between lip print patterns and fingerprint (in the second quadrant for Egyptian males and the first one for Malaysian). For Malaysian females, a statistically significant association was proved in the fourth quadrant. Regarding the blood groups, 89.5% of ulnar loops were significantly related to blood group A among Egyptian males. The results proved an association between the fingerprint pattern and the lip prints, as well as between the ABO blood group and the pattern of fingerprints. However, further researches with larger sample sizes need to be directed to approve the current results.Keywords: ABO, cheiloscopy, dactylography, Egyptians, Malaysians
Procedia PDF Downloads 2191572 Algorithm for Modelling Land Surface Temperature and Land Cover Classification and Their Interaction
Authors: Jigg Pelayo, Ricardo Villar, Einstine Opiso
Abstract:
The rampant and unintended spread of urban areas resulted in increasing artificial component features in the land cover types of the countryside and bringing forth the urban heat island (UHI). This paved the way to wide range of negative influences on the human health and environment which commonly relates to air pollution, drought, higher energy demand, and water shortage. Land cover type also plays a relevant role in the process of understanding the interaction between ground surfaces with the local temperature. At the moment, the depiction of the land surface temperature (LST) at city/municipality scale particularly in certain areas of Misamis Oriental, Philippines is inadequate as support to efficient mitigations and adaptations of the surface urban heat island (SUHI). Thus, this study purposely attempts to provide application on the Landsat 8 satellite data and low density Light Detection and Ranging (LiDAR) products in mapping out quality automated LST model and crop-level land cover classification in a local scale, through theoretical and algorithm based approach utilizing the principle of data analysis subjected to multi-dimensional image object model. The paper also aims to explore the relationship between the derived LST and land cover classification. The results of the presented model showed the ability of comprehensive data analysis and GIS functionalities with the integration of object-based image analysis (OBIA) approach on automating complex maps production processes with considerable efficiency and high accuracy. The findings may potentially lead to expanded investigation of temporal dynamics of land surface UHI. It is worthwhile to note that the environmental significance of these interactions through combined application of remote sensing, geographic information tools, mathematical morphology and data analysis can provide microclimate perception, awareness and improved decision-making for land use planning and characterization at local and neighborhood scale. As a result, it can aid in facilitating problem identification, support mitigations and adaptations more efficiently.Keywords: LiDAR, OBIA, remote sensing, local scale
Procedia PDF Downloads 2821571 A Comparative Analysis of Hyper-Parameters Using Neural Networks for E-Mail Spam Detection
Authors: Syed Mahbubuz Zaman, A. B. M. Abrar Haque, Mehedi Hassan Nayeem, Misbah Uddin Sagor
Abstract:
Everyday e-mails are being used by millions of people as an effective form of communication over the Internet. Although e-mails allow high-speed communication, there is a constant threat known as spam. Spam e-mail is often called junk e-mails which are unsolicited and sent in bulk. These unsolicited emails cause security concerns among internet users because they are being exposed to inappropriate content. There is no guaranteed way to stop spammers who use static filters as they are bypassed very easily. In this paper, a smart system is proposed that will be using neural networks to approach spam in a different way, and meanwhile, this will also detect the most relevant features that will help to design the spam filter. Also, a comparison of different parameters for different neural network models has been shown to determine which model works best within suitable parameters.Keywords: long short-term memory, bidirectional long short-term memory, gated recurrent unit, natural language processing, natural language processing
Procedia PDF Downloads 2051570 Time and Cost Prediction Models for Language Classification Over a Large Corpus on Spark
Authors: Jairson Barbosa Rodrigues, Paulo Romero Martins Maciel, Germano Crispim Vasconcelos
Abstract:
This paper presents an investigation of the performance impacts regarding the variation of five factors (input data size, node number, cores, memory, and disks) when applying a distributed implementation of Naïve Bayes for text classification of a large Corpus on the Spark big data processing framework. Problem: The algorithm's performance depends on multiple factors, and knowing before-hand the effects of each factor becomes especially critical as hardware is priced by time slice in cloud environments. Objectives: To explain the functional relationship between factors and performance and to develop linear predictor models for time and cost. Methods: the solid statistical principles of Design of Experiments (DoE), particularly the randomized two-level fractional factorial design with replications. This research involved 48 real clusters with different hardware arrangements. The metrics were analyzed using linear models for screening, ranking, and measurement of each factor's impact. Results: Our findings include prediction models and show some non-intuitive results about the small influence of cores and the neutrality of memory and disks on total execution time, and the non-significant impact of data input scale on costs, although notably impacts the execution time.Keywords: big data, design of experiments, distributed machine learning, natural language processing, spark
Procedia PDF Downloads 1201569 Using Predictive Analytics to Identify First-Year Engineering Students at Risk of Failing
Authors: Beng Yew Low, Cher Liang Cha, Cheng Yong Teoh
Abstract:
Due to a lack of continual assessment or grade related data, identifying first-year engineering students in a polytechnic education at risk of failing is challenging. Our experience over the years tells us that there is no strong correlation between having good entry grades in Mathematics and the Sciences and excelling in hardcore engineering subjects. Hence, identifying students at risk of failure cannot be on the basis of entry grades in Mathematics and the Sciences alone. These factors compound the difficulty of early identification and intervention. This paper describes the development of a predictive analytics model in the early detection of students at risk of failing and evaluates its effectiveness. Data from continual assessments conducted in term one, supplemented by data of student psychological profiles such as interests and study habits, were used. Three classification techniques, namely Logistic Regression, K Nearest Neighbour, and Random Forest, were used in our predictive model. Based on our findings, Random Forest was determined to be the strongest predictor with an Area Under the Curve (AUC) value of 0.994. Correspondingly, the Accuracy, Precision, Recall, and F-Score were also highest among these three classifiers. Using this Random Forest Classification technique, students at risk of failure could be identified at the end of term one. They could then be assigned to a Learning Support Programme at the beginning of term two. This paper gathers the results of our findings. It also proposes further improvements that can be made to the model.Keywords: continual assessment, predictive analytics, random forest, student psychological profile
Procedia PDF Downloads 134