Search results for: LiDAR datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 846

Search results for: LiDAR datasets

696 Comparison Of Virtual Non-Contrast To True Non-Contrast Images Using Dual Layer Spectral Computed Tomography

Authors: O’Day Luke

Abstract:

Purpose: To validate virtual non-contrast reconstructions generated from dual-layer spectral computed tomography (DL-CT) data as an alternative for the acquisition of a dedicated true non-contrast dataset during multiphase contrast studies. Material and methods: Thirty-three patients underwent a routine multiphase clinical CT examination, using Dual-Layer Spectral CT, from March to August 2021. True non-contrast (TNC) and virtual non-contrast (VNC) datasets, generated from both portal venous and arterial phase imaging were evaluated. For every patient in both true and virtual non-contrast datasets, a region-of-interest (ROI) was defined in aorta, liver, fluid (i.e. gallbladder, urinary bladder), kidney, muscle, fat and spongious bone, resulting in 693 ROIs. Differences in attenuation for VNC and TNV images were compared, both separately and combined. Consistency between VNC reconstructions obtained from the arterial and portal venous phase was evaluated. Results: Comparison of CT density (HU) on the VNC and TNC images showed a high correlation. The mean difference between TNC and VNC images (excluding bone results) was 5.5 ± 9.1 HU and > 90% of all comparisons showed a difference of less than 15 HU. For all tissues but spongious bone, the mean absolute difference between TNC and VNC images was below 10 HU. VNC images derived from the arterial and the portal venous phase showed a good correlation in most tissue types. The aortic attenuation was somewhat dependent however on which dataset was used for reconstruction. Bone evaluation with VNC datasets continues to be a problem, as spectral CT algorithms are currently poor in differentiating bone and iodine. Conclusion: Given the increasing availability of DL-CT and proven accuracy of virtual non-contrast processing, VNC is a promising tool for generating additional data during routine contrast-enhanced studies. This study shows the utility of virtual non-contrast scans as an alternative for true non-contrast studies during multiphase CT, with potential for dose reduction, without loss of diagnostic information.

Keywords: dual-layer spectral computed tomography, virtual non-contrast, true non-contrast, clinical comparison

Procedia PDF Downloads 141
695 Long-Term Variabilities and Tendencies in the Zonally Averaged TIMED-SABER Ozone and Temperature in the Middle Atmosphere over 10°N-15°N

Authors: Oindrila Nath, S. Sridharan

Abstract:

Long-term (2002-2012) temperature and ozone measurements by Sounding of Atmosphere by Broadband Emission Radiometry (SABER) instrument onboard Thermosphere, Ionosphere, Mesosphere Energetics and Dynamics (TIMED) satellite zonally averaged over 10°N-15°N are used to study their long-term changes and their responses to solar cycle, quasi-biennial oscillation and El Nino Southern Oscillation. The region is selected to provide more accurate long-term trends and variabilities, which were not possible earlier with lidar measurements over Gadanki (13.5°N, 79.2°E), which are limited to cloud-free nights, whereas continuous data sets of SABER temperature and ozone are available. Regression analysis of temperature shows a cooling trend of 0.5K/decade in the stratosphere and that of 3K/decade in the mesosphere. Ozone shows a statistically significant decreasing trend of 1.3 ppmv per decade in the mesosphere although there is a small positive trend in stratosphere at 25 km. Other than this no significant ozone trend is observed in stratosphere. Negative ozone-QBO response (0.02ppmv/QBO), positive ozone-solar cycle (0.91ppmv/100SFU) and negative response to ENSO (0.51ppmv/SOI) have been found more in mesosphere whereas positive ozone response to ENSO (0.23ppmv/SOI) is pronounced in stratosphere (20-30 km). The temperature response to solar cycle is more positive (3.74K/100SFU) in the upper mesosphere and its response to ENSO is negative around 80 km and positive around 90-100 km and its response to QBO is insignificant at most of the heights. Composite monthly mean of ozone volume mixing ratio shows maximum values during pre-monsoon and post-monsoon season in middle stratosphere (25-30 km) and in upper mesosphere (85-95 km) around 10 ppmv. Composite monthly mean of temperature shows semi-annual variation with large values (~250-260 K) in equinox months and less values in solstice months in upper stratosphere and lower mesosphere (40-55 km) whereas the SAO becomes weaker above 55 km. The semi-annual variation again appears at 80-90 km, with large values in spring equinox and winter months. In the upper mesosphere (90-100 km), less temperature (~170-190 K) prevails in all the months except during September, when the temperature is slightly more. The height profiles of amplitudes of semi-annual and annual oscillations in ozone show maximum values of 6 ppmv and 2.5 ppmv respectively in upper mesosphere (80-100 km), whereas SAO and AO in temperature show maximum values of 5.8 K and 4.6 K in lower and middle mesosphere around 60-85 km. The phase profiles of both SAO and AO show downward progressions. These results are being compared with long-term lidar temperature measurements over Gadanki (13.5°N, 79.2°E) and the results obtained will be presented during the meeting.

Keywords: trends, QBO, solar cycle, ENSO, ozone, temperature

Procedia PDF Downloads 410
694 Analysis and Detection of Facial Expressions in Autism Spectrum Disorder People Using Machine Learning

Authors: Muhammad Maisam Abbas, Salman Tariq, Usama Riaz, Muhammad Tanveer, Humaira Abdul Ghafoor

Abstract:

Autism Spectrum Disorder (ASD) refers to a developmental disorder that impairs an individual's communication and interaction ability. Individuals feel difficult to read facial expressions while communicating or interacting. Facial Expression Recognition (FER) is a unique method of classifying basic human expressions, i.e., happiness, fear, surprise, sadness, disgust, neutral, and anger through static and dynamic sources. This paper conducts a comprehensive comparison and proposed optimal method for a continued research project—a system that can assist people who have Autism Spectrum Disorder (ASD) in recognizing facial expressions. Comparison has been conducted on three supervised learning algorithms EigenFace, FisherFace, and LBPH. The JAFFE, CK+, and TFEID (I&II) datasets have been used to train and test the algorithms. The results were then evaluated based on variance, standard deviation, and accuracy. The experiments showed that FisherFace has the highest accuracy for all datasets and is considered the best algorithm to be implemented in our system.

Keywords: autism spectrum disorder, ASD, EigenFace, facial expression recognition, FisherFace, local binary pattern histogram, LBPH

Procedia PDF Downloads 174
693 Principal Component Analysis Combined Machine Learning Techniques on Pharmaceutical Samples by Laser Induced Breakdown Spectroscopy

Authors: Kemal Efe Eseller, Göktuğ Yazici

Abstract:

Laser-induced breakdown spectroscopy (LIBS) is a rapid optical atomic emission spectroscopy which is used for material identification and analysis with the advantages of in-situ analysis, elimination of intensive sample preparation, and micro-destructive properties for the material to be tested. LIBS delivers short pulses of laser beams onto the material in order to create plasma by excitation of the material to a certain threshold. The plasma characteristics, which consist of wavelength value and intensity amplitude, depends on the material and the experiment’s environment. In the present work, medicine samples’ spectrum profiles were obtained via LIBS. Medicine samples’ datasets include two different concentrations for both paracetamol based medicines, namely Aferin and Parafon. The spectrum data of the samples were preprocessed via filling outliers based on quartiles, smoothing spectra to eliminate noise and normalizing both wavelength and intensity axis. Statistical information was obtained and principal component analysis (PCA) was incorporated to both the preprocessed and raw datasets. The machine learning models were set based on two different train-test splits, which were 70% training – 30% test and 80% training – 20% test. Cross-validation was preferred to protect the models against overfitting; thus the sample amount is small. The machine learning results of preprocessed and raw datasets were subjected to comparison for both splits. This is the first time that all supervised machine learning classification algorithms; consisting of Decision Trees, Discriminant, naïve Bayes, Support Vector Machines (SVM), k-NN(k-Nearest Neighbor) Ensemble Learning and Neural Network algorithms; were incorporated to LIBS data of paracetamol based pharmaceutical samples, and their different concentrations on preprocessed and raw dataset in order to observe the effect of preprocessing.

Keywords: machine learning, laser-induced breakdown spectroscopy, medicines, principal component analysis, preprocessing

Procedia PDF Downloads 87
692 A Review on Existing Challenges of Data Mining and Future Research Perspectives

Authors: Hema Bhardwaj, D. Srinivasa Rao

Abstract:

Technology for analysing, processing, and extracting meaningful data from enormous and complicated datasets can be termed as "big data." The technique of big data mining and big data analysis is extremely helpful for business movements such as making decisions, building organisational plans, researching the market efficiently, improving sales, etc., because typical management tools cannot handle such complicated datasets. Special computational and statistical issues, such as measurement errors, noise accumulation, spurious correlation, and storage and scalability limitations, are brought on by big data. These unique problems call for new computational and statistical paradigms. This research paper offers an overview of the literature on big data mining, its process, along with problems and difficulties, with a focus on the unique characteristics of big data. Organizations have several difficulties when undertaking data mining, which has an impact on their decision-making. Every day, terabytes of data are produced, yet only around 1% of that data is really analyzed. The idea of the mining and analysis of data and knowledge discovery techniques that have recently been created with practical application systems is presented in this study. This article's conclusion also includes a list of issues and difficulties for further research in the area. The report discusses the management's main big data and data mining challenges.

Keywords: big data, data mining, data analysis, knowledge discovery techniques, data mining challenges

Procedia PDF Downloads 110
691 A Hybrid Feature Selection Algorithm with Neural Network for Software Fault Prediction

Authors: Khalaf Khatatneh, Nabeel Al-Milli, Amjad Hudaib, Monther Ali Tarawneh

Abstract:

Software fault prediction identify potential faults in software modules during the development process. In this paper, we present a novel approach for software fault prediction by combining a feedforward neural network with particle swarm optimization (PSO). The PSO algorithm is employed as a feature selection technique to identify the most relevant metrics as inputs to the neural network. Which enhances the quality of feature selection and subsequently improves the performance of the neural network model. Through comprehensive experiments on software fault prediction datasets, the proposed hybrid approach achieves better results, outperforming traditional classification methods. The integration of PSO-based feature selection with the neural network enables the identification of critical metrics that provide more accurate fault prediction. Results shows the effectiveness of the proposed approach and its potential for reducing development costs and effort by detecting faults early in the software development lifecycle. Further research and validation on diverse datasets will help solidify the practical applicability of the new approach in real-world software engineering scenarios.

Keywords: feature selection, neural network, particle swarm optimization, software fault prediction

Procedia PDF Downloads 94
690 Violence Detection and Tracking on Moving Surveillance Video Using Machine Learning Approach

Authors: Abe Degale D., Cheng Jian

Abstract:

When creating automated video surveillance systems, violent action recognition is crucial. In recent years, hand-crafted feature detectors have been the primary method for achieving violence detection, such as the recognition of fighting activity. Researchers have also looked into learning-based representational models. On benchmark datasets created especially for the detection of violent sequences in sports and movies, these methods produced good accuracy results. The Hockey dataset's videos with surveillance camera motion present challenges for these algorithms for learning discriminating features. Image recognition and human activity detection challenges have shown success with deep representation-based methods. For the purpose of detecting violent images and identifying aggressive human behaviours, this research suggested a deep representation-based model using the transfer learning idea. The results show that the suggested approach outperforms state-of-the-art accuracy levels by learning the most discriminating features, attaining 99.34% and 99.98% accuracy levels on the Hockey and Movies datasets, respectively.

Keywords: violence detection, faster RCNN, transfer learning and, surveillance video

Procedia PDF Downloads 106
689 Using Satellite Images Datasets for Road Intersection Detection in Route Planning

Authors: Fatma El-Zahraa El-Taher, Ayman Taha, Jane Courtney, Susan Mckeever

Abstract:

Understanding road networks plays an important role in navigation applications such as self-driving vehicles and route planning for individual journeys. Intersections of roads are essential components of road networks. Understanding the features of an intersection, from a simple T-junction to larger multi-road junctions, is critical to decisions such as crossing roads or selecting the safest routes. The identification and profiling of intersections from satellite images is a challenging task. While deep learning approaches offer the state-of-the-art in image classification and detection, the availability of training datasets is a bottleneck in this approach. In this paper, a labelled satellite image dataset for the intersection recognition problem is presented. It consists of 14,692 satellite images of Washington DC, USA. To support other users of the dataset, an automated download and labelling script is provided for dataset replication. The challenges of construction and fine-grained feature labelling of a satellite image dataset is examined, including the issue of how to address features that are spread across multiple images. Finally, the accuracy of the detection of intersections in satellite images is evaluated.

Keywords: satellite images, remote sensing images, data acquisition, autonomous vehicles

Procedia PDF Downloads 144
688 A Hybrid Feature Selection and Deep Learning Algorithm for Cancer Disease Classification

Authors: Niousha Bagheri Khulenjani, Mohammad Saniee Abadeh

Abstract:

Learning from very big datasets is a significant problem for most present data mining and machine learning algorithms. MicroRNA (miRNA) is one of the important big genomic and non-coding datasets presenting the genome sequences. In this paper, a hybrid method for the classification of the miRNA data is proposed. Due to the variety of cancers and high number of genes, analyzing the miRNA dataset has been a challenging problem for researchers. The number of features corresponding to the number of samples is high and the data suffer from being imbalanced. The feature selection method has been used to select features having more ability to distinguish classes and eliminating obscures features. Afterward, a Convolutional Neural Network (CNN) classifier for classification of cancer types is utilized, which employs a Genetic Algorithm to highlight optimized hyper-parameters of CNN. In order to make the process of classification by CNN faster, Graphics Processing Unit (GPU) is recommended for calculating the mathematic equation in a parallel way. The proposed method is tested on a real-world dataset with 8,129 patients, 29 different types of tumors, and 1,046 miRNA biomarkers, taken from The Cancer Genome Atlas (TCGA) database.

Keywords: cancer classification, feature selection, deep learning, genetic algorithm

Procedia PDF Downloads 111
687 Disaggregation the Daily Rainfall Dataset into Sub-Daily Resolution in the Temperate Oceanic Climate Region

Authors: Mohammad Bakhshi, Firas Al Janabi

Abstract:

High resolution rain data are very important to fulfill the input of hydrological models. Among models of high-resolution rainfall data generation, the temporal disaggregation was chosen for this study. The paper attempts to generate three different rainfall resolutions (4-hourly, hourly and 10-minutes) from daily for around 20-year record period. The process was done by DiMoN tool which is based on random cascade model and method of fragment. Differences between observed and simulated rain dataset are evaluated with variety of statistical and empirical methods: Kolmogorov-Smirnov test (K-S), usual statistics, and Exceedance probability. The tool worked well at preserving the daily rainfall values in wet days, however, the generated data are cumulated in a shorter time period and made stronger storms. It is demonstrated that the difference between generated and observed cumulative distribution function curve of 4-hourly datasets is passed the K-S test criteria while in hourly and 10-minutes datasets the P-value should be employed to prove that their differences were reasonable. The results are encouraging considering the overestimation of generated high-resolution rainfall data.

Keywords: DiMoN Tool, disaggregation, exceedance probability, Kolmogorov-Smirnov test, rainfall

Procedia PDF Downloads 201
686 Constructing a Semi-Supervised Model for Network Intrusion Detection

Authors: Tigabu Dagne Akal

Abstract:

While advances in computer and communications technology have made the network ubiquitous, they have also rendered networked systems vulnerable to malicious attacks devised from a distance. These attacks or intrusions start with attackers infiltrating a network through a vulnerable host and then launching further attacks on the local network or Intranet. Nowadays, system administrators and network professionals can attempt to prevent such attacks by developing intrusion detection tools and systems using data mining technology. In this study, the experiments were conducted following the Knowledge Discovery in Database Process Model. The Knowledge Discovery in Database Process Model starts from selection of the datasets. The dataset used in this study has been taken from Massachusetts Institute of Technology Lincoln Laboratory. After taking the data, it has been pre-processed. The major pre-processing activities include fill in missed values, remove outliers; resolve inconsistencies, integration of data that contains both labelled and unlabelled datasets, dimensionality reduction, size reduction and data transformation activity like discretization tasks were done for this study. A total of 21,533 intrusion records are used for training the models. For validating the performance of the selected model a separate 3,397 records are used as a testing set. For building a predictive model for intrusion detection J48 decision tree and the Naïve Bayes algorithms have been tested as a classification approach for both with and without feature selection approaches. The model that was created using 10-fold cross validation using the J48 decision tree algorithm with the default parameter values showed the best classification accuracy. The model has a prediction accuracy of 96.11% on the training datasets and 93.2% on the test dataset to classify the new instances as normal, DOS, U2R, R2L and probe classes. The findings of this study have shown that the data mining methods generates interesting rules that are crucial for intrusion detection and prevention in the networking industry. Future research directions are forwarded to come up an applicable system in the area of the study.

Keywords: intrusion detection, data mining, computer science, data mining

Procedia PDF Downloads 296
685 Generation of High-Quality Synthetic CT Images from Cone Beam CT Images Using A.I. Based Generative Networks

Authors: Heeba A. Gurku

Abstract:

Introduction: Cone Beam CT(CBCT) images play an integral part in proper patient positioning in cancer patients undergoing radiation therapy treatment. But these images are low in quality. The purpose of this study is to generate high-quality synthetic CT images from CBCT using generative models. Material and Methods: This study utilized two datasets from The Cancer Imaging Archive (TCIA) 1) Lung cancer dataset of 20 patients (with full view CBCT images) and 2) Pancreatic cancer dataset of 40 patients (only 27 patients having limited view images were included in the study). Cycle Generative Adversarial Networks (GAN) and its variant Attention Guided Generative Adversarial Networks (AGGAN) models were used to generate the synthetic CTs. Models were evaluated by visual evaluation and on four metrics, Structural Similarity Index Measure (SSIM), Peak Signal Noise Ratio (PSNR) Mean Absolute Error (MAE) and Root Mean Square Error (RMSE), to compare the synthetic CT and original CT images. Results: For pancreatic dataset with limited view CBCT images, our study showed that in Cycle GAN model, MAE, RMSE, PSNR improved from 12.57to 8.49, 20.94 to 15.29 and 21.85 to 24.63, respectively but structural similarity only marginally increased from 0.78 to 0.79. Similar, results were achieved with AGGAN with no improvement over Cycle GAN. However, for lung dataset with full view CBCT images Cycle GAN was able to reduce MAE significantly from 89.44 to 15.11 and AGGAN was able to reduce it to 19.77. Similarly, RMSE was also decreased from 92.68 to 23.50 in Cycle GAN and to 29.02 in AGGAN. SSIM and PSNR also improved significantly from 0.17 to 0.59 and from 8.81 to 21.06 in Cycle GAN respectively while in AGGAN SSIM increased to 0.52 and PSNR increased to 19.31. In both datasets, GAN models were able to reduce artifacts, reduce noise, have better resolution, and better contrast enhancement. Conclusion and Recommendation: Both Cycle GAN and AGGAN were significantly able to reduce MAE, RMSE and PSNR in both datasets. However, full view lung dataset showed more improvement in SSIM and image quality than limited view pancreatic dataset.

Keywords: CT images, CBCT images, cycle GAN, AGGAN

Procedia PDF Downloads 83
684 Emotion Mining and Attribute Selection for Actionable Recommendations to Improve Customer Satisfaction

Authors: Jaishree Ranganathan, Poonam Rajurkar, Angelina A. Tzacheva, Zbigniew W. Ras

Abstract:

In today’s world, business often depends on the customer feedback and reviews. Sentiment analysis helps identify and extract information about the sentiment or emotion of the of the topic or document. Attribute selection is a challenging problem, especially with large datasets in actionable pattern mining algorithms. Action Rule Mining is one of the methods to discover actionable patterns from data. Action Rules are rules that help describe specific actions to be made in the form of conditions that help achieve the desired outcome. The rules help to change from any undesirable or negative state to a more desirable or positive state. In this paper, we present a Lexicon based weighted scheme approach to identify emotions from customer feedback data in the area of manufacturing business. Also, we use Rough sets and explore the attribute selection method for large scale datasets. Then we apply Actionable pattern mining to extract possible emotion change recommendations. This kind of recommendations help business analyst to improve their customer service which leads to customer satisfaction and increase sales revenue.

Keywords: actionable pattern discovery, attribute selection, business data, data mining, emotion

Procedia PDF Downloads 199
683 The Hurricane 'Bump': Measuring the Effects of Hurricanes on Wages in Southern Louisiana

Authors: Jasmine Latiolais

Abstract:

Much of the disaster-related literature finds a positive relationship between the impact of a natural disaster and the growth of wages. Panel datasets are often used to explore these effects. However, natural disasters do not impact a single variable in the economy. Rather, natural disasters affect all facets of the economy, simultaneously, upon impact. It is difficult to control for all factors that would be influenced by the impact of a natural disaster, which can lead to lead to omitted variable bias in those studies employing panel datasets. To address this issue of omitted variable bias, an interrupted time series analysis is used to test the short-run relationship between the impact of Hurricanes Katrina and Rita on parish wage levels in Southern Louisiana, inherently controlling for economic conditions. This study provides evidence that natural disasters do increase wages in the very short term (one quarter following the impact of the hurricane) but that these results are not seen in the longer term and are not robust. In addition, the significance of the coefficients changes depending on the parish. Overall, this study finds that previous literature on this topic may not be robust when considered through a time-series lens.

Keywords: economic recovery, local economies, local wage growth, natural disasters

Procedia PDF Downloads 132
682 Modeling Geogenic Groundwater Contamination Risk with the Groundwater Assessment Platform (GAP)

Authors: Joel Podgorski, Manouchehr Amini, Annette Johnson, Michael Berg

Abstract:

One-third of the world’s population relies on groundwater for its drinking water. Natural geogenic arsenic and fluoride contaminate ~10% of wells. Prolonged exposure to high levels of arsenic can result in various internal cancers, while high levels of fluoride are responsible for the development of dental and crippling skeletal fluorosis. In poor urban and rural settings, the provision of drinking water free of geogenic contamination can be a major challenge. In order to efficiently apply limited resources in the testing of wells, water resource managers need to know where geogenically contaminated groundwater is likely to occur. The Groundwater Assessment Platform (GAP) fulfills this need by providing state-of-the-art global arsenic and fluoride contamination hazard maps as well as enabling users to create their own groundwater quality models. The global risk models were produced by logistic regression of arsenic and fluoride measurements using predictor variables of various soil, geological and climate parameters. The maps display the probability of encountering concentrations of arsenic or fluoride exceeding the World Health Organization’s (WHO) stipulated concentration limits of 10 µg/L or 1.5 mg/L, respectively. In addition to a reconsideration of the relevant geochemical settings, these second-generation maps represent a great improvement over the previous risk maps due to a significant increase in data quantity and resolution. For example, there is a 10-fold increase in the number of measured data points, and the resolution of predictor variables is generally 60 times greater. These same predictor variable datasets are available on the GAP platform for visualization as well as for use with a modeling tool. The latter requires that users upload their own concentration measurements and select the predictor variables that they wish to incorporate in their models. In addition, users can upload additional predictor variable datasets either as features or coverages. Such models can represent an improvement over the global models already supplied, since (a) users may be able to use their own, more detailed datasets of measured concentrations and (b) the various processes leading to arsenic and fluoride groundwater contamination can be isolated more effectively on a smaller scale, thereby resulting in a more accurate model. All maps, including user-created risk models, can be downloaded as PDFs. There is also the option to share data in a secure environment as well as the possibility to collaborate in a secure environment through the creation of communities. In summary, GAP provides users with the means to reliably and efficiently produce models specific to their region of interest by making available the latest datasets of predictor variables along with the necessary modeling infrastructure.

Keywords: arsenic, fluoride, groundwater contamination, logistic regression

Procedia PDF Downloads 348
681 Plantation Forests Height Mapping Using Unmanned Aerial System

Authors: Shiming Li, Qingwang Liu, Honggan Wu, Jianbing Zhang

Abstract:

Plantation forests are useful for timber production, recreation, environmental protection and social development. Stands height is an important parameter for the estimation of forest volume and carbon stocks. Although lidar is suitable technology for the vertical parameters extraction of forests, but high costs make it not suitable for operational inventory. With the development of computer vision and photogrammetry, aerial photos from unmanned aerial system can be used as an alternative solution for height mapping. Structure-from-motion (SfM) photogrammetry technique can be used to extract DSM and DEM information. Canopy height model (CHM) can be achieved by subtraction DEM from DSM. Our result shows that overlapping aerial photos is a potential solution for plantation forests height mapping.

Keywords: forest height mapping, plantation forests, structure-from-motion photogrammetry, UAS

Procedia PDF Downloads 278
680 Life Cycle Datasets for the Ornamental Stone Sector

Authors: Isabella Bianco, Gian Andrea Blengini

Abstract:

The environmental impact related to ornamental stones (such as marbles and granites) is largely debated. Starting from the industrial revolution, continuous improvements of machineries led to a higher exploitation of this natural resource and to a more international interaction between markets. As a consequence, the environmental impact of the extraction and processing of stones has increased. Nevertheless, if compared with other building materials, ornamental stones are generally more durable, natural, and recyclable. From the scientific point of view, studies on stone life cycle sustainability have been carried out, but these are often partial or not very significant because of the high percentage of approximations and assumptions in calculations. This is due to the lack, in life cycle databases (e.g. Ecoinvent, Thinkstep, and ELCD), of datasets about the specific technologies employed in the stone production chain. For example, databases do not contain information about diamond wires, chains or explosives, materials commonly used in quarries and transformation plants. The project presented in this paper aims to populate the life cycle databases with specific data of specific stone processes. To this goal, the methodology follows the standardized approach of Life Cycle Assessment (LCA), according to the requirements of UNI 14040-14044 and to the International Reference Life Cycle Data System (ILCD) Handbook guidelines of the European Commission. The study analyses the processes of the entire production chain (from-cradle-to-gate system boundaries), including the extraction of benches, the cutting of blocks into slabs/tiles and the surface finishing. Primary data have been collected in Italian quarries and transformation plants which use technologies representative of the current state-of-the-art. Since the technologies vary according to the hardness of the stone, the case studies comprehend both soft stones (marbles) and hard stones (gneiss). In particular, data about energy, materials and emissions were collected in marble basins of Carrara and in Beola and Serizzo basins located in the province of Verbano Cusio Ossola. Data were then elaborated through an appropriate software to build a life cycle model. The model was realized setting free parameters that allow an easy adaptation to specific productions. Through this model, the study aims to boost the direct participation of stone companies and encourage the use of LCA tool to assess and improve the stone sector environmental sustainability. At the same time, the realization of accurate Life Cycle Inventory data aims at making available, to researchers and stone experts, ILCD compliant datasets of the most significant processes and technologies related to the ornamental stone sector.

Keywords: life cycle assessment, LCA datasets, ornamental stone, stone environmental impact

Procedia PDF Downloads 233
679 Evaluating Machine Learning Techniques for Activity Classification in Smart Home Environments

Authors: Talal Alshammari, Nasser Alshammari, Mohamed Sedky, Chris Howard

Abstract:

With the widespread adoption of the Internet-connected devices, and with the prevalence of the Internet of Things (IoT) applications, there is an increased interest in machine learning techniques that can provide useful and interesting services in the smart home domain. The areas that machine learning techniques can help advance are varied and ever-evolving. Classifying smart home inhabitants’ Activities of Daily Living (ADLs), is one prominent example. The ability of machine learning technique to find meaningful spatio-temporal relations of high-dimensional data is an important requirement as well. This paper presents a comparative evaluation of state-of-the-art machine learning techniques to classify ADLs in the smart home domain. Forty-two synthetic datasets and two real-world datasets with multiple inhabitants are used to evaluate and compare the performance of the identified machine learning techniques. Our results show significant performance differences between the evaluated techniques. Such as AdaBoost, Cortical Learning Algorithm (CLA), Decision Trees, Hidden Markov Model (HMM), Multi-layer Perceptron (MLP), Structured Perceptron and Support Vector Machines (SVM). Overall, neural network based techniques have shown superiority over the other tested techniques.

Keywords: activities of daily living, classification, internet of things, machine learning, prediction, smart home

Procedia PDF Downloads 357
678 A Lightweight Pretrained Encrypted Traffic Classification Method with Squeeze-and-Excitation Block and Sharpness-Aware Optimization

Authors: Zhiyan Meng, Dan Liu, Jintao Meng

Abstract:

Dependable encrypted traffic classification is crucial for improving cybersecurity and handling the growing amount of data. Large language models have shown that learning from large datasets can be effective, making pre-trained methods for encrypted traffic classification popular. However, attention-based pre-trained methods face two main issues: their large neural parameters are not suitable for low-computation environments like mobile devices and real-time applications, and they often overfit by getting stuck in local minima. To address these issues, we developed a lightweight transformer model, which reduces the computational parameters through lightweight vocabulary construction and Squeeze-and-Excitation Block. We use sharpness-aware optimization to avoid local minima during pre-training and capture temporal features with relative positional embeddings. Our approach keeps the model's classification accuracy high for downstream tasks. We conducted experiments on four datasets -USTC-TFC2016, VPN 2016, Tor 2016, and CICIOT 2022. Even with fewer than 18 million parameters, our method achieves classification results similar to methods with ten times as many parameters.

Keywords: sharpness-aware optimization, encrypted traffic classification, squeeze-and-excitation block, pretrained model

Procedia PDF Downloads 30
677 Gene Expression Profile Reveals Breast Cancer Proliferation and Metastasis

Authors: Nandhana Vivek, Bhaskar Gogoi, Ayyavu Mahesh

Abstract:

Breast cancer metastasis plays a key role in cancer progression and fatality. The present study examines the potential causes of metastasis in breast cancer by investigating the novel interactions between genes and their pathways. The gene expression profile of GSE99394, GSE1246464, and GSE103865 was downloaded from the GEO data repository to analyze the differentially expressed genes (DEGs). Protein-protein interactions, target factor interactions, pathways and gene relationships, and functional enrichment networks were investigated. The proliferation pathway was shown to be highly expressed in breast cancer progression and metastasis in all three datasets. Gene Ontology analysis revealed 11 DEGs as gene targets to control breast cancer metastasis: LYN, DLGAP5, CXCR4, CDC6, NANOG, IFI30, TXP2, AGTR1, MKI67, and FTH1. Upon studying the function, genomic and proteomic data, and pathway involvement of the target genes, DLGAP5 proved to be a promising candidate due to it being highly differentially expressed in all datasets. The study takes a unique perspective on the avenues through which DLGAP5 promotes metastasis. The current investigation helps pave the way in understanding the role DLGAP5 plays in metastasis, which leads to an increased incidence of death among breast cancer patients.

Keywords: genomics, metastasis, microarray, cancer

Procedia PDF Downloads 96
676 Geomorphology and Flood Analysis Using Light Detection and Ranging

Authors: George R. Puno, Eric N. Bruno

Abstract:

The natural landscape of the Philippine archipelago plus the current realities of climate change make the country vulnerable to flood hazards. Flooding becomes the recurring natural disaster in the country resulting to lose of lives and properties. Musimusi is among the rivers which exhibited inundation particularly at the inhabited floodplain portion of its watershed. During the event, rescue operations and distribution of relief goods become a problem due to lack of high resolution flood maps to aid local government unit identify the most affected areas. In the attempt of minimizing impact of flooding, hydrologic modelling with high resolution mapping is becoming more challenging and important. This study focused on the analysis of flood extent as a function of different geomorphologic characteristics of Musimusi watershed. The methods include the delineation of morphometric parameters in the Musimusi watershed using Geographic Information System (GIS) and geometric calculations tools. Digital Terrain Model (DTM) as one of the derivatives of Light Detection and Ranging (LiDAR) technology was used to determine the extent of river inundation involving the application of Hydrologic Engineering Center-River Analysis System (HEC-RAS) and Hydrology Modelling System (HEC-HMS) models. The digital elevation model (DEM) from synthetic Aperture Radar (SAR) was used to delineate watershed boundary and river network. Datasets like mean sea level, river cross section, river stage, discharge and rainfall were also used as input parameters. Curve number (CN), vegetation, and soil properties were calibrated based on the existing condition of the site. Results showed that the drainage density value of the watershed is low which indicates that the basin is highly permeable subsoil and thick vegetative cover. The watershed’s elongation ratio value of 0.9 implies that the floodplain portion of the watershed is susceptible to flooding. The bifurcation ratio value of 2.1 indicates higher risk of flooding in localized areas of the watershed. The circularity ratio value (1.20) indicates that the basin is circular in shape, high discharge of runoff and low permeability of the subsoil condition. The heavy rainfall of 167 mm brought by Typhoon Seniang last December 29, 2014 was characterized as high intensity and long duration, with a return period of 100 years produced 316 m3s-1 outflows. Portion of the floodplain zone (1.52%) suffered inundation with 2.76 m depth at the maximum. The information generated in this study is helpful to the local disaster risk reduction management council in monitoring the affected sites for more appropriate decisions so that cost of rescue operations and relief goods distribution is minimized.

Keywords: flooding, geomorphology, mapping, watershed

Procedia PDF Downloads 230
675 Facial Expression Phoenix (FePh): An Annotated Sequenced Dataset for Facial and Emotion-Specified Expressions in Sign Language

Authors: Marie Alaghband, Niloofar Yousefi, Ivan Garibay

Abstract:

Facial expressions are important parts of both gesture and sign language recognition systems. Despite the recent advances in both fields, annotated facial expression datasets in the context of sign language are still scarce resources. In this manuscript, we introduce an annotated sequenced facial expression dataset in the context of sign language, comprising over 3000 facial images extracted from the daily news and weather forecast of the public tv-station PHOENIX. Unlike the majority of currently existing facial expression datasets, FePh provides sequenced semi-blurry facial images with different head poses, orientations, and movements. In addition, in the majority of images, identities are mouthing the words, which makes the data more challenging. To annotate this dataset we consider primary, secondary, and tertiary dyads of seven basic emotions of "sad", "surprise", "fear", "angry", "neutral", "disgust", and "happy". We also considered the "None" class if the image’s facial expression could not be described by any of the aforementioned emotions. Although we provide FePh as a facial expression dataset of signers in sign language, it has a wider application in gesture recognition and Human Computer Interaction (HCI) systems.

Keywords: annotated facial expression dataset, gesture recognition, sequenced facial expression dataset, sign language recognition

Procedia PDF Downloads 159
674 On the Fourth-Order Hybrid Beta Polynomial Kernels in Kernel Density Estimation

Authors: Benson Ade Eniola Afere

Abstract:

This paper introduces a family of fourth-order hybrid beta polynomial kernels developed for statistical analysis. The assessment of these kernels' performance centers on two critical metrics: asymptotic mean integrated squared error (AMISE) and kernel efficiency. Through the utilization of both simulated and real-world datasets, a comprehensive evaluation was conducted, facilitating a thorough comparison with conventional fourth-order polynomial kernels. The evaluation procedure encompassed the computation of AMISE and efficiency values for both the proposed hybrid kernels and the established classical kernels. The consistently observed trend was the superior performance of the hybrid kernels when compared to their classical counterparts. This trend persisted across diverse datasets, underscoring the resilience and efficacy of the hybrid approach. By leveraging these performance metrics and conducting evaluations on both simulated and real-world data, this study furnishes compelling evidence in favour of the superiority of the proposed hybrid beta polynomial kernels. The discernible enhancement in performance, as indicated by lower AMISE values and higher efficiency scores, strongly suggests that the proposed kernels offer heightened suitability for statistical analysis tasks when compared to traditional kernels.

Keywords: AMISE, efficiency, fourth-order Kernels, hybrid Kernels, Kernel density estimation

Procedia PDF Downloads 70
673 Correlation between Speech Emotion Recognition Deep Learning Models and Noises

Authors: Leah Lee

Abstract:

This paper examines the correlation between deep learning models and emotions with noises to see whether or not noises mask emotions. The deep learning models used are plain convolutional neural networks (CNN), auto-encoder, long short-term memory (LSTM), and Visual Geometry Group-16 (VGG-16). Emotion datasets used are Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS), Crowd-sourced Emotional Multimodal Actors Dataset (CREMA-D), Toronto Emotional Speech Set (TESS), and Surrey Audio-Visual Expressed Emotion (SAVEE). To make it four times bigger, audio set files, stretch, and pitch augmentations are utilized. From the augmented datasets, five different features are extracted for inputs of the models. There are eight different emotions to be classified. Noise variations are white noise, dog barking, and cough sounds. The variation in the signal-to-noise ratio (SNR) is 0, 20, and 40. In summation, per a deep learning model, nine different sets with noise and SNR variations and just augmented audio files without any noises will be used in the experiment. To compare the results of the deep learning models, the accuracy and receiver operating characteristic (ROC) are checked.

Keywords: auto-encoder, convolutional neural networks, long short-term memory, speech emotion recognition, visual geometry group-16

Procedia PDF Downloads 75
672 Regularizing Software for Aerosol Particles

Authors: Christine Böckmann, Julia Rosemann

Abstract:

We present an inversion algorithm that is used in the European Aerosol Lidar Network for the inversion of data collected with multi-wavelength Raman lidar. These instruments measure backscatter coefficients at 355, 532, and 1064 nm, and extinction coefficients at 355 and 532 nm. The algorithm is based on manually controlled inversion of optical data which allows for detailed sensitivity studies and thus provides us with comparably high quality of the derived data products. The algorithm allows us to derive particle effective radius, volume, surface-area concentration with comparably high confidence. The retrieval of the real and imaginary parts of the complex refractive index still is a challenge in view of the accuracy required for these parameters in climate change studies in which light-absorption needs to be known with high accuracy. Single-scattering albedo (SSA) can be computed from the retrieve microphysical parameters and allows us to categorize aerosols into high and low absorbing aerosols. From mathematical point of view the algorithm is based on the concept of using truncated singular value decomposition as regularization method. This method was adapted to work for the retrieval of the particle size distribution function (PSD) and is called hybrid regularization technique since it is using a triple of regularization parameters. The inversion of an ill-posed problem, such as the retrieval of the PSD, is always a challenging task because very small measurement errors will be amplified most often hugely during the solution process unless an appropriate regularization method is used. Even using a regularization method is difficult since appropriate regularization parameters have to be determined. Therefore, in a next stage of our work we decided to use two regularization techniques in parallel for comparison purpose. The second method is an iterative regularization method based on Pade iteration. Here, the number of iteration steps serves as the regularization parameter. We successfully developed a semi-automated software for spherical particles which is able to run even on a parallel processor machine. From a mathematical point of view, it is also very important (as selection criteria for an appropriate regularization method) to investigate the degree of ill-posedness of the problem which we found is a moderate ill-posedness. We computed the optical data from mono-modal logarithmic PSD and investigated particles of spherical shape in our simulations. We considered particle radii as large as 6 nm which does not only cover the size range of particles in the fine-mode fraction of naturally occurring PSD but also covers a part of the coarse-mode fraction of PSD. We considered errors of 15% in the simulation studies. For the SSA, 100% of all cases achieve relative errors below 12%. In more detail, 87% of all cases for 355 nm and 88% of all cases for 532 nm are well below 6%. With respect to the absolute error for non- and weak-absorbing particles with real parts 1.5 and 1.6 in all modes the accuracy limit +/- 0.03 is achieved. In sum, 70% of all cases stay below +/-0.03 which is sufficient for climate change studies.

Keywords: aerosol particles, inverse problem, microphysical particle properties, regularization

Procedia PDF Downloads 343
671 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: fuzzy C-means clustering, fuzzy C-means clustering based attribute weighting, Pima Indians diabetes, SVM

Procedia PDF Downloads 413
670 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network

Procedia PDF Downloads 135
669 Towards Real-Time Classification of Finger Movement Direction Using Encephalography Independent Components

Authors: Mohamed Mounir Tellache, Hiroyuki Kambara, Yasuharu Koike, Makoto Miyakoshi, Natsue Yoshimura

Abstract:

This study explores the practicality of using electroencephalographic (EEG) independent components to predict eight-direction finger movements in pseudo-real-time. Six healthy participants with individual-head MRI images performed finger movements in eight directions with two different arm configurations. The analysis was performed in two stages. The first stage consisted of using independent component analysis (ICA) to separate the signals representing brain activity from non-brain activity signals and to obtain the unmixing matrix. The resulting independent components (ICs) were checked, and those reflecting brain-activity were selected. Finally, the time series of the selected ICs were used to predict eight finger-movement directions using Sparse Logistic Regression (SLR). The second stage consisted of using the previously obtained unmixing matrix, the selected ICs, and the model obtained by applying SLR to classify a different EEG dataset. This method was applied to two different settings, namely the single-participant level and the group-level. For the single-participant level, the EEG dataset used in the first stage and the EEG dataset used in the second stage originated from the same participant. For the group-level, the EEG datasets used in the first stage were constructed by temporally concatenating each combination without repetition of the EEG datasets of five participants out of six, whereas the EEG dataset used in the second stage originated from the remaining participants. The average test classification results across datasets (mean ± S.D.) were 38.62 ± 8.36% for the single-participant, which was significantly higher than the chance level (12.50 ± 0.01%), and 27.26 ± 4.39% for the group-level which was also significantly higher than the chance level (12.49% ± 0.01%). The classification accuracy within [–45°, 45°] of the true direction is 70.03 ± 8.14% for single-participant and 62.63 ± 6.07% for group-level which may be promising for some real-life applications. Clustering and contribution analyses further revealed the brain regions involved in finger movement and the temporal aspect of their contribution to the classification. These results showed the possibility of using the ICA-based method in combination with other methods to build a real-time system to control prostheses.

Keywords: brain-computer interface, electroencephalography, finger motion decoding, independent component analysis, pseudo real-time motion decoding

Procedia PDF Downloads 138
668 Video Object Segmentation for Automatic Image Annotation of Ethernet Connectors with Environment Mapping and 3D Projection

Authors: Marrone Silverio Melo Dantas Pedro Henrique Dreyer, Gabriel Fonseca Reis de Souza, Daniel Bezerra, Ricardo Souza, Silvia Lins, Judith Kelner, Djamel Fawzi Hadj Sadok

Abstract:

The creation of a dataset is time-consuming and often discourages researchers from pursuing their goals. To overcome this problem, we present and discuss two solutions adopted for the automation of this process. Both optimize valuable user time and resources and support video object segmentation with object tracking and 3D projection. In our scenario, we acquire images from a moving robotic arm and, for each approach, generate distinct annotated datasets. We evaluated the precision of the annotations by comparing these with a manually annotated dataset, as well as the efficiency in the context of detection and classification problems. For detection support, we used YOLO and obtained for the projection dataset an F1-Score, accuracy, and mAP values of 0.846, 0.924, and 0.875, respectively. Concerning the tracking dataset, we achieved an F1-Score of 0.861, an accuracy of 0.932, whereas mAP reached 0.894. In order to evaluate the quality of the annotated images used for classification problems, we employed deep learning architectures. We adopted metrics accuracy and F1-Score, for VGG, DenseNet, MobileNet, Inception, and ResNet. The VGG architecture outperformed the others for both projection and tracking datasets. It reached an accuracy and F1-score of 0.997 and 0.993, respectively. Similarly, for the tracking dataset, it achieved an accuracy of 0.991 and an F1-Score of 0.981.

Keywords: RJ45, automatic annotation, object tracking, 3D projection

Procedia PDF Downloads 167
667 Leveraging SHAP Values for Effective Feature Selection in Peptide Identification

Authors: Sharon Li, Zhonghang Xia

Abstract:

Post-database search is an essential phase in peptide identification using tandem mass spectrometry (MS/MS) to refine peptide-spectrum matches (PSMs) produced by database search engines. These engines frequently face difficulty differentiating between correct and incorrect peptide assignments. Despite advances in statistical and machine learning methods aimed at improving the accuracy of peptide identification, challenges remain in selecting critical features for these models. In this study, two machine learning models—a random forest tree and a support vector machine—were applied to three datasets to enhance PSMs. SHAP values were utilized to determine the significance of each feature within the models. The experimental results indicate that the random forest model consistently outperformed the SVM across all datasets. Further analysis of SHAP values revealed that the importance of features varies depending on the dataset, indicating that a feature's role in model predictions can differ significantly. This variability in feature selection can lead to substantial differences in model performance, with false discovery rate (FDR) differences exceeding 50% between different feature combinations. Through SHAP value analysis, the most effective feature combinations were identified, significantly enhancing model performance.

Keywords: peptide identification, SHAP value, feature selection, random forest tree, support vector machine

Procedia PDF Downloads 23