Search results for: categorical datasets
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 825

Search results for: categorical datasets

375 Determination of the Effective Economic and/or Demographic Indicators in Classification of European Union Member and Candidate Countries Using Partial Least Squares Discriminant Analysis

Authors: Esra Polat

Abstract:

Partial Least Squares Discriminant Analysis (PLSDA) is a statistical method for classification and consists a classical Partial Least Squares Regression (PLSR) in which the dependent variable is a categorical one expressing the class membership of each observation. PLSDA can be applied in many cases when classical discriminant analysis cannot be applied. For example, when the number of observations is low and when the number of independent variables is high. When there are missing values, PLSDA can be applied on the data that is available. Finally, it is adapted when multicollinearity between independent variables is high. The aim of this study is to determine the economic and/or demographic indicators, which are effective in grouping the 28 European Union (EU) member countries and 7 candidate countries (including potential candidates Bosnia and Herzegovina (BiH) and Kosova) by using the data set obtained from database of the World Bank for 2014. Leaving the political issues aside, the analysis is only concerned with the economic and demographic variables that have the potential influence on country’s eligibility for EU entrance. Hence, in this study, both the performance of PLSDA method in classifying the countries correctly to their pre-defined groups (candidate or member) and the differences between the EU countries and candidate countries in terms of these indicators are analyzed. As a result of the PLSDA, the value of percentage correctness of 100 % indicates that overall of the 35 countries is classified correctly. Moreover, the most important variables that determine the statuses of member and candidate countries in terms of economic indicators are identified as 'external balance on goods and services (% GDP)', 'gross domestic savings (% GDP)' and 'gross national expenditure (% GDP)' that means for the 2014 economical structure of countries is the most important determinant of EU membership. Subsequently, the model validated to prove the predictive ability by using the data set for 2015. For prediction sample, %97,14 of the countries are correctly classified. An interesting result is obtained for only BiH, which is still a potential candidate for EU, predicted as a member of EU by using the indicators data set for 2015 as a prediction sample. Although BiH has made a significant transformation from a war-torn country to a semi-functional state, ethnic tensions, nationalistic rhetoric and political disagreements are still evident, which inhibit Bosnian progress towards the EU.

Keywords: classification, demographic indicators, economic indicators, European Union, partial least squares discriminant analysis

Procedia PDF Downloads 269
374 Accuracy Improvement of Traffic Participant Classification Using Millimeter-Wave Radar by Leveraging Simulator Based on Domain Adaptation

Authors: Tokihiko Akita, Seiichi Mita

Abstract:

A millimeter-wave radar is the most robust against adverse environments, making it an essential environment recognition sensor for automated driving. However, the reflection signal is sparse and unstable, so it is difficult to obtain the high recognition accuracy. Deep learning provides high accuracy even for them in recognition, but requires large scale datasets with ground truth. Specially, it takes a lot of cost to annotate for a millimeter-wave radar. For the solution, utilizing a simulator that can generate an annotated huge dataset is effective. Simulation of the radar is more difficult to match with real world data than camera image, and recognition by deep learning with higher-order features using the simulator causes further deviation. We have challenged to improve the accuracy of traffic participant classification by fusing simulator and real-world data with domain adaptation technique. Experimental results with the domain adaptation network created by us show that classification accuracy can be improved even with a few real-world data.

Keywords: millimeter-wave radar, object classification, deep learning, simulation, domain adaptation

Procedia PDF Downloads 80
373 Neologisms and Word-Formation Processes in Board Game Rulebook Corpus: Preliminary Results

Authors: Athanasios Karasimos, Vasiliki Makri

Abstract:

This research focuses on the design and development of the first text Corpus based on Board Game Rulebooks (BGRC) with direct application on the morphological analysis of neologisms and tendencies in word-formation processes. Corpus linguistics is a dynamic field that examines language through the lens of vast collections of texts. These corpora consist of diverse written and spoken materials, ranging from literature and newspapers to transcripts of everyday conversations. By morphologically analyzing these extensive datasets, morphologists can gain valuable insights into how language functions and evolves, as these extensive datasets can reflect the byproducts of inflection, derivation, blending, clipping, compounding, and neology. This entails scrutinizing how words are created, modified, and combined to convey meaning in a corpus of challenging, creative, and straightforward texts that include rules, examples, tutorials, and tips. Board games teach players how to strategize, consider alternatives, and think flexibly, which are critical elements in language learning. Their rulebooks reflect not only their weight (complexity) but also the language properties of each genre and subgenre of these games. Board games are a captivating realm where strategy, competition, and creativity converge. Beyond the excitement of gameplay, board games also spark the art of word creation. Word games, like Scrabble, Codenames, Bananagrams, Wordcraft, Alice in the Wordland, Once uUpona Time, challenge players to construct words from a pool of letters, thus encouraging linguistic ingenuity and vocabulary expansion. These games foster a love for language, motivating players to unearth obscure words and devise clever combinations. On the other hand, the designers and creators produce rulebooks, where they include their joy of discovering the hidden potential of language, igniting the imagination, and playing with the beauty of words, making these games a delightful fusion of linguistic exploration and leisurely amusement. In this research, more than 150 rulebooks in English from all types of modern board games, either language-independent or language-dependent, are used to create the BGRC. A representative sample of each genre (family, party, worker placement, deckbuilding, dice, and chance games, strategy, eurogames, thematic, role-playing, among others) was selected based on the score from BoardGameGeek, the size of the texts and the level of complexity (weight) of the game. A morphological model with morphological networks, multi-word expressions, and word-creation mechanics based on the complexity of the textual structure, difficulty, and board game category will be presented. In enabling the identification of patterns, trends, and variations in word formation and other morphological processes, this research aspires to make avail of this creative yet strict text genre so as to (a) give invaluable insight into morphological creativity and innovation that (re)shape the lexicon of the English language and (b) test morphological theories. Overall, it is shown that corpus linguistics empowers us to explore the intricate tapestry of language, and morphology in particular, revealing its richness, flexibility, and adaptability in the ever-evolving landscape of human expression.

Keywords: board game rulebooks, corpus design, morphological innovations, neologisms, word-formation processes

Procedia PDF Downloads 78
372 Fuzzy Population-Based Meta-Heuristic Approaches for Attribute Reduction in Rough Set Theory

Authors: Mafarja Majdi, Salwani Abdullah, Najmeh S. Jaddi

Abstract:

One of the global combinatorial optimization problems in machine learning is feature selection. It concerned with removing the irrelevant, noisy, and redundant data, along with keeping the original meaning of the original data. Attribute reduction in rough set theory is an important feature selection method. Since attribute reduction is an NP-hard problem, it is necessary to investigate fast and effective approximate algorithms. In this paper, we proposed two feature selection mechanisms based on memetic algorithms (MAs) which combine the genetic algorithm with a fuzzy record to record travel algorithm and a fuzzy controlled great deluge algorithm to identify a good balance between local search and genetic search. In order to verify the proposed approaches, numerical experiments are carried out on thirteen datasets. The results show that the MAs approaches are efficient in solving attribute reduction problems when compared with other meta-heuristic approaches.

Keywords: rough set theory, attribute reduction, fuzzy logic, memetic algorithms, record to record algorithm, great deluge algorithm

Procedia PDF Downloads 440
371 Aspect-Level Sentiment Analysis with Multi-Channel and Graph Convolutional Networks

Authors: Jiajun Wang, Xiaoge Li

Abstract:

The purpose of the aspect-level sentiment analysis task is to identify the sentiment polarity of aspects in a sentence. Currently, most methods mainly focus on using neural networks and attention mechanisms to model the relationship between aspects and context, but they ignore the dependence of words in different ranges in the sentence, resulting in deviation when assigning relationship weight to other words other than aspect words. To solve these problems, we propose a new aspect-level sentiment analysis model that combines a multi-channel convolutional network and graph convolutional network (GCN). Firstly, the context and the degree of association between words are characterized by Long Short-Term Memory (LSTM) and self-attention mechanism. Besides, a multi-channel convolutional network is used to extract the features of words in different ranges. Finally, a convolutional graph network is used to associate the node information of the dependency tree structure. We conduct experiments on four benchmark datasets. The experimental results are compared with those of other models, which shows that our model is better and more effective.

Keywords: aspect-level sentiment analysis, attention, multi-channel convolution network, graph convolution network, dependency tree

Procedia PDF Downloads 198
370 Manufacturing Anomaly Detection Using a Combination of Gated Recurrent Unit Network and Random Forest Algorithm

Authors: Atinkut Atinafu Yilma, Eyob Messele Sefene

Abstract:

Anomaly detection is one of the essential mechanisms to control and reduce production loss, especially in today's smart manufacturing. Quick anomaly detection aids in reducing the cost of production by minimizing the possibility of producing defective products. However, developing an anomaly detection model that can rapidly detect a production change is challenging. This paper proposes Gated Recurrent Unit (GRU) combined with Random Forest (RF) to detect anomalies in the production process in real-time quickly. The GRU is used as a feature detector, and RF as a classifier using the input features from GRU. The model was tested using various synthesis and real-world datasets against benchmark methods. The results show that the proposed GRU-RF outperforms the benchmark methods with the shortest time taken to detect anomalies in the production process. Based on the investigation from the study, this proposed model can eliminate or reduce unnecessary production costs and bring a competitive advantage to manufacturing industries.

Keywords: anomaly detection, multivariate time series data, smart manufacturing, gated recurrent unit network, random forest

Procedia PDF Downloads 99
369 Comparison of Deep Convolutional Neural Networks Models for Plant Disease Identification

Authors: Megha Gupta, Nupur Prakash

Abstract:

Identification of plant diseases has been performed using machine learning and deep learning models on the datasets containing images of healthy and diseased plant leaves. The current study carries out an evaluation of some of the deep learning models based on convolutional neural network (CNN) architectures for identification of plant diseases. For this purpose, the publicly available New Plant Diseases Dataset, an augmented version of PlantVillage dataset, available on Kaggle platform, containing 87,900 images has been used. The dataset contained images of 26 diseases of 14 different plants and images of 12 healthy plants. The CNN models selected for the study presented in this paper are AlexNet, ZFNet, VGGNet (four models), GoogLeNet, and ResNet (three models). The selected models are trained using PyTorch, an open-source machine learning library, on Google Colaboratory. A comparative study has been carried out to analyze the high degree of accuracy achieved using these models. The highest test accuracy and F1-score of 99.59% and 0.996, respectively, were achieved by using GoogLeNet with Mini-batch momentum based gradient descent learning algorithm.

Keywords: comparative analysis, convolutional neural networks, deep learning, plant disease identification

Procedia PDF Downloads 184
368 Multiple Linear Regression for Rapid Estimation of Subsurface Resistivity from Apparent Resistivity Measurements

Authors: Sabiu Bala Muhammad, Rosli Saad

Abstract:

Multiple linear regression (MLR) models for fast estimation of true subsurface resistivity from apparent resistivity field measurements are developed and assessed in this study. The parameters investigated were apparent resistivity (ρₐ), horizontal location (X) and depth (Z) of measurement as the independent variables; and true resistivity (ρₜ) as the dependent variable. To achieve linearity in both resistivity variables, datasets were first transformed into logarithmic domain following diagnostic checks of normality of the dependent variable and heteroscedasticity to ensure accurate models. Four MLR models were developed based on hierarchical combination of the independent variables. The generated MLR coefficients were applied to another data set to estimate ρₜ values for validation. Contours of the estimated ρₜ values were plotted and compared to the observed data plots at the colour scale and blanking for visual assessment. The accuracy of the models was assessed using coefficient of determination (R²), standard error (SE) and weighted mean absolute percentage error (wMAPE). It is concluded that the MLR models can estimate ρₜ for with high level of accuracy.

Keywords: apparent resistivity, depth, horizontal location, multiple linear regression, true resistivity

Procedia PDF Downloads 261
367 Importance of Location Selection of an Energy Storage System in a Smart Grid

Authors: Vanaja Rao

Abstract:

In the recent times, the need for the integration of Renewable Energy Sources (RES) in a Smart Grid is on the rise. As a result of this, associated energy storage systems are known to play important roles in sustaining the efficient operation of such RES like wind power and solar power. This paper investigates the importance of location selection of Energy Storage Systems (ESSs) in a Smart Grid. Three scenarios of ESS location is studied and analyzed in a Smart Grid, which are – 1. Near the generation/source, 2. In the middle of the Grid and, 3. Near the demand/consumption. This is explained with the aim of assisting any Distribution Network Operator (DNO) in deploying the ESSs in a power network, which will significantly help reduce the costs and time of planning and avoid any damages incurred as a result of installing them at an incorrect location of a Smart Grid. To do this, the outlined scenarios mentioned above are modelled and analyzed with the National Grid’s datasets of energy generation and consumption in the UK power network. As a result, the outcome of this analysis aims to provide a better overview for the location selection of the ESSs in a Smart Grid. This ensures power system stability and security along with the optimum usage of the ESSs.

Keywords: distribution networks, energy storage system, energy security, location planning, power stability, smart grid

Procedia PDF Downloads 286
366 Multi-Spectral Deep Learning Models for Forest Fire Detection

Authors: Smitha Haridasan, Zelalem Demissie, Atri Dutta, Ajita Rattani

Abstract:

Aided by the wind, all it takes is one ember and a few minutes to create a wildfire. Wildfires are growing in frequency and size due to climate change. Wildfires and its consequences are one of the major environmental concerns. Every year, millions of hectares of forests are destroyed over the world, causing mass destruction and human casualties. Thus early detection of wildfire becomes a critical component to mitigate this threat. Many computer vision-based techniques have been proposed for the early detection of forest fire using video surveillance. Several computer vision-based methods have been proposed to predict and detect forest fires at various spectrums, namely, RGB, HSV, and YCbCr. The aim of this paper is to propose a multi-spectral deep learning model that combines information from different spectrums at intermediate layers for accurate fire detection. A heterogeneous dataset assembled from publicly available datasets is used for model training and evaluation in this study. The experimental results show that multi-spectral deep learning models could obtain an improvement of about 4.68 % over those based on a single spectrum for fire detection.

Keywords: deep learning, forest fire detection, multi-spectral learning, natural hazard detection

Procedia PDF Downloads 227
365 Exploring Syntactic and Semantic Features for Text-Based Authorship Attribution

Authors: Haiyan Wu, Ying Liu, Shaoyun Shi

Abstract:

Authorship attribution is to extract features to identify authors of anonymous documents. Many previous works on authorship attribution focus on statistical style features (e.g., sentence/word length), content features (e.g., frequent words, n-grams). Modeling these features by regression or some transparent machine learning methods gives a portrait of the authors' writing style. But these methods do not capture the syntactic (e.g., dependency relationship) or semantic (e.g., topics) information. In recent years, some researchers model syntactic trees or latent semantic information by neural networks. However, few works take them together. Besides, predictions by neural networks are difficult to explain, which is vital in authorship attribution tasks. In this paper, we not only utilize the statistical style and content features but also take advantage of both syntactic and semantic features. Different from an end-to-end neural model, feature selection and prediction are two steps in our method. An attentive n-gram network is utilized to select useful features, and logistic regression is applied to give prediction and understandable representation of writing style. Experiments show that our extracted features can improve the state-of-the-art methods on three benchmark datasets.

Keywords: authorship attribution, attention mechanism, syntactic feature, feature extraction

Procedia PDF Downloads 124
364 A Study of ZY3 Satellite Digital Elevation Model Verification and Refinement with Shuttle Radar Topography Mission

Authors: Bo Wang

Abstract:

As the first high-resolution civil optical satellite, ZY-3 satellite is able to obtain high-resolution multi-view images with three linear array sensors. The images can be used to generate Digital Elevation Models (DEM) through dense matching of stereo images. However, due to the clouds, forest, water and buildings covered on the images, there are some problems in the dense matching results such as outliers and areas failed to be matched (matching holes). This paper introduced an algorithm to verify the accuracy of DEM that generated by ZY-3 satellite with Shuttle Radar Topography Mission (SRTM). Since the accuracy of SRTM (Internal accuracy: 5 m; External accuracy: 15 m) is relatively uniform in the worldwide, it may be used to improve the accuracy of ZY-3 DEM. Based on the analysis of mass DEM and SRTM data, the processing can be divided into two aspects. The registration of ZY-3 DEM and SRTM can be firstly performed using the conjugate line features and area features matched between these two datasets. Then the ZY-3 DEM can be refined by eliminating the matching outliers and filling the matching holes. The matching outliers can be eliminated based on the statistics on Local Vector Binning (LVB). The matching holes can be filled by the elevation interpolated from SRTM. Some works are also conducted for the accuracy statistics of the ZY-3 DEM.

Keywords: ZY-3 satellite imagery, DEM, SRTM, refinement

Procedia PDF Downloads 328
363 Attention-Based ResNet for Breast Cancer Classification

Authors: Abebe Mulugojam Negash, Yongbin Yu, Ekong Favour, Bekalu Nigus Dawit, Molla Woretaw Teshome, Aynalem Birtukan Yirga

Abstract:

Breast cancer remains a significant health concern, necessitating advancements in diagnostic methodologies. Addressing this, our paper confronts the notable challenges in breast cancer classification, particularly the imbalance in datasets and the constraints in the accuracy and interpretability of prevailing deep learning approaches. We proposed an attention-based residual neural network (ResNet), which effectively combines the robust features of ResNet with an advanced attention mechanism. Enhanced through strategic data augmentation and positive weight adjustments, this approach specifically targets the issue of data imbalance. The proposed model is tested on the BreakHis dataset and achieved accuracies of 99.00%, 99.04%, 98.67%, and 98.08% in different magnifications (40X, 100X, 200X, and 400X), respectively. We evaluated the performance by using different evaluation metrics such as precision, recall, and F1-Score and made comparisons with other state-of-the-art methods. Our experiments demonstrate that the proposed model outperforms existing approaches, achieving higher accuracy in breast cancer classification.

Keywords: residual neural network, attention mechanism, positive weight, data augmentation

Procedia PDF Downloads 77
362 Pre-Operative Psychological Factors Significantly Add to the Predictability of Chronic Narcotic Use: A Two Year Prospective Study

Authors: Dana El-Mughayyar, Neil Manson, Erin Bigney, Eden Richardson, Dean Tripp, Edward Abraham

Abstract:

Use of narcotics to treat pain has increased over the past two decades and is a contributing factor to the current public health crisis. Understanding the pre-operative risks of chronic narcotic use may be aided through investigation of psychological measures. The objective of the reported study is to determine predictors of narcotic use two years post-surgery in a thoracolumbar spine surgery population, including an array of psychological factors. A prospective observational study of 191 consecutively enrolled adult patients having undergone thoracolumbar spine surgery is presented. Baseline measures of interest included the Pain Catastrophizing Scale (PCS), Tampa Scale for Kinesiophobia, Multidimensional Scale for Perceived Social Support (MSPSS), Chronic Pain Acceptance Questionnaire (CPAQ-8), Oswestry Disability Index (ODI), Numeric Rating Scales for back and leg pain (NRS-B/L), SF-12’s Mental Component Summary (MCS), narcotic use and demographic variables. The post-operative measure of interest is narcotic use at 2-year follow-up. Narcotic use is collapsed into binary categories of use and no use. Descriptive statistics are run. Chi Square analysis is used for categorical variables and an ANOVA for continuous variables. Significant variables are built into a hierarchical logistic regression to determine predictors of post-operative narcotic use. Significance is set at α < 0.05. Results: A total of 27.23% of the sample were using narcotics two years after surgery. The regression model included ODI, NRS-Leg, time with condition, chief complaint, pre-operative drug use, gender, MCS, PCS subscale helplessness, and CPAQ subscale pain willingness and was significant χ² (13, N=191)= 54.99; p = .000. The model accounted for 39.6% of the variance in narcotic use and correctly predicted in 79.7% of cases. Psychological variables accounted for 9.6% of the variance over and above the other predictors. Conclusions: Managing chronic narcotic usage is central to the patient’s overall health and quality of life. Psychological factors in the preoperative period are significant predictors of narcotic use 2 years post-operatively. The psychological variables are malleable, potentially allowing surgeons to direct their patients to preventative resources prior to surgery.

Keywords: narcotics, psychological factors, quality of life, spine surgery

Procedia PDF Downloads 129
361 Global City Typologies: 300 Cities and Over 100 Datasets

Authors: M. Novak, E. Munoz, A. Jana, M. Nelemans

Abstract:

Cities and local governments the world over are interested to employ circular strategies as a means to bring about food security, create employment and increase resilience. The selection and implementation of circular strategies is facilitated by modeling the effects of strategies locally and understanding the impacts such strategies have had in other (comparable) cities and how that would translate locally. Urban areas are heterogeneous because of their geographic, economic, social characteristics, governance, and culture. In order to better understand the effect of circular strategies on urban systems, we create a dataset for over 300 cities around the world designed to facilitate circular strategy scenario modeling. This new dataset integrates data from over 20 prominent global national and urban data sources, such as the Global Human Settlements layer and International Labour Organisation, as well as incorporating employment data from over 150 cities collected bottom up from local departments and data providers. The dataset is made to be reproducible. Various clustering techniques are explored in the paper. The result is sets of clusters of cities, which can be used for further research, analysis, and support comparative, regional, and national policy making on circular cities.

Keywords: data integration, urban innovation, cluster analysis, circular economy, city profiles, scenario modelling

Procedia PDF Downloads 171
360 Clustering of Association Rules of ISIS & Al-Qaeda Based on Similarity Measures

Authors: Tamanna Goyal, Divya Bansal, Sanjeev Sofat

Abstract:

In world-threatening terrorist attacks, where early detection, distinction, and prediction are effective diagnosis techniques and for functionally accurate and precise analysis of terrorism data, there are so many data mining & statistical approaches to assure accuracy. The computational extraction of derived patterns is a non-trivial task which comprises specific domain discovery by means of sophisticated algorithm design and analysis. This paper proposes an approach for similarity extraction by obtaining the useful attributes from the available datasets of terrorist attacks and then applying feature selection technique based on the statistical impurity measures followed by clustering techniques on the basis of similarity measures. On the basis of degree of participation of attributes in the rules, the associative dependencies between the attacks are analyzed. Consequently, to compute the similarity among the discovered rules, we applied a weighted similarity measure. Finally, the rules are grouped by applying using hierarchical clustering. We have applied it to an open source dataset to determine the usability and efficiency of our technique, and a literature search is also accomplished to support the efficiency and accuracy of our results.

Keywords: association rules, clustering, similarity measure, statistical approaches

Procedia PDF Downloads 307
359 Burden of Communicable and Non-Communicable Disease in India: A Regional Analysis

Authors: Ajit Kumar Yadav, Priyanka Yadav, F. Ram

Abstract:

In present study is an effort to analyse the burden of diseases in the state. Disability Adjusted Life Years (DALY) is estimated non-communicable diseases. Multi-rounds (52nd, 60th and 71st round) of the National Sample Surveys (NSSO), conducted in 1995-96, 2004 and 2014 respectively, and Million Deaths Study (MDS) of 2001-03, 2006 and 2013-14 datasets are used. Descriptive and multivariate analyses are carried out to identify the determinants of different types of self-reported morbidity and DALY. The prevalence was higher for population aged 60 and above, among females, illiterates, and rich across the time period and for all the selected morbidities. The results were found to be significant at P<0.001. The estimation of DALY revealed that, the burden of communicable diseases was higher during infancy, noticeably among males than females in 2002. However, females aged 1-5 years were more vulnerable to report communicable diseases than the corresponding males. The age distribution of DALY indicates that individuals aged below 5 years and above 60 year were more susceptible to ill health. The growing incidence of non-communicable diseases especially among the older generations put additional burden on the health system in the state. The state has to grapple with the unsettled preventable infectious diseases in one hand and growing non-communicable in other hand.

Keywords: disease burden, non-communicable, communicable, India and region

Procedia PDF Downloads 238
358 Sparse Modelling of Cancer Patients’ Survival Based on Genomic Copy Number Alterations

Authors: Khaled M. Alqahtani

Abstract:

Copy number alterations (CNA) are variations in the structure of the genome, where certain regions deviate from the typical two chromosomal copies. These alterations are pivotal in understanding tumor progression and are indicative of patients' survival outcomes. However, effectively modeling patients' survival based on their genomic CNA profiles while identifying relevant genomic regions remains a statistical challenge. Various methods, such as the Cox proportional hazard (PH) model with ridge, lasso, or elastic net penalties, have been proposed but often overlook the inherent dependencies between genomic regions, leading to results that are hard to interpret. In this study, we enhance the elastic net penalty by incorporating an additional penalty that accounts for these dependencies. This approach yields smooth parameter estimates and facilitates variable selection, resulting in a sparse solution. Our findings demonstrate that this method outperforms other models in predicting survival outcomes, as evidenced by our simulation study. Moreover, it allows for a more meaningful interpretation of genomic regions associated with patients' survival. We demonstrate the efficacy of our approach using both real data from a lung cancer cohort and simulated datasets.

Keywords: copy number alterations, cox proportional hazard, lung cancer, regression, sparse solution

Procedia PDF Downloads 31
357 On the Cluster of the Families of Hybrid Polynomial Kernels in Kernel Density Estimation

Authors: Benson Ade Eniola Afere

Abstract:

Over the years, kernel density estimation has been extensively studied within the context of nonparametric density estimation. The fundamental components of kernel density estimation are the kernel function and the bandwidth. While the mathematical exploration of the kernel component has been relatively limited, its selection and development remain crucial. The Mean Integrated Squared Error (MISE), serving as a measure of discrepancy, provides a robust framework for assessing the effectiveness of any kernel function. A kernel function with a lower MISE is generally considered to perform better than one with a higher MISE. Hence, the primary aim of this article is to create kernels that exhibit significantly reduced MISE when compared to existing classical kernels. Consequently, this article introduces a cluster of hybrid polynomial kernel families. The construction of these proposed kernel functions is carried out heuristically by combining two kernels from the classical polynomial kernel family using probability axioms. We delve into the analysis of error propagation within these kernels. To assess their performance, simulation experiments, and real-life datasets are employed. The obtained results demonstrate that the proposed hybrid kernels surpass their classical kernel counterparts in terms of performance.

Keywords: classical polynomial kernels, cluster of families, global error, hybrid Kernels, Kernel density estimation, Monte Carlo simulation

Procedia PDF Downloads 77
356 Data Augmentation for Automatic Graphical User Interface Generation Based on Generative Adversarial Network

Authors: Xulu Yao, Moi Hoon Yap, Yanlong Zhang

Abstract:

As a branch of artificial neural network, deep learning is widely used in the field of image recognition, but the lack of its dataset leads to imperfect model learning. By analysing the data scale requirements of deep learning and aiming at the application in GUI generation, it is found that the collection of GUI dataset is a time-consuming and labor-consuming project, which is difficult to meet the needs of current deep learning network. To solve this problem, this paper proposes a semi-supervised deep learning model that relies on the original small-scale datasets to produce a large number of reliable data sets. By combining the cyclic neural network with the generated countermeasure network, the cyclic neural network can learn the sequence relationship and characteristics of data, make the generated countermeasure network generate reasonable data, and then expand the Rico dataset. Relying on the network structure, the characteristics of collected data can be well analysed, and a large number of reasonable data can be generated according to these characteristics. After data processing, a reliable dataset for model training can be formed, which alleviates the problem of dataset shortage in deep learning.

Keywords: GUI, deep learning, GAN, data augmentation

Procedia PDF Downloads 169
355 On Enabling Miner Self-Rescue with In-Mine Robots using Real-Time Object Detection with Thermal Images

Authors: Cyrus Addy, Venkata Sriram Siddhardh Nadendla, Kwame Awuah-Offei

Abstract:

Surface robots in modern underground mine rescue operations suffer from several limitations in enabling a prompt self-rescue. Therefore, the possibility of designing and deploying in-mine robots to expedite miner self-rescue can have a transformative impact on miner safety. These in-mine robots for miner self-rescue can be envisioned to carry out diverse tasks such as object detection, autonomous navigation, and payload delivery. Specifically, this paper investigates the challenges in the design of object detection algorithms for in-mine robots using thermal images, especially to detect people in real-time. A total of 125 thermal images were collected in the Missouri S&T Experimental Mine with the help of student volunteers using the FLIR TG 297 infrared camera, which were pre-processed into training and validation datasets with 100 and 25 images, respectively. Three state-of-the-art, pre-trained real-time object detection models, namely YOLOv5, YOLO-FIRI, and YOLOv8, were considered and re-trained using transfer learning techniques on the training dataset. On the validation dataset, the re-trained YOLOv8 outperforms the re-trained versions of both YOLOv5, and YOLO-FIRI.

Keywords: miner self-rescue, object detection, underground mine, YOLO

Procedia PDF Downloads 63
354 Influence of Pretreatment Magnetic Resonance Imaging on Local Therapy Decisions in Intermediate-Risk Prostate Cancer Patients

Authors: Christian Skowronski, Andrew Shanholtzer, Brent Yelton, Muayad Almahariq, Daniel J. Krauss

Abstract:

Prostate cancer has the third highest incidence rate and is the second leading cause of cancer death for men in the United States. Of the diagnostic tools available for intermediate-risk prostate cancer, magnetic resonance imaging (MRI) provides superior soft tissue delineation serving as a valuable tool for both diagnosis and treatment planning. Currently, there is minimal data regarding the practical utility of MRI for evaluation of intermediate-risk prostate cancer. As such, the National Comprehensive Cancer Network’s guidelines indicate MRI as optional in intermediate-risk prostate cancer evaluation. This project aims to elucidate whether MRI affects radiation treatment decisions for intermediate-risk prostate cancer. This was a retrospective study evaluating 210 patients with intermediate-risk prostate cancer, treated with definitive radiotherapy at our institution between 2019-2020. NCCN risk stratification criteria were used to define intermediate-risk prostate cancer. Patients were divided into two groups: those with pretreatment prostate MRI, and those without pretreatment prostate MRI. We compared the use of external beam radiotherapy, brachytherapy alone, brachytherapy boost, and androgen depravation therapy between the two groups. Inverse probability of treatment weighting was used to match the two groups for age, comorbidity index, American Urologic Association symptoms index, pretreatment PSA, grade group, and percent core involvement on prostate biopsy. Wilcoxon Rank Sum and Chi-squared tests were used to compare continuous and categorical variables. Of the patients who met the study’s eligibility criteria, 133 had a prostate MRI and 77 did not. Following propensity matching, there were no differences between baseline characteristics between the two groups. There were no statistically significant differences in treatments pursued between the two groups: 42% vs 47% were treated with brachytherapy alone, 40% vs 42% were treated with external beam radiotherapy alone, 18% vs 12% were treated with external beam radiotherapy with a brachytherapy boost, and 24% vs 17% received androgen deprivation therapy in the non-MRI and MRI groups, respectively. This analysis suggests that pretreatment MRI does not significantly impact radiation therapy or androgen deprivation therapy decisions in patients with intermediate-risk prostate cancer. Obtaining a pretreatment prostate MRI should be used judiciously and pursued only to answer a specific question, for which the answer is likely to impact treatment decision. Further follow up is needed to correlate MRI findings with their impacts on specific oncologic outcomes.

Keywords: magnetic resonance imaging, prostate cancer, definitive radiotherapy, gleason score 7

Procedia PDF Downloads 76
353 Regional Anesthesia in Carotid Surgery: A Single Center Experience

Authors: Daniel Thompson, Muhammad Peerbux, Sophie Cerutti, Hansraj Riteesh Bookun

Abstract:

Patients with carotid stenosis, which may be asymptomatic or symptomatic in the form of transient ischaemic attack (TIA), amaurosis fugax, or stroke, often require an endarterectomy to reduce stroke risk. Risks of this procedure include stroke, death, myocardial infarction, and cranial nerve damage. Carotid endarterectomy is most commonly performed under general anaesthetic, however, it can also be undertaken with a regional anaesthetic approach. Our tertiary centre generally performs carotid endarterectomy under regional anaesthetic. Our major tertiary hospital mostly utilises regional anaesthesia for carotid endarterectomy. We completed a cross-sectional analysis of all cases of carotid endarterectomy performed under regional anaesthesia across a 10-year period between January 2010 to March 2020 at our institution. 350 patients were included in this descriptive analysis, and demographic details for patients, indications for surgery, procedural details, length of surgery, and complications were collected. Data was cross tabulated and presented in frequency tables to describe these categorical variables. 263 of the 350 patients in the analysis were male, with a mean age of 71 ± 9. 172 patients had a history of ischaemic heart disease, 104 had diabetes mellitus, 318 had hypertension, and 17 patients had chronic kidney disease greater than Stage 3. 13.1% (46 patients) were current smokers, and the majority (63%) were ex-smokers. Most commonly, carotid endarterectomy was performed conventionally with patch arterioplasty 96% of the time (337 patients). The most common indication was TIA and stroke in 64% of patients, 18.9% were classified as asymptomatic, and 13.7% had amaurosis fugax. There were few general complications, with 9 wound complications/infections, 7 postoperative haematomas requiring return to theatre, 3 myocardial infarctions, 3 arrhythmias, 1 exacerbation of congestive heart failure, 1 chest infection, and 1 urinary tract infection. Specific complications to carotid endarterectomy included 3 strokes, 1 postoperative TIA, and 1 cerebral bleed. There were no deaths in our cohort. This analysis of a large cohort of patients from a major tertiary centre who underwent carotid endarterectomy under regional anaesthesia indicates the safety of such an approach for these patients. Regional anaesthesia holds the promise of less general respiratory and cardiac events compared to general anaesthesia, and in this vulnerable patient group, calls for comparative research between local and general anaesthesia in carotid surgery.

Keywords: anaesthesia, carotid endarterectomy, stroke, carotid stenosis

Procedia PDF Downloads 109
352 Load Forecasting Using Neural Network Integrated with Economic Dispatch Problem

Authors: Mariyam Arif, Ye Liu, Israr Ul Haq, Ahsan Ashfaq

Abstract:

High cost of fossil fuels and intensifying installations of alternate energy generation sources are intimidating main challenges in power systems. Making accurate load forecasting an important and challenging task for optimal energy planning and management at both distribution and generation side. There are many techniques to forecast load but each technique comes with its own limitation and requires data to accurately predict the forecast load. Artificial Neural Network (ANN) is one such technique to efficiently forecast the load. Comparison between two different ranges of input datasets has been applied to dynamic ANN technique using MATLAB Neural Network Toolbox. It has been observed that selection of input data on training of a network has significant effects on forecasted results. Day-wise input data forecasted the load accurately as compared to year-wise input data. The forecasted load is then distributed among the six generators by using the linear programming to get the optimal point of generation. The algorithm is then verified by comparing the results of each generator with their respective generation limits.

Keywords: artificial neural networks, demand-side management, economic dispatch, linear programming, power generation dispatch

Procedia PDF Downloads 175
351 Analysing Techniques for Fusing Multimodal Data in Predictive Scenarios Using Convolutional Neural Networks

Authors: Philipp Ruf, Massiwa Chabbi, Christoph Reich, Djaffar Ould-Abdeslam

Abstract:

In recent years, convolutional neural networks (CNN) have demonstrated high performance in image analysis, but oftentimes, there is only structured data available regarding a specific problem. By interpreting structured data as images, CNNs can effectively learn and extract valuable insights from tabular data, leading to improved predictive accuracy and uncovering hidden patterns that may not be apparent in traditional structured data analysis. In applying a single neural network for analyzing multimodal data, e.g., both structured and unstructured information, significant advantages in terms of time complexity and energy efficiency can be achieved. Converting structured data into images and merging them with existing visual material offers a promising solution for applying CNN in multimodal datasets, as they often occur in a medical context. By employing suitable preprocessing techniques, structured data is transformed into image representations, where the respective features are expressed as different formations of colors and shapes. In an additional step, these representations are fused with existing images to incorporate both types of information. This final image is finally analyzed using a CNN.

Keywords: CNN, image processing, tabular data, mixed dataset, data transformation, multimodal fusion

Procedia PDF Downloads 105
350 An Empirical Study on Switching Activation Functions in Shallow and Deep Neural Networks

Authors: Apoorva Vinod, Archana Mathur, Snehanshu Saha

Abstract:

Though there exists a plethora of Activation Functions (AFs) used in single and multiple hidden layer Neural Networks (NN), their behavior always raised curiosity, whether used in combination or singly. The popular AFs –Sigmoid, ReLU, and Tanh–have performed prominently well for shallow and deep architectures. Most of the time, AFs are used singly in multi-layered NN, and, to the best of our knowledge, their performance is never studied and analyzed deeply when used in combination. In this manuscript, we experiment with multi-layered NN architecture (both on shallow and deep architectures; Convolutional NN and VGG16) and investigate how well the network responds to using two different AFs (Sigmoid-Tanh, Tanh-ReLU, ReLU-Sigmoid) used alternately against a traditional, single (Sigmoid-Sigmoid, Tanh-Tanh, ReLUReLU) combination. Our results show that using two different AFs, the network achieves better accuracy, substantially lower loss, and faster convergence on 4 computer vision (CV) and 15 Non-CV (NCV) datasets. When using different AFs, not only was the accuracy greater by 6-7%, but we also accomplished convergence twice as fast. We present a case study to investigate the probability of networks suffering vanishing and exploding gradients when using two different AFs. Additionally, we theoretically showed that a composition of two or more AFs satisfies Universal Approximation Theorem (UAT).

Keywords: activation function, universal approximation function, neural networks, convergence

Procedia PDF Downloads 142
349 High Injury Prevalence in Adolescent Field Hockey Players: Implications for Future Practice

Authors: Pillay J. D., D. De Wit, J. F. Ducray

Abstract:

Field hockey is a popular international sport which is played in more than 100 countries across the world. Due to the nature of hockey, players repeatedly perform a combination of forward flexion and rotational movements of the spine in order to strike the ball. These movements have been shown to increase the risk of pain and injury to the lumbar spine. The aim of this study was to determine the prevalence and incidence of low back pain (LBP) in male adolescent field hockey players and the characteristics of LBP in terms of location, chronicity, disability, and treatment sought, as well as its association with selected risk factors. A survey was conducted on 112 male adolescent field hockey players in the eThekwini Municipality of KwaZulu-Natal, South Africa. The questionnaire contained sections on the demographics of participants, general characteristics of participants, health and lifestyle characteristics, low back pain patterns, treatment of low back pain, and the level of disability associated with LBP. The data were statistically analysed using IBM SPSS version 25 with statistical significance set at p-value <0.05. Descriptive statistics such as mean and standard deviation were used to summarise responses to continuous variables as appropriate. Categorical variables were described using frequency tables. Associations between risk factors and low back pain were tested using Pearson’s chi-square test and t-tests as appropriate. A total of 68 questionnaires were completed for analysis (67% participation rate); the period prevalence of LBP was 63.2% (35.0%:beginning of the season, 32.4%:mid-season, 22.1%: end of season). Incidence was 38.2%. The most common location for LBP was the middle low back region (39.5%), and the most common duration of pain was a few hours (32.6%). Most participants (79.1%) did not classify their pain as a disability, and only 44.2% of participants received medical treatment for their LBP. An interesting finding was the association between hydration and LBP (p = 0.050), i.e., those individuals who did not hydrate frequently during matches and training were significantly more likely to experience LBP. The results of this study, although limited to a select group of adolescents, showed a higher prevalence of LBP than that of previous studies. More importantly, even though most participants did not experience LBP classified as a disability, LBP still had a large impact on participants, as nearly half of the participants consulted with a medical professional for treatment. Need for the application of further strategies in the prevention and management of LBP in field hockey, such as adequate warm-up and cool-down, stretching exercises, rest between sessions, etc., are recommended as simple strategies to reduce LBP prevalence.

Keywords: adolescents, field hockey players, incidence, low back pain, prevalence, risk factors

Procedia PDF Downloads 48
348 SAMRA: Dataset in Al-Soudani Arabic Maghrebi Script for Recognition of Arabic Ancient Words Handwritten

Authors: Sidi Ahmed Maouloud, Cheikh Ba

Abstract:

Much of West Africa’s cultural heritage is written in the Al-Soudani Arabic script, which was widely used in West Africa before the time of European colonization. This Al-Soudani Arabic script is an African version of the Maghrebi script, in particular, the Al-Mebssout script. However, the local African qualities were incorporated into the Al-Soudani script in a way that gave it a unique African diversity and character. Despite the existence of several Arabic datasets in Oriental script, allowing for the analysis, layout, and recognition of texts written in these calligraphies, many Arabic scripts and written traditions remain understudied. In this paper, we present a dataset of words from Al-Soudani calligraphy scripts. This dataset consists of 100 images selected from three different manuscripts written in Al-Soudani Arabic script by different copyists. The primary source for this database was the libraries of Boston University and Cambridge University. This dataset highlights the unique characteristics of the Al-Soudani Arabic script as well as the new challenges it presents in terms of automatic word recognition of Arabic manuscripts. An HTR system based on a hybrid ANN (CRNN-CTC) is also proposed to test this dataset. SAMRA is a dataset of annotated Arabic manuscript words in the Al-Soudani script that can help researchers automatically recognize and analyze manuscript words written in this script.

Keywords: dataset, CRNN-CTC, handwritten words recognition, Al-Soudani Arabic script, HTR, manuscripts

Procedia PDF Downloads 103
347 Predictive Analytics in Traffic Flow Management: Integrating Temporal Dynamics and Traffic Characteristics to Estimate Travel Time

Authors: Maria Ezziani, Rabie Zine, Amine Amar, Ilhame Kissani

Abstract:

This paper introduces a predictive model for urban transportation engineering, which is vital for efficient traffic management. Utilizing comprehensive datasets and advanced statistical techniques, the model accurately forecasts travel times by considering temporal variations and traffic dynamics. Machine learning algorithms, including regression trees and neural networks, are employed to capture sequential dependencies. Results indicate significant improvements in predictive accuracy, particularly during peak hours and holidays, with the incorporation of traffic flow and speed variables. Future enhancements may integrate weather conditions and traffic incidents. The model's applications range from adaptive traffic management systems to route optimization algorithms, facilitating congestion reduction and enhancing journey reliability. Overall, this research extends beyond travel time estimation, offering insights into broader transportation planning and policy-making realms, empowering stakeholders to optimize infrastructure utilization and improve network efficiency.

Keywords: predictive analytics, traffic flow, travel time estimation, urban transportation, machine learning, traffic management

Procedia PDF Downloads 64
346 Constructing a Physics Guided Machine Learning Neural Network to Predict Tonal Noise Emitted by a Propeller

Authors: Arthur D. Wiedemann, Christopher Fuller, Kyle A. Pascioni

Abstract:

With the introduction of electric motors, small unmanned aerial vehicle designers have to consider trade-offs between acoustic noise and thrust generated. Currently, there are few low-computational tools available for predicting acoustic noise emitted by a propeller into the far-field. Artificial neural networks offer a highly non-linear and adaptive model for predicting isolated and interactive tonal noise. But neural networks require large data sets, exceeding practical considerations in modeling experimental results. A methodology known as physics guided machine learning has been applied in this study to reduce the required data set to train the network. After building and evaluating several neural networks, the best model is investigated to determine how the network successfully predicts the acoustic waveform. Lastly, a post-network transfer function is developed to remove discontinuity from the predicted waveform. Overall, methodologies from physics guided machine learning show a notable improvement in prediction performance, but additional loss functions are necessary for constructing predictive networks on small datasets.

Keywords: aeroacoustics, machine learning, propeller, rotor, neural network, physics guided machine learning

Procedia PDF Downloads 206