Commenced in January 2007

Frequency: Monthly

Edition: International

Paper Count: 2888

Search results for: satellite imagery classification

1928 Diversity in Finance Literature Revealed through the Lens of Machine Learning: A Topic Modeling Approach on Academic Papers

Abstract:

This paper aims to define a structured topography for finance researchers seeking to navigate the body of knowledge in their extrapolation of finance phenomena. To make sense of the body of knowledge in finance, a probabilistic topic modeling approach is applied on 6000 abstracts of academic articles published in three top journals in finance between 1976 and 2020. This approach combines both machine learning techniques and natural language processing to statistically identify the conjunctions between research articles and their shared topics described each by relevant keywords. The topic modeling analysis reveals 35 coherent topics that can well depict finance literature and provide a comprehensive structure for the ongoing research themes. Comparing the extracted topics to the Journal of Economic Literature (JEL) classification system, a significant similarity was highlighted between the characterizing keywords. On the other hand, we identify other topics that do not match the JEL classification despite being relevant in the finance literature.

Keywords: finance literature, textual analysis, topic modeling, perplexity

Procedia PDF Downloads 148

1927 A Framework for Auditing Multilevel Models Using Explainability Methods

Authors: Debarati Bhaumik, Diptish Dey

Abstract:

Multilevel models, increasingly deployed in industries such as insurance, food production, and entertainment within functions such as marketing and supply chain management, need to be transparent and ethical. Applications usually result in binary classification within groups or hierarchies based on a set of input features. Using open-source datasets, we demonstrate that popular explainability methods, such as SHAP and LIME, consistently underperform inaccuracy when interpreting these models. They fail to predict the order of feature importance, the magnitudes, and occasionally even the nature of the feature contribution (negative versus positive contribution to the outcome). Besides accuracy, the computational intractability of SHAP for binomial classification is a cause of concern. For transparent and ethical applications of these hierarchical statistical models, sound audit frameworks need to be developed. In this paper, we propose an audit framework for technical assessment of multilevel regression models focusing on three aspects: (i) model assumptions & statistical properties, (ii) model transparency using different explainability methods, and (iii) discrimination assessment. To this end, we undertake a quantitative approach and compare intrinsic model methods with SHAP and LIME. The framework comprises a shortlist of KPIs, such as PoCE (Percentage of Correct Explanations) and MDG (Mean Discriminatory Gap) per feature, for each of these three aspects. A traffic light risk assessment method is furthermore coupled to these KPIs. The audit framework will assist regulatory bodies in performing conformity assessments of AI systems using multilevel binomial classification models at businesses. It will also benefit businesses deploying multilevel models to be future-proof and aligned with the European Commission’s proposed Regulation on Artificial Intelligence.

Keywords: audit, multilevel model, model transparency, model explainability, discrimination, ethics

Procedia PDF Downloads 77

1926 Large Neural Networks Learning From Scratch With Very Few Data and Without Explicit Regularization

Authors: Christoph Linse, Thomas Martinetz

Abstract:

Recent findings have shown that Neural Networks generalize also in over-parametrized regimes with zero training error. This is surprising, since it is completely against traditional machine learning wisdom. In our empirical study we fortify these findings in the domain of fine-grained image classification. We show that very large Convolutional Neural Networks with millions of weights do learn with only a handful of training samples and without image augmentation, explicit regularization or pretraining. We train the architectures ResNet018, ResNet101 and VGG19 on subsets of the difficult benchmark datasets Caltech101, CUB_200_2011, FGVCAircraft, Flowers102 and StanfordCars with 100 classes and more, perform a comprehensive comparative study and draw implications for the practical application of CNNs. Finally, we show that VGG19 with 140 million weights learns to distinguish airplanes and motorbikes with up to 95% accuracy using only 20 training samples per class.

Keywords: convolutional neural networks, fine-grained image classification, generalization, image recognition, over-parameterized, small data sets

Procedia PDF Downloads 70

1925 Developing an Advanced Algorithm Capable of Classifying News, Articles and Other Textual Documents Using Text Mining Techniques

Authors: R. B. Knudsen, O. T. Rasmussen, R. A. Alphinas

Abstract:

The reason for conducting this research is to develop an algorithm that is capable of classifying news articles from the automobile industry, according to the competitive actions that they entail, with the use of Text Mining (TM) methods. It is needed to test how to properly preprocess the data for this research by preparing pipelines which fits each algorithm the best. The pipelines are tested along with nine different classification algorithms in the realm of regression, support vector machines, and neural networks. Preliminary testing for identifying the optimal pipelines and algorithms resulted in the selection of two algorithms with two different pipelines. The two algorithms are Logistic Regression (LR) and Artificial Neural Network (ANN). These algorithms are optimized further, where several parameters of each algorithm are tested. The best result is achieved with the ANN. The final model yields an accuracy of 0.79, a precision of 0.80, a recall of 0.78, and an F1 score of 0.76. By removing three of the classes that created noise, the final algorithm is capable of reaching an accuracy of 94%.

Keywords: Artificial Neural network, Competitive dynamics, Logistic Regression, Text classification, Text mining

Procedia PDF Downloads 103

1924 Enhancing the Interpretation of Group-Level Diagnostic Results from Cognitive Diagnostic Assessment: Application of Quantile Regression and Cluster Analysis

Authors: Wenbo Du, Xiaomei Ma

Abstract:

With the empowerment of Cognitive Diagnostic Assessment (CDA), various domains of language testing and assessment have been investigated to dig out more diagnostic information. What is noticeable is that most of the extant empirical CDA-based research puts much emphasis on individual-level diagnostic purpose with very few concerned about learners’ group-level performance. Even though the personalized diagnostic feedback is the unique feature that differentiates CDA from other assessment tools, group-level diagnostic information cannot be overlooked in that it might be more practical in classroom setting. Additionally, the group-level diagnostic information obtained via current CDA always results in a “flat pattern”, that is, the mastery/non-mastery of all tested skills accounts for the two highest proportion. In that case, the outcome does not bring too much benefits than the original total score. To address these issues, the present study attempts to apply cluster analysis for group classification and quantile regression analysis to pinpoint learners’ performance at different proficiency levels (beginner, intermediate and advanced) thus to enhance the interpretation of the CDA results extracted from a group of EFL learners’ reading performance on a diagnostic reading test designed by PELDiaG research team from a key university in China. The results show that EM method in cluster analysis yield more appropriate classification results than that of CDA, and quantile regression analysis does picture more insightful characteristics of learners with different reading proficiencies. The findings are helpful and practical for instructors to refine EFL reading curriculum and instructional plan tailored based on the group classification results and quantile regression analysis. Meanwhile, these innovative statistical methods could also make up the deficiencies of CDA and push forward the development of language testing and assessment in the future.

Keywords: cognitive diagnostic assessment, diagnostic feedback, EFL reading, quantile regression

Procedia PDF Downloads 135

1923 Modeling Engagement with Multimodal Multisensor Data: The Continuous Performance Test as an Objective Tool to Track Flow

Authors: Mohammad H. Taheri, David J. Brown, Nasser Sherkat

Abstract:

Engagement is one of the most important factors in determining successful outcomes and deep learning in students. Existing approaches to detect student engagement involve periodic human observations that are subject to inter-rater reliability. Our solution uses real-time multimodal multisensor data labeled by objective performance outcomes to infer the engagement of students. The study involves four students with a combined diagnosis of cerebral palsy and a learning disability who took part in a 3-month trial over 59 sessions. Multimodal multisensor data were collected while they participated in a continuous performance test. Eye gaze, electroencephalogram, body pose, and interaction data were used to create a model of student engagement through objective labeling from the continuous performance test outcomes. In order to achieve this, a type of continuous performance test is introduced, the Seek-X type. Nine features were extracted including high-level handpicked compound features. Using leave-one-out cross-validation, a series of different machine learning approaches were evaluated. Overall, the random forest classification approach achieved the best classification results. Using random forest, 93.3% classification for engagement and 42.9% accuracy for disengagement were achieved. We compared these results to outcomes from different models: AdaBoost, decision tree, k-Nearest Neighbor, naïve Bayes, neural network, and support vector machine. We showed that using a multisensor approach achieved higher accuracy than using features from any reduced set of sensors. We found that using high-level handpicked features can improve the classification accuracy in every sensor mode. Our approach is robust to both sensor fallout and occlusions. The single most important sensor feature to the classification of engagement and distraction was shown to be eye gaze. It has been shown that we can accurately predict the level of engagement of students with learning disabilities in a real-time approach that is not subject to inter-rater reliability, human observation or reliant on a single mode of sensor input. This will help teachers design interventions for a heterogeneous group of students, where teachers cannot possibly attend to each of their individual needs. Our approach can be used to identify those with the greatest learning challenges so that all students are supported to reach their full potential.

Keywords: affective computing in education, affect detection, continuous performance test, engagement, flow, HCI, interaction, learning disabilities, machine learning, multimodal, multisensor, physiological sensors, student engagement

Procedia PDF Downloads 77

1922 One-Dimensional Numerical Simulation of the Nonlinear Instability Behavior of an Electrified Viscoelastic Liquid Jet

Authors: Fang Li, Xie-Yuan Yin, Xie-Zhen Yin

Abstract:

Instability and breakup of electrified viscoelastic liquid jets are involved in various applications such as inkjet printing, fuel atomization, the pharmaceutical industry, electrospraying, and electrospinning. Studying on the instability of electrified viscoelastic liquid jets is of theoretical and practical significance. We built a one-dimensional electrified viscoelastic model to study the nonlinear instability behavior of a perfecting conducting, slightly viscoelastic liquid jet under a radial electric field. The model is solved numerically by using an implicit finite difference scheme together with a boundary element method. It is found that under a radial electric field a viscoelastic liquid jet still evolves into a beads-on-string structure with a thin filament connecting two adjacent droplets as in the absence of an electric field. A radial electric field exhibits limited influence on the decay of the filament thickness in the nonlinear evolution process of a viscoelastic jet, in contrast to its great enhancing effect on the linear instability of the jet. On the other hand, a radial electric field can induce axial non-uniformity of the first normal stress difference within the filament. Particularly, the magnitude of the first normal stress difference near the midpoint of the filament can be greatly decreased by a radial electric field. Decreasing the extensional stress by a radial electric field may found applications in spraying, spinning, liquid bridges and others. In addition, the effect of a radial electric field on the formation of satellite droplets is investigated on the parametric plane of the dimensionless wave number and the electrical Bond number. It is found that satellite droplets may be formed for a larger axial wave number at a larger radial electric field. The present study helps us gain insight into the nonlinear instability characteristics of electrified viscoelastic liquid jets.

Keywords: non linear instability, one-dimensional models, radial electric fields, viscoelastic liquid jets

Procedia PDF Downloads 376

1921 Improved Rare Species Identification Using Focal Loss Based Deep Learning Models

Authors: Chad Goldsworthy, B. Rajeswari Matam

Abstract:

The use of deep learning for species identification in camera trap images has revolutionised our ability to study, conserve and monitor species in a highly efficient and unobtrusive manner, with state-of-the-art models achieving accuracies surpassing the accuracy of manual human classification. The high imbalance of camera trap datasets, however, results in poor accuracies for minority (rare or endangered) species due to their relative insignificance to the overall model accuracy. This paper investigates the use of Focal Loss, in comparison to the traditional Cross Entropy Loss function, to improve the identification of minority species in the “255 Bird Species” dataset from Kaggle. The results show that, although Focal Loss slightly decreased the accuracy of the majority species, it was able to increase the F1-score by 0.06 and improve the identification of the bottom two, five and ten (minority) species by 37.5%, 15.7% and 10.8%, respectively, as well as resulting in an improved overall accuracy of 2.96%.

Keywords: convolutional neural networks, data imbalance, deep learning, focal loss, species classification, wildlife conservation

Procedia PDF Downloads 170

1920 Evaluation of IMERG Performance at Estimating the Rainfall Properties through Convective and Stratiform Rain Events in a Semi-Arid Region of Mexico

Authors: Eric Muñoz de la Torre, Julián González Trinidad, Efrén González Ramírez

Abstract:

Rain varies greatly in its duration, intensity, and spatial coverage, it is important to have sub-daily rainfall data for various applications, including risk prevention. However, the ground measurements are limited by the low and irregular density of rain gauges. An alternative to this problem are the Satellite Precipitation Products (SPPs) that use passive microwave and infrared sensors to estimate rainfall, as IMERG, however, these SPPs have to be validated before their application. The aim of this study is to evaluate the performance of the IMERG: Integrated Multi-satellitE Retrievals for Global Precipitation Measurament final run V06B SPP in a semi-arid region of Mexico, using 4 automatic rain gauges (pluviographs) sub-daily data of October 2019 and June to September 2021, using the Minimum inter-event Time (MIT) criterion to separate unique rain events with a dry period of 10 hrs. for the purpose of evaluating the rainfall properties (depth, duration and intensity). Point to pixel analysis, continuous, categorical, and volumetric statistical metrics were used. Results show that IMERG is capable to estimate the rainfall depth with a slight overestimation but is unable to identify the real duration and intensity of the rain events, showing large overestimations and underestimations, respectively. The study zone presented 80 to 85 % of convective rain events, the rest were stratiform rain events, classified by the depth magnitude variation of IMERG pixels and pluviographs. IMERG showed poorer performance at detecting the first ones but had a good performance at estimating stratiform rain events that are originated by Cold Fronts.

Keywords: IMERG, rainfall, rain gauge, remote sensing, statistical evaluation

Procedia PDF Downloads 51

1919 Spatial Data Mining by Decision Trees

Authors: Sihem Oujdi, Hafida Belbachir

Abstract:

Existing methods of data mining cannot be applied on spatial data because they require spatial specificity consideration, as spatial relationships. This paper focuses on the classification with decision trees, which are one of the data mining techniques. We propose an extension of the C4.5 algorithm for spatial data, based on two different approaches Join materialization and Querying on the fly the different tables. Similar works have been done on these two main approaches, the first - Join materialization - favors the processing time in spite of memory space, whereas the second - Querying on the fly different tables- promotes memory space despite of the processing time. The modified C4.5 algorithm requires three entries tables: a target table, a neighbor table, and a spatial index join that contains the possible spatial relationship among the objects in the target table and those in the neighbor table. Thus, the proposed algorithms are applied to a spatial data pattern in the accidentology domain. A comparative study of our approach with other works of classification by spatial decision trees will be detailed.

Keywords: C4.5 algorithm, decision trees, S-CART, spatial data mining

Procedia PDF Downloads 600

1918 A Robust System for Foot Arch Type Classification from Static Foot Pressure Distribution Data Using Linear Discriminant Analysis

Authors: R. Periyasamy, Deepak Joshi, Sneh Anand

Abstract:

Foot posture assessment is important to evaluate foot type, causing gait and postural defects in all age groups. Although different methods are used for classification of foot arch type in clinical/research examination, there is no clear approach for selecting the most appropriate measurement system. Therefore, the aim of this study was to develop a system for evaluation of foot type as clinical decision-making aids for diagnosis of flat and normal arch based on the Arch Index (AI) and foot pressure distribution parameter - Power Ratio (PR) data. The accuracy of the system was evaluated for 27 subjects with age ranging from 24 to 65 years. Foot area measurements (hind foot, mid foot, and forefoot) were acquired simultaneously from foot pressure intensity image using portable PedoPowerGraph system and analysis of the image in frequency domain to obtain foot pressure distribution parameter - PR data. From our results, we obtain 100% classification accuracy of normal and flat foot by using the linear discriminant analysis method. We observe there is no misclassification of foot types because of incorporating foot pressure distribution data instead of only arch index (AI). We found that the mid-foot pressure distribution ratio data and arch index (AI) value are well correlated to foot arch type based on visual analysis. Therefore, this paper suggests that the proposed system is accurate and easy to determine foot arch type from arch index (AI), as well as incorporating mid-foot pressure distribution ratio data instead of physical area of contact. Hence, such computational tool based system can help the clinicians for assessment of foot structure and cross-check their diagnosis of flat foot from mid-foot pressure distribution.

Keywords: arch index, computational tool, static foot pressure intensity image, foot pressure distribution, linear discriminant analysis

Procedia PDF Downloads 488

1917 Modified Naive Bayes-Based Prediction Modeling for Crop Yield Prediction

Authors: Kefaya Qaddoum

Abstract:

Most of greenhouse growers desire a determined amount of yields in order to accurately meet market requirements. The purpose of this paper is to model a simple but often satisfactory supervised classification method. The original naive Bayes have a serious weakness, which is producing redundant predictors. In this paper, utilized regularization technique was used to obtain a computationally efficient classifier based on naive Bayes. The suggested construction, utilized L1-penalty, is capable of clearing redundant predictors, where a modification of the LARS algorithm is devised to solve this problem, making this method applicable to a wide range of data. In the experimental section, a study conducted to examine the effect of redundant and irrelevant predictors, and test the method on WSG data set for tomato yields, where there are many more predictors than data, and the urge need to predict weekly yield is the goal of this approach. Finally, the modified approach is compared with several naive Bayes variants and other classification algorithms (SVM and kNN), and is shown to be fairly good.

Keywords: tomato yield prediction, naive Bayes, redundancy, WSG

Procedia PDF Downloads 218

1916 Earthquake Classification in Molluca Collision Zone Using Conventional Statistical Methods

Authors: H. J. Wattimanela, U. S. Passaribu, A. N. T. Puspito, S. W. Indratno

Abstract:

Molluca Collision Zone is located at the junction of the Eurasian plate, Australian, Pacific, and the Philippines. Between the Sangihe arc, west of the collision zone, and to the east of Halmahera arc is active collision and convex toward the Molluca Sea. This research will analyze the behavior of earthquake occurrence in Molluca Collision Zone related to the distributions of an earthquake in each partition regions, determining the type of distribution of a occurrence earthquake of partition regions, and the mean occurrence of earthquakes each partition regions, and the correlation between the partitions region. We calculate number of earthquakes using partition method and its behavioral using conventional statistical methods. The data used is the data type of shallow earthquakes with magnitudes ≥ 4 SR for the period 1964-2013 in the Molluca Collision Zone. From the results, we can classify partitioned regions based on the correlation into two classes: strong and very strong. This classification can be used for early warning system in disaster management.

Keywords: molluca collision zone, partition regions, conventional statistical methods, earthquakes, classifications, disaster management

Procedia PDF Downloads 477

1915 Distangling Biological Noise in Cellular Images with a Focus on Explainability

Authors: Manik Sharma, Ganapathy Krishnamurthi

Abstract:

The cost of some drugs and medical treatments has risen in recent years, that many patients are having to go without. A classification project could make researchers more efficient. One of the more surprising reasons behind the cost is how long it takes to bring new treatments to market. Despite improvements in technology and science, research and development continues to lag. In fact, finding new treatment takes, on average, more than 10 years and costs hundreds of millions of dollars. If successful, we could dramatically improve the industry's ability to model cellular images according to their relevant biology. In turn, greatly decreasing the cost of treatments and ensure these treatments get to patients faster. This work aims at solving a part of this problem by creating a cellular image classification model which can decipher the genetic perturbations in cell (occurring naturally or artificially). Another interesting question addressed is what makes the deep-learning model decide in a particular fashion, which can further help in demystifying the mechanism of action of certain perturbations and paves a way towards the explainability of the deep-learning model.

Keywords: cellular images, genetic perturbations, deep-learning, explainability

Procedia PDF Downloads 95

1914 Detection and Classification of Rubber Tree Leaf Diseases Using Machine Learning

Authors: Kavyadevi N., Kaviya G., Gowsalya P., Janani M., Mohanraj S.

Abstract:

Hevea brasiliensis, also known as the rubber tree, is one of the foremost assets of crops in the world. One of the most significant advantages of the Rubber Plant in terms of air oxygenation is its capacity to reduce the likelihood of an individual developing respiratory allergies like asthma. To construct such a system that can properly identify crop diseases and pests and then create a database of insecticides for each pest and disease, we must first give treatment for the illness that has been detected. We shall primarily examine three major leaf diseases since they are economically deficient in this article, which is Bird's eye spot, algal spot and powdery mildew. And the recommended work focuses on disease identification on rubber tree leaves. It will be accomplished by employing one of the superior algorithms. Input, Preprocessing, Image Segmentation, Extraction Feature, and Classification will be followed by the processing technique. We will use time-consuming procedures that they use to detect the sickness. As a consequence, the main ailments, underlying causes, and signs and symptoms of diseases that harm the rubber tree are covered in this study.

Keywords: image processing, python, convolution neural network (CNN), machine learning

Procedia PDF Downloads 59

1913 Classifications of Sleep Apnea (Obstructive, Central, Mixed) and Hypopnea Events Using Wavelet Packet Transform and Support Vector Machines (VSM)

Authors: Benghenia Hadj Abd El Kader

Abstract:

Sleep apnea events as obstructive, central, mixed or hypopnea are characterized by frequent breathing cessations or reduction in upper airflow during sleep. An advanced method for analyzing the patterning of biomedical signals to recognize obstructive sleep apnea and hypopnea is presented. In the aim to extract characteristic parameters, which will be used for classifying the above stated (obstructive, central, mixed) sleep apnea and hypopnea, the proposed method is based first on the analysis of polysomnography signals such as electrocardiogram signal (ECG) and electromyogram (EMG), then classification of the (obstructive, central, mixed) sleep apnea and hypopnea. The analysis is carried out using the wavelet transform technique in order to extract characteristic parameters whereas classification is carried out by applying the SVM (support vector machine) technique. The obtained results show good recognition rates using characteristic parameters.

Keywords: obstructive, central, mixed, sleep apnea, hypopnea, ECG, EMG, wavelet transform, SVM classifier

Procedia PDF Downloads 354

1912 Combining Multiscale Patterns of Weather and Sea States into a Machine Learning Classifier for Mid-Term Prediction of Extreme Rainfall in North-Western Mediterranean Sea

Authors: Pinel Sebastien, Bourrin François, De Madron Du Rieu Xavier, Ludwig Wolfgang, Arnau Pedro

Abstract:

Heavy precipitation constitutes a major meteorological threat in the western Mediterranean. Research has investigated the relationship between the states of the Mediterranean Sea and the atmosphere with the precipitation for short temporal windows. However, at a larger temporal scale, the precursor signals of heavy rainfall in the sea and atmosphere have drawn little attention. Moreover, despite ongoing improvements in numerical weather prediction, the medium-term forecasting of rainfall events remains a difficult task. Here, we aim to investigate the influence of early-spring environmental parameters on the following autumnal heavy precipitations. Hence, we develop a machine learning model to predict extreme autumnal rainfall with a 6-month lead time over the Spanish Catalan coastal area, based on i) the sea pattern (main current-LPC and Sea Surface Temperature-SST) at the mesoscale scale, ii) 4 European weather teleconnection patterns (NAO, WeMo, SCAND, MO) at synoptic scale, and iii) the hydrological regime of the main local river (Rhône River). The accuracy of the developed model classifier is evaluated via statistical analysis based on classification accuracy, logarithmic and confusion matrix by comparing with rainfall estimates from rain gauges and satellite observations (CHIRPS-2.0). Sensitivity tests are carried out by changing the model configuration, such as sea SST, sea LPC, river regime, and synoptic atmosphere configuration. The sensitivity analysis suggests a negligible influence from the hydrological regime, unlike SST, LPC, and specific teleconnection weather patterns. At last, this study illustrates how public datasets can be integrated into a machine learning model for heavy rainfall prediction and can interest local policies for management purposes.

Keywords: extreme hazards, sensitivity analysis, heavy rainfall, machine learning, sea-atmosphere modeling, precipitation forecasting

Procedia PDF Downloads 116

1911 Discrimination and Classification of Vestibular Neuritis Using Combined Fisher and Support Vector Machine Model

Authors: Amine Ben Slama, Aymen Mouelhi, Sondes Manoubi, Chiraz Mbarek, Hedi Trabelsi, Mounir Sayadi, Farhat Fnaiech

Abstract:

Vertigo is a sensation of feeling off balance; the cause of this symptom is very difficult to interpret and needs a complementary exam. Generally, vertigo is caused by an ear problem. Some of the most common causes include: benign paroxysmal positional vertigo (BPPV), Meniere's disease and vestibular neuritis (VN). In clinical practice, different tests of videonystagmographic (VNG) technique are used to detect the presence of vestibular neuritis (VN). The topographical diagnosis of this disease presents a large diversity in its characteristics that confirm a mixture of problems for usual etiological analysis methods. In this study, a vestibular neuritis analysis method is proposed with videonystagmography (VNG) applications using an estimation of pupil movements in the case of an uncontrolled motion to obtain an efficient and reliable diagnosis results. First, an estimation of the pupil displacement vectors using with Hough Transform (HT) is performed to approximate the location of pupil region. Then, temporal and frequency features are computed from the rotation angle variation of the pupil motion. Finally, optimized features are selected using Fisher criterion evaluation for discrimination and classification of the VN disease.Experimental results are analyzed using two categories: normal and pathologic. By classifying the reduced features using the Support Vector Machine (SVM), 94% is achieved as classification accuracy. Compared to recent studies, the proposed expert system is extremely helpful and highly effective to resolve the problem of VNG analysis and provide an accurate diagnostic for medical devices.

Keywords: nystagmus, vestibular neuritis, videonystagmographic system, VNG, Fisher criterion, support vector machine, SVM

Procedia PDF Downloads 126

1910 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: artificial neural networks (ANNs), classifier algorithms, credit risk assessment, logistic regression, machine Learning, support vector machines

Procedia PDF Downloads 89

1909 Machine Learning Approach for Yield Prediction in Semiconductor Production

Authors: Heramb Somthankar, Anujoy Chakraborty

Abstract:

This paper presents a classification study on yield prediction in semiconductor production using machine learning approaches. A complicated semiconductor production process is generally monitored continuously by signals acquired from sensors and measurement sites. A monitoring system contains a variety of signals, all of which contain useful information, irrelevant information, and noise. In the case of each signal being considered a feature, "Feature Selection" is used to find the most relevant signals. The open-source UCI SECOM Dataset provides 1567 such samples, out of which 104 fail in quality assurance. Feature extraction and selection are performed on the dataset, and useful signals were considered for further study. Afterward, common machine learning algorithms were employed to predict whether the signal yields pass or fail. The most relevant algorithm is selected for prediction based on the accuracy and loss of the ML model.

Keywords: deep learning, feature extraction, feature selection, machine learning classification algorithms, semiconductor production monitoring, signal processing, time-series analysis

Procedia PDF Downloads 95

1908 Pattern Recognition Based on Simulation of Chemical Senses (SCS)

Authors: Nermeen El Kashef, Yasser Fouad, Khaled Mahar

Abstract:

No AI-complete system can model the human brain or behavior, without looking at the totality of the whole situation and incorporating a combination of senses. This paper proposes a Pattern Recognition model based on Simulation of Chemical Senses (SCS) for separation and classification of sign language. The model based on human taste controlling strategy. The main idea of the introduced model is motivated by the facts that the tongue cluster input substance into its basic tastes first, and then the brain recognizes its flavor. To implement this strategy, two level architecture is proposed (this is inspired from taste system). The separation-level of the architecture focuses on hand posture cluster, while the classification-level of the architecture to recognizes the sign language. The efficiency of proposed model is demonstrated experimentally by recognizing American Sign Language (ASL) data set. The recognition accuracy obtained for numbers of ASL is 92.9 percent.

Keywords: artificial intelligence, biocybernetics, gustatory system, sign language recognition, taste sense

Procedia PDF Downloads 276

1907 Unearthing Air Traffic Control Officers Decision Instructional Patterns From Simulator Data for Application in Human Machine Teams

Authors: Zainuddin Zakaria, Sun Woh Lye

Abstract:

Despite the continuous advancements in automated conflict resolution tools, there is still a low rate of adoption of automation from Air Traffic Control Officers (ATCOs). Trust or acceptance in these tools and conformance to the individual ATCO preferences in strategy execution for conflict resolution are two key factors that impact their use. This paper proposes a methodology to unearth and classify ATCO conflict resolution strategies from simulator data of trained and qualified ATCOs. The methodology involves the extraction of ATCO executive control actions and the establishment of a system of strategy resolution classification based on ATCO radar commands and prevailing flight parameters in deconflicting a pair of aircraft. Six main strategies used to handle various categories of conflict were identified and discussed. It was found that ATCOs were about twice more likely to choose only vertical maneuvers in conflict resolution compared to horizontal maneuvers or a combination of both vertical and horizontal maneuvers.

Keywords: air traffic control strategies, conflict resolution, simulator data, strategy classification system

Procedia PDF Downloads 131

1906 Orbit Determination from Two Position Vectors Using Finite Difference Method

Authors: Akhilesh Kumar, Sathyanarayan G., Nirmala S.

Abstract:

An unusual approach is developed to determine the orbit of satellites/space objects. The determination of orbits is considered a boundary value problem and has been solved using the finite difference method (FDM). Only positions of the satellites/space objects are known at two end times taken as boundary conditions. The technique of finite difference has been used to calculate the orbit between end times. In this approach, the governing equation is defined as the satellite's equation of motion with a perturbed acceleration. Using the finite difference method, the governing equations and boundary conditions are discretized. The resulting system of algebraic equations is solved using Tri Diagonal Matrix Algorithm (TDMA) until convergence is achieved. This methodology test and evaluation has been done using all GPS satellite orbits from National Geospatial-Intelligence Agency (NGA) precise product for Doy 125, 2023. Towards this, two hours of twelve sets have been taken into consideration. Only positions at the end times of each twelve sets are considered boundary conditions. This algorithm is applied to all GPS satellites. Results achieved using FDM compared with the results of NGA precise orbits. The maximum RSS error for the position is 0.48 [m] and the velocity is 0.43 [mm/sec]. Also, the present algorithm is applied on the IRNSS satellites for Doy 220, 2023. The maximum RSS error for the position is 0.49 [m], and for velocity is 0.28 [mm/sec]. Next, a simulation has been done for a Highly Elliptical orbit for DOY 63, 2023, for the duration of 6 hours. The RSS of difference in position is 0.92 [m] and velocity is 1.58 [mm/sec] for the orbital speed of more than 5km/sec. Whereas the RSS of difference in position is 0.13 [m] and velocity is 0.12 [mm/sec] for the orbital speed less than 5km/sec. Results show that the newly created method is reliable and accurate. Further applications of the developed methodology include missile and spacecraft targeting, orbit design (mission planning), space rendezvous and interception, space debris correlation, and navigation solutions.

Keywords: finite difference method, grid generation, NavIC system, orbit perturbation

Procedia PDF Downloads 68

1905 Deep-Learning Coupled with Pragmatic Categorization Method to Classify the Urban Environment of the Developing World

Authors: Qianwei Cheng, A. K. M. Mahbubur Rahman, Anis Sarker, Abu Bakar Siddik Nayem, Ovi Paul, Amin Ahsan Ali, M. Ashraful Amin, Ryosuke Shibasaki, Moinul Zaber

Abstract:

Thomas Friedman, in his famous book, argued that the world in this 21st century is ﬂat and will continue to be ﬂatter. This is attributed to rapid globalization and the interdependence of humanity that engendered tremendous in-ﬂow of human migration towards the urban spaces. In order to keep the urban environment sustainable, policy makers need to plan based on extensive analysis of the urban environment. With the advent of high deﬁnition satellite images, high resolution data, computational methods such as deep neural network analysis, and hardware capable of high-speed analysis; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. However, the ﬁrst step of understanding urban space lies in useful categorization of the space that is usable for data collection, analysis, and visualization. In this paper, we propose a pragmatic categorization method that is readily usable for machine analysis and show applicability of the methodology on a developing world setting. Categorization to plan sustainable urban spaces should encompass the buildings and their surroundings. However, the state-of-the-art is mostly dominated by classiﬁcation of building structures, building types, etc. and largely represents the developed world. Hence, these methods and models are not sufﬁcient for developing countries such as Bangladesh, where the surrounding environment is crucial for the categorization. Moreover, these categorizations propose small-scale classiﬁcations, which give limited information, have poor scalability and are slow to compute in real time. Our proposed method is divided into two steps-categorization and automation. We categorize the urban area in terms of informal and formal spaces and take the surrounding environment into account. 50 km × 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert and consequently a map was drawn. The categorization is based broadly on two dimensions-the state of urbanization and the architectural form of urban environment. Consequently, the urban space is divided into four categories: 1) highly informal area; 2) moderately informal area; 3) moderately formal area; and 4) highly formal area. In total, sixteen sub-categories were identiﬁed. For semantic segmentation and automatic categorization, Google’s DeeplabV3+ model was used. The model uses Atrous convolution operation to analyze different layers of texture and shape. This allows us to enlarge the ﬁeld of view of the ﬁlters to incorporate larger context. Image encompassing 70% of the urban space was used to train the model, and the remaining 30% was used for testing and validation. The model is able to segment with 75% accuracy and 60% Mean Intersection over Union (mIoU). In this paper, we propose a pragmatic categorization method that is readily applicable for automatic use in both developing and developed world context. The method can be augmented for real-time socio-economic comparative analysis among cities. It can be an essential tool for the policy makers to plan future sustainable urban spaces.

Keywords: semantic segmentation, urban environment, deep learning, urban building, classification

Procedia PDF Downloads 165

1904 Simulating the Surface Runoff for the Urbanized Watershed of Mula-Mutha River from Western Maharashtra, India

Authors: Anargha A. Dhorde, Deshpande Gauri, Amit G. Dhorde

Abstract:

Mula-Mutha basin is one of the speedily urbanizing watersheds, wherein two major urban centers, Pune and Pimpri-Chinchwad, have developed at a shocking rate in the last two decades. Such changing land use/land cover (LULC) is prone to hydrological problems and flash floods are a frequent, eventuality in the lower reaches of the basin. The present research brings out the impact of varying LULC, impervious surfaces on urban surface hydrology and generates storm-runoff scenarios for the hydrological units. The two multi-temporal satellite images were processed and supervised classification is performed with > 75% accuracy. The built-up has increased from 14.4% to 34.37% in the 28 years span, which is concentrated in and around the Pune-PCMC region. Impervious surfaces that were obtained by population calibrated multiple regression models. Almost 50% area of the watershed is impervious, which attribute to increase surface runoff and flash floods. The SCS-CN method was employed to calculate surface runoff of the watershed. The comparison between calculated and measured values of runoff was performed in a statistically precise way which shows no significant difference. Increasing built-up areas, as well as impervious surface areas due to rapid urbanization and industrialization, may lead to generating high runoff volumes in the basin especially in the urbanized areas of the watershed and along the major transportation arteries. Simulations generated with 50 mm and 100 mm rainstorm depth conspicuously noted that most of the changes in terms of increased runoff are constricted to the highly urbanized areas. Considering whole watershed area, the runoff values 39 m³ generated with 1'' rainfall whereas only urbanized areas of the basin (Pune and Pimpari-Chinchwad) were generated 11154 m³ runoff. Such analysis is crucial in providing information regarding their intensity and location, which proves instrumental in their analysis in order to formulate proper mitigation measures and rehabilitation strategies.

Keywords: land use/land cover, LULC, impervious surfaces, surface hydrology, storm-runoff scenarios

Procedia PDF Downloads 201

1903 Assimilating Multi-Mission Satellites Data into a Hydrological Model

Authors: Mehdi Khaki, Ehsan Forootan, Joseph Awange, Michael Kuhn

Abstract:

Terrestrial water storage, as a source of freshwater, plays an important role in human lives. Hydrological models offer important tools for simulating and predicting water storages at global and regional scales. However, their comparisons with 'reality' are imperfect mainly due to a high level of uncertainty in input data and limitations in accounting for all complex water cycle processes, uncertainties of (unknown) empirical model parameters, as well as the absence of high resolution (both spatially and temporally) data. Data assimilation can mitigate this drawback by incorporating new sets of observations into models. In this effort, we use multi-mission satellite-derived remotely sensed observations to improve the performance of World-Wide Water Resources Assessment system (W3RA) hydrological model for estimating terrestrial water storages. For this purpose, we assimilate total water storage (TWS) data from the Gravity Recovery And Climate Experiment (GRACE) and surface soil moisture data from the Advanced Microwave Scanning Radiometer for the Earth Observing System (AMSR-E) into W3RA. This is done to (i) improve model estimations of water stored in ground and soil moisture, and (ii) assess the impacts of each satellite of data (from GRACE and AMSR-E) and their combination on the final terrestrial water storage estimations. These data are assimilated into W3RA using the Ensemble Square-Root Filter (EnSRF) filtering technique over Mississippi Basin (the United States) and Murray-Darling Basin (Australia) between 2002 and 2013. In order to evaluate the results, independent ground-based groundwater and soil moisture measurements within each basin are used.

Keywords: data assimilation, GRACE, AMSR-E, hydrological model, EnSRF

Procedia PDF Downloads 271

1902 Reliability Analysis of Geometric Performance of Onboard Satellite Sensors: A Study on Location Accuracy

Authors: Ch. Sridevi, A. Chalapathi Rao, P. Srinivasulu

Abstract:

The location accuracy of data products is a critical parameter in assessing the geometric performance of satellite sensors. This study focuses on reliability analysis of onboard sensors to evaluate their performance in terms of location accuracy performance over time. The analysis utilizes field failure data and employs the weibull distribution to determine the reliability and in turn to understand the improvements or degradations over a period of time. The analysis begins by scrutinizing the location accuracy error which is the root mean square (RMS) error of differences between ground control point coordinates observed on the product and the map and identifying the failure data with reference to time. A significant challenge in this study is to thoroughly analyze the possibility of an infant mortality phase in the data. To address this, the Weibull distribution is utilized to determine if the data exhibits an infant stage or if it has transitioned into the operational phase. The shape parameter beta plays a crucial role in identifying this stage. Additionally, determining the exact start of the operational phase and the end of the infant stage poses another challenge as it is crucial to eliminate residual infant mortality or wear-out from the model, as it can significantly increase the total failure rate. To address this, an approach utilizing the well-established statistical Laplace test is applied to infer the behavior of sensors and to accurately ascertain the duration of different phases in the lifetime and the time required for stabilization. This approach also helps in understanding if the bathtub curve model, which accounts for the different phases in the lifetime of a product, is appropriate for the data and whether the thresholds for the infant period and wear-out phase are accurately estimated by validating the data in individual phases with Weibull distribution curve fitting analysis. Once the operational phase is determined, reliability is assessed using Weibull analysis. This analysis not only provides insights into the reliability of individual sensors with regards to location accuracy over the required period of time, but also establishes a model that can be applied to automate similar analyses for various sensors and parameters using field failure data. Furthermore, the identification of the best-performing sensor through this analysis serves as a benchmark for future missions and designs, ensuring continuous improvement in sensor performance and reliability. Overall, this study provides a methodology to accurately determine the duration of different phases in the life data of individual sensors. It enables an assessment of the time required for stabilization and provides insights into the reliability during the operational phase and the commencement of the wear-out phase. By employing this methodology, designers can make informed decisions regarding sensor performance with regards to location accuracy, contributing to enhanced accuracy in satellite-based applications.

Keywords: bathtub curve, geometric performance, Laplace test, location accuracy, reliability analysis, Weibull analysis

Procedia PDF Downloads 57

1901 Analysis of Sediment Distribution around Karang Sela Coral Reef Using Multibeam Backscatter

Authors: Razak Zakariya, Fazliana Mustajap, Lenny Sharinee Sakai

Abstract:

A sediment map is quite important in the marine environment. The sediment itself contains thousands of information that can be used for other research. This study was conducted by using a multibeam echo sounder Reson T20 on 15 August 2020 at the Karang Sela (coral reef area) at Pulau Bidong. The study aims to identify the sediment type around the coral reef by using bathymetry and backscatter data. The sediment in the study area was collected as ground truthing data to verify the classification of the seabed. A dry sieving method was used to analyze the sediment sample by using a sieve shaker. PDS 2000 software was used for data acquisition, and Qimera QPS version 2.4.5 was used for processing the bathymetry data. Meanwhile, FMGT QPS version 7.10 processes the backscatter data. Then, backscatter data were analyzed by using the maximum likelihood classification tool in ArcGIS version 10.8 software. The result identified three types of sediments around the coral which were very coarse sand, coarse sand, and medium sand.

Keywords: sediment type, MBES echo sounder, backscatter, ArcGIS

Procedia PDF Downloads 65

1900 Geostatistical Models to Correct Salinity of Soils from Landsat Satellite Sensor: Application to the Oran Region, Algeria

Authors: Dehni Abdellatif, Lounis Mourad

Abstract:

The new approach of applied spatial geostatistics in materials sciences, agriculture accuracy, agricultural statistics, permitted an apprehension of managing and monitoring the water and groundwater qualities in a relationship with salt-affected soil. The anterior experiences concerning data acquisition, spatial-preparation studies on optical and multispectral data has facilitated the integration of correction models of electrical conductivity related with soils temperature (horizons of soils). For tomography apprehension, this physical parameter has been extracted from calibration of the thermal band (LANDSAT ETM+6) with a radiometric correction. Our study area is Oran region (Northern West of Algeria). Different spectral indices are determined such as salinity and sodicity index, the Combined Spectral Reflectance Index (CSRI), Normalized Difference Vegetation Index (NDVI), emissivity, Albedo, and Sodium Adsorption Ratio (SAR). The approach of geostatistical modeling of electrical conductivity (salinity), appears to be a useful decision support system for estimating corrected electrical resistivity related to the temperature of surface soils, according to the conversion models by substitution, the reference temperature at 25°C (where hydrochemical data are collected with this constraint). The Brightness temperatures extracted from satellite reflectance (LANDSAT ETM+) are used in consistency models to estimate electrical resistivity. The confusions that arise from the effects of salt stress and water stress removed followed by seasonal application of the geostatistical analysis in Geographic Information System (GIS) techniques investigation and monitoring the variation of the electrical conductivity in the alluvial aquifer of Es-Sénia for the salt-affected soil.

Keywords: geostatistical modelling, landsat, brightness temperature, conductivity

Procedia PDF Downloads 427

1899 Classification of Political Affiliations by Reduced Number of Features

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.

Keywords: feature selection, LIWC, machine learning, politics

Procedia PDF Downloads 370