Search results for: classification and regression tree
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5699

Search results for: classification and regression tree

5519 Faults Diagnosis by Thresholding and Decision tree with Neuro-Fuzzy System

Authors: Y. Kourd, D. Lefebvre

Abstract:

The monitoring of industrial processes is required to ensure operating conditions of industrial systems through automatic detection and isolation of faults. This paper proposes a method of fault diagnosis based on a neuro-fuzzy hybrid structure. This hybrid structure combines the selection of threshold and decision tree. The validation of this method is obtained with the DAMADICS benchmark. In the first phase of the method, a model will be constructed that represents the normal state of the system to fault detection. Signatures of the faults are obtained with residuals analysis and selection of appropriate thresholds. These signatures provide groups of non-separable faults. In the second phase, we build faulty models to see the flaws in the system that cannot be isolated in the first phase. In the latest phase we construct the tree that isolates these faults.

Keywords: decision tree, residuals analysis, ANFIS, fault diagnosis

Procedia PDF Downloads 593
5518 Allometric Models for Biomass Estimation in Savanna Woodland Area, Niger State, Nigeria

Authors: Abdullahi Jibrin, Aishetu Abdulkadir

Abstract:

The development of allometric models is crucial to accurate forest biomass/carbon stock assessment. The aim of this study was to develop a set of biomass prediction models that will enable the determination of total tree aboveground biomass for savannah woodland area in Niger State, Nigeria. Based on the data collected through biometric measurements of 1816 trees and destructive sampling of 36 trees, five species specific and one site specific models were developed. The sample size was distributed equally between the five most dominant species in the study site (Vitellaria paradoxa, Irvingia gabonensis, Parkia biglobosa, Anogeissus leiocarpus, Pterocarpus erinaceous). Firstly, the equations were developed for five individual species. Secondly these five species were mixed and were used to develop an allometric equation of mixed species. Overall, there was a strong positive relationship between total tree biomass and the stem diameter. The coefficient of determination (R2 values) ranging from 0.93 to 0.99 P < 0.001 were realised for the models; with considerable low standard error of the estimates (SEE) which confirms that the total tree above ground biomass has a significant relationship with the dbh. The F-test value for the biomass prediction models were also significant at p < 0.001 which indicates that the biomass prediction models are valid. This study recommends that for improved biomass estimates in the study site, the site specific biomass models should preferably be used instead of using generic models.

Keywords: allometriy, biomass, carbon stock , model, regression equation, woodland, inventory

Procedia PDF Downloads 419
5517 Analytical Authentication of Butter Using Fourier Transform Infrared Spectroscopy Coupled with Chemometrics

Authors: M. Bodner, M. Scampicchio

Abstract:

Fourier Transform Infrared (FT-IR) spectroscopy coupled with chemometrics was used to distinguish between butter samples and non-butter samples. Further, quantification of the content of margarine in adulterated butter samples was investigated. Fingerprinting region (1400-800 cm–1) was used to develop unsupervised pattern recognition (Principal Component Analysis, PCA), supervised modeling (Soft Independent Modelling by Class Analogy, SIMCA), classification (Partial Least Squares Discriminant Analysis, PLS-DA) and regression (Partial Least Squares Regression, PLS-R) models. PCA of the fingerprinting region shows a clustering of the two sample types. All samples were classified in their rightful class by SIMCA approach; however, nine adulterated samples (between 1% and 30% w/w of margarine) were classified as belonging both at the butter class and at the non-butter one. In the two-class PLS-DA model’s (R2 = 0.73, RMSEP, Root Mean Square Error of Prediction = 0.26% w/w) sensitivity was 71.4% and Positive Predictive Value (PPV) 100%. Its threshold was calculated at 7% w/w of margarine in adulterated butter samples. Finally, PLS-R model (R2 = 0.84, RMSEP = 16.54%) was developed. PLS-DA was a suitable classification tool and PLS-R a proper quantification approach. Results demonstrate that FT-IR spectroscopy combined with PLS-R can be used as a rapid, simple and safe method to identify pure butter samples from adulterated ones and to determine the grade of adulteration of margarine in butter samples.

Keywords: adulterated butter, margarine, PCA, PLS-DA, PLS-R, SIMCA

Procedia PDF Downloads 114
5516 Improve B-Tree Index’s Performance Using Lock-Free Hash Table

Authors: Zhanfeng Ma, Zhiping Xiong, Hu Yin, Zhengwei She, Aditya P. Gurajada, Tianlun Chen, Ying Li

Abstract:

Many RDBMS vendors use B-tree index to achieve high performance for point queries and range queries, and some of them also employ hash index to further enhance the performance as hash table is more efficient for point queries. However, there are extra overheads to maintain a separate hash index, for example, hash mapping for all data records must always be maintained, which results in more memory space consumption; locking, logging and other mechanisms are needed to guarantee ACID, which affects the concurrency and scalability of the system. To relieve the overheads, Hash Cached B-tree (HCB) index is proposed in this paper, which consists of a standard disk-based B-tree index and an additional in-memory lock-free hash table. Initially, only the B-tree index is constructed for all data records, the hash table is built on the fly based on runtime workload, only data records accessed by point queries are indexed using hash table, this helps reduce the memory footprint. Changes to hash table are done using compare-and-swap (CAS) without performing locking and logging, this helps improve the concurrency and avoid contention. The hash table is also optimized to be cache conscious. HCB index is implemented in SAP ASE database, compared with the standard B-tree index, early experiments and customer adoptions show significant performance improvement. This paper provides an overview of the design of HCB index and reports the experimental results.

Keywords: B-tree, compare-and-swap, lock-free hash table, point queries, range queries, SAP ASE database

Procedia PDF Downloads 258
5515 Probing Syntax Information in Word Representations with Deep Metric Learning

Authors: Bowen Ding, Yihao Kuang

Abstract:

In recent years, with the development of large-scale pre-trained lan-guage models, building vector representations of text through deep neural network models has become a standard practice for natural language processing tasks. From the performance on downstream tasks, we can know that the text representation constructed by these models contains linguistic information, but its encoding mode and extent are unclear. In this work, a structural probe is proposed to detect whether the vector representation produced by a deep neural network is embedded with a syntax tree. The probe is trained with the deep metric learning method, so that the distance between word vectors in the metric space it defines encodes the distance of words on the syntax tree, and the norm of word vectors encodes the depth of words on the syntax tree. The experiment results on ELMo and BERT show that the syntax tree is encoded in their parameters and the word representations they produce.

Keywords: deep metric learning, syntax tree probing, natural language processing, word representations

Procedia PDF Downloads 32
5514 Image Processing Approach for Detection of Three-Dimensional Tree-Rings from X-Ray Computed Tomography

Authors: Jorge Martinez-Garcia, Ingrid Stelzner, Joerg Stelzner, Damian Gwerder, Philipp Schuetz

Abstract:

Tree-ring analysis is an important part of the quality assessment and the dating of (archaeological) wood samples. It provides quantitative data about the whole anatomical ring structure, which can be used, for example, to measure the impact of the fluctuating environment on the tree growth, for the dendrochronological analysis of archaeological wooden artefacts and to estimate the wood mechanical properties. Despite advances in computer vision and edge recognition algorithms, detection and counting of annual rings are still limited to 2D datasets and performed in most cases manually, which is a time consuming, tedious task and depends strongly on the operator’s experience. This work presents an image processing approach to detect the whole 3D tree-ring structure directly from X-ray computed tomography imaging data. The approach relies on a modified Canny edge detection algorithm, which captures fully connected tree-ring edges throughout the measured image stack and is validated on X-ray computed tomography data taken from six wood species.

Keywords: ring recognition, edge detection, X-ray computed tomography, dendrochronology

Procedia PDF Downloads 184
5513 The Role of Urban Development Patterns for Mitigating Extreme Urban Heat: The Case Study of Doha, Qatar

Authors: Yasuyo Makido, Vivek Shandas, David J. Sailor, M. Salim Ferwati

Abstract:

Mitigating extreme urban heat is challenging in a desert climate such as Doha, Qatar, since outdoor daytime temperature area often too high for the human body to tolerate. Recent studies demonstrate that cities in arid and semiarid areas can exhibit ‘urban cool islands’ - urban areas that are cooler than the surrounding desert. However, the variation of temperatures as a result of the time of day and factors leading to temperature change remain at the question. To address these questions, we examined the spatial and temporal variation of air temperature in Doha, Qatar by conducting multiple vehicle-base local temperature observations. We also employed three statistical approaches to model surface temperatures using relevant predictors: (1) Ordinary Least Squares, (2) Regression Tree Analysis and (3) Random Forest for three time periods. Although the most important determinant factors varied by day and time, distance to the coast was the significant determinant at midday. A 70%/30% holdout method was used to create a testing dataset to validate the results through Pearson’s correlation coefficient. The Pearson’s analysis suggests that the Random Forest model more accurately predicts the surface temperatures than the other methods. We conclude with recommendations about the types of development patterns that show the greatest potential for reducing extreme heat in air climates.

Keywords: desert cities, tree-structure regression model, urban cool Island, vehicle temperature traverse

Procedia PDF Downloads 365
5512 Performance Analysis of Artificial Neural Network Based Land Cover Classification

Authors: Najam Aziz, Nasru Minallah, Ahmad Junaid, Kashaf Gul

Abstract:

Landcover classification using automated classification techniques, while employing remotely sensed multi-spectral imagery, is one of the promising areas of research. Different land conditions at different time are captured through satellite and monitored by applying different classification algorithms in specific environment. In this paper, a SPOT-5 image provided by SUPARCO has been studied and classified in Environment for Visual Interpretation (ENVI), a tool widely used in remote sensing. Then, Artificial Neural Network (ANN) classification technique is used to detect the land cover changes in Abbottabad district. Obtained results are compared with a pixel based Distance classifier. The results show that ANN gives the better overall accuracy of 99.20% and Kappa coefficient value of 0.98 over the Mahalanobis Distance Classifier.

Keywords: landcover classification, artificial neural network, remote sensing, SPOT 5

Procedia PDF Downloads 502
5511 Prediction of Energy Storage Areas for Static Photovoltaic System Using Irradiation and Regression Modelling

Authors: Kisan Sarda, Bhavika Shingote

Abstract:

This paper aims to evaluate regression modelling for prediction of Energy storage of solar photovoltaic (PV) system using Semi parametric regression techniques because there are some parameters which are known while there are some unknown parameters like humidity, dust etc. Here irradiation of solar energy is different for different places on the basis of Latitudes, so by finding out areas which give more storage we can implement PV systems at those places and our need of energy will be fulfilled. This regression modelling is done for daily, monthly and seasonal prediction of solar energy storage. In this, we have used R modules for designing the algorithm. This algorithm will give the best comparative results than other regression models for the solar PV cell energy storage.

Keywords: semi parametric regression, photovoltaic (PV) system, regression modelling, irradiation

Procedia PDF Downloads 350
5510 Urban and Rural Children’s Knowledge on Biodiversity in Bizkaia: Tree Identification Skills and Animal and Plant Listing

Authors: Joserra Díez, Ainhoa Meñika, Iñaki Sanz-Azkue, Arritokieta Ortuzar

Abstract:

Biodiversity provides humans with a great range of ecosystemic services; it is therefore an indispensable resource and a legacy to coming generations. However, in the last decades, the increasing exploitation of the Planet has caused a great loss of biodiversity and its acquaintance has decreased remarkably; especially in urbanized areas, due to the decreasing attachment of humans to nature. Yet, the Primary Education curriculum primes the identification of flora and fauna to guarantee the knowledge of children on their surroundings, so that they care for the environment as well as for themselves. In order to produce effective didactic material that meets the needs of both teachers and pupils, it is fundamental to diagnose the current situation. In the present work, the knowledge on biodiversity of 3rd cycle Primary Education students in Biscay (n=98) and its relation to the size of the town/city of their school is discussed. Two tests have been used with such aim: one for tree identification and the other one so that the students enumerated the species of trees and animals they knew. Results reveal that knowledge of students on tree identification is scarce regardless the size of the city/town and of their school. On the other hand, animal species are better known than tree species.

Keywords: biodiversity, population, tree identification, animal identification

Procedia PDF Downloads 153
5509 Scene Classification Using Hierarchy Neural Network, Directed Acyclic Graph Structure, and Label Relations

Authors: Po-Jen Chen, Jian-Jiun Ding, Hung-Wei Hsu, Chien-Yao Wang, Jia-Ching Wang

Abstract:

A more accurate scene classification algorithm using label relations and the hierarchy neural network was developed in this work. In many classification algorithms, it is assumed that the labels are mutually exclusive. This assumption is true in some specific problems, however, for scene classification, the assumption is not reasonable. Because there are a variety of objects with a photo image, it is more practical to assign multiple labels for an image. In this paper, two label relations, which are exclusive relation and hierarchical relation, were adopted in the classification process to achieve more accurate multiple label classification results. Moreover, the hierarchy neural network (hierarchy NN) is applied to classify the image and the directed acyclic graph structure is used for predicting a more reasonable result which obey exclusive and hierarchical relations. Simulations show that, with these techniques, a much more accurate scene classification result can be achieved.

Keywords: convolutional neural network, label relation, hierarchy neural network, scene classification

Procedia PDF Downloads 423
5508 Automated Prediction of HIV-associated Cervical Cancer Patients Using Data Mining Techniques for Survival Analysis

Authors: O. J. Akinsola, Yinan Zheng, Rose Anorlu, F. T. Ogunsola, Lifang Hou, Robert Leo-Murphy

Abstract:

Cervical Cancer (CC) is the 2nd most common cancer among women living in low and middle-income countries, with no associated symptoms during formative periods. With the advancement and innovative medical research, there are numerous preventive measures being utilized, but the incidence of cervical cancer cannot be truncated with the application of only screening tests. The mortality associated with this invasive cervical cancer can be nipped in the bud through the important role of early-stage detection. This study research selected an array of different top features selection techniques which was aimed at developing a model that could validly diagnose the risk factors of cervical cancer. A retrospective clinic-based cohort study was conducted on 178 HIV-associated cervical cancer patients in Lagos University teaching Hospital, Nigeria (U54 data repository) in April 2022. The outcome measure was the automated prediction of the HIV-associated cervical cancer cases, while the predictor variables include: demographic information, reproductive history, birth control, sexual history, cervical cancer screening history for invasive cervical cancer. The proposed technique was assessed with R and Python programming software to produce the model by utilizing the classification algorithms for the detection and diagnosis of cervical cancer disease. Four machine learning classification algorithms used are: the machine learning model was split into training and testing dataset into ratio 80:20. The numerical features were also standardized while hyperparameter tuning was carried out on the machine learning to train and test the data. Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), and K-Nearest Neighbor (KNN). Some fitting features were selected for the detection and diagnosis of cervical cancer diseases from selected characteristics in the dataset using the contribution of various selection methods for the classification cervical cancer into healthy or diseased status. The mean age of patients was 49.7±12.1 years, mean age at pregnancy was 23.3±5.5 years, mean age at first sexual experience was 19.4±3.2 years, while the mean BMI was 27.1±5.6 kg/m2. A larger percentage of the patients are Married (62.9%), while most of them have at least two sexual partners (72.5%). Age of patients (OR=1.065, p<0.001**), marital status (OR=0.375, p=0.011**), number of pregnancy live-births (OR=1.317, p=0.007**), and use of birth control pills (OR=0.291, p=0.015**) were found to be significantly associated with HIV-associated cervical cancer. On top ten 10 features (variables) considered in the analysis, RF claims the overall model performance, which include: accuracy of (72.0%), the precision of (84.6%), a recall of (84.6%) and F1-score of (74.0%) while LR has: an accuracy of (74.0%), precision of (70.0%), recall of (70.0%) and F1-score of (70.0%). The RF model identified 10 features predictive of developing cervical cancer. The age of patients was considered as the most important risk factor, followed by the number of pregnancy livebirths, marital status, and use of birth control pills, The study shows that data mining techniques could be used to identify women living with HIV at high risk of developing cervical cancer in Nigeria and other sub-Saharan African countries.

Keywords: associated cervical cancer, data mining, random forest, logistic regression

Procedia PDF Downloads 52
5507 The Influence of Forest Management Histories on Dead and Habitat Trees in the Old Growth Forest in Northern Iran

Authors: Kiomars Sefidi

Abstract:

Dead and habitat tree such as fallen logs, snags, stumps and cracks and loos bark etc. is regarded as an important ecological component of forests on which many forest dwelling species depend, yet its relation to management history in Caspian forest has gone unreported. The aim of research was to compare the amounts of dead tree and habitat in the forests with historically different intensities of management, including: forests with the long term implication of management (PS), the short-term implication of management (NS) which were compared with semi virgin forest (GS). The number of 405 individual dead and habitat trees were recorded and measured at 109 sampling locations. ANOVA revealed volume of the dead tree in the form and decay classes significantly differ within sites and dead volume in the semi virgin forest significantly higher than managed sites. Comparing the amount of dead and habitat tree in three sites showed that dead tree volume related with management history and significantly differ in three study sites. Also, the numbers of habitat trees including cavities, Cracks and loose bark and Fork split trees significantly vary among sites. Reaching their highest in virgin site and their lowest in the site with the long term implication of management, it was concluded that forest management cause reduction of the amount of dead and habitat tree. Forest management history affect the forest's ability to generate dead tree especially in a large size, thus managing this forest according to ecological sustainable principles require a commitment to maintaining stand structure that allow, continued generation of dead tree in a full range of size.

Keywords: forest biodiversity, cracks trees, fork split trees, sustainable management, Fagus orientalis, Iran

Procedia PDF Downloads 531
5506 New Segmentation of Piecewise Linear Regression Models Using Reversible Jump MCMC Algorithm

Authors: Suparman

Abstract:

Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation of piecewise linear regression models. The method used to estimate the parameters of picewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters of picewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.

Keywords: regression, piecewise, Bayesian, reversible Jump MCMC

Procedia PDF Downloads 489
5505 Competitors’ Influence Analysis of a Retailer by Using Customer Value and Huff’s Gravity Model

Authors: Yepeng Cheng, Yasuhiko Morimoto

Abstract:

Customer relationship analysis is vital for retail stores, especially for supermarkets. The point of sale (POS) systems make it possible to record the daily purchasing behaviors of customers as an identification point of sale (ID-POS) database, which can be used to analyze customer behaviors of a supermarket. The customer value is an indicator based on ID-POS database for detecting the customer loyalty of a store. In general, there are many supermarkets in a city, and other nearby competitor supermarkets significantly affect the customer value of customers of a supermarket. However, it is impossible to get detailed ID-POS databases of competitor supermarkets. This study firstly focused on the customer value and distance between a customer's home and supermarkets in a city, and then constructed the models based on logistic regression analysis to analyze correlations between distance and purchasing behaviors only from a POS database of a supermarket chain. During the modeling process, there are three primary problems existed, including the incomparable problem of customer values, the multicollinearity problem among customer value and distance data, and the number of valid partial regression coefficients. The improved customer value, Huff’s gravity model, and inverse attractiveness frequency are considered to solve these problems. This paper presents three types of models based on these three methods for loyal customer classification and competitors’ influence analysis. In numerical experiments, all types of models are useful for loyal customer classification. The type of model, including all three methods, is the most superior one for evaluating the influence of the other nearby supermarkets on customers' purchasing of a supermarket chain from the viewpoint of valid partial regression coefficients and accuracy.

Keywords: customer value, Huff's Gravity Model, POS, Retailer

Procedia PDF Downloads 93
5504 An Investigation into Fraud Detection in Financial Reporting Using Sugeno Fuzzy Classification

Authors: Mohammad Sarchami, Mohsen Zeinalkhani

Abstract:

Always, financial reporting system faces some problems to win public ear. The increase in the number of fraud and representation, often combined with the bankruptcy of large companies, has raised concerns about the quality of financial statements. So, investors, legislators, managers, and auditors have focused on significant fraud detection or prevention in financial statements. This article aims to investigate the Sugeno fuzzy classification to consider fraud detection in financial reporting of accepted firms by Tehran stock exchange. The hypothesis is: Sugeno fuzzy classification may detect fraud in financial reporting by financial ratio. Hypothesis was tested using Matlab software. Accuracy average was 81/80 in Sugeno fuzzy classification; so the hypothesis was confirmed.

Keywords: fraud, financial reporting, Sugeno fuzzy classification, firm

Procedia PDF Downloads 219
5503 About the Case Portfolio Management Algorithms and Their Applications

Authors: M. Chumburidze, N. Salia, T. Namchevadze

Abstract:

This work deal with case processing problems in business. The task of strategic credit requirements management of cases portfolio is discussed. The information model of credit requirements in a binary tree diagram is considered. The algorithms to solve issues of prioritizing clusters of cases in business have been investigated. An implementation of priority queues to support case management operations has been presented. The corresponding pseudo codes for the programming application have been constructed. The tools applied in this development are based on binary tree ordering algorithms, optimization theory, and business management methods.

Keywords: credit network, case portfolio, binary tree, priority queue, stack

Procedia PDF Downloads 92
5502 Effect of Personality Traits on Classification of Political Orientation

Authors: Vesile Evrim, Aliyu Awwal

Abstract:

Today as in the other domains, there are an enormous number of political transcripts available in the Web which is waiting to be mined and used for various purposes such as statistics and recommendations. Therefore, automatically determining the political orientation on these transcripts becomes crucial. The methodologies used by machine learning algorithms to do the automatic classification are based on different features such as Linguistic. Considering the ideology differences between Liberals and Conservatives, in this paper, the effect of Personality Traits on political orientation classification is studied. This is done by considering the correlation between LIWC features and the BIG Five Personality Traits. Several experiments are conducted on Convote U.S. Congressional-Speech dataset with seven benchmark classification algorithms. The different methodologies are applied on selecting different feature sets that constituted by 8 to 64 varying number of features. While Neuroticism is obtained to be the most differentiating personality trait on classification of political polarity, when its top 10 representative features are combined with several classification algorithms, it outperformed the results presented in previous research.

Keywords: politics, personality traits, LIWC, machine learning

Procedia PDF Downloads 465
5501 Application Difference between Cox and Logistic Regression Models

Authors: Idrissa Kayijuka

Abstract:

The logistic regression and Cox regression models (proportional hazard model) at present are being employed in the analysis of prospective epidemiologic research looking into risk factors in their application on chronic diseases. However, a theoretical relationship between the two models has been studied. By definition, Cox regression model also called Cox proportional hazard model is a procedure that is used in modeling data regarding time leading up to an event where censored cases exist. Whereas the Logistic regression model is mostly applicable in cases where the independent variables consist of numerical as well as nominal values while the resultant variable is binary (dichotomous). Arguments and findings of many researchers focused on the overview of Cox and Logistic regression models and their different applications in different areas. In this work, the analysis is done on secondary data whose source is SPSS exercise data on BREAST CANCER with a sample size of 1121 women where the main objective is to show the application difference between Cox regression model and logistic regression model based on factors that cause women to die due to breast cancer. Thus we did some analysis manually i.e. on lymph nodes status, and SPSS software helped to analyze the mentioned data. This study found out that there is an application difference between Cox and Logistic regression models which is Cox regression model is used if one wishes to analyze data which also include the follow-up time whereas Logistic regression model analyzes data without follow-up-time. Also, they have measurements of association which is different: hazard ratio and odds ratio for Cox and logistic regression models respectively. A similarity between the two models is that they are both applicable in the prediction of the upshot of a categorical variable i.e. a variable that can accommodate only a restricted number of categories. In conclusion, Cox regression model differs from logistic regression by assessing a rate instead of proportion. The two models can be applied in many other researches since they are suitable methods for analyzing data but the more recommended is the Cox, regression model.

Keywords: logistic regression model, Cox regression model, survival analysis, hazard ratio

Procedia PDF Downloads 423
5500 Machine Learning Techniques in Bank Credit Analysis

Authors: Fernanda M. Assef, Maria Teresinha A. Steiner

Abstract:

The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.

Keywords: artificial neural networks (ANNs), classifier algorithms, credit risk assessment, logistic regression, machine Learning, support vector machines

Procedia PDF Downloads 76
5499 Automatic Adult Age Estimation Using Deep Learning of the ResNeXt Model Based on CT Reconstruction Images of the Costal Cartilage

Authors: Ting Lu, Ya-Ru Diao, Fei Fan, Ye Xue, Lei Shi, Xian-e Tang, Meng-jun Zhan, Zhen-hua Deng

Abstract:

Accurate adult age estimation (AAE) is a significant and challenging task in forensic and archeology fields. Attempts have been made to explore optimal adult age metrics, and the rib is considered a potential age marker. The traditional way is to extract age-related features designed by experts from macroscopic or radiological images followed by classification or regression analysis. Those results still have not met the high-level requirements for practice, and the limitation of using feature design and manual extraction methods is loss of information since the features are likely not designed explicitly for extracting information relevant to age. Deep learning (DL) has recently garnered much interest in imaging learning and computer vision. It enables learning features that are important without a prior bias or hypothesis and could be supportive of AAE. This study aimed to develop DL models for AAE based on CT images and compare their performance to the manual visual scoring method. Chest CT data were reconstructed using volume rendering (VR). Retrospective data of 2500 patients aged 20.00-69.99 years were obtained between December 2019 and September 2021. Five-fold cross-validation was performed, and datasets were randomly split into training and validation sets in a 4:1 ratio for each fold. Before feeding the inputs into networks, all images were augmented with random rotation and vertical flip, normalized, and resized to 224×224 pixels. ResNeXt was chosen as the DL baseline due to its advantages of higher efficiency and accuracy in image classification. Mean absolute error (MAE) was the primary parameter. Independent data from 100 patients acquired between March and April 2022 were used as a test set. The manual method completely followed the prior study, which reported the lowest MAEs (5.31 in males and 6.72 in females) among similar studies. CT data and VR images were used. The radiation density of the first costal cartilage was recorded using CT data on the workstation. The osseous and calcified projections of the 1 to 7 costal cartilages were scored based on VR images using an eight-stage staging technique. According to the results of the prior study, the optimal models were the decision tree regression model in males and the stepwise multiple linear regression equation in females. Predicted ages of the test set were calculated separately using different models by sex. A total of 2600 patients (training and validation sets, mean age=45.19 years±14.20 [SD]; test set, mean age=46.57±9.66) were evaluated in this study. Of ResNeXt model training, MAEs were obtained with 3.95 in males and 3.65 in females. Based on the test set, DL achieved MAEs of 4.05 in males and 4.54 in females, which were far better than the MAEs of 8.90 and 6.42 respectively, for the manual method. Those results showed that the DL of the ResNeXt model outperformed the manual method in AAE based on CT reconstruction of the costal cartilage and the developed system may be a supportive tool for AAE.

Keywords: forensic anthropology, age determination by the skeleton, costal cartilage, CT, deep learning

Procedia PDF Downloads 41
5498 Locus of Control, Metacognitive Knowledge, Metacognitive Regulation, and Student Performance in an Introductory Economics Course

Authors: Ahmad A. Kader

Abstract:

In the principles of Microeconomics course taught during the Fall Semester 2019, 158out of 179 students participated in the completion of two questionnaires and a survey describing their demographic and academic profiles. The two questionnaires include the 29 items of the Rotter Locus of Control Scale and the 52 items of the Schraw andDennisonMetacognitive Awareness Scale. The 52 items consist of 17 items describing knowledge of cognition and 37 items describing the regulation of cognition. The paper is intended to show the combined influence of locus of control, metacognitive knowledge, and metacognitive regulation on student performance. The survey covers variables that have been tested and recognized in economic education literature, which include GPA, gender, age, course level, race, student classification, whether the course was required or elective, employments, whether a high school economic course was taken, and attendance. Regression results show that of the economic education variables, GPA, classification, whether the course was required or elective, and attendance are the only significant variables in their influence on student grade. Of the educational psychology variables, the regression results show that the locus of control variable has a negative and significant effect, while the metacognitive knowledge variable has a positive and significant effect on student grade. Also, the adjusted R square value increased markedly with the addition of the locus of control, metacognitive knowledge, and metacognitive regulation variables to the regression equation. The t test results also show that students who are internally oriented and are high on the metacognitive knowledge scale significantly outperform students who are externally oriented and are low on the metacognitive knowledge scale. The implication of these results for educators is discussed in the paper.

Keywords: locus of control, metacognitive knowledge, metacognitive regulation, student performance, economic education

Procedia PDF Downloads 92
5497 Machine Learning-Driven Prediction of Cardiovascular Diseases: A Supervised Approach

Authors: Thota Sai Prakash, B. Yaswanth, Jhade Bhuvaneswar, Marreddy Divakar Reddy, Shyam Ji Gupta

Abstract:

Across the globe, there are a lot of chronic diseases, and heart disease stands out as one of the most perilous. Sadly, many lives are lost to this condition, even though early intervention could prevent such tragedies. However, identifying heart disease in its initial stages is not easy. To address this challenge, we propose an automated system aimed at predicting the presence of heart disease using advanced techniques. By doing so, we hope to empower individuals with the knowledge needed to take proactive measures against this potentially fatal illness. Our approach towards this problem involves meticulous data preprocessing and the development of predictive models utilizing classification algorithms such as Support Vector Machines (SVM), Decision Tree, and Random Forest. We assess the efficiency of every model based on metrics like accuracy, ensuring that we select the most reliable option. Additionally, we conduct thorough data analysis to reveal the importance of different attributes. Among the models considered, Random Forest emerges as the standout performer with an accuracy rate of 96.04% in our study.

Keywords: support vector machines, decision tree, random forest

Procedia PDF Downloads 4
5496 Effect of Tree Age on Fruit Quality of Different Cultivars of Sweet Orange

Authors: Muhammad Imran, Faheem Khadija, Zahoor Hussain, Raheel Anwar, M. Nawaz Khan, M. Raza Salik

Abstract:

Amongst citrus species, sweet orange (Citrus sinensis L. Osbeck) occupies a dominant position in the orange producing countries in the world. Sweet orange is widely consumed both as fresh fruit as well as juice and its global demand is attributed due to higher vitamin C and antioxidants. Fruit quality is most important for the external appearance and marketability of sweet orange fruit, especially for fresh consumption. There are so many factors affecting fruit quality, tree age is the most important one, but remains unexplored so far. The present study, we investigated the role of tree age on fruit quality of different cultivars of sweet oranges. The difference between fruit quality of 5-year young and 15-year old trees was discussed in the current study. In case of fruit weight, maximum fruit weight (238g) was recorded in 15-year old sweet orange cv. Sallustiana cultivar while minimum fruit weight (142g) was recorded in 5-year young tree of Succari sweet orange fruit. The results of the fruit diameter showed that the maximum fruit diameter (77.142mm) was recorded in 15-year old Sallustiana orange but the minimum fruit diameter (66.046mm) was observed in 5-year young tree of sweet orange cv. Succari. The minimum value of rind thickness (4.142mm) was noted in 15-year old tree of cv. Red blood. On the other hand maximum value of rind thickness was observed in 5-year young tree of cv. Sallustiana. The data regarding total soluble solids (TSS), acidity (TA), TSS/TA, juice content, rind, flavedo thickness, pH and fruit diameter have also been discussed.

Keywords: age, cultivars, fruit, quality, sweet orange (Citrus Sinensis L. Osbeck)

Procedia PDF Downloads 189
5495 Fault Tree Analysis (FTA) of CNC Turning Center

Authors: R. B. Patil, B. S. Kothavale, L. Y. Waghmode

Abstract:

Today, the CNC turning center becomes an important machine tool for manufacturing industry worldwide. However, as the breakdown of a single CNC turning center may result in the production of an entire plant being halted. For this reason, operations and preventive maintenance have to be minimized to ensure availability of the system. Indeed, improving the availability of the CNC turning center as a whole, objectively leads to a substantial reduction in production loss, operating, maintenance and support cost. In this paper, fault tree analysis (FTA) method is used for reliability analysis of CNC turning center. The major faults associated with the system and the causes for the faults are presented graphically. Boolean algebra is used for evaluating fault tree (FT) diagram and for deriving governing reliability model for CNC turning center. Failure data over a period of six years has been collected and used for evaluating the model. Qualitative and quantitative analysis is also carried out to identify critical sub-systems and components of CNC turning center. It is found that, at the end of the warranty period (one year), the reliability of the CNC turning center as a whole is around 0.61628.

Keywords: fault tree analysis (FTA), reliability analysis, risk assessment, hazard analysis

Procedia PDF Downloads 375
5494 Effect of Different Spacings on Growth Yield and Fruit Quality of Peach in the Sub-Tropics of India

Authors: Harminder Singh, Rupinder Kaur

Abstract:

Peach is primarily a temperate fruit, but its low chilling cultivars are grown quite successfully in the sub-tropical climate as well. The area under peach cultivation is picking up rapidly in the sub tropics of northern India due to higher return on a unit area basis, availability of suitable peach cultivar and their production technology. Information on the use of different training systems on peach in the sub tropics is inadequate. In this investigation, conducted at Punjab Agricultural University, Ludhiana (Punjab), India, the trees of the Shan-i-Punjab peach were planted at four different spacings i.e. 6.0x3.0m, 6.0x2.5m, 4.5x3.0m and 4.5x2.5m and were trained to central leader system. The total radiation interception and penetration in the upper and lower canopy parts were higher in 6x3.0m and 6x2.5m planted trees as compared to other spacings. Average radiation interception was maximum in the upper part of the tree canopy, and it decreased significantly with the depth of the canopy in all the spacings. Tree planted at wider spacings produced more vegetative (tree height, tree girth, tree spread and canopy volume) and reproductive growth (flower bud density, number of fruits and fruit yield) per tree but productivity was maximum in the closely planted trees. Fruits harvested from the wider spaced trees were superior in fruit quality (size, weight, colour, TSS and acidity) and matured earlier than those harvested from closed spaced trees.

Keywords: quality, radiation, spacings, yield

Procedia PDF Downloads 154
5493 Dust Holding Capacity of Some Selected Road Side Tree Species

Authors: Jitin Rahul, Manish Kumar Jain

Abstract:

Dust pollution refers to the various locations, activities, or factors which are responsible for the releasing of pollutants into the atmosphere. The sources of dust can be classified into two major categories anthropogenic sources (man-made sources) and natural sources. Dust kicked up by heavy vehicles (Bus, Truck, Loaders, Tankers, car etc.) travelling on highways may make up approximately 33-40% of air pollution. Plants naturally cleanse the atmosphere by absorbing gases and particulate matter plants (Leaves). Plants are very good pollution indicator and also very good for dust capturing (Dust controlling). Many types tree species like Azadirachta indica A. juss, Butea monosperma (Lam.) Kuntz., Ficus bengalensis (Linn)., Pterocarpus marspium (Roxb.), Terminalia arjuna (Roxb, exDC.), Dalbergia sissoo roxb., and Ficus religiosa (Linn.) generally occur in roadside. These selected tree spiciness can control the dust pollution or dust capturing. It is well known that plants absorb particulate pollutants and help in dust controlling. Some tree species like (Ficus bengalensis, Ficus religiosa and Azadirachta indica) are very effective and natural means for controlling air pollution.

Keywords: dust, pollution, road, tree species

Procedia PDF Downloads 299
5492 Stock Market Prediction by Regression Model with Social Moods

Authors: Masahiro Ohmura, Koh Kakusho, Takeshi Okadome

Abstract:

This paper presents a regression model with autocorrelated errors in which the inputs are social moods obtained by analyzing the adjectives in Twitter posts using a document topic model. The regression model predicts Dow Jones Industrial Average (DJIA) more precisely than autoregressive moving-average models.

Keywords: stock market prediction, social moods, regression model, DJIA

Procedia PDF Downloads 519
5491 Effect of Signal Acquisition Procedure on Imagined Speech Classification Accuracy

Authors: M.R Asghari Bejestani, Gh. R. Mohammad Khani, V.R. Nafisi

Abstract:

Imagined speech recognition is one of the most interesting approaches to BCI development and a lot of works have been done in this area. Many different experiments have been designed and hundreds of combinations of feature extraction methods and classifiers have been examined. Reported classification accuracies range from the chance level to more than 90%. Based on non-stationary nature of brain signals, we have introduced 3 classification modes according to time difference in inter and intra-class samples. The modes can explain the diversity of reported results and predict the range of expected classification accuracies from the brain signal accusation procedure. In this paper, a few samples are illustrated by inspecting results of some previous works.

Keywords: brain computer interface, silent talk, imagined speech, classification, signal processing

Procedia PDF Downloads 120
5490 Evaluation of Vehicle Classification Categories: Florida Case Study

Authors: Ren Moses, Jaqueline Masaki

Abstract:

This paper addresses the need for accurate and updated vehicle classification system through a thorough evaluation of vehicle class categories to identify errors arising from the existing system and proposing modifications. The data collected from two permanent traffic monitoring sites in Florida were used to evaluate the performance of the existing vehicle classification table. The vehicle data were collected and classified by the automatic vehicle classifier (AVC), and a video camera was used to obtain ground truth data. The Federal Highway Administration (FHWA) vehicle classification definitions were used to define vehicle classes from the video and compare them to the data generated by AVC in order to identify the sources of misclassification. Six types of errors were identified. Modifications were made in the classification table to improve the classification accuracy. The results of this study include the development of updated vehicle classification table with a reduction in total error by 5.1%, a step by step procedure to use for evaluation of vehicle classification studies and recommendations to improve FHWA 13-category rule set. The recommendations for the FHWA 13-category rule set indicate the need for the vehicle classification definitions in this scheme to be updated to reflect the distribution of current traffic. The presented results will be of interest to States’ transportation departments and consultants, researchers, engineers, designers, and planners who require accurate vehicle classification information for planning, designing and maintenance of transportation infrastructures.

Keywords: vehicle classification, traffic monitoring, pavement design, highway traffic

Procedia PDF Downloads 160