Search results for: classification and regression
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 5108

Search results for: classification and regression

4418 The Prediction of Effective Equation on Drivers' Behavioral Characteristics of Lane Changing

Authors: Khashayar Kazemzadeh, Mohammad Hanif Dasoomi

Abstract:

According to the increasing volume of traffic, lane changing plays a crucial role in traffic flow. Lane changing in traffic depends on several factors including road geometrical design, speed, drivers’ behavioral characteristics, etc. A great deal of research has been carried out regarding these fields. Despite of the other significant factors, the drivers’ behavioral characteristics of lane changing has been emphasized in this paper. This paper has predicted the effective equation based on personal characteristics of lane changing by regression models.

Keywords: effective equation, lane changing, drivers’ behavioral characteristics, regression models

Procedia PDF Downloads 438
4417 Climate Changes in Albania and Their Effect on Cereal Yield

Authors: Lule Basha, Eralda Gjika

Abstract:

This study is focused on analyzing climate change in Albania and its potential effects on cereal yields. Initially, monthly temperature and rainfalls in Albania were studied for the period 1960-2021. Climacteric variables are important variables when trying to model cereal yield behavior, especially when significant changes in weather conditions are observed. For this purpose, in the second part of the study, linear and nonlinear models explaining cereal yield are constructed for the same period, 1960-2021. The multiple linear regression analysis and lasso regression method are applied to the data between cereal yield and each independent variable: average temperature, average rainfall, fertilizer consumption, arable land, land under cereal production, and nitrous oxide emissions. In our regression model, heteroscedasticity is not observed, data follow a normal distribution, and there is a low correlation between factors, so we do not have the problem of multicollinearity. Machine-learning methods, such as random forest, are used to predict cereal yield responses to climacteric and other variables. Random Forest showed high accuracy compared to the other statistical models in the prediction of cereal yield. We found that changes in average temperature negatively affect cereal yield. The coefficients of fertilizer consumption, arable land, and land under cereal production are positively affecting production. Our results show that the Random Forest method is an effective and versatile machine-learning method for cereal yield prediction compared to the other two methods.

Keywords: cereal yield, climate change, machine learning, multiple regression model, random forest

Procedia PDF Downloads 82
4416 Hybrid Approach for Software Defect Prediction Using Machine Learning with Optimization Technique

Authors: C. Manjula, Lilly Florence

Abstract:

Software technology is developing rapidly which leads to the growth of various industries. Now-a-days, software-based applications have been adopted widely for business purposes. For any software industry, development of reliable software is becoming a challenging task because a faulty software module may be harmful for the growth of industry and business. Hence there is a need to develop techniques which can be used for early prediction of software defects. Due to complexities in manual prediction, automated software defect prediction techniques have been introduced. These techniques are based on the pattern learning from the previous software versions and finding the defects in the current version. These techniques have attracted researchers due to their significant impact on industrial growth by identifying the bugs in software. Based on this, several researches have been carried out but achieving desirable defect prediction performance is still a challenging task. To address this issue, here we present a machine learning based hybrid technique for software defect prediction. First of all, Genetic Algorithm (GA) is presented where an improved fitness function is used for better optimization of features in data sets. Later, these features are processed through Decision Tree (DT) classification model. Finally, an experimental study is presented where results from the proposed GA-DT based hybrid approach is compared with those from the DT classification technique. The results show that the proposed hybrid approach achieves better classification accuracy.

Keywords: decision tree, genetic algorithm, machine learning, software defect prediction

Procedia PDF Downloads 324
4415 Black-Box-Base Generic Perturbation Generation Method under Salient Graphs

Authors: Dingyang Hu, Dan Liu

Abstract:

DNN (Deep Neural Network) deep learning models are widely used in classification, prediction, and other task scenarios. To address the difficulties of generic adversarial perturbation generation for deep learning models under black-box conditions, a generic adversarial ingestion generation method based on a saliency map (CJsp) is proposed to obtain salient image regions by counting the factors that influence the input features of an image on the output results. This method can be understood as a saliency map attack algorithm to obtain false classification results by reducing the weights of salient feature points. Experiments also demonstrate that this method can obtain a high success rate of migration attacks and is a batch adversarial sample generation method.

Keywords: adversarial sample, gradient, probability, black box

Procedia PDF Downloads 88
4414 Exploring the Spatial Relationship between Built Environment and Ride-hailing Demand: Applying Street-Level Images

Authors: Jingjue Bao, Ye Li, Yujie Qi

Abstract:

The explosive growth of ride-hailing has reshaped residents' travel behavior and plays a crucial role in urban mobility within the built environment. Contributing to the research of the spatial variation of ride-hailing demand and its relationship to the built environment and socioeconomic factors, this study utilizes multi-source data from Haikou, China, to construct a Multi-scale Geographically Weighted Regression model (MGWR), considering spatial scale heterogeneity. The regression results showed that MGWR model was demonstrated superior interpretability and reliability with an improvement of 3.4% on R2 and from 4853 to 4787 on AIC, compared with Geographically Weighted Regression model (GWR). Furthermore, to precisely identify the surrounding environment of sampling point, DeepLabv3+ model is employed to segment street-level images. Features extracted from these images are incorporated as variables in the regression model, further enhancing its rationality and accuracy by 7.78% improvement on R2 compared with the MGWR model only considered region-level variables. By integrating multi-scale geospatial data and utilizing advanced computer vision techniques, this study provides a comprehensive understanding of the spatial dynamics between ride-hailing demand and the urban built environment. The insights gained from this research are expected to contribute significantly to urban transportation planning and policy making, as well as ride-hailing platforms, facilitating the development of more efficient and effective mobility solutions in modern cities.

Keywords: travel behavior, ride-hailing, spatial relationship, built environment, street-level image

Procedia PDF Downloads 65
4413 Identity Verification Using k-NN Classifiers and Autistic Genetic Data

Authors: Fuad M. Alkoot

Abstract:

DNA data have been used in forensics for decades. However, current research looks at using the DNA as a biometric identity verification modality. The goal is to improve the speed of identification. We aim at using gene data that was initially used for autism detection to find if and how accurate is this data for identification applications. Mainly our goal is to find if our data preprocessing technique yields data useful as a biometric identification tool. We experiment with using the nearest neighbor classifier to identify subjects. Results show that optimal classification rate is achieved when the test set is corrupted by normally distributed noise with zero mean and standard deviation of 1. The classification rate is close to optimal at higher noise standard deviation reaching 3. This shows that the data can be used for identity verification with high accuracy using a simple classifier such as the k-nearest neighbor (k-NN). 

Keywords: biometrics, genetic data, identity verification, k nearest neighbor

Procedia PDF Downloads 245
4412 Food Intake Pattern and Nutritional Status of Preschool Children of Chakma Ethnic Community

Authors: Md Monoarul Haque

Abstract:

Nutritional status is a sensitive indicator of community health and nutrition among preschool children, especially the prevalence of undernutrition that affects all dimensions of human development and leads to growth faltering in early life. The present study is an attempt to assess the food intake pattern and nutritional status of pre-school Chakma tribe children. It was a cross-sectional community based study. The subjects were selected purposively. This study was conducted at Savar Upazilla of Rangamati. Rangamati is located in the Chittagong Division. Anthropometric data height and weight of the study subjects were collected by standard techniques. Nutritional status was measured using Z score according WHO classification. χ2 test, independent t-test, Pearson’s correlation, multiple regression and logistic regression was performed as P<0.05 level of significance. Statistical analyses were performed by appropriate univariate and multivariate techniques using SPSS windows 11.5. Moderate (-3SD to <-2SD) to severe underweight (<-3SD) were 23.8% and 76.2% study subjects had normal weight for their age. Moderate (-3SD to <-2SD) to severe (<-3SD) stunted children were only 25.6% and 74.4% children were normal and moderate to severe wasting were 14.7% whereas normal child was 85.3%. Significant association had been found between child nutritional status and monthly family income, mother education and occupation of father and mother. Age, sex and incomes of the family, education of mother and occupation of father were significantly associated with WAZ and HAZ of the study subjects (P=0.0001, P=0.025, P=0.001 and P=0.0001, P=0.003, P=0.031, P=0.092, P=0.008). Maximum study subjects took local small fish and some traditional tribal food like bashrool, jhijhipoka and pork very much popular food among tribal children. Energy, carbohydrate and fat intake was significantly associated with HAZ, WAZ, BAZ and MUACZ. This study demonstrates that malnutrition among tribal children in Bangladesh is much better than national scenario in Bangladesh. Significant association was found between child nutritional status and family monthly income, mother education and occupation of father and mother. Most of the study subjects took local small fish and some traditional tribal food. Significant association was also found between child nutritional status and dietary intake of energy, carbohydrate and fat.

Keywords: food intake pattern, nutritional status, preschool children, Chakma ethnic community

Procedia PDF Downloads 494
4411 The Impact of Cryptocurrency Classification on Money Laundering: Analyzing the Preferences of Criminals for Stable Coins, Utility Coins, and Privacy Tokens

Authors: Mohamed Saad, Huda Ismail

Abstract:

The purpose of this research is to examine the impact of cryptocurrency classification on money laundering crimes and to analyze how the preferences of criminals differ according to the type of digital currency used. Specifically, we aim to explore the roles of stablecoins, utility coins, and privacy tokens in facilitating or hindering money laundering activities and to identify the key factors that influence the choices of criminals in using these cryptocurrencies. To achieve our research objectives, we used a dataset for the most highly traded cryptocurrencies (32 currencies) that were published on the coin market cap for 2022. In addition to conducting a comprehensive review of the existing literature on cryptocurrency and money laundering, with a focus on stablecoins, utility coins, and privacy tokens, Furthermore, we conducted several Multivariate analyses. Our study reveals that the classification of cryptocurrency plays a significant role in money laundering activities, as criminals tend to prefer certain types of digital currencies over others, depending on their specific needs and goals. Specifically, we found that stablecoins are more commonly used in money laundering due to their relatively stable value and low volatility, which makes them less risky to hold and transfer. Utility coins, on the other hand, are less frequently used in money laundering due to their lack of anonymity and limited liquidity. Finally, privacy tokens, such as Monero and Zcash, are increasingly becoming a preferred choice among criminals due to their high degree of privacy and untraceability. In summary, our study highlights the importance of understanding the nuances of cryptocurrency classification in the context of money laundering and provides insights into the preferences of criminals in using digital currencies for illegal activities. Based on our findings, our recommendation to the policymakers is to address the potential misuse of cryptocurrencies for money laundering. By implementing measures to regulate stable coins, strengthening cross-border cooperation, fostering public-private partnerships, and increasing cooperation, policymakers can help prevent and detect money laundering activities involving digital currencies.

Keywords: crime, cryptocurrency, money laundering, tokens.

Procedia PDF Downloads 80
4410 Post-Earthquake Road Damage Detection by SVM Classification from Quickbird Satellite Images

Authors: Moein Izadi, Ali Mohammadzadeh

Abstract:

Detection of damaged parts of roads after earthquake is essential for coordinating rescuers. In this study, an approach is presented for the semi-automatic detection of damaged roads in a city using pre-event vector maps and both pre- and post-earthquake QuickBird satellite images. Damage is defined in this study as the debris of damaged buildings adjacent to the roads. Some spectral and texture features are considered for SVM classification step to detect damages. Finally, the proposed method is tested on QuickBird pan-sharpened images from the Bam City earthquake and the results show that an overall accuracy of 81% and a kappa coefficient of 0.71 are achieved for the damage detection. The obtained results indicate the efficiency and accuracy of the proposed approach.

Keywords: SVM classifier, disaster management, road damage detection, quickBird images

Procedia PDF Downloads 615
4409 Land Cover Mapping Using Sentinel-2, Landsat-8 Satellite Images, and Google Earth Engine: A Study Case of the Beterou Catchment

Authors: Ella Sèdé Maforikan

Abstract:

Accurate land cover mapping is essential for effective environmental monitoring and natural resources management. This study focuses on assessing the classification performance of two satellite datasets and evaluating the impact of different input feature combinations on classification accuracy in the Beterou catchment, situated in the northern part of Benin. Landsat-8 and Sentinel-2 images from June 1, 2020, to March 31, 2021, were utilized. Employing the Random Forest (RF) algorithm on Google Earth Engine (GEE), a supervised classification categorized the land into five classes: forest, savannas, cropland, settlement, and water bodies. GEE was chosen due to its high-performance computing capabilities, mitigating computational burdens associated with traditional land cover classification methods. By eliminating the need for individual satellite image downloads and providing access to an extensive archive of remote sensing data, GEE facilitated efficient model training on remote sensing data. The study achieved commendable overall accuracy (OA), ranging from 84% to 85%, even without incorporating spectral indices and terrain metrics into the model. Notably, the inclusion of additional input sources, specifically terrain features like slope and elevation, enhanced classification accuracy. The highest accuracy was achieved with Sentinel-2 (OA = 91%, Kappa = 0.88), slightly surpassing Landsat-8 (OA = 90%, Kappa = 0.87). This underscores the significance of combining diverse input sources for optimal accuracy in land cover mapping. The methodology presented herein not only enables the creation of precise, expeditious land cover maps but also demonstrates the prowess of cloud computing through GEE for large-scale land cover mapping with remarkable accuracy. The study emphasizes the synergy of different input sources to achieve superior accuracy. As a future recommendation, the application of Light Detection and Ranging (LiDAR) technology is proposed to enhance vegetation type differentiation in the Beterou catchment. Additionally, a cross-comparison between Sentinel-2 and Landsat-8 for assessing long-term land cover changes is suggested.

Keywords: land cover mapping, Google Earth Engine, random forest, Beterou catchment

Procedia PDF Downloads 52
4408 A Case-Based Reasoning-Decision Tree Hybrid System for Stock Selection

Authors: Yaojun Wang, Yaoqing Wang

Abstract:

Stock selection is an important decision-making problem. Many machine learning and data mining technologies are employed to build automatic stock-selection system. A profitable stock-selection system should consider the stock’s investment value and the market timing. In this paper, we present a hybrid system including both engage for stock selection. This system uses a case-based reasoning (CBR) model to execute the stock classification, uses a decision-tree model to help with market timing and stock selection. The experiments show that the performance of this hybrid system is better than that of other techniques regarding to the classification accuracy, the average return and the Sharpe ratio.

Keywords: case-based reasoning, decision tree, stock selection, machine learning

Procedia PDF Downloads 410
4407 Multi-Labeled Aromatic Medicinal Plant Image Classification Using Deep Learning

Authors: Tsega Asresa, Getahun Tigistu, Melaku Bayih

Abstract:

Computer vision is a subfield of artificial intelligence that allows computers and systems to extract meaning from digital images and video. It is used in a wide range of fields of study, including self-driving cars, video surveillance, medical diagnosis, manufacturing, law, agriculture, quality control, health care, facial recognition, and military applications. Aromatic medicinal plants are botanical raw materials used in cosmetics, medicines, health foods, essential oils, decoration, cleaning, and other natural health products for therapeutic and Aromatic culinary purposes. These plants and their products not only serve as a valuable source of income for farmers and entrepreneurs but also going to export for valuable foreign currency exchange. In Ethiopia, there is a lack of technologies for the classification and identification of Aromatic medicinal plant parts and disease type cured by aromatic medicinal plants. Farmers, industry personnel, academicians, and pharmacists find it difficult to identify plant parts and disease types cured by plants before ingredient extraction in the laboratory. Manual plant identification is a time-consuming, labor-intensive, and lengthy process. To alleviate these challenges, few studies have been conducted in the area to address these issues. One way to overcome these problems is to develop a deep learning model for efficient identification of Aromatic medicinal plant parts with their corresponding disease type. The objective of the proposed study is to identify the aromatic medicinal plant parts and their disease type classification using computer vision technology. Therefore, this research initiated a model for the classification of aromatic medicinal plant parts and their disease type by exploring computer vision technology. Morphological characteristics are still the most important tools for the identification of plants. Leaves are the most widely used parts of plants besides roots, flowers, fruits, and latex. For this study, the researcher used RGB leaf images with a size of 128x128 x3. In this study, the researchers trained five cutting-edge models: convolutional neural network, Inception V3, Residual Neural Network, Mobile Network, and Visual Geometry Group. Those models were chosen after a comprehensive review of the best-performing models. The 80/20 percentage split is used to evaluate the model, and classification metrics are used to compare models. The pre-trained Inception V3 model outperforms well, with training and validation accuracy of 99.8% and 98.7%, respectively.

Keywords: aromatic medicinal plant, computer vision, convolutional neural network, deep learning, plant classification, residual neural network

Procedia PDF Downloads 173
4406 Interpretation of the Russia-Ukraine 2022 War via N-Gram Analysis

Authors: Elcin Timur Cakmak, Ayse Oguzlar

Abstract:

This study presents the results of the tweets sent by Twitter users on social media about the Russia-Ukraine war by bigram and trigram methods. On February 24, 2022, Russian President Vladimir Putin declared a military operation against Ukraine, and all eyes were turned to this war. Many people living in Russia and Ukraine reacted to this war and protested and also expressed their deep concern about this war as they felt the safety of their families and their futures were at stake. Most people, especially those living in Russia and Ukraine, express their views on the war in different ways. The most popular way to do this is through social media. Many people prefer to convey their feelings using Twitter, one of the most frequently used social media tools. Since the beginning of the war, it is seen that there have been thousands of tweets about the war from many countries of the world on Twitter. These tweets accumulated in data sources are extracted using various codes for analysis through Twitter API and analysed by Python programming language. The aim of the study is to find the word sequences in these tweets by the n-gram method, which is known for its widespread use in computational linguistics and natural language processing. The tweet language used in the study is English. The data set consists of the data obtained from Twitter between February 24, 2022, and April 24, 2022. The tweets obtained from Twitter using the #ukraine, #russia, #war, #putin, #zelensky hashtags together were captured as raw data, and the remaining tweets were included in the analysis stage after they were cleaned through the preprocessing stage. In the data analysis part, the sentiments are found to present what people send as a message about the war on Twitter. Regarding this, negative messages make up the majority of all the tweets as a ratio of %63,6. Furthermore, the most frequently used bigram and trigram word groups are found. Regarding the results, the most frequently used word groups are “he, is”, “I, do”, “I, am” for bigrams. Also, the most frequently used word groups are “I, do, not”, “I, am, not”, “I, can, not” for trigrams. In the machine learning phase, the accuracy of classifications is measured by Classification and Regression Trees (CART) and Naïve Bayes (NB) algorithms. The algorithms are used separately for bigrams and trigrams. We gained the highest accuracy and F-measure values by the NB algorithm and the highest precision and recall values by the CART algorithm for bigrams. On the other hand, the highest values for accuracy, precision, and F-measure values are achieved by the CART algorithm, and the highest value for the recall is gained by NB for trigrams.

Keywords: classification algorithms, machine learning, sentiment analysis, Twitter

Procedia PDF Downloads 66
4405 Development of a Computer Aided Diagnosis Tool for Brain Tumor Extraction and Classification

Authors: Fathi Kallel, Abdulelah Alabd Uljabbar, Abdulrahman Aldukhail, Abdulaziz Alomran

Abstract:

The brain is an important organ in our body since it is responsible about the majority actions such as vision, memory, etc. However, different diseases such as Alzheimer and tumors could affect the brain and conduct to a partial or full disorder. Regular diagnosis are necessary as a preventive measure and could help doctors to early detect a possible trouble and therefore taking the appropriate treatment, especially in the case of brain tumors. Different imaging modalities are proposed for diagnosis of brain tumor. The powerful and most used modality is the Magnetic Resonance Imaging (MRI). MRI images are analyzed by doctor in order to locate eventual tumor in the brain and describe the appropriate and needed treatment. Diverse image processing methods are also proposed for helping doctors in identifying and analyzing the tumor. In fact, a large Computer Aided Diagnostic (CAD) tools including developed image processing algorithms are proposed and exploited by doctors as a second opinion to analyze and identify the brain tumors. In this paper, we proposed a new advanced CAD for brain tumor identification, classification and feature extraction. Our proposed CAD includes three main parts. Firstly, we load the brain MRI. Secondly, a robust technique for brain tumor extraction is proposed. This technique is based on both Discrete Wavelet Transform (DWT) and Principal Component Analysis (PCA). DWT is characterized by its multiresolution analytic property, that’s why it was applied on MRI images with different decomposition levels for feature extraction. Nevertheless, this technique suffers from a main drawback since it necessitates a huge storage and is computationally expensive. To decrease the dimensions of the feature vector and the computing time, PCA technique is considered. In the last stage, according to different extracted features, the brain tumor is classified into either benign or malignant tumor using Support Vector Machine (SVM) algorithm. A CAD tool for brain tumor detection and classification, including all above-mentioned stages, is designed and developed using MATLAB guide user interface.

Keywords: MRI, brain tumor, CAD, feature extraction, DWT, PCA, classification, SVM

Procedia PDF Downloads 241
4404 Investigating the Influence of the Ferro Alloys Consumption on the Slab Product Standard Cost with Different Grades Using Regression Analysis (A Case Study of Iran's Iron and Steel Industry)

Authors: Iman Fakhrian, Ali Salehi Manzari

Abstract:

Consistent Profitability is one of the most important priorities in manufacturing companies. One of the fundamental factors for increasing the companies profitability is cost management. Isfahan's mobarakeh steel company is one of the largest producers of the slab product grades in the middle east. Raw material cost constitutes about 70% of the company's expenditures. The costs of the ferro alloys have a remarkable contribution of the raw material costs. This research aims to determine the ferro alloys which have significant effect on the variability of the standard cost of the slab product grades. Used data in this study were collected from standard costing system of isfahan's mobarakeh steel company in 2022. The results of conducting the regression analysis model show that expense items: 03020, 03045, 03125, 03130 and 03150 have dominant role in variability of the standard cost of the slab product grades. In other words, the mentioned ferro alloys have noticeable and significant role in variability of the standard cost of the slab product grades.

Keywords: consistent profitability, ferro alloys, slab product grades, regression analysis

Procedia PDF Downloads 59
4403 Classification of Business Models of Italian Bancassurance by Balance Sheet Indicators

Authors: Andrea Bellucci, Martina Tofi

Abstract:

The aim of paper is to analyze business models of bancassurance in Italy for life business. The life insurance business is very developed in the Italian market and banks branches have 80% of the market share. Given its maturity, the life insurance market needs to consolidate its organizational form to allow for the development of non-life business, which nowadays collects few premiums but represents a great opportunity to enlarge the market share of bancassurance using its strength in the distribution channel while the market share of independent agents is decreasing. Starting with the main business model of bancassurance for life business, this paper will analyze the performances of life companies in the Italian market by balance sheet indicators and by main discriminant variables of business models. The study will observe trends from 2013 to 2015 for the Italian market by exploiting a database managed by Associazione Nazionale delle Imprese di Assicurazione (ANIA). The applied approach is based on a bottom-up analysis starting with variables and indicators to define business models’ classification. The statistical classification algorithm proposed by Ward is employed to design business models’ profiles. Results from the analysis will be a representation of the main business models built by their profile related to indicators. In that way, an unsupervised analysis is developed that has the limit of its judgmental dimension based on research opinion, but it is possible to obtain a design of effective business models.

Keywords: bancassurance, business model, non life bancassurance, insurance business value drivers

Procedia PDF Downloads 289
4402 Comparison of Machine Learning and Deep Learning Algorithms for Automatic Classification of 80 Different Pollen Species

Authors: Endrick Barnacin, Jean-Luc Henry, Jimmy Nagau, Jack Molinie

Abstract:

Palynology is a field of interest in many disciplines due to its multiple applications: chronological dating, climatology, allergy treatment, and honey characterization. Unfortunately, the analysis of a pollen slide is a complicated and time consuming task that requires the intervention of experts in the field, which are becoming increasingly rare due to economic and social conditions. That is why the need for automation of this task is urgent. A lot of studies have investigated the subject using different standard image processing descriptors and sometimes hand-crafted ones.In this work, we make a comparative study between classical feature extraction methods (Shape, GLCM, LBP, and others) and Deep Learning (CNN, Autoencoders, Transfer Learning) to perform a recognition task over 80 regional pollen species. It has been found that the use of Transfer Learning seems to be more precise than the other approaches

Keywords: pollens identification, features extraction, pollens classification, automated palynology

Procedia PDF Downloads 124
4401 ANFIS Approach for Locating Faults in Underground Cables

Authors: Magdy B. Eteiba, Wael Ismael Wahba, Shimaa Barakat

Abstract:

This paper presents a fault identification, classification and fault location estimation method based on Discrete Wavelet Transform and Adaptive Network Fuzzy Inference System (ANFIS) for medium voltage cable in the distribution system. Different faults and locations are simulated by ATP/EMTP, and then certain selected features of the wavelet transformed signals are used as an input for a training process on the ANFIS. Then an accurate fault classifier and locator algorithm was designed, trained and tested using current samples only. The results obtained from ANFIS output were compared with the real output. From the results, it was found that the percentage error between ANFIS output and real output is less than three percent. Hence, it can be concluded that the proposed technique is able to offer high accuracy in both of the fault classification and fault location.

Keywords: ANFIS, fault location, underground cable, wavelet transform

Procedia PDF Downloads 500
4400 Kernel-Based Double Nearest Proportion Feature Extraction for Hyperspectral Image Classification

Authors: Hung-Sheng Lin, Cheng-Hsuan Li

Abstract:

Over the past few years, kernel-based algorithms have been widely used to extend some linear feature extraction methods such as principal component analysis (PCA), linear discriminate analysis (LDA), and nonparametric weighted feature extraction (NWFE) to their nonlinear versions, kernel principal component analysis (KPCA), generalized discriminate analysis (GDA), and kernel nonparametric weighted feature extraction (KNWFE), respectively. These nonlinear feature extraction methods can detect nonlinear directions with the largest nonlinear variance or the largest class separability based on the given kernel function. Moreover, they have been applied to improve the target detection or the image classification of hyperspectral images. The double nearest proportion feature extraction (DNP) can effectively reduce the overlap effect and have good performance in hyperspectral image classification. The DNP structure is an extension of the k-nearest neighbor technique. For each sample, there are two corresponding nearest proportions of samples, the self-class nearest proportion and the other-class nearest proportion. The term “nearest proportion” used here consider both the local information and other more global information. With these settings, the effect of the overlap between the sample distributions can be reduced. Usually, the maximum likelihood estimator and the related unbiased estimator are not ideal estimators in high dimensional inference problems, particularly in small data-size situation. Hence, an improved estimator by shrinkage estimation (regularization) is proposed. Based on the DNP structure, LDA is included as a special case. In this paper, the kernel method is applied to extend DNP to kernel-based DNP (KDNP). In addition to the advantages of DNP, KDNP surpasses DNP in the experimental results. According to the experiments on the real hyperspectral image data sets, the classification performance of KDNP is better than that of PCA, LDA, NWFE, and their kernel versions, KPCA, GDA, and KNWFE.

Keywords: feature extraction, kernel method, double nearest proportion feature extraction, kernel double nearest feature extraction

Procedia PDF Downloads 334
4399 A Systematic Review of Situational Awareness and Cognitive Load Measurement in Driving

Authors: Aly Elshafei, Daniela Romano

Abstract:

With the development of autonomous vehicles, a human-machine interaction (HMI) system is needed for a safe transition of control when a takeover request (TOR) is required. An important part of the HMI system is the ability to monitor the level of situational awareness (SA) of any driver in real-time, in different scenarios, and without any pre-calibration. Presenting state-of-the-art machine learning models used to measure SA is the purpose of this systematic review. Investigating the limitations of each type of sensor, the gaps, and the most suited sensor and computational model that can be used in driving applications. To the author’s best knowledge this is the first literature review identifying online and offline classification methods used to measure SA, explaining which measurements are subject or session-specific, and how many classifications can be done with each classification model. This information can be very useful for researchers measuring SA to identify the most suited model to measure SA for different applications.

Keywords: situational awareness, autonomous driving, gaze metrics, EEG, ECG

Procedia PDF Downloads 110
4398 Long-Term Indoor Air Monitoring for Students with Emphasis on Particulate Matter (PM2.5) Exposure

Authors: Seyedtaghi Mirmohammadi, Jamshid Yazdani, Syavash Etemadi Nejad

Abstract:

One of the main indoor air parameters in classrooms is dust pollution and it depends on the particle size and exposure duration. However, there is a lake of data about the exposure level to PM2.5 concentrations in rural area classrooms. The objective of the current study was exposure assessment for PM2.5 for students in the classrooms. One year monitoring was carried out for fifteen schools by time-series sampling to evaluate the indoor air PM2.5 in the rural district of Sari city, Iran. A hygrometer and thermometer were used to measure some psychrometric parameters (temperature, relative humidity, and wind speed) and Real-Time Dust Monitor, (MicroDust Pro, Casella, UK) was used to monitor particulate matters (PM2.5) concentration. The results show the mean indoor PM2.5 concentration in the studied classrooms was 135µg/m3. The regression model indicated that a positive correlation between indoor PM2.5 concentration and relative humidity, also with distance from city center and classroom size. Meanwhile, the regression model revealed that the indoor PM2.5 concentration, the relative humidity, and dry bulb temperature was significant at 0.05, 0.035, and 0.05 levels, respectively. A statistical predictive model was obtained from multiple regressions modeling for indoor PM2.5 concentration and indoor psychrometric parameters conditions.

Keywords: classrooms, concentration, humidity, particulate matters, regression

Procedia PDF Downloads 327
4397 Rank-Based Chain-Mode Ensemble for Binary Classification

Authors: Chongya Song, Kang Yen, Alexander Pons, Jin Liu

Abstract:

In the field of machine learning, the ensemble has been employed as a common methodology to improve the performance upon multiple base classifiers. However, the true predictions are often canceled out by the false ones during consensus due to a phenomenon called “curse of correlation” which is represented as the strong interferences among the predictions produced by the base classifiers. In addition, the existing practices are still not able to effectively mitigate the problem of imbalanced classification. Based on the analysis on our experiment results, we conclude that the two problems are caused by some inherent deficiencies in the approach of consensus. Therefore, we create an enhanced ensemble algorithm which adopts a designed rank-based chain-mode consensus to overcome the two problems. In order to evaluate the proposed ensemble algorithm, we employ a well-known benchmark data set NSL-KDD (the improved version of dataset KDDCup99 produced by University of New Brunswick) to make comparisons between the proposed and 8 common ensemble algorithms. Particularly, each compared ensemble classifier uses the same 22 base classifiers, so that the differences in terms of the improvements toward the accuracy and reliability upon the base classifiers can be truly revealed. As a result, the proposed rank-based chain-mode consensus is proved to be a more effective ensemble solution than the traditional consensus approach, which outperforms the 8 ensemble algorithms by 20% on almost all compared metrices which include accuracy, precision, recall, F1-score and area under receiver operating characteristic curve.

Keywords: consensus, curse of correlation, imbalance classification, rank-based chain-mode ensemble

Procedia PDF Downloads 128
4396 A Research on Tourism Market Forecast and Its Evaluation

Authors: Min Wei

Abstract:

The traditional prediction methods of the forecast for tourism market are paid more attention to the accuracy of the forecasts, ignoring the results of the feasibility of forecasting and predicting operability, which had made it difficult to predict the results of scientific testing. With the application of Linear Regression Model, this paper attempts to construct a scientific evaluation system for predictive value, both to ensure the accuracy, stability of the predicted value, and to ensure the feasibility of forecasting and predicting the results of operation. The findings show is that a scientific evaluation system can implement the scientific concept of development, the harmonious development of man and nature co-ordinate.

Keywords: linear regression model, tourism market, forecast, tourism economics

Procedia PDF Downloads 321
4395 Attention Multiple Instance Learning for Cancer Tissue Classification in Digital Histopathology Images

Authors: Afaf Alharbi, Qianni Zhang

Abstract:

The identification of malignant tissue in histopathological slides holds significant importance in both clinical settings and pathology research. This paper introduces a methodology aimed at automatically categorizing cancerous tissue through the utilization of a multiple-instance learning framework. This framework is specifically developed to acquire knowledge of the Bernoulli distribution of the bag label probability by employing neural networks. Furthermore, we put forward a neural network based permutation-invariant aggregation operator, equivalent to attention mechanisms, which is applied to the multi-instance learning network. Through empirical evaluation of an openly available colon cancer histopathology dataset, we provide evidence that our approach surpasses various conventional deep learning methods.

Keywords: attention multiple instance learning, MIL and transfer learning, histopathological slides, cancer tissue classification

Procedia PDF Downloads 96
4394 Application of the Quantile Regression Approach to the Heterogeneity of the Fine Wine Prices

Authors: Charles-Olivier Amédée-Manesme, Benoit Faye, Eric Le Fur

Abstract:

In this paper, the heterogeneity of the Bordeaux Legends 50 wine market price segment is addressed. For this purpose, quantile regression is applied – with market segmentation based on wine bottle price quantile – and the hedonic price of wine attributes is computed for various price segments of the market. The approach is applied to a major privately held data set which consists of approximately 30,000 transactions over the 2003–2014 period. The findings suggest that the relative hedonic prices of several wine attributes differ significantly among deciles. In particular, the elasticity coefficient of the expert ratings shows strong variation among prices. If - as suggested in the literature - expert ratings have a positive influence on wine price on average, they have a clearly decreasing impact over the quantiles. Finally, the lower the wine price, the higher the potential for price appreciation over time. Other variables such as chateaux or vintage are also shown to vary across the distribution of wine prices. While enhancing our understanding of the complex market dynamics that underlie Bordeaux wines’ price, this research provides empirical evidence that the QR approach adequately captures heterogeneity among wine price ranges, which simultaneously applies to wine stock, vintage and auctions’ house.

Keywords: hedonics, market segmentation, quantile regression, heterogeneity, wine economics

Procedia PDF Downloads 330
4393 Urban Heat Island Intensity Assessment through Comparative Study on Land Surface Temperature and Normalized Difference Vegetation Index: A Case Study of Chittagong, Bangladesh

Authors: Tausif A. Ishtiaque, Zarrin T. Tasin, Kazi S. Akter

Abstract:

Current trend of urban expansion, especially in the developing countries has caused significant changes in land cover, which is generating great concern due to its widespread environmental degradation. Energy consumption of the cities is also increasing with the aggravated heat island effect. Distribution of land surface temperature (LST) is one of the most significant climatic parameters affected by urban land cover change. Recent increasing trend of LST is causing elevated temperature profile of the built up area with less vegetative cover. Gradual change in land cover, especially decrease in vegetative cover is enhancing the Urban Heat Island (UHI) effect in the developing cities around the world. Increase in the amount of urban vegetation cover can be a useful solution for the reduction of UHI intensity. LST and Normalized Difference Vegetation Index (NDVI) have widely been accepted as reliable indicators of UHI and vegetation abundance respectively. Chittagong, the second largest city of Bangladesh, has been a growth center due to rapid urbanization over the last several decades. This study assesses the intensity of UHI in Chittagong city by analyzing the relationship between LST and NDVI based on the type of land use/land cover (LULC) in the study area applying an integrated approach of Geographic Information System (GIS), remote sensing (RS), and regression analysis. Land cover map is prepared through an interactive supervised classification using remotely sensed data from Landsat ETM+ image along with NDVI differencing using ArcGIS. LST and NDVI values are extracted from the same image. The regression analysis between LST and NDVI indicates that within the study area, UHI is directly correlated with LST while negatively correlated with NDVI. It interprets that surface temperature reduces with increase in vegetation cover along with reduction in UHI intensity. Moreover, there are noticeable differences in the relationship between LST and NDVI based on the type of LULC. In other words, depending on the type of land usage, increase in vegetation cover has a varying impact on the UHI intensity. This analysis will contribute to the formulation of sustainable urban land use planning decisions as well as suggesting suitable actions for mitigation of UHI intensity within the study area.

Keywords: land cover change, land surface temperature, normalized difference vegetation index, urban heat island

Procedia PDF Downloads 267
4392 Classification Based on Deep Neural Cellular Automata Model

Authors: Yasser F. Hassan

Abstract:

Deep learning structure is a branch of machine learning science and greet achievement in research and applications. Cellular neural networks are regarded as array of nonlinear analog processors called cells connected in a way allowing parallel computations. The paper discusses how to use deep learning structure for representing neural cellular automata model. The proposed learning technique in cellular automata model will be examined from structure of deep learning. A deep automata neural cellular system modifies each neuron based on the behavior of the individual and its decision as a result of multi-level deep structure learning. The paper will present the architecture of the model and the results of simulation of approach are given. Results from the implementation enrich deep neural cellular automata system and shed a light on concept formulation of the model and the learning in it.

Keywords: cellular automata, neural cellular automata, deep learning, classification

Procedia PDF Downloads 182
4391 Factors Affecting Green Consumption Behaviors of the Urban Residents in Hanoi, Vietnam

Authors: Phan Thi Song Thuong

Abstract:

This paper uses data from a survey on the green consumption behavior of Hanoi residents in October 2022. Data was gathered from a survey conducted in ten districts in the center of Hanoi, with 393 respondents. The hypothesis focuses on understanding the factors that may affect green consumption behavior, such as demographic characteristics, concerns about the environment and health, people living around, self-efficiency, and mass media. A number of methods, such as the T-test, exploratory factor analysis, and a linear regression model, are used to prove the hypotheses. Accordingly, the results show that gender, age, and education level have separate effects on the green consumption behavior of respondents.

Keywords: green consumption, urban residents, environment, sustainable, linear regression

Procedia PDF Downloads 114
4390 Recurrent Neural Networks for Classifying Outliers in Electronic Health Record Clinical Text

Authors: Duncan Wallace, M-Tahar Kechadi

Abstract:

In recent years, Machine Learning (ML) approaches have been successfully applied to an analysis of patient symptom data in the context of disease diagnosis, at least where such data is well codified. However, much of the data present in Electronic Health Records (EHR) are unlikely to prove suitable for classic ML approaches. Furthermore, as scores of data are widely spread across both hospitals and individuals, a decentralized, computationally scalable methodology is a priority. The focus of this paper is to develop a method to predict outliers in an out-of-hours healthcare provision center (OOHC). In particular, our research is based upon the early identification of patients who have underlying conditions which will cause them to repeatedly require medical attention. OOHC act as an ad-hoc delivery of triage and treatment, where interactions occur without recourse to a full medical history of the patient in question. Medical histories, relating to patients contacting an OOHC, may reside in several distinct EHR systems in multiple hospitals or surgeries, which are unavailable to the OOHC in question. As such, although a local solution is optimal for this problem, it follows that the data under investigation is incomplete, heterogeneous, and comprised mostly of noisy textual notes compiled during routine OOHC activities. Through the use of Deep Learning methodologies, the aim of this paper is to provide the means to identify patient cases, upon initial contact, which are likely to relate to such outliers. To this end, we compare the performance of Long Short-Term Memory, Gated Recurrent Units, and combinations of both with Convolutional Neural Networks. A further aim of this paper is to elucidate the discovery of such outliers by examining the exact terms which provide a strong indication of positive and negative case entries. While free-text is the principal data extracted from EHRs for classification, EHRs also contain normalized features. Although the specific demographical features treated within our corpus are relatively limited in scope, we examine whether it is beneficial to include such features among the inputs to our neural network, or whether these features are more successfully exploited in conjunction with a different form of a classifier. In this section, we compare the performance of randomly generated regression trees and support vector machines and determine the extent to which our classification program can be improved upon by using either of these machine learning approaches in conjunction with the output of our Recurrent Neural Network application. The output of our neural network is also used to help determine the most significant lexemes present within the corpus for determining high-risk patients. By combining the confidence of our classification program in relation to lexemes within true positive and true negative cases, with an inverse document frequency of the lexemes related to these cases, we can determine what features act as the primary indicators of frequent-attender and non-frequent-attender cases, providing a human interpretable appreciation of how our program classifies cases.

Keywords: artificial neural networks, data-mining, machine learning, medical informatics

Procedia PDF Downloads 117
4389 A Combination of Independent Component Analysis, Relative Wavelet Energy and Support Vector Machine for Mental State Classification

Authors: Nguyen The Hoang Anh, Tran Huy Hoang, Vu Tat Thang, T. T. Quyen Bui

Abstract:

Mental state classification is an important step for realizing a control system based on electroencephalography (EEG) signals which could benefit a lot of paralyzed people including the locked-in or Amyotrophic Lateral Sclerosis. Considering that EEG signals are nonstationary and often contaminated by various types of artifacts, classifying thoughts into correct mental states is not a trivial problem. In this work, our contribution is that we present and realize a novel model which integrates different techniques: Independent component analysis (ICA), relative wavelet energy, and support vector machine (SVM) for the same task. We applied our model to classify thoughts in two types of experiment whether with two or three mental states. The experimental results show that the presented model outperforms other models using Artificial Neural Network, K-Nearest Neighbors, etc.

Keywords: EEG, ICA, SVM, wavelet

Procedia PDF Downloads 377