Search results for: features engineering methods for forecasting
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 20901

Search results for: features engineering methods for forecasting

20571 Designing Price Stability Model of Red Cayenne Pepper Price in Wonogiri District, Centre Java, Using ARCH/GARCH Method

Authors: Fauzia Dianawati, Riska W. Purnomo

Abstract:

Food and agricultural sector become the biggest sector contributing to inflation in Indonesia. Especially in Wonogiri district, red cayenne pepper was the biggest sector contributing to inflation on 2016. A national statistic proved that in recent five years red cayenne pepper has the highest average level of fluctuation among all commodities. Some factors, like supply chain, price disparity, production quantity, crop failure, and oil price become the possible factor causes high volatility level in red cayenne pepper price. Therefore, this research tries to find the key factor causing fluctuation on red cayenne pepper by using ARCH/GARCH method. The method could accommodate the presence of heteroscedasticity in time series data. At the end of the research, it is statistically found that the second level of supply chain becomes the biggest part contributing to inflation with 3,35 of coefficient in fluctuation forecasting model of red cayenne pepper price. This model could become a reference to the government to determine the appropriate policy in maintaining the price stability of red cayenne pepper.

Keywords: ARCH/GARCH, forecasting, red cayenne pepper, volatility, supply chain

Procedia PDF Downloads 186
20570 An Approach Based on Statistics and Multi-Resolution Representation to Classify Mammograms

Authors: Nebi Gedik

Abstract:

One of the significant and continual public health problems in the world is breast cancer. Early detection is very important to fight the disease, and mammography has been one of the most common and reliable methods to detect the disease in the early stages. However, it is a difficult task, and computer-aided diagnosis (CAD) systems are needed to assist radiologists in providing both accurate and uniform evaluation for mass in mammograms. In this study, a multiresolution statistical method to classify mammograms as normal and abnormal in digitized mammograms is used to construct a CAD system. The mammogram images are represented by wave atom transform, and this representation is made by certain groups of coefficients, independently. The CAD system is designed by calculating some statistical features using each group of coefficients. The classification is performed by using support vector machine (SVM).

Keywords: wave atom transform, statistical features, multi-resolution representation, mammogram

Procedia PDF Downloads 222
20569 A Nonlinear Feature Selection Method for Hyperspectral Image Classification

Authors: Pei-Jyun Hsieh, Cheng-Hsuan Li, Bor-Chen Kuo

Abstract:

For hyperspectral image classification, feature reduction is an important pre-processing for avoiding the Hughes phenomena due to the difficulty for collecting training samples. Hence, lots of researches developed feature selection methods such as F-score, HSIC (Hilbert-Schmidt Independence Criterion), and etc., to improve hyperspectral image classification. However, most of them only consider the class separability in the original space, i.e., a linear class separability. In this study, we proposed a nonlinear class separability measure based on kernel trick for selecting an appropriate feature subset. The proposed nonlinear class separability was formed by a generalized RBF kernel with different bandwidths with respect to different features. Moreover, it considered the within-class separability and the between-class separability. A genetic algorithm was applied to tune these bandwidths such that the smallest with-class separability and the largest between-class separability simultaneously. This indicates the corresponding feature space is more suitable for classification. In addition, the corresponding nonlinear classification boundary can separate classes very well. These optimal bandwidths also show the importance of bands for hyperspectral image classification. The reciprocals of these bandwidths can be viewed as weights of bands. The smaller bandwidth, the larger weight of the band, and the more importance for classification. Hence, the descending order of the reciprocals of the bands gives an order for selecting the appropriate feature subsets. In the experiments, three hyperspectral image data sets, the Indian Pine Site data set, the PAVIA data set, and the Salinas A data set, were used to demonstrate the selected feature subsets by the proposed nonlinear feature selection method are more appropriate for hyperspectral image classification. Only ten percent of samples were randomly selected to form the training dataset. All non-background samples were used to form the testing dataset. The support vector machine was applied to classify these testing samples based on selected feature subsets. According to the experiments on the Indian Pine Site data set with 220 bands, the highest accuracies by applying the proposed method, F-score, and HSIC are 0.8795, 0.8795, and 0.87404, respectively. However, the proposed method selects 158 features. F-score and HSIC select 168 features and 217 features, respectively. Moreover, the classification accuracies increase dramatically only using first few features. The classification accuracies with respect to feature subsets of 10 features, 20 features, 50 features, and 110 features are 0.69587, 0.7348, 0.79217, and 0.84164, respectively. Furthermore, only using half selected features (110 features) of the proposed method, the corresponding classification accuracy (0.84168) is approximate to the highest classification accuracy, 0.8795. For other two hyperspectral image data sets, the PAVIA data set and Salinas A data set, we can obtain the similar results. These results illustrate our proposed method can efficiently find feature subsets to improve hyperspectral image classification. One can apply the proposed method to determine the suitable feature subset first according to specific purposes. Then researchers can only use the corresponding sensors to obtain the hyperspectral image and classify the samples. This can not only improve the classification performance but also reduce the cost for obtaining hyperspectral images.

Keywords: hyperspectral image classification, nonlinear feature selection, kernel trick, support vector machine

Procedia PDF Downloads 263
20568 High Techno-Parks in the Economy of Azerbaijan and Their Management Problems

Authors: Rasim M. Alguliyev, Alovsat G. Aliyev, Roza O. Shahverdiyeva

Abstract:

The paper investigated the role and position of high techno-parks, which is one of the priorities of Azerbaijan. The main objectives, functions and features of the establishment of high-techno parks, as well as organization of the activity of the structural elements, which are the parking complex and their interactions were analyzed. The development, organization and management of high techno-parks were studied. The key features and functions of innovative structures’ management were explained. The need for a comprehensive management system for the development of high-techno parks was emphasized and the major problems were analyzed. In addition, the methods were proposed for the development of information systems supporting decision making in systematic and sustainable management of the parks.

Keywords: innovative development, innovation processes, innovation economy, innovation infrastructure, high technology park, efficient management, management decisions, information insurance

Procedia PDF Downloads 472
20567 Analyzing the Influence of Hydrometeorlogical Extremes, Geological Setting, and Social Demographic on Public Health

Authors: Irfan Ahmad Afip

Abstract:

This main research objective is to accurately identify the possibility for a Leptospirosis outbreak severity of a certain area based on its input features into a multivariate regression model. The research question is the possibility of an outbreak in a specific area being influenced by this feature, such as social demographics and hydrometeorological extremes. If the occurrence of an outbreak is being subjected to these features, then the epidemic severity for an area will be different depending on its environmental setting because the features will influence the possibility and severity of an outbreak. Specifically, this research objective was three-fold, namely: (a) to identify the relevant multivariate features and visualize the patterns data, (b) to develop a multivariate regression model based from the selected features and determine the possibility for Leptospirosis outbreak in an area, and (c) to compare the predictive ability of multivariate regression model and machine learning algorithms. Several secondary data features were collected locations in the state of Negeri Sembilan, Malaysia, based on the possibility it would be relevant to determine the outbreak severity in the area. The relevant features then will become an input in a multivariate regression model; a linear regression model is a simple and quick solution for creating prognostic capabilities. A multivariate regression model has proven more precise prognostic capabilities than univariate models. The expected outcome from this research is to establish a correlation between the features of social demographic and hydrometeorological with Leptospirosis bacteria; it will also become a contributor for understanding the underlying relationship between the pathogen and the ecosystem. The relationship established can be beneficial for the health department or urban planner to inspect and prepare for future outcomes in event detection and system health monitoring.

Keywords: geographical information system, hydrometeorological, leptospirosis, multivariate regression

Procedia PDF Downloads 115
20566 Credit Risk Prediction Based on Bayesian Estimation of Logistic Regression Model with Random Effects

Authors: Sami Mestiri, Abdeljelil Farhat

Abstract:

The aim of this current paper is to predict the credit risk of banks in Tunisia, over the period (2000-2005). For this purpose, two methods for the estimation of the logistic regression model with random effects: Penalized Quasi Likelihood (PQL) method and Gibbs Sampler algorithm are applied. By using the information on a sample of 528 Tunisian firms and 26 financial ratios, we show that Bayesian approach improves the quality of model predictions in terms of good classification as well as by the ROC curve result.

Keywords: forecasting, credit risk, Penalized Quasi Likelihood, Gibbs Sampler, logistic regression with random effects, curve ROC

Procedia PDF Downloads 542
20565 Enhancement of Long Term Peak Demand Forecast in Peninsular Malaysia Using Hourly Load Profile

Authors: Nazaitul Idya Hamzah, Muhammad Syafiq Mazli, Maszatul Akmar Mustafa

Abstract:

The peak demand forecast is crucial to identify the future generation plant up needed in the long-term capacity planning analysis for Peninsular Malaysia as well as for the transmission and distribution network planning activities. Currently, peak demand forecast (in Mega Watt) is derived from the generation forecast by using load factor assumption. However, a forecast using this method has underperformed due to the structural changes in the economy, emerging trends and weather uncertainty. The dynamic changes of these drivers will result in many possible outcomes of peak demand for Peninsular Malaysia. This paper will look into the independent model of peak demand forecasting. The model begins with the selection of driver variables to capture long-term growth. This selection and construction of variables, which include econometric, emerging trend and energy variables, will have an impact on the peak forecast. The actual framework begins with the development of system energy and load shape forecast by using the system’s hourly data. The shape forecast represents the system shape assuming all embedded technology and use patterns to continue in the future. This is necessary to identify the movements in the peak hour or changes in the system load factor. The next step would be developing the peak forecast, which involves an iterative process to explore model structures and variables. The final step is combining the system energy, shape, and peak forecasts into the hourly system forecast then modifying it with the forecast adjustments. Forecast adjustments are among other sales forecasts for electric vehicles, solar and other adjustments. The framework will result in an hourly forecast that captures growth, peak usage and new technologies. The advantage of this approach as compared to the current methodology is that the peaks capture new technology impacts that change the load shape.

Keywords: hourly load profile, load forecasting, long term peak demand forecasting, peak demand

Procedia PDF Downloads 172
20564 Analysis of the Engineering Judgement Influence on the Selection of Geotechnical Parameters Characteristic Values

Authors: K. Ivandic, F. Dodigovic, D. Stuhec, S. Strelec

Abstract:

A characteristic value of certain geotechnical parameter results from an engineering assessment. Its selection has to be based on technical principles and standards of engineering practice. It has been shown that the results of engineering assessment of different authors for the same problem and input data are significantly dispersed. A survey was conducted in which participants had to estimate the force that causes a 10 cm displacement at the top of a axially in-situ compressed pile. Fifty experts from all over the world took part in it. The lowest estimated force value was 42% and the highest was 133% of measured force resulting from a mentioned static pile load test. These extreme values result in significantly different technical solutions to the same engineering task. In case of selecting a characteristic value of a geotechnical parameter the importance of the influence of an engineering assessment can be reduced by using statistical methods. An informative annex of Eurocode 1 prescribes the method of selecting the characteristic values of material properties. This is followed by Eurocode 7 with certain specificities linked to selecting characteristic values of geotechnical parameters. The paper shows the procedure of selecting characteristic values of a geotechnical parameter by using a statistical method with different initial conditions. The aim of the paper is to quantify an engineering assessment in the example of determining a characteristic value of a specific geotechnical parameter. It is assumed that this assessment is a random variable and that its statistical features will be determined. For this purpose, a survey research was conducted among relevant experts from the field of geotechnical engineering. Conclusively, the results of the survey and the application of statistical method were compared.

Keywords: characteristic values, engineering judgement, Eurocode 7, statistical methods

Procedia PDF Downloads 296
20563 Assessing Future Offshore Wind Farms in the Gulf of Roses: Insights from Weather Research and Forecasting Model Version 4.2

Authors: Kurias George, Ildefonso Cuesta Romeo, Clara Salueña Pérez, Jordi Sole Olle

Abstract:

With the growing prevalence of wind energy there is a need, for modeling techniques to evaluate the impact of wind farms on meteorology and oceanography. This study presents an approach that utilizes the WRF (Weather Research and Forecasting )with that include a Wind Farm Parametrization model to simulate the dynamics around Parc Tramuntana project, a offshore wind farm to be located near the Gulf of Roses off the coast of Barcelona, Catalonia. The model incorporates parameterizations for wind turbines enabling a representation of the wind field and how it interacts with the infrastructure of the wind farm. Current results demonstrate that the model effectively captures variations in temeperature, pressure and in both wind speed and direction over time along with their resulting effects on power output from the wind farm. These findings are crucial for optimizing turbine placement and operation thus improving efficiency and sustainability of the wind farm. In addition to focusing on atmospheric interactions, this study delves into the wake effects within the turbines in the farm. A range of meteorological parameters were also considered to offer a comprehensive understanding of the farm's microclimate. The model was tested under different horizontal resolutions and farm layouts to scrutinize the wind farm's effects more closely. These experimental configurations allow for a nuanced understanding of how turbine wakes interact with each other and with the broader atmospheric and oceanic conditions. This modified approach serves as a potent tool for stakeholders in renewable energy, environmental protection, and marine spatial planning. environmental protection and marine spatial planning. It provides a range of information regarding the environmental and socio economic impacts of offshore wind energy projects.

Keywords: weather research and forecasting, wind turbine wake effects, environmental impact, wind farm parametrization, sustainability analysis

Procedia PDF Downloads 72
20562 Enhancement of Road Defect Detection Using First-Level Algorithm Based on Channel Shuffling and Multi-Scale Feature Fusion

Authors: Yifan Hou, Haibo Liu, Le Jiang, Wandong Su, Binqing Wang

Abstract:

Road defect detection is crucial for modern urban management and infrastructure maintenance. Traditional road defect detection methods mostly rely on manual labor, which is not only inefficient but also difficult to ensure their reliability. However, existing deep learning-based road defect detection models have poor detection performance in complex environments and lack robustness to multi-scale targets. To address this challenge, this paper proposes a distinct detection framework based on the one stage algorithm network structure. This article designs a deep feature extraction network based on RCSDarknet, which applies channel shuffling to enhance information fusion between tensors. Through repeated stacking of RCS modules, the information flow between different channels of adjacent layer features is enhanced to improve the model's ability to capture target spatial features. In addition, a multi-scale feature fusion mechanism with weighted dual flow paths was adopted to fuse spatial features of different scales, thereby further improving the detection performance of the model at different scales. To validate the performance of the proposed algorithm, we tested it using the RDD2022 dataset. The experimental results show that the enhancement algorithm achieved 84.14% mAP, which is 1.06% higher than the currently advanced YOLOv8 algorithm. Through visualization analysis of the results, it can also be seen that our proposed algorithm has good performance in detecting targets of different scales in complex scenes. The above experimental results demonstrate the effectiveness and superiority of the proposed algorithm, providing valuable insights for advancing real-time road defect detection methods.

Keywords: roads, defect detection, visualization, deep learning

Procedia PDF Downloads 6
20561 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)

Procedia PDF Downloads 235
20560 AS-Geo: Arbitrary-Sized Image Geolocalization with Learnable Geometric Enhancement Resizer

Authors: Huayuan Lu, Chunfang Yang, Ma Zhu, Baojun Qi, Yaqiong Qiao, Jiangqian Xu

Abstract:

Image geolocalization has great application prospects in fields such as autonomous driving and virtual/augmented reality. In practical application scenarios, the size of the image to be located is not fixed; it is impractical to train different networks for all possible sizes. When its size does not match the size of the input of the descriptor extraction model, existing image geolocalization methods usually directly scale or crop the image in some common ways. This will result in the loss of some information important to the geolocalization task, thus affecting the performance of the image geolocalization method. For example, excessive down-sampling can lead to blurred building contour, and inappropriate cropping can lead to the loss of key semantic elements, resulting in incorrect geolocation results. To address this problem, this paper designs a learnable image resizer and proposes an arbitrary-sized image geolocation method. (1) The designed learnable image resizer employs the self-attention mechanism to enhance the geometric features of the resized image. Firstly, it applies bilinear interpolation to the input image and its feature maps to obtain the initial resized image and the resized feature maps. Then, SKNet (selective kernel net) is used to approximate the best receptive field, thus keeping the geometric shapes as the original image. And SENet (squeeze and extraction net) is used to automatically select the feature maps with strong contour information, enhancing the geometric features. Finally, the enhanced geometric features are fused with the initial resized image, to obtain the final resized images. (2) The proposed image geolocalization method embeds the above image resizer as a fronting layer of the descriptor extraction network. It not only enables the network to be compatible with arbitrary-sized input images but also enhances the geometric features that are crucial to the image geolocalization task. Moreover, the triplet attention mechanism is added after the first convolutional layer of the backbone network to optimize the utilization of geometric elements extracted by the first convolutional layer. Finally, the local features extracted by the backbone network are aggregated to form image descriptors for image geolocalization. The proposed method was evaluated on several mainstream datasets, such as Pittsburgh30K, Tokyo24/7, and Places365. The results show that the proposed method has excellent size compatibility and compares favorably to recently mainstream geolocalization methods.

Keywords: image geolocalization, self-attention mechanism, image resizer, geometric feature

Procedia PDF Downloads 214
20559 A Hybrid Fuzzy Clustering Approach for Fertile and Unfertile Analysis

Authors: Shima Soltanzadeh, Mohammad Hosain Fazel Zarandi, Mojtaba Barzegar Astanjin

Abstract:

Diagnosis of male infertility by the laboratory tests is expensive and, sometimes it is intolerable for patients. Filling out the questionnaire and then using classification method can be the first step in decision-making process, so only in the cases with a high probability of infertility we can use the laboratory tests. In this paper, we evaluated the performance of four classification methods including naive Bayesian, neural network, logistic regression and fuzzy c-means clustering as a classification, in the diagnosis of male infertility due to environmental factors. Since the data are unbalanced, the ROC curves are most suitable method for the comparison. In this paper, we also have selected the more important features using a filtering method and examined the impact of this feature reduction on the performance of each methods; generally, most of the methods had better performance after applying the filter. We have showed that using fuzzy c-means clustering as a classification has a good performance according to the ROC curves and its performance is comparable to other classification methods like logistic regression.

Keywords: classification, fuzzy c-means, logistic regression, Naive Bayesian, neural network, ROC curve

Procedia PDF Downloads 336
20558 The Syntactic Features of Islamic Legal Texts and Their Implications for Translation

Authors: Rafat Y. Alwazna

Abstract:

Certain religious texts are deemed part of legal texts that are characterised by high sensitivity and sacredness. Amongst such religious texts are Islamic legal texts that are replete with Islamic legal terms that designate particular legal concepts peculiar to Islamic legal system and legal culture. However, from the syntactic perspective, Islamic legal texts prove lengthy, condensed and convoluted, with little use of punctuation system, but with an extensive use of subordinations and co-ordinations, which separate the main verb from the subject, and which, of course, carry a heavy load of legal detail. The present paper seeks to examine the syntactic features of Islamic legal texts through analysing a short text of Islamic jurisprudence in an attempt at exploring the syntactic features that characterise this type of legal text. A translation of this text into legal English is then exercised to find the translation implications that have emerged as a result of the English translation. Based on these implications, the paper compares and contrasts the syntactic features of Islamic legal texts to those of legal English texts. Finally, the present paper argues that there are a number of syntactic features of Islamic legal texts, such as nominalisation, passivisation, little use of punctuation system, the use of the Arabic cohesive device, etc., which are also possessed by English legal texts except for the last feature and with some variations. The paper also claims that when rendering an Islamic legal text into legal English, certain implications emerge, such as the necessity of a sentence break, the omission of the cohesive device concerned and the increase in the use of nominalisation, passivisation, passive participles, and so on.

Keywords: English legal texts, Islamic legal texts, nominalisation, participles, passivisation, syntactic features, translation implications

Procedia PDF Downloads 233
20557 Trabecular Bone Radiograph Characterization Using Fractal, Multifractal Analysis and SVM Classifier

Authors: I. Slim, H. Akkari, A. Ben Abdallah, I. Bhouri, M. Hedi Bedoui

Abstract:

Osteoporosis is a common disease characterized by low bone mass and deterioration of micro-architectural bone tissue, which provokes an increased risk of fracture. This work treats the texture characterization of trabecular bone radiographs. The aim was to analyze according to clinical research a group of 174 subjects: 87 osteoporotic patients (OP) with various bone fracture types and 87 control cases (CC). To characterize osteoporosis, Fractal and MultiFractal (MF) methods were applied to images for features (attributes) extraction. In order to improve the results, a new method of MF spectrum based on the q-stucture function calculation was proposed and a combination of Fractal and MF attributes was used. The Support Vector Machines (SVM) was applied as a classifier to distinguish between OP patients and CC subjects. The features fusion (fractal and MF) allowed a good discrimination between the two groups with an accuracy rate of 96.22%.

Keywords: fractal, micro-architecture analysis, multifractal, osteoporosis, SVM

Procedia PDF Downloads 392
20556 A Comparative Study of Optimization Techniques and Models to Forecasting Dengue Fever

Authors: Sudha T., Naveen C.

Abstract:

Dengue is a serious public health issue that causes significant annual economic and welfare burdens on nations. However, enhanced optimization techniques and quantitative modeling approaches can predict the incidence of dengue. By advocating for a data-driven approach, public health officials can make informed decisions, thereby improving the overall effectiveness of sudden disease outbreak control efforts. The National Oceanic and Atmospheric Administration and the Centers for Disease Control and Prevention are two of the U.S. Federal Government agencies from which this study uses environmental data. Based on environmental data that describe changes in temperature, precipitation, vegetation, and other factors known to affect dengue incidence, many predictive models are constructed that use different machine learning methods to estimate weekly dengue cases. The first step involves preparing the data, which includes handling outliers and missing values to make sure the data is prepared for subsequent processing and the creation of an accurate forecasting model. In the second phase, multiple feature selection procedures are applied using various machine learning models and optimization techniques. During the third phase of the research, machine learning models like the Huber Regressor, Support Vector Machine, Gradient Boosting Regressor (GBR), and Support Vector Regressor (SVR) are compared with several optimization techniques for feature selection, such as Harmony Search and Genetic Algorithm. In the fourth stage, the model's performance is evaluated using Mean Square Error (MSE), Mean Absolute Error (MAE), and Root Mean Square Error (RMSE) as assistance. Selecting an optimization strategy with the least number of errors, lowest price, biggest productivity, or maximum potential results is the goal. In a variety of industries, including engineering, science, management, mathematics, finance, and medicine, optimization is widely employed. An effective optimization method based on harmony search and an integrated genetic algorithm is introduced for input feature selection, and it shows an important improvement in the model's predictive accuracy. The predictive models with Huber Regressor as the foundation perform the best for optimization and also prediction.

Keywords: deep learning model, dengue fever, prediction, optimization

Procedia PDF Downloads 65
20555 Design and Implementation of Machine Learning Model for Short-Term Energy Forecasting in Smart Home Management System

Authors: R. Ramesh, K. K. Shivaraman

Abstract:

The main aim of this paper is to handle the energy requirement in an efficient manner by merging the advanced digital communication and control technologies for smart grid applications. In order to reduce user home load during peak load hours, utility applies several incentives such as real-time pricing, time of use, demand response for residential customer through smart meter. However, this method provides inconvenience in the sense that user needs to respond manually to prices that vary in real time. To overcome these inconvenience, this paper proposes a convolutional neural network (CNN) with k-means clustering machine learning model which have ability to forecast energy requirement in short term, i.e., hour of the day or day of the week. By integrating our proposed technique with home energy management based on Bluetooth low energy provides predicted value to user for scheduling appliance in advanced. This paper describes detail about CNN configuration and k-means clustering algorithm for short-term energy forecasting.

Keywords: convolutional neural network, fuzzy logic, k-means clustering approach, smart home energy management

Procedia PDF Downloads 304
20554 Breast Cancer Survivability Prediction via Classifier Ensemble

Authors: Mohamed Al-Badrashiny, Abdelghani Bellaachia

Abstract:

This paper presents a classifier ensemble approach for predicting the survivability of the breast cancer patients using the latest database version of the Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute. The system consists of two main components; features selection and classifier ensemble components. The features selection component divides the features in SEER database into four groups. After that it tries to find the most important features among the four groups that maximizes the weighted average F-score of a certain classification algorithm. The ensemble component uses three different classifiers, each of which models different set of features from SEER through the features selection module. On top of them, another classifier is used to give the final decision based on the output decisions and confidence scores from each of the underlying classifiers. Different classification algorithms have been examined; the best setup found is by using the decision tree, Bayesian network, and Na¨ıve Bayes algorithms for the underlying classifiers and Na¨ıve Bayes for the classifier ensemble step. The system outperforms all published systems to date when evaluated against the exact same data of SEER (period of 1973-2002). It gives 87.39% weighted average F-score compared to 85.82% and 81.34% of the other published systems. By increasing the data size to cover the whole database (period of 1973-2014), the overall weighted average F-score jumps to 92.4% on the held out unseen test set.

Keywords: classifier ensemble, breast cancer survivability, data mining, SEER

Procedia PDF Downloads 328
20553 Ensemble Methods in Machine Learning: An Algorithmic Approach to Derive Distinctive Behaviors of Criminal Activity Applied to the Poaching Domain

Authors: Zachary Blanks, Solomon Sonya

Abstract:

Poaching presents a serious threat to endangered animal species, environment conservations, and human life. Additionally, some poaching activity has even been linked to supplying funds to support terrorist networks elsewhere around the world. Consequently, agencies dedicated to protecting wildlife habitats have a near intractable task of adequately patrolling an entire area (spanning several thousand kilometers) given limited resources, funds, and personnel at their disposal. Thus, agencies need predictive tools that are both high-performing and easily implementable by the user to help in learning how the significant features (e.g. animal population densities, topography, behavior patterns of the criminals within the area, etc) interact with each other in hopes of abating poaching. This research develops a classification model using machine learning algorithms to aid in forecasting future attacks that is both easy to train and performs well when compared to other models. In this research, we demonstrate how data imputation methods (specifically predictive mean matching, gradient boosting, and random forest multiple imputation) can be applied to analyze data and create significant predictions across a varied data set. Specifically, we apply these methods to improve the accuracy of adopted prediction models (Logistic Regression, Support Vector Machine, etc). Finally, we assess the performance of the model and the accuracy of our data imputation methods by learning on a real-world data set constituting four years of imputed data and testing on one year of non-imputed data. This paper provides three main contributions. First, we extend work done by the Teamcore and CREATE (Center for Risk and Economic Analysis of Terrorism Events) research group at the University of Southern California (USC) working in conjunction with the Department of Homeland Security to apply game theory and machine learning algorithms to develop more efficient ways of reducing poaching. This research introduces ensemble methods (Random Forests and Stochastic Gradient Boosting) and applies it to real-world poaching data gathered from the Ugandan rain forest park rangers. Next, we consider the effect of data imputation on both the performance of various algorithms and the general accuracy of the method itself when applied to a dependent variable where a large number of observations are missing. Third, we provide an alternate approach to predict the probability of observing poaching both by season and by month. The results from this research are very promising. We conclude that by using Stochastic Gradient Boosting to predict observations for non-commercial poaching by season, we are able to produce statistically equivalent results while being orders of magnitude faster in computation time and complexity. Additionally, when predicting potential poaching incidents by individual month vice entire seasons, boosting techniques produce a mean area under the curve increase of approximately 3% relative to previous prediction schedules by entire seasons.

Keywords: ensemble methods, imputation, machine learning, random forests, statistical analysis, stochastic gradient boosting, wildlife protection

Procedia PDF Downloads 291
20552 A Fuzzy-Rough Feature Selection Based on Binary Shuffled Frog Leaping Algorithm

Authors: Javad Rahimipour Anaraki, Saeed Samet, Mahdi Eftekhari, Chang Wook Ahn

Abstract:

Feature selection and attribute reduction are crucial problems, and widely used techniques in the field of machine learning, data mining and pattern recognition to overcome the well-known phenomenon of the Curse of Dimensionality. This paper presents a feature selection method that efficiently carries out attribute reduction, thereby selecting the most informative features of a dataset. It consists of two components: 1) a measure for feature subset evaluation, and 2) a search strategy. For the evaluation measure, we have employed the fuzzy-rough dependency degree (FRFDD) of the lower approximation-based fuzzy-rough feature selection (L-FRFS) due to its effectiveness in feature selection. As for the search strategy, a modified version of a binary shuffled frog leaping algorithm is proposed (B-SFLA). The proposed feature selection method is obtained by hybridizing the B-SFLA with the FRDD. Nine classifiers have been employed to compare the proposed approach with several existing methods over twenty two datasets, including nine high dimensional and large ones, from the UCI repository. The experimental results demonstrate that the B-SFLA approach significantly outperforms other metaheuristic methods in terms of the number of selected features and the classification accuracy.

Keywords: binary shuffled frog leaping algorithm, feature selection, fuzzy-rough set, minimal reduct

Procedia PDF Downloads 225
20551 Support Vector Regression Combined with Different Optimization Algorithms to Predict Global Solar Radiation on Horizontal Surfaces in Algeria

Authors: Laidi Maamar, Achwak Madani, Abdellah El Ahdj Abdellah

Abstract:

The aim of this work is to use Support Vector regression (SVR) combined with dragonfly, firefly, Bee Colony and particle swarm Optimization algorithm to predict global solar radiation on horizontal surfaces in some cities in Algeria. Combining these optimization algorithms with SVR aims principally to enhance accuracy by fine-tuning the parameters, speeding up the convergence of the SVR model, and exploring a larger search space efficiently; these parameters are the regularization parameter (C), kernel parameters, and epsilon parameter. By doing so, the aim is to improve the generalization and predictive accuracy of the SVR model. Overall, the aim is to leverage the strengths of both SVR and optimization algorithms to create a more powerful and effective regression model for various cities and under different climate conditions. Results demonstrate close agreement between predicted and measured data in terms of different metrics. In summary, SVM has proven to be a valuable tool in modeling global solar radiation, offering accurate predictions and demonstrating versatility when combined with other algorithms or used in hybrid forecasting models.

Keywords: support vector regression (SVR), optimization algorithms, global solar radiation prediction, hybrid forecasting models

Procedia PDF Downloads 35
20550 Is Presence of Psychotic Features Themselves Carry a Risk for Metabolic Syndrome?

Authors: Rady A., Elsheshai A., Elsawy M., Nagui R.

Abstract:

Background and Aim: Metabolic syndrome affect around 20% of general population , authors have incriminated antipsychotics as serious risk factor that may provoke such derangement. The aim of our study is to assess metabolic syndrome in patients presenting psychotic features (delusions and hallucinations) whether schizophrenia or mood disorder and compare results in terms of drug naïf, on medication and healthy control. Subjects and Methods: The study recruited 40 schizophrenic patients, half of them drug naïf and the other half on antipsychotics, 40 patients with mood disorder with psychotic features, half of them drug naïf and the other half on medication, 20 healthy control. Exclusion criteria were put in order to exclude patients having already endocrine or metabolic disorders that my interfere with results obtain to minimize confusion bias. Metabolic syndrome assessed by measuring parameters including weight, body mass index, waist circumference, triglyceride level, HDL, fasting glucose, fasting insulin and insulin resistance Results: No difference was found when comparing drug naïf to those on medication in both schizophrenic and psychotic mood disorder arms, schizophrenic patients whether on medication or drug naïf should difference with control group for fasting glucose, schizophrenic patients on medication also showed difference in insulin resistance compared to control group. On the other hand, patients with psychotic mood disorder whether drug naïf or on medication showed difference from control group for fasting insulin level. Those on medication also differed from control for insulin resistance Conclusion: Our study didn’t reveal difference in metabolic syndrome among patients with psychotic features whether on medication or drug naïf. Only patients with Psychotic features on medication showed insulin resistance. Schizophrenic patients drug naïf or on medication tend to show higher fasting glucose while psychotic mood disorder whether drug naïf or on medication tend to show higher fasting insulin. This study suggest that presence of psychotic features themselves regardless being on medication or not carries a risk for insulin resistance and metabolic syndrome. Limitation: This study is limited by number of participants and larger numbers in future studies should be included in order to extrapolate results. Cohort longitudinal studies are needed in order to evaluate such hypothesis.

Keywords: schizophrenia, metabolic syndrome, psychosis, insulin, resistance

Procedia PDF Downloads 535
20549 Enhanced Thai Character Recognition with Histogram Projection Feature Extraction

Authors: Benjawan Rangsikamol, Chutimet Srinilta

Abstract:

This research paper deals with extraction of Thai character features using the proposed histogram projection so as to improve the recognition performance. The process starts with transformation of image files into binary files before thinning. After character thinning, the skeletons are entered into the proposed extraction using histogram projection (horizontal and vertical) to extract unique features which are inputs of the subsequent recognition step. The recognition rate with the proposed extraction technique is as high as 97 percent since the technique works very well with the idiosyncrasies of Thai characters.

Keywords: character recognition, histogram projection, multilayer perceptron, Thai character features extraction

Procedia PDF Downloads 464
20548 Load Forecasting Using Neural Network Integrated with Economic Dispatch Problem

Authors: Mariyam Arif, Ye Liu, Israr Ul Haq, Ahsan Ashfaq

Abstract:

High cost of fossil fuels and intensifying installations of alternate energy generation sources are intimidating main challenges in power systems. Making accurate load forecasting an important and challenging task for optimal energy planning and management at both distribution and generation side. There are many techniques to forecast load but each technique comes with its own limitation and requires data to accurately predict the forecast load. Artificial Neural Network (ANN) is one such technique to efficiently forecast the load. Comparison between two different ranges of input datasets has been applied to dynamic ANN technique using MATLAB Neural Network Toolbox. It has been observed that selection of input data on training of a network has significant effects on forecasted results. Day-wise input data forecasted the load accurately as compared to year-wise input data. The forecasted load is then distributed among the six generators by using the linear programming to get the optimal point of generation. The algorithm is then verified by comparing the results of each generator with their respective generation limits.

Keywords: artificial neural networks, demand-side management, economic dispatch, linear programming, power generation dispatch

Procedia PDF Downloads 188
20547 The Term Spread Impact on Economic Activity for Transition Economies: Case of Georgia

Authors: L. Totladze

Abstract:

The role of financial sector in supporting economic growth and development is well acknowledged. The term spread (the difference between the yields on long-term and short-term Treasury securities) has been found useful for predicting economic variables as output growth, inflation, industrial production, consumption. The temp spread is one of the leading economic indicators according to NBER methodology. Leading economic indicators are widely used in forecasting of economic activity. Many empirical studies find that the term spread predicts future economic activity. The article shortly explains how the term spread might predict future economic activity. This paper analyses the dynamics of the spread between short and long-term interest rates in countries with transition economies. The research paper analyses term spread dynamics in Georgia and compare it with post-communist countries and transition economies spread dynamics. In Georgia, the banking sector plays an important and dominant role in the financial sector, especially with respect to the mobilization of savings and provision of credit and may impact on economic activity. For this purpose, we study the impact of the term spread on economic growth in Georgia.

Keywords: forecasting, leading economic indicators, term spread, transition economies

Procedia PDF Downloads 176
20546 A Review of Serious Games Characteristics: Common and Specific Aspects

Authors: B. Ben Amara, H. Mhiri Sellami

Abstract:

Serious games adoption is increasing in multiple fields, including health, education, and business. In the same way, many research studied serious games (SGs) for various purposes such as classification, positive impacts, or learning outcomes. Although most of these research examine SG characteristics (SGCs) for conducting their studies, to author’s best knowledge, there is no consensus about features neither in number not in the description. In this paper, we conduct a literature review to collect essential game attributes regardless of the application areas and the study objectives. Firstly, we aimed to define Common SGCs (CSGCs) that characterize the game aspect, by gathering features having the same meanings. Secondly, we tried to identify specific features related to the application area or to the study purpose as a serious aspect. The findings suggest that any type of SG can be defined by a number of CSGCs depicting the gaming side, such as adaptability and rules. In addition, we outlined a number of specific SGCs describing the 'serious' aspect, including specific needs of the domain and indented outcomes. In conclusion, our review showed that it is possible to bridge the research gap due to the lack of consensus by using CSGCs. Moreover, these features facilitate the design and development of successful serious games in any domain and provide a foundation for further research in this area.

Keywords: serious game characteristics, serious games common aspects, serious games features, serious games outcomes

Procedia PDF Downloads 135
20545 Evaluation of Technology Tools for Mathematics Instruction by Novice Elementary Teachers

Authors: Christopher J. Johnston

Abstract:

This paper presents the finding of a research study in which novice (first and second year) elementary teachers (grades Kindergarten – six) evaluated various mathematics Virtual Manipulatives, websites, and Applets (tools) for use in mathematics instruction. Participants identified the criteria they used for evaluating these types of resources and provided recommendations for or against five pre-selected tools. During the study, participants participated in three data collection activities: (1) A brief Likert-scale survey which gathered information about their attitudes toward technology use; (2) An identification of criteria for evaluating technology tools; and (3) A review of five pre-selected technology tools in light of their self-identified criteria. Data were analyzed qualitatively using four theoretical categories (codes): Software Features (41%), Mathematics (26%), Learning (22%), and Motivation (11%). These four theoretical categories were then grouped into two broad categories: Content and Instruction (Mathematics and Learning), and Surface Features (Software Features and Motivation). These combined, broad categories suggest novice teachers place roughly the same weight on pedagogical features as they do technological features. Implications for mathematics teacher educators are discussed, and suggestions for future research are provided.

Keywords: mathematics education, novice teachers, technology, virtual manipulatives

Procedia PDF Downloads 133
20544 Forecasting Age-Specific Mortality Rates and Life Expectancy at Births for Malaysian Sub-Populations

Authors: Syazreen N. Shair, Saiful A. Ishak, Aida Y. Yusof, Azizah Murad

Abstract:

In this paper, we forecast age-specific Malaysian mortality rates and life expectancy at births by gender and ethnic groups including Malay, Chinese and Indian. Two mortality forecasting models are adopted the original Lee-Carter model and its recent modified version, the product ratio coherent model. While the first forecasts the mortality rates for each subpopulation independently, the latter accounts for the relationship between sub-populations. The evaluation of both models is performed using the out-of-sample forecast errors which are mean absolute percentage errors (MAPE) for mortality rates and mean forecast errors (MFE) for life expectancy at births. The best model is then used to perform the long-term forecasts up to the year 2030, the year when Malaysia is expected to become an aged nation. Results suggest that in terms of overall accuracy, the product ratio model performs better than the original Lee-Carter model. The association of lower mortality group (Chinese) in the subpopulation model can improve the forecasts of high mortality groups (Malay and Indian).

Keywords: coherent forecasts, life expectancy at births, Lee-Carter model, product-ratio model, mortality rates

Procedia PDF Downloads 218
20543 Design and Fabrication of a Scaffold with Appropriate Features for Cartilage Tissue Engineering

Authors: S. S. Salehi, A. Shamloo

Abstract:

Poor ability of cartilage tissue when experiencing a damage leads scientists to use tissue engineering as a reliable and effective method for regenerating or replacing damaged tissues. An artificial tissue should have some features such as biocompatibility, biodegradation and, enough mechanical properties like the original tissue. In this work, a composite hydrogel is prepared by using natural and synthetic materials that has high porosity. Mechanical properties of different combinations of polymers such as modulus of elasticity were tested, and a hydrogel with good mechanical properties was selected. Bone marrow derived mesenchymal stem cells were also seeded into the pores of the sponge, and the results showed the adhesion and proliferation of cells within the hydrogel after one month. In comparison with previous works, this study offers a new and efficient procedure for the fabrication of cartilage like tissue and further cartilage repair.

Keywords: cartilage tissue engineering, hydrogel, mechanical strength, mesenchymal stem cell

Procedia PDF Downloads 300
20542 Identification of Vessel Class with Long Short-Term Memory Using Kinematic Features in Maritime Traffic Control

Authors: Davide Fuscà, Kanan Rahimli, Roberto Leuzzi

Abstract:

Preventing abuse and illegal activities in a given area of the sea is a very difficult and expensive task. Artificial intelligence offers the possibility to implement new methods to identify the vessel class type from the kinematic features of the vessel itself. The task strictly depends on the quality of the data. This paper explores the application of a deep, long short-term memory model by using AIS flow only with a relatively low quality. The proposed model reaches high accuracy on detecting nine vessel classes representing the most common vessel types in the Ionian-Adriatic Sea. The model has been applied during the Adriatic-Ionian trial period of the international EU ANDROMEDA H2020 project to identify vessels performing behaviors far from the expected one depending on the declared type.

Keywords: maritime surveillance, artificial intelligence, behavior analysis, LSTM

Procedia PDF Downloads 231