Search results for: accuracy
3330 Forecasting Stock Prices Based on the Residual Income Valuation Model: Evidence from a Time-Series Approach
Authors: Chen-Yin Kuo, Yung-Hsin Lee
Abstract:
Previous studies applying residual income valuation (RIV) model generally use panel data and single-equation model to forecast stock prices. Unlike these, this paper uses Taiwan longitudinal data to estimate multi-equation time-series models such as Vector Autoregressive (VAR), Vector Error Correction Model (VECM), and conduct out-of-sample forecasting. Further, this work assesses their forecasting performance by two instruments. In favor of extant research, the major finding shows that VECM outperforms other three models in forecasting for three stock sectors over entire horizons. It implies that an error correction term containing long-run information contributes to improve forecasting accuracy. Moreover, the pattern of composite shows that at longer horizon, VECM produces the greater reduction in errors, and performs substantially better than VAR.Keywords: residual income valuation model, vector error correction model, out of sample forecasting, forecasting accuracy
Procedia PDF Downloads 3163329 Amharic Text News Classification Using Supervised Learning
Authors: Misrak Assefa
Abstract:
The Amharic language is the second most widely spoken Semitic language in the world. There are several new overloaded on the web. Searching some useful documents from the web on a specific topic, which is written in the Amharic language, is a challenging task. Hence, document categorization is required for managing and filtering important information. In the classification of Amharic text news, there is still a gap in the domain of information that needs to be launch. This study attempts to design an automatic Amharic news classification using a supervised learning mechanism on four un-touch classes. To achieve this research, 4,182 news articles were used. Naive Bayes (NB) and Decision tree (j48) algorithms were used to classify the given Amharic dataset. In this paper, k-fold cross-validation is used to estimate the accuracy of the classifier. As a result, it shows those algorithms can be applicable in Amharic news categorization. The best average accuracy result is achieved by j48 decision tree and naïve Bayes is 95.2345 %, and 94.6245 % respectively using three categories. This research indicated that a typical decision tree algorithm is more applicable to Amharic news categorization.Keywords: text categorization, supervised machine learning, naive Bayes, decision tree
Procedia PDF Downloads 2093328 A Simple and Easy-To-Use Tool for Detecting Outer Contour of Leukocytes Based on Image Processing Techniques
Authors: Retno Supriyanti, Best Leader Nababan, Yogi Ramadhani, Wahyu Siswandari
Abstract:
Blood cell morphology is an important parameter in a hematology test. Currently, in developing countries, a lot of hematology is done manually, either by physicians or laboratory staff. According to the limitation of the human eye, examination based on manual method will result in a lower precision and accuracy. In addition, the hematology test by manual will further complicate the diagnosis in some areas that do not have competent medical personnel. This research aims to develop a simple tool in the detection of blood cell morphology-based computer. In this paper, we focus on the detection of the outer contour of leukocytes. The results show that the system that we developed is promising for detecting blood cell morphology automatically. It is expected, by implementing this method, the problem of accuracy, precision and limitations of the medical staff can be solved.Keywords: morphology operation, developing countries, hematology test, limitation of medical personnel
Procedia PDF Downloads 3373327 Margin-Based Feed-Forward Neural Network Classifiers
Authors: Xiaohan Bookman, Xiaoyan Zhu
Abstract:
Margin-Based Principle has been proposed for a long time, it has been proved that this principle could reduce the structural risk and improve the performance in both theoretical and practical aspects. Meanwhile, feed-forward neural network is a traditional classifier, which is very hot at present with a deeper architecture. However, the training algorithm of feed-forward neural network is developed and generated from Widrow-Hoff Principle that means to minimize the squared error. In this paper, we propose a new training algorithm for feed-forward neural networks based on Margin-Based Principle, which could effectively promote the accuracy and generalization ability of neural network classifiers with less labeled samples and flexible network. We have conducted experiments on four UCI open data sets and achieved good results as expected. In conclusion, our model could handle more sparse labeled and more high-dimension data set in a high accuracy while modification from old ANN method to our method is easy and almost free of work.Keywords: Max-Margin Principle, Feed-Forward Neural Network, classifier, structural risk
Procedia PDF Downloads 3413326 WebAppShield: An Approach Exploiting Machine Learning to Detect SQLi Attacks in an Application Layer in Run-time
Authors: Ahmed Abdulla Ashlam, Atta Badii, Frederic Stahl
Abstract:
In recent years, SQL injection attacks have been identified as being prevalent against web applications. They affect network security and user data, which leads to a considerable loss of money and data every year. This paper presents the use of classification algorithms in machine learning using a method to classify the login data filtering inputs into "SQLi" or "Non-SQLi,” thus increasing the reliability and accuracy of results in terms of deciding whether an operation is an attack or a valid operation. A method Web-App auto-generated twin data structure replication. Shielding against SQLi attacks (WebAppShield) that verifies all users and prevents attackers (SQLi attacks) from entering and or accessing the database, which the machine learning module predicts as "Non-SQLi" has been developed. A special login form has been developed with a special instance of data validation; this verification process secures the web application from its early stages. The system has been tested and validated, up to 99% of SQLi attacks have been prevented.Keywords: SQL injection, attacks, web application, accuracy, database
Procedia PDF Downloads 1513325 Cognitive Methods for Detecting Deception During the Criminal Investigation Process
Authors: Laid Fekih
Abstract:
Background: It is difficult to detect lying, deception, and misrepresentation just by looking at verbal or non-verbal expression during the criminal investigation process, as there is a common belief that it is possible to tell whether a person is lying or telling the truth just by looking at the way they act or behave. The process of detecting lies and deception during the criminal investigation process needs more studies and research to overcome the difficulties facing the investigators. Method: The present study aimed to identify the effectiveness of cognitive methods and techniques in detecting deception during the criminal investigation. It adopted the quasi-experimental method and covered a sample of (20) defendants distributed randomly into two homogeneous groups, an experimental group of (10) defendants be subject to criminal investigation by applying cognitive techniques to detect deception and a second experimental group of (10) defendants be subject to the direct investigation method. The tool that used is a guided interview based on models of investigative questions according to the cognitive deception detection approach, which consists of three techniques of Vrij: imposing the cognitive burden, encouragement to provide more information, and ask unexpected questions, and the Direct Investigation Method. Results: Results revealed a significant difference between the two groups in term of lie detection accuracy in favour of defendants be subject to criminal investigation by applying cognitive techniques, the cognitive deception detection approach produced superior total accuracy rates both with human observers and through an analysis of objective criteria. The cognitive deception detection approach produced superior accuracy results in truth detection: 71%, deception detection: 70% compared to a direct investigation method truth detection: 52%; deception detection: 49%. Conclusion: The study recommended if practitioners use a cognitive deception detection technique, they will correctly classify more individuals than when they use a direct investigation method.Keywords: the cognitive lie detection approach, deception, criminal investigation, mental health
Procedia PDF Downloads 663324 Predicting Wealth Status of Households Using Ensemble Machine Learning Algorithms
Authors: Habtamu Ayenew Asegie
Abstract:
Wealth, as opposed to income or consumption, implies a more stable and permanent status. Due to natural and human-made difficulties, households' economies will be diminished, and their well-being will fall into trouble. Hence, governments and humanitarian agencies offer considerable resources for poverty and malnutrition reduction efforts. One key factor in the effectiveness of such efforts is the accuracy with which low-income or poor populations can be identified. As a result, this study aims to predict a household’s wealth status using ensemble Machine learning (ML) algorithms. In this study, design science research methodology (DSRM) is employed, and four ML algorithms, Random Forest (RF), Adaptive Boosting (AdaBoost), Light Gradient Boosted Machine (LightGBM), and Extreme Gradient Boosting (XGBoost), have been used to train models. The Ethiopian Demographic and Health Survey (EDHS) dataset is accessed for this purpose from the Central Statistical Agency (CSA)'s database. Various data pre-processing techniques were employed, and the model training has been conducted using the scikit learn Python library functions. Model evaluation is executed using various metrics like Accuracy, Precision, Recall, F1-score, area under curve-the receiver operating characteristics (AUC-ROC), and subjective evaluations of domain experts. An optimal subset of hyper-parameters for the algorithms was selected through the grid search function for the best prediction. The RF model has performed better than the rest of the algorithms by achieving an accuracy of 96.06% and is better suited as a solution model for our purpose. Following RF, LightGBM, XGBoost, and AdaBoost algorithms have an accuracy of 91.53%, 88.44%, and 58.55%, respectively. The findings suggest that some of the features like ‘Age of household head’, ‘Total children ever born’ in a family, ‘Main roof material’ of their house, ‘Region’ they lived in, whether a household uses ‘Electricity’ or not, and ‘Type of toilet facility’ of a household are determinant factors to be a focal point for economic policymakers. The determinant risk factors, extracted rules, and designed artifact achieved 82.28% of the domain expert’s evaluation. Overall, the study shows ML techniques are effective in predicting the wealth status of households.Keywords: ensemble machine learning, households wealth status, predictive model, wealth status prediction
Procedia PDF Downloads 383323 Detection of Powdery Mildew Disease in Strawberry Using Image Texture and Supervised Classifiers
Authors: Sultan Mahmud, Qamar Zaman, Travis Esau, Young Chang
Abstract:
Strawberry powdery mildew (PM) is a serious disease that has a significant impact on strawberry production. Field scouting is still a major way to find PM disease, which is not only labor intensive but also almost impossible to monitor disease severity. To reduce the loss caused by PM disease and achieve faster automatic detection of the disease, this paper proposes an approach for detection of the disease, based on image texture and classified with support vector machines (SVMs) and k-nearest neighbors (kNNs). The methodology of the proposed study is based on image processing which is composed of five main steps including image acquisition, pre-processing, segmentation, features extraction and classification. Two strawberry fields were used in this study. Images of healthy leaves and leaves infected with PM (Sphaerotheca macularis) disease under artificial cloud lighting condition. Colour thresholding was utilized to segment all images before textural analysis. Colour co-occurrence matrix (CCM) was introduced for extraction of textural features. Forty textural features, related to a physiological parameter of leaves were extracted from CCM of National television system committee (NTSC) luminance, hue, saturation and intensity (HSI) images. The normalized feature data were utilized for training and validation, respectively, using developed classifiers. The classifiers have experimented with internal, external and cross-validations. The best classifier was selected based on their performance and accuracy. Experimental results suggested that SVMs classifier showed 98.33%, 85.33%, 87.33%, 93.33% and 95.0% of accuracy on internal, external-I, external-II, 4-fold cross and 5-fold cross-validation, respectively. Whereas, kNNs results represented 90.0%, 72.00%, 74.66%, 89.33% and 90.3% of classification accuracy, respectively. The outcome of this study demonstrated that SVMs classified PM disease with a highest overall accuracy of 91.86% and 1.1211 seconds of processing time. Therefore, overall results concluded that the proposed study can significantly support an accurate and automatic identification and recognition of strawberry PM disease with SVMs classifier.Keywords: powdery mildew, image processing, textural analysis, color co-occurrence matrix, support vector machines, k-nearest neighbors
Procedia PDF Downloads 1203322 Evaluating Factors Affecting Audiologists’ Diagnostic Performance in Auditory Brainstem Response Reading: Training and Experience
Authors: M. Zaitoun, S. Cumming, A. Purcell
Abstract:
This study aims to determine if audiologists' experience characteristics in ABR (Auditory Brainstem Response) reading is associated with their performance in interpreting ABR results. Fifteen ABR traces with varying degrees of hearing level were presented twice, making a total of 30. Audiologists were asked to determine the hearing threshold for each of the cases after completing a brief survey regarding their experience and training in ABR administration. Sixty-one audiologists completed all tasks. Correlations between audiologists’ performance measures and experience variables suggested significant associations (p < 0.05) between training period in ABR testing and audiologists’ performance in terms of both sensitivity and accuracy. In addition, the number of years conducting ABR testing correlated with specificity. No other correlations approached significance. While there are relatively few significant correlations between ABR performance and experience, accuracy in ABR reading is associated with audiologists’ length of experience and period of training. To improve audiologists’ performance in reading ABR results, an emphasis on the importance of training should be raised and standardized levels and period for audiologists training in ABR testing should also be set.Keywords: ABR, audiology, performance, training, experience
Procedia PDF Downloads 1663321 Structural Equation Modeling Semiparametric in Modeling the Accuracy of Payment Time for Customers of Credit Bank in Indonesia
Authors: Adji Achmad Rinaldo Fernandes
Abstract:
The research was conducted to apply semiparametric SEM modeling to the timeliness of paying credit. Semiparametric SEM is structural modeling in which two combined approaches of parametric and nonparametric approaches are used. The analysis method in this research is semiparametric SEM with a nonparametric approach using a truncated spline. The data in the study were obtained through questionnaires distributed to Bank X mortgage debtors and are confidential. The study used 3 variables consisting of one exogenous variable, one intervening endogenous variable, and one endogenous variable. The results showed that (1) the effect of capacity and willingness to pay variables on timeliness of payment is significant, (2) modeling the capacity variable on willingness to pay also produces a significant estimate, (3) the effect of the capacity variable on the timeliness of payment variable is not influenced by the willingness to pay variable as an intervening variable, (4) the R^2 value of 0.763 or 76.33% indicates that the model has good predictive relevance.Keywords: structural equation modeling semiparametric, credit bank, accuracy of payment time, willingness to pay
Procedia PDF Downloads 443320 Flood-prone Urban Area Mapping Using Machine Learning, a Case Sudy of M'sila City (Algeria)
Authors: Medjadj Tarek, Ghribi Hayet
Abstract:
This study aims to develop a flood sensitivity assessment tool using machine learning (ML) techniques and geographic information system (GIS). The importance of this study is integrating the geographic information systems (GIS) and machine learning (ML) techniques for mapping flood risks, which help decision-makers to identify the most vulnerable areas and take the necessary precautions to face this type of natural disaster. To reach this goal, we will study the case of the city of M'sila, which is among the areas most vulnerable to floods. This study drew a map of flood-prone areas based on the methodology where we have made a comparison between 3 machine learning algorithms: the xGboost model, the Random Forest algorithm and the K Nearest Neighbour algorithm. Each of them gave an accuracy respectively of 97.92 - 95 - 93.75. In the process of mapping flood-prone areas, the first model was relied upon, which gave the greatest accuracy (xGboost).Keywords: Geographic information systems (GIS), machine learning (ML), emergency mapping, flood disaster management
Procedia PDF Downloads 953319 Machine Learning Driven Analysis of Kepler Objects of Interest to Identify Exoplanets
Authors: Akshat Kumar, Vidushi
Abstract:
This paper identifies 27 KOIs, 26 of which are currently classified as candidates and one as false positives that have a high probability of being confirmed. For this purpose, 11 machine learning algorithms were implemented on the cumulative kepler dataset sourced from the NASA exoplanet archive; it was observed that the best-performing model was HistGradientBoosting and XGBoost with a test accuracy of 93.5%, and the lowest-performing model was Gaussian NB with a test accuracy of 54%, to test model performance F1, cross-validation score and RUC curve was calculated. Based on the learned models, the significant characteristics for confirm exoplanets were identified, putting emphasis on the object’s transit and stellar properties; these characteristics were namely koi_count, koi_prad, koi_period, koi_dor, koi_ror, and koi_smass, which were later considered to filter out the potential KOIs. The paper also calculates the Earth similarity index based on the planetary radius and equilibrium temperature for each KOI identified to aid in their classification.Keywords: Kepler objects of interest, exoplanets, space exploration, machine learning, earth similarity index, transit photometry
Procedia PDF Downloads 753318 Multiphase Equilibrium Characterization Model For Hydrate-Containing Systems Based On Trust-Region Method Non-Iterative Solving Approach
Authors: Zhuoran Li, Guan Qin
Abstract:
A robust and efficient compositional equilibrium characterization model for hydrate-containing systems is required, especially for time-critical simulations such as subsea pipeline flow assurance analysis, compositional simulation in hydrate reservoirs etc. A multiphase flash calculation framework, which combines Gibbs energy minimization function and cubic plus association (CPA) EoS, is developed to describe the highly non-ideal phase behavior of hydrate-containing systems. A non-iterative eigenvalue problem-solving approach for the trust-region sub-problem is selected to guarantee efficiency. The developed flash model is based on the state-of-the-art objective function proposed by Michelsen to minimize the Gibbs energy of the multiphase system. It is conceivable that a hydrate-containing system always contains polar components (such as water and hydrate inhibitors), introducing hydrogen bonds to influence phase behavior. Thus, the cubic plus associating (CPA) EoS is utilized to compute the thermodynamic parameters. The solid solution theory proposed by van der Waals and Platteeuw is applied to represent hydrate phase parameters. The trust-region method combined with the trust-region sub-problem non-iterative eigenvalue problem-solving approach is utilized to ensure fast convergence. The developed multiphase flash model's accuracy performance is validated by three available models (one published and two commercial models). Hundreds of published hydrate-containing system equilibrium experimental data are collected to act as the standard group for the accuracy test. The accuracy comparing results show that our model has superior performances over two models and comparable calculation accuracy to CSMGem. Efficiency performance test also has been carried out. Because the trust-region method can determine the optimization step's direction and size simultaneously, fast solution progress can be obtained. The comparison results show that less iteration number is needed to optimize the objective function by utilizing trust-region methods than applying line search methods. The non-iterative eigenvalue problem approach also performs faster computation speed than the conventional iterative solving algorithm for the trust-region sub-problem, further improving the calculation efficiency. A new thermodynamic framework of the multiphase flash model for the hydrate-containing system has been constructed in this work. Sensitive analysis and numerical experiments have been carried out to prove the accuracy and efficiency of this model. Furthermore, based on the current thermodynamic model in the oil and gas industry, implementing this model is simple.Keywords: equation of state, hydrates, multiphase equilibrium, trust-region method
Procedia PDF Downloads 1723317 Machine Learning Techniques in Bank Credit Analysis
Authors: Fernanda M. Assef, Maria Teresinha A. Steiner
Abstract:
The aim of this paper is to compare and discuss better classifier algorithm options for credit risk assessment by applying different Machine Learning techniques. Using records from a Brazilian financial institution, this study uses a database of 5,432 companies that are clients of the bank, where 2,600 clients are classified as non-defaulters, 1,551 are classified as defaulters and 1,281 are temporarily defaulters, meaning that the clients are overdue on their payments for up 180 days. For each case, a total of 15 attributes was considered for a one-against-all assessment using four different techniques: Artificial Neural Networks Multilayer Perceptron (ANN-MLP), Artificial Neural Networks Radial Basis Functions (ANN-RBF), Logistic Regression (LR) and finally Support Vector Machines (SVM). For each method, different parameters were analyzed in order to obtain different results when the best of each technique was compared. Initially the data were coded in thermometer code (numerical attributes) or dummy coding (for nominal attributes). The methods were then evaluated for each parameter and the best result of each technique was compared in terms of accuracy, false positives, false negatives, true positives and true negatives. This comparison showed that the best method, in terms of accuracy, was ANN-RBF (79.20% for non-defaulter classification, 97.74% for defaulters and 75.37% for the temporarily defaulter classification). However, the best accuracy does not always represent the best technique. For instance, on the classification of temporarily defaulters, this technique, in terms of false positives, was surpassed by SVM, which had the lowest rate (0.07%) of false positive classifications. All these intrinsic details are discussed considering the results found, and an overview of what was presented is shown in the conclusion of this study.Keywords: artificial neural networks (ANNs), classifier algorithms, credit risk assessment, logistic regression, machine Learning, support vector machines
Procedia PDF Downloads 1033316 Solutions for Large Diameter Piles Stifness Used in Offshore Wind Turbine Farms
Authors: M. H. Aissa, Amar Bouzid Dj
Abstract:
As known, many countries are now planning to build new wind farms with high capacity up to 5MW. Consequently, the size of the foundation increase. These kinds of structures are subject to fatigue damage from environmental loading mainly due to wind and waves as well as from cyclic loading imposed through the rotational frequency (1P) through mass and aerodynamic imbalances and from the blade passing frequency (3P) of the wind turbine which make them behavior dynamically very sensitive. That is why natural frequency must be determined with accuracy from the existing data of the soil and the foundation stiffness sources of uncertainties, to avoid the resonance of the system. This paper presents analytical expressions of stiffness foundation with large diameter in linear soil behavior in different soil stiffness profile. To check the accuracy of the proposed formulas, a mathematical model approach based on non-dimensional parameters is used to calculate the natural frequency taking into account the soil structure interaction (SSI) compared with the p-y method and measured frequency in the North Sea Wind farms.Keywords: offshore wind turbines, semi analytical FE analysis, p-y curves, piles foundations
Procedia PDF Downloads 4663315 Analyzing Current Transformer’s Transient and Steady State Behavior for Different Burden’s Using LabVIEW Data Acquisition Tool
Abstract:
Current transformers (CTs) are used to transform large primary currents to a small secondary current. Since most standard equipment’s are not designed to handle large primary currents the CTs have an important part in any electrical system for the purpose of Metering and Protection both of which are integral in Power system. Now a days due to advancement in solid state technology, the operation times of the protective relays have come to a few cycles from few seconds. Thus, in such a scenario it becomes important to study the transient response of the current transformers as it will play a vital role in the operating of the protective devices. This paper shows the steady state and transient behavior of current transformers and how it changes with change in connected burden. The transient and steady state response will be captured using the data acquisition software LabVIEW. Analysis is done on the real time data gathered using LabVIEW. Variation of current transformer characteristics with changes in burden will be discussed.Keywords: accuracy, accuracy limiting factor, burden, current transformer, instrument security factor
Procedia PDF Downloads 3433314 Algorithm Research on Traffic Sign Detection Based on Improved EfficientDet
Authors: Ma Lei-Lei, Zhou You
Abstract:
Aiming at the problems of low detection accuracy of deep learning algorithm in traffic sign detection, this paper proposes improved EfficientDet based traffic sign detection algorithm. Multi-head self-attention is introduced in the minimum resolution layer of the backbone of EfficientDet to achieve effective aggregation of local and global depth information, and this study proposes an improved feature fusion pyramid with increased vertical cross-layer connections, which improves the performance of the model while introducing a small amount of complexity, the Balanced L1 Loss is introduced to replace the original regression loss function Smooth L1 Loss, which solves the problem of balance in the loss function. Experimental results show, the algorithm proposed in this study is suitable for the task of traffic sign detection. Compared with other models, the improved EfficientDet has the best detection accuracy. Although the test speed is not completely dominant, it still meets the real-time requirement.Keywords: convolutional neural network, transformer, feature pyramid networks, loss function
Procedia PDF Downloads 973313 Accurate Mass Segmentation Using U-Net Deep Learning Architecture for Improved Cancer Detection
Authors: Ali Hamza
Abstract:
Accurate segmentation of breast ultrasound images is of paramount importance in enhancing the diagnostic capabilities of breast cancer detection. This study presents an approach utilizing the U-Net architecture for segmenting breast ultrasound images aimed at improving the accuracy and reliability of mass identification within the breast tissue. The proposed method encompasses a multi-stage process. Initially, preprocessing techniques are employed to refine image quality and diminish noise interference. Subsequently, the U-Net architecture, a deep learning convolutional neural network (CNN), is employed for pixel-wise segmentation of regions of interest corresponding to potential breast masses. The U-Net's distinctive architecture, characterized by a contracting and expansive pathway, enables accurate boundary delineation and detailed feature extraction. To evaluate the effectiveness of the proposed approach, an extensive dataset of breast ultrasound images is employed, encompassing diverse cases. Quantitative performance metrics such as the Dice coefficient, Jaccard index, sensitivity, specificity, and Hausdorff distance are employed to comprehensively assess the segmentation accuracy. Comparative analyses against traditional segmentation methods showcase the superiority of the U-Net architecture in capturing intricate details and accurately segmenting breast masses. The outcomes of this study emphasize the potential of the U-Net-based segmentation approach in bolstering breast ultrasound image analysis. The method's ability to reliably pinpoint mass boundaries holds promise for aiding radiologists in precise diagnosis and treatment planning. However, further validation and integration within clinical workflows are necessary to ascertain their practical clinical utility and facilitate seamless adoption by healthcare professionals. In conclusion, leveraging the U-Net architecture for breast ultrasound image segmentation showcases a robust framework that can significantly enhance diagnostic accuracy and advance the field of breast cancer detection. This approach represents a pivotal step towards empowering medical professionals with a more potent tool for early and accurate breast cancer diagnosis.Keywords: mage segmentation, U-Net, deep learning, breast cancer detection, diagnostic accuracy, mass identification, convolutional neural network
Procedia PDF Downloads 843312 Building Scalable and Accurate Hybrid Kernel Mapping Recommender
Authors: Hina Iqbal, Mustansar Ali Ghazanfar, Sandor Szedmak
Abstract:
Recommender systems uses artificial intelligence practices for filtering obscure information and can predict if a user likes a specified item. Kernel mapping Recommender systems have been proposed which are accurate and state-of-the-art algorithms and resolve recommender system’s design objectives such as; long tail, cold-start, and sparsity. The aim of research is to propose hybrid framework that can efficiently integrate different versions— namely item-based and user-based KMR— of KMR algorithm. We have proposed various heuristic algorithms that integrate different versions of KMR (into a unified framework) resulting in improved accuracy and elimination of problems associated with conventional recommender system. We have tested our system on publically available movies dataset and benchmark with KMR. The results (in terms of accuracy, precision, recall, F1 measure and ROC metrics) reveal that the proposed algorithm is quite accurate especially under cold-start and sparse scenarios.Keywords: Kernel Mapping Recommender Systems, hybrid recommender systems, cold start, sparsity, long tail
Procedia PDF Downloads 3383311 Capability of Available Seismic Soil Liquefaction Potential Assessment Models Based on Shear-Wave Velocity Using Banchu Case History
Authors: Nima Pirhadi, Yong Bo Shao, Xusheng Wa, Jianguo Lu
Abstract:
Several models based on the simplified method introduced by Seed and Idriss (1971) have been developed to assess the liquefaction potential of saturated sandy soils. The procedure includes determining the cyclic resistance of the soil as the cyclic resistance ratio (CRR) and comparing it with earthquake loads as cyclic stress ratio (CSR). Of all methods to determine CRR, the methods using shear-wave velocity (Vs) are common because of their low sensitivity to the penetration resistance reduction caused by fine content (FC). To evaluate the capability of the models, based on the Vs., the new data from Bachu-Jianshi earthquake case history collected, then the prediction results of the models are compared to the measured results; consequently, the accuracy of the models are discussed via three criteria and graphs. The evaluation demonstrates reasonable accuracy of the models in the Banchu region.Keywords: seismic liquefaction, banchu-jiashi earthquake, shear-wave velocity, liquefaction potential evaluation
Procedia PDF Downloads 2373310 Using Photogrammetric Techniques to Map the Mars Surface
Authors: Ahmed Elaksher, Islam Omar
Abstract:
For many years, Mars surface has been a mystery for scientists. Lately with the help of geospatial data and photogrammetric procedures researchers were able to capture some insights about this planet. Two of the most imperative data sources to explore Mars are the The High Resolution Imaging Science Experiment (HiRISE) and the Mars Orbiter Laser Altimeter (MOLA). HiRISE is one of six science instruments carried by the Mars Reconnaissance Orbiter, launched August 12, 2005, and managed by NASA. The MOLA sensor is a laser altimeter carried by the Mars Global Surveyor (MGS) and launched on November 7, 1996. In this project, we used MOLA-based DEMs to orthorectify HiRISE optical images for generating a more accurate and trustful surface of Mars. The MOLA data was interpolated using the kriging interpolation technique. Corresponding tie points were digitized from both datasets. These points were employed in co-registering both datasets using GIS analysis tools. In this project, we employed three different 3D to 2D transformation models. These are the parallel projection (3D affine) transformation model; the extended parallel projection transformation model; the Direct Linear Transformation (DLT) model. A set of tie-points was digitized from both datasets. These points were split into two sets: Ground Control Points (GCPs), used to evaluate the transformation parameters using least squares adjustment techniques, and check points (ChkPs) to evaluate the computed transformation parameters. Results were evaluated using the RMSEs between the precise horizontal coordinates of the digitized check points and those estimated through the transformation models using the computed transformation parameters. For each set of GCPs, three different configurations of GCPs and check points were tested, and average RMSEs are reported. It was found that for the 2D transformation models, average RMSEs were in the range of five meters. Increasing the number of GCPs from six to ten points improve the accuracy of the results with about two and half meters. Further increasing the number of GCPs didn’t improve the results significantly. Using the 3D to 2D transformation parameters provided three to two meters accuracy. Best results were reported using the DLT transformation model. However, increasing the number of GCPS didn’t have substantial effect. The results support the use of the DLT model as it provides the required accuracy for ASPRS large scale mapping standards. However, well distributed sets of GCPs is a key to provide such accuracy. The model is simple to apply and doesn’t need substantial computations.Keywords: mars, photogrammetry, MOLA, HiRISE
Procedia PDF Downloads 573309 COVID-19 Analysis with Deep Learning Model Using Chest X-Rays Images
Authors: Uma Maheshwari V., Rajanikanth Aluvalu, Kumar Gautam
Abstract:
The COVID-19 disease is a highly contagious viral infection with major worldwide health implications. The global economy suffers as a result of COVID. The spread of this pandemic disease can be slowed if positive patients are found early. COVID-19 disease prediction is beneficial for identifying patients' health problems that are at risk for COVID. Deep learning and machine learning algorithms for COVID prediction using X-rays have the potential to be extremely useful in solving the scarcity of doctors and clinicians in remote places. In this paper, a convolutional neural network (CNN) with deep layers is presented for recognizing COVID-19 patients using real-world datasets. We gathered around 6000 X-ray scan images from various sources and split them into two categories: normal and COVID-impacted. Our model examines chest X-ray images to recognize such patients. Because X-rays are commonly available and affordable, our findings show that X-ray analysis is effective in COVID diagnosis. The predictions performed well, with an average accuracy of 99% on training photographs and 88% on X-ray test images.Keywords: deep CNN, COVID–19 analysis, feature extraction, feature map, accuracy
Procedia PDF Downloads 793308 Classification of Political Affiliations by Reduced Number of Features
Authors: Vesile Evrim, Aliyu Awwal
Abstract:
By the evolvement in technology, the way of expressing opinions switched the direction to the digital world. The domain of politics as one of the hottest topics of opinion mining research merged together with the behavior analysis for affiliation determination in text which constitutes the subject of this paper. This study aims to classify the text in news/blogs either as Republican or Democrat with the minimum number of features. As an initial set, 68 features which 64 are constituted by Linguistic Inquiry and Word Count (LIWC) features are tested against 14 benchmark classification algorithms. In the later experiments, the dimensions of the feature vector reduced based on the 7 feature selection algorithms. The results show that Decision Tree, Rule Induction and M5 Rule classifiers when used with SVM and IGR feature selection algorithms performed the best up to 82.5% accuracy on a given dataset. Further tests on a single feature and the linguistic based feature sets showed the similar results. The feature “function” as an aggregate feature of the linguistic category, is obtained as the most differentiating feature among the 68 features with 81% accuracy by itself in classifying articles either as Republican or Democrat.Keywords: feature selection, LIWC, machine learning, politics
Procedia PDF Downloads 3823307 Specific Emitter Identification Based on Refined Composite Multiscale Dispersion Entropy
Authors: Shaoying Guo, Yanyun Xu, Meng Zhang, Weiqing Huang
Abstract:
The wireless communication network is developing rapidly, thus the wireless security becomes more and more important. Specific emitter identification (SEI) is an vital part of wireless communication security as a technique to identify the unique transmitters. In this paper, a SEI method based on multiscale dispersion entropy (MDE) and refined composite multiscale dispersion entropy (RCMDE) is proposed. The algorithms of MDE and RCMDE are used to extract features for identification of five wireless devices and cross-validation support vector machine (CV-SVM) is used as the classifier. The experimental results show that the total identification accuracy is 99.3%, even at low signal-to-noise ratio(SNR) of 5dB, which proves that MDE and RCMDE can describe the communication signal series well. In addition, compared with other methods, the proposed method is effective and provides better accuracy and stability for SEI.Keywords: cross-validation support vector machine, refined com- posite multiscale dispersion entropy, specific emitter identification, transient signal, wireless communication device
Procedia PDF Downloads 1293306 A Novel Heuristic for Analysis of Large Datasets by Selecting Wrapper-Based Features
Authors: Bushra Zafar, Usman Qamar
Abstract:
Large data sample size and dimensions render the effectiveness of conventional data mining methodologies. A data mining technique are important tools for collection of knowledgeable information from variety of databases and provides supervised learning in the form of classification to design models to describe vital data classes while structure of the classifier is based on class attribute. Classification efficiency and accuracy are often influenced to great extent by noisy and undesirable features in real application data sets. The inherent natures of data set greatly masks its quality analysis and leave us with quite few practical approaches to use. To our knowledge first time, we present a new approach for investigation of structure and quality of datasets by providing a targeted analysis of localization of noisy and irrelevant features of data sets. Machine learning is based primarily on feature selection as pre-processing step which offers us to select few features from number of features as a subset by reducing the space according to certain evaluation criterion. The primary objective of this study is to trim down the scope of the given data sample by searching a small set of important features which may results into good classification performance. For this purpose, a heuristic for wrapper-based feature selection using genetic algorithm and for discriminative feature selection an external classifier are used. Selection of feature based on its number of occurrence in the chosen chromosomes. Sample dataset has been used to demonstrate proposed idea effectively. A proposed method has improved average accuracy of different datasets is about 95%. Experimental results illustrate that proposed algorithm increases the accuracy of prediction of different diseases.Keywords: data mining, generic algorithm, KNN algorithms, wrapper based feature selection
Procedia PDF Downloads 3163305 Investigating Data Normalization Techniques in Swarm Intelligence Forecasting for Energy Commodity Spot Price
Authors: Yuhanis Yusof, Zuriani Mustaffa, Siti Sakira Kamaruddin
Abstract:
Data mining is a fundamental technique in identifying patterns from large data sets. The extracted facts and patterns contribute in various domains such as marketing, forecasting, and medical. Prior to that, data are consolidated so that the resulting mining process may be more efficient. This study investigates the effect of different data normalization techniques, which are Min-max, Z-score, and decimal scaling, on Swarm-based forecasting models. Recent swarm intelligence algorithms employed includes the Grey Wolf Optimizer (GWO) and Artificial Bee Colony (ABC). Forecasting models are later developed to predict the daily spot price of crude oil and gasoline. Results showed that GWO works better with Z-score normalization technique while ABC produces better accuracy with the Min-Max. Nevertheless, the GWO is more superior that ABC as its model generates the highest accuracy for both crude oil and gasoline price. Such a result indicates that GWO is a promising competitor in the family of swarm intelligence algorithms.Keywords: artificial bee colony, data normalization, forecasting, Grey Wolf optimizer
Procedia PDF Downloads 4753304 A Trend Based Forecasting Framework of the ATA Method and Its Performance on the M3-Competition Data
Authors: H. Taylan Selamlar, I. Yavuz, G. Yapar
Abstract:
It is difficult to make predictions especially about the future and making accurate predictions is not always easy. However, better predictions remain the foundation of all science therefore the development of accurate, robust and reliable forecasting methods is very important. Numerous number of forecasting methods have been proposed and studied in the literature. There are still two dominant major forecasting methods: Box-Jenkins ARIMA and Exponential Smoothing (ES), and still new methods are derived or inspired from them. After more than 50 years of widespread use, exponential smoothing is still one of the most practically relevant forecasting methods available due to their simplicity, robustness and accuracy as automatic forecasting procedures especially in the famous M-Competitions. Despite its success and widespread use in many areas, ES models have some shortcomings that negatively affect the accuracy of forecasts. Therefore, a new forecasting method in this study will be proposed to cope with these shortcomings and it will be called ATA method. This new method is obtained from traditional ES models by modifying the smoothing parameters therefore both methods have similar structural forms and ATA can be easily adapted to all of the individual ES models however ATA has many advantages due to its innovative new weighting scheme. In this paper, the focus is on modeling the trend component and handling seasonality patterns by utilizing classical decomposition. Therefore, ATA method is expanded to higher order ES methods for additive, multiplicative, additive damped and multiplicative damped trend components. The proposed models are called ATA trended models and their predictive performances are compared to their counter ES models on the M3 competition data set since it is still the most recent and comprehensive time-series data collection available. It is shown that the models outperform their counters on almost all settings and when a model selection is carried out amongst these trended models ATA outperforms all of the competitors in the M3- competition for both short term and long term forecasting horizons when the models’ forecasting accuracies are compared based on popular error metrics.Keywords: accuracy, exponential smoothing, forecasting, initial value
Procedia PDF Downloads 1773303 A Supervised Approach for Word Sense Disambiguation Based on Arabic Diacritics
Authors: Alaa Alrakaf, Sk. Md. Mizanur Rahman
Abstract:
Since the last two decades’ Arabic natural language processing (ANLP) has become increasingly much more important. One of the key issues related to ANLP is ambiguity. In Arabic language different pronunciation of one word may have a different meaning. Furthermore, ambiguity also has an impact on the effectiveness and efficiency of Machine Translation (MT). The issue of ambiguity has limited the usefulness and accuracy of the translation from Arabic to English. The lack of Arabic resources makes ambiguity problem more complicated. Additionally, the orthographic level of representation cannot specify the exact meaning of the word. This paper looked at the diacritics of Arabic language and used them to disambiguate a word. The proposed approach of word sense disambiguation used Diacritizer application to Diacritize Arabic text then found the most accurate sense of an ambiguous word using Naïve Bayes Classifier. Our Experimental study proves that using Arabic Diacritics with Naïve Bayes Classifier enhances the accuracy of choosing the appropriate sense by 23% and also decreases the ambiguity in machine translation.Keywords: Arabic natural language processing, machine learning, machine translation, Naive bayes classifier, word sense disambiguation
Procedia PDF Downloads 3583302 Resistivity Tomography Optimization Based on Parallel Electrode Linear Back Projection Algorithm
Authors: Yiwei Huang, Chunyu Zhao, Jingjing Ding
Abstract:
Electrical Resistivity Tomography has been widely used in the medicine and the geology, such as the imaging of the lung impedance and the analysis of the soil impedance, etc. Linear Back Projection is the core algorithm of Electrical Resistivity Tomography, but the traditional Linear Back Projection can not make full use of the information of the electric field. In this paper, an imaging method of Parallel Electrode Linear Back Projection for Electrical Resistivity Tomography is proposed, which generates the electric field distribution that is not linearly related to the traditional Linear Back Projection, captures the new information and improves the imaging accuracy without increasing the number of electrodes by changing the connection mode of the electrodes. The simulation results show that the accuracy of the image obtained by the inverse operation obtained by the Parallel Electrode Linear Back Projection can be improved by about 20%.Keywords: electrical resistivity tomography, finite element simulation, image optimization, parallel electrode linear back projection
Procedia PDF Downloads 1533301 SC-LSH: An Efficient Indexing Method for Approximate Similarity Search in High Dimensional Space
Authors: Sanaa Chafik, Imane Daoudi, Mounim A. El Yacoubi, Hamid El Ouardi
Abstract:
Locality Sensitive Hashing (LSH) is one of the most promising techniques for solving nearest neighbour search problem in high dimensional space. Euclidean LSH is the most popular variation of LSH that has been successfully applied in many multimedia applications. However, the Euclidean LSH presents limitations that affect structure and query performances. The main limitation of the Euclidean LSH is the large memory consumption. In order to achieve a good accuracy, a large number of hash tables is required. In this paper, we propose a new hashing algorithm to overcome the storage space problem and improve query time, while keeping a good accuracy as similar to that achieved by the original Euclidean LSH. The Experimental results on a real large-scale dataset show that the proposed approach achieves good performances and consumes less memory than the Euclidean LSH.Keywords: approximate nearest neighbor search, content based image retrieval (CBIR), curse of dimensionality, locality sensitive hashing, multidimensional indexing, scalability
Procedia PDF Downloads 321