Search results for: algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 1934

Search results for: algorithms

554 Three Tier Indoor Localization System for Digital Forensics

Authors: Dennis L. Owuor, Okuthe P. Kogeda, Johnson I. Agbinya

Abstract:

Mobile localization has attracted a great deal of attention recently due to the introduction of wireless networks. Although several localization algorithms and systems have been implemented and discussed in the literature, very few researchers have exploited the gap that exists between indoor localization, tracking, external storage of location information and outdoor localization for the purpose of digital forensics during and after a disaster. The contribution of this paper lies in the implementation of a robust system that is capable of locating, tracking mobile device users and store location information for both indoor and partially outdoor the cloud. The system can be used during disaster to track and locate mobile phone users. The developed system is a mobile application built based on Android, Hypertext Preprocessor (PHP), Cascading Style Sheets (CSS), JavaScript and MATLAB for the Android mobile users. Using Waterfall model of software development, we have implemented a three level system that is able to track, locate and store mobile device information in secure database (cloud) on almost a real time basis. The outcome of the study showed that the developed system is efficient with regard to the tracking and locating mobile devices. The system is also flexible, i.e. can be used in any building with fewer adjustments. Finally, the system is accurate for both indoor and outdoor in terms of locating and tracking mobile devices.

Keywords: indoor localization, digital forensics, fingerprinting, tracking and cloud

Procedia PDF Downloads 302
553 Machine Learning for Targeting of Conditional Cash Transfers: Improving the Effectiveness of Proxy Means Tests to Identify Future School Dropouts and the Poor

Authors: Cristian Crespo

Abstract:

Conditional cash transfers (CCTs) have been targeted towards the poor. Thus, their targeting assessments check whether these schemes have been allocated to low-income households or individuals. However, CCTs have more than one goal and target group. An additional goal of CCTs is to increase school enrolment. Hence, students at risk of dropping out of school also are a target group. This paper analyses whether one of the most common targeting mechanisms of CCTs, a proxy means test (PMT), is suitable to identify the poor and future school dropouts. The PMT is compared with alternative approaches that use the outputs of a predictive model of school dropout. This model was built using machine learning algorithms and rich administrative datasets from Chile. The paper shows that using machine learning outputs in conjunction with the PMT increases targeting effectiveness by identifying more students who are either poor or future dropouts. This joint targeting approach increases effectiveness in different scenarios except when the social valuation of the two target groups largely differs. In these cases, the most likely optimal approach is to solely adopt the targeting mechanism designed to find the highly valued group.

Keywords: conditional cash transfers, machine learning, poverty, proxy means tests, school dropout prediction, targeting

Procedia PDF Downloads 173
552 Pushing the Boundary of Parallel Tractability for Ontology Materialization via Boolean Circuits

Authors: Zhangquan Zhou, Guilin Qi

Abstract:

Materialization is an important reasoning service for applications built on the Web Ontology Language (OWL). To make materialization efficient in practice, current research focuses on deciding tractability of an ontology language and designing parallel reasoning algorithms. However, some well-known large-scale ontologies, such as YAGO, have been shown to have good performance for parallel reasoning, but they are expressed in ontology languages that are not parallelly tractable, i.e., the reasoning is inherently sequential in the worst case. This motivates us to study the problem of parallel tractability of ontology materialization from a theoretical perspective. That is we aim to identify the ontologies for which materialization is parallelly tractable, i.e., in the NC complexity. Since the NC complexity is defined based on Boolean circuit that is widely used to investigate parallel computing problems, we first transform the problem of materialization to evaluation of Boolean circuits, and then study the problem of parallel tractability based on circuits. In this work, we focus on datalog rewritable ontology languages. We use Boolean circuits to identify two classes of datalog rewritable ontologies (called parallelly tractable classes) such that materialization over them is parallelly tractable. We further investigate the parallel tractability of materialization of a datalog rewritable OWL fragment DHL (Description Horn Logic). Based on the above results, we analyze real-world datasets and show that many ontologies expressed in DHL belong to the parallelly tractable classes.

Keywords: ontology materialization, parallel reasoning, datalog, Boolean circuit

Procedia PDF Downloads 239
551 Teaching Tools for Web Processing Services

Authors: Rashid Javed, Hardy Lehmkuehler, Franz Josef-Behr

Abstract:

Web Processing Services (WPS) have up growing concern in geoinformation research. However, teaching about them is difficult because of the generally complex circumstances of their use. They limit the possibilities for hands- on- exercises on Web Processing Services. To support understanding however a Training Tools Collection was brought on the way at University of Applied Sciences Stuttgart (HFT). It is limited to the scope of Geostatistical Interpolation of sample point data where different algorithms can be used like IDW, Nearest Neighbor etc. The Tools Collection aims to support understanding of the scope, definition and deployment of Web Processing Services. For example it is necessary to characterize the input of Interpolation by the data set, the parameters for the algorithm and the interpolation results (here a grid of interpolated values is assumed). This paper reports on first experiences using a pilot installation. This was intended to find suitable software interfaces for later full implementations and conclude on potential user interface characteristics. Experiences were made with Deegree software, one of several Services Suites (Collections). Being strictly programmed in Java, Deegree offers several OGC compliant Service Implementations that also promise to be of benefit for the project. The mentioned parameters for a WPS were formalized following the paradigm that any meaningful component will be defined in terms of suitable standards. E.g. the data output can be defined as a GML file. But, the choice of meaningful information pieces and user interactions is not free but partially determined by the selected WPS Processing Suite.

Keywords: deegree, interpolation, IDW, web processing service (WPS)

Procedia PDF Downloads 329
550 Multi-Objective Evolutionary Computation Based Feature Selection Applied to Behaviour Assessment of Children

Authors: F. Jiménez, R. Jódar, M. Martín, G. Sánchez, G. Sciavicco

Abstract:

Abstract—Attribute or feature selection is one of the basic strategies to improve the performances of data classification tasks, and, at the same time, to reduce the complexity of classifiers, and it is a particularly fundamental one when the number of attributes is relatively high. Its application to unsupervised classification is restricted to a limited number of experiments in the literature. Evolutionary computation has already proven itself to be a very effective choice to consistently reduce the number of attributes towards a better classification rate and a simpler semantic interpretation of the inferred classifiers. We present a feature selection wrapper model composed by a multi-objective evolutionary algorithm, the clustering method Expectation-Maximization (EM), and the classifier C4.5 for the unsupervised classification of data extracted from a psychological test named BASC-II (Behavior Assessment System for Children - II ed.) with two objectives: Maximizing the likelihood of the clustering model and maximizing the accuracy of the obtained classifier. We present a methodology to integrate feature selection for unsupervised classification, model evaluation, decision making (to choose the most satisfactory model according to a a posteriori process in a multi-objective context), and testing. We compare the performance of the classifier obtained by the multi-objective evolutionary algorithms ENORA and NSGA-II, and the best solution is then validated by the psychologists that collected the data.

Keywords: evolutionary computation, feature selection, classification, clustering

Procedia PDF Downloads 339
549 Using Deep Learning Real-Time Object Detection Convolution Neural Networks for Fast Fruit Recognition in the Tree

Authors: K. Bresilla, L. Manfrini, B. Morandi, A. Boini, G. Perulli, L. C. Grappadelli

Abstract:

Image/video processing for fruit in the tree using hard-coded feature extraction algorithms have shown high accuracy during recent years. While accurate, these approaches even with high-end hardware are computationally intensive and too slow for real-time systems. This paper details the use of deep convolution neural networks (CNNs), specifically an algorithm (YOLO - You Only Look Once) with 24+2 convolution layers. Using deep-learning techniques eliminated the need for hard-code specific features for specific fruit shapes, color and/or other attributes. This CNN is trained on more than 5000 images of apple and pear fruits on 960 cores GPU (Graphical Processing Unit). Testing set showed an accuracy of 90%. After this, trained data were transferred to an embedded device (Raspberry Pi gen.3) with camera for more portability. Based on correlation between number of visible fruits or detected fruits on one frame and the real number of fruits on one tree, a model was created to accommodate this error rate. Speed of processing and detection of the whole platform was higher than 40 frames per second. This speed is fast enough for any grasping/harvesting robotic arm or other real-time applications.

Keywords: artificial intelligence, computer vision, deep learning, fruit recognition, harvesting robot, precision agriculture

Procedia PDF Downloads 386
548 Time Series Forecasting (TSF) Using Various Deep Learning Models

Authors: Jimeng Shi, Mahek Jain, Giri Narasimhan

Abstract:

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed-length window in the past as an explicit input. In this paper, we study how the performance of predictive models changes as a function of different look-back window sizes and different amounts of time to predict the future. We also consider the performance of the recent attention-based Transformer models, which have had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

Keywords: air quality prediction, deep learning algorithms, time series forecasting, look-back window

Procedia PDF Downloads 126
547 Interval Bilevel Linear Fractional Programming

Authors: F. Hamidi, N. Amiri, H. Mishmast Nehi

Abstract:

The Bilevel Programming (BP) model has been presented for a decision making process that consists of two decision makers in a hierarchical structure. In fact, BP is a model for a static two person game (the leader player in the upper level and the follower player in the lower level) wherein each player tries to optimize his/her personal objective function under dependent constraints; this game is sequential and non-cooperative. The decision making variables are divided between the two players and one’s choice affects the other’s benefit and choices. In other words, BP consists of two nested optimization problems with two objective functions (upper and lower) where the constraint region of the upper level problem is implicitly determined by the lower level problem. In real cases, the coefficients of an optimization problem may not be precise, i.e. they may be interval. In this paper we develop an algorithm for solving interval bilevel linear fractional programming problems. That is to say, bilevel problems in which both objective functions are linear fractional, the coefficients are interval and the common constraint region is a polyhedron. From the original problem, the best and the worst bilevel linear fractional problems have been derived and then, using the extended Charnes and Cooper transformation, each fractional problem can be reduced to a linear problem. Then we can find the best and the worst optimal values of the leader objective function by two algorithms.

Keywords: best and worst optimal solutions, bilevel programming, fractional, interval coefficients

Procedia PDF Downloads 417
546 Vehicular Speed Detection Camera System Using Video Stream

Authors: C. A. Anser Pasha

Abstract:

In this paper, a new Vehicular Speed Detection Camera System that is applicable as an alternative to traditional radars with the same accuracy or even better is presented. The real-time measurement and analysis of various traffic parameters such as speed and number of vehicles are increasingly required in traffic control and management. Image processing techniques are now considered as an attractive and flexible method for automatic analysis and data collections in traffic engineering. Various algorithms based on image processing techniques have been applied to detect multiple vehicles and track them. The SDCS processes can be divided into three successive phases; the first phase is Objects detection phase, which uses a hybrid algorithm based on combining an adaptive background subtraction technique with a three-frame differencing algorithm which ratifies the major drawback of using only adaptive background subtraction. The second phase is Objects tracking, which consists of three successive operations - object segmentation, object labeling, and object center extraction. Objects tracking operation takes into consideration the different possible scenarios of the moving object like simple tracking, the object has left the scene, the object has entered the scene, object crossed by another object, and object leaves and another one enters the scene. The third phase is speed calculation phase, which is calculated from the number of frames consumed by the object to pass by the scene.

Keywords: radar, image processing, detection, tracking, segmentation

Procedia PDF Downloads 431
545 Intelligent Recognition of Diabetes Disease via FCM Based Attribute Weighting

Authors: Kemal Polat

Abstract:

In this paper, an attribute weighting method called fuzzy C-means clustering based attribute weighting (FCMAW) for classification of Diabetes disease dataset has been used. The aims of this study are to reduce the variance within attributes of diabetes dataset and to improve the classification accuracy of classifier algorithm transforming from non-linear separable datasets to linearly separable datasets. Pima Indians Diabetes dataset has two classes including normal subjects (500 instances) and diabetes subjects (268 instances). Fuzzy C-means clustering is an improved version of K-means clustering method and is one of most used clustering methods in data mining and machine learning applications. In this study, as the first stage, fuzzy C-means clustering process has been used for finding the centers of attributes in Pima Indians diabetes dataset and then weighted the dataset according to the ratios of the means of attributes to centers of theirs. Secondly, after weighting process, the classifier algorithms including support vector machine (SVM) and k-NN (k- nearest neighbor) classifiers have been used for classifying weighted Pima Indians diabetes dataset. Experimental results show that the proposed attribute weighting method (FCMAW) has obtained very promising results in the classification of Pima Indians diabetes dataset.

Keywords: fuzzy C-means clustering, fuzzy C-means clustering based attribute weighting, Pima Indians diabetes, SVM

Procedia PDF Downloads 382
544 The Searching Artificial Intelligence: Neural Evidence on Consumers' Less Aversion to Algorithm-Recommended Search Product

Authors: Zhaohan Xie, Yining Yu, Mingliang Chen

Abstract:

As research has shown a convergent tendency for aversion to AI recommendation, it is imperative to find a way to promote AI usage and better harness the technology. In the context of e-commerce, this study has found evidence that people show less avoidance of algorithms when recommending search products compared to experience products. This is due to people’s different attribution of mind to AI versus humans, as suggested by mind perception theory. While people hold the belief that an algorithm owns sufficient capability to think and calculate, which makes it competent to evaluate search product attributes that can be obtained before actual use, they doubt its capability to sense and feel, which is essential for evaluating experience product attributes that must be assessed after experience in person. The result of the behavioral investigation (Study 1, N=112) validated that consumers show low purchase intention to experience products recommended by AI. Further consumer neuroscience study (Study 2, N=26) using Event-related potential (ERP) showed that consumers have a higher level of cognitive conflict when faced with AI recommended experience product as reflected by larger N2 component, while the effect disappears for search product. This research has implications for the effective employment of AI recommenders, and it extends the literature on e-commerce and marketing communication.

Keywords: algorithm recommendation, consumer behavior, e-commerce, event-related potential, experience product, search product

Procedia PDF Downloads 105
543 From Wave-Powered Propulsion to Flight with Membrane Wings: Insights Powered by High-Fidelity Immersed Boundary Methods based FSI Simulations

Authors: Rajat Mittal, Jung Hee Seo, Jacob Turner, Harshal Raut

Abstract:

The perpetual advancement in computational capabilities, coupled with the continuous evolution of software tools and numerical algorithms, is creating novel avenues for research, exploration, and application at the nexus of computational fluid and structural mechanics. Fish leverage their remarkably flexible bodies and fins to harness energy from vortices, propelling themselves with an elegance and efficiency that captivates engineers. Bats fly with unparalleled agility and speed by using their flexible membrane wings. Wave-assisted propulsion (WAP) systems, utilizing elastically mounted hydrofoils, convert wave energy into thrust. Each of these problems involves a complex and elegant interplay between fluid dynamics and structural mechanics. Historically, investigations into such phenomena were constrained by available tools, but modern computational advancements now facilitate exploration of these multi-physics challenges with an unprecedented level of fidelity, precision, and realism. In this work, the author will discuss projects that harness the capabilities of high-fidelity sharp-interface immersed boundary methods to address a spectrum of engineering and biological challenges involving fluid-structure interaction.

Keywords: immersed boundary methods, CFD, bioflight, fluid structure interaction

Procedia PDF Downloads 28
542 Comparative Performance of Artificial Bee Colony Based Algorithms for Wind-Thermal Unit Commitment

Authors: P. K. Singhal, R. Naresh, V. Sharma

Abstract:

This paper presents the three optimization models, namely New Binary Artificial Bee Colony (NBABC) algorithm, NBABC with Local Search (NBABC-LS), and NBABC with Genetic Crossover (NBABC-GC) for solving the Wind-Thermal Unit Commitment (WTUC) problem. The uncertain nature of the wind power is incorporated using the Weibull probability density function, which is used to calculate the overestimation and underestimation costs associated with the wind power fluctuation. The NBABC algorithm utilizes a mechanism based on the dissimilarity measure between binary strings for generating the binary solutions in WTUC problem. In NBABC algorithm, an intelligent scout bee phase is proposed that replaces the abandoned solution with the global best solution. The local search operator exploits the neighboring region of the current solutions, whereas the integration of genetic crossover with the NBABC algorithm increases the diversity in the search space and thus avoids the problem of local trappings encountered with the NBABC algorithm. These models are then used to decide the units on/off status, whereas the lambda iteration method is used to dispatch the hourly load demand among the committed units. The effectiveness of the proposed models is validated on an IEEE 10-unit thermal system combined with a wind farm over the planning period of 24 hours.

Keywords: artificial bee colony algorithm, economic dispatch, unit commitment, wind power

Procedia PDF Downloads 351
541 Real-Time Multi-Vehicle Tracking Application at Intersections Based on Feature Selection in Combination with Color Attribution

Authors: Qiang Zhang, Xiaojian Hu

Abstract:

In multi-vehicle tracking, based on feature selection, the tracking system efficiently tracks vehicles in a video with minimal error in combination with color attribution, which focuses on presenting a simple and fast, yet accurate and robust solution to the problem such as inaccurately and untimely responses of statistics-based adaptive traffic control system in the intersection scenario. In this study, a real-time tracking system is proposed for multi-vehicle tracking in the intersection scene. Considering the complexity and application feasibility of the algorithm, in the object detection step, the detection result provided by virtual loops were post-processed and then used as the input for the tracker. For the tracker, lightweight methods were designed to extract and select features and incorporate them into the adaptive color tracking (ACT) framework. And the approbatory online feature selection algorithms are integrated on the mature ACT system with good compatibility. The proposed feature selection methods and multi-vehicle tracking method are evaluated on KITTI datasets and show efficient vehicle tracking performance when compared to the other state-of-the-art approaches in the same category. And the system performs excellently on the video sequences recorded at the intersection. Furthermore, the presented vehicle tracking system is suitable for surveillance applications.

Keywords: real-time, multi-vehicle tracking, feature selection, color attribution

Procedia PDF Downloads 120
540 Implementation of a Multimodal Biometrics Recognition System with Combined Palm Print and Iris Features

Authors: Rabab M. Ramadan, Elaraby A. Elgallad

Abstract:

With extensive application, the performance of unimodal biometrics systems has to face a diversity of problems such as signal and background noise, distortion, and environment differences. Therefore, multimodal biometric systems are proposed to solve the above stated problems. This paper introduces a bimodal biometric recognition system based on the extracted features of the human palm print and iris. Palm print biometric is fairly a new evolving technology that is used to identify people by their palm features. The iris is a strong competitor together with face and fingerprints for presence in multimodal recognition systems. In this research, we introduced an algorithm to the combination of the palm and iris-extracted features using a texture-based descriptor, the Scale Invariant Feature Transform (SIFT). Since the feature sets are non-homogeneous as features of different biometric modalities are used, these features will be concatenated to form a single feature vector. Particle swarm optimization (PSO) is used as a feature selection technique to reduce the dimensionality of the feature. The proposed algorithm will be applied to the Institute of Technology of Delhi (IITD) database and its performance will be compared with various iris recognition algorithms found in the literature.

Keywords: iris recognition, particle swarm optimization, feature extraction, feature selection, palm print, the Scale Invariant Feature Transform (SIFT)

Procedia PDF Downloads 193
539 An Optimization Algorithm for Reducing the Liquid Oscillation in the Moving Containers

Authors: Reza Babajanivalashedi, Stefania Lo Feudo, Jean-Luc Dion

Abstract:

Liquid sloshing is a crucial problem for the dynamic of moving containers in the packaging industries. Sloshing issues have been so far mainly modeled within the framework of fluid dynamics or by using equivalent mechanical models with different kinds of movements and shapes of containers. Nevertheless, these approaches do not allow to determinate the shape of the free surface of the liquid in case of the irregular shape of the moving containers, so that experimental measurements may be required. If there is too much slosh in the moving tank, the liquid can be splashed out on the packages. So, the free surface oscillation must be controlled/reduced to eliminate the splashing. The purpose of this research is to propose an optimization algorithm for finding an optimum command law to reduce surface elevation. In the first step, the free surface of the liquid is simulated based on the separation variable and weak formulation models. Then Genetic and Gradient algorithms are developed for finding the optimum command law. The optimum command law is compared with existing command laws, and the results show that there is a significant difference in surface oscillation between optimum and existing command laws. This algorithm is applicable for different varieties of bottles in case of using the camera for detecting the liquid elevation, and it can produce new command laws for different kinds of tanks to reduce the surface oscillation and remove the splashing phenomenon.

Keywords: sloshing phenomenon, separation variables, weak formulation, optimization algorithm, command law

Procedia PDF Downloads 117
538 Classification of Potential Biomarkers in Breast Cancer Using Artificial Intelligence Algorithms and Anthropometric Datasets

Authors: Aref Aasi, Sahar Ebrahimi Bajgani, Erfan Aasi

Abstract:

Breast cancer (BC) continues to be the most frequent cancer in females and causes the highest number of cancer-related deaths in women worldwide. Inspired by recent advances in studying the relationship between different patient attributes and features and the disease, in this paper, we have tried to investigate the different classification methods for better diagnosis of BC in the early stages. In this regard, datasets from the University Hospital Centre of Coimbra were chosen, and different machine learning (ML)-based and neural network (NN) classifiers have been studied. For this purpose, we have selected favorable features among the nine provided attributes from the clinical dataset by using a random forest algorithm. This dataset consists of both healthy controls and BC patients, and it was noted that glucose, BMI, resistin, and age have the most importance, respectively. Moreover, we have analyzed these features with various ML-based classifier methods, including Decision Tree (DT), K-Nearest Neighbors (KNN), eXtreme Gradient Boosting (XGBoost), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machine (SVM) along with NN-based Multi-Layer Perceptron (MLP) classifier. The results revealed that among different techniques, the SVM and MLP classifiers have the most accuracy, with amounts of 96% and 92%, respectively. These results divulged that the adopted procedure could be used effectively for the classification of cancer cells, and also it encourages further experimental investigations with more collected data for other types of cancers.

Keywords: breast cancer, diagnosis, machine learning, biomarker classification, neural network

Procedia PDF Downloads 96
537 Automatic Early Breast Cancer Segmentation Enhancement by Image Analysis and Hough Transform

Authors: David Jurado, Carlos Ávila

Abstract:

Detection of early signs of breast cancer development is crucial to quickly diagnose the disease and to define adequate treatment to increase the survival probability of the patient. Computer Aided Detection systems (CADs), along with modern data techniques such as Machine Learning (ML) and Neural Networks (NN), have shown an overall improvement in digital mammography cancer diagnosis, reducing the false positive and false negative rates becoming important tools for the diagnostic evaluations performed by specialized radiologists. However, ML and NN-based algorithms rely on datasets that might bring issues to the segmentation tasks. In the present work, an automatic segmentation and detection algorithm is described. This algorithm uses image processing techniques along with the Hough transform to automatically identify microcalcifications that are highly correlated with breast cancer development in the early stages. Along with image processing, automatic segmentation of high-contrast objects is done using edge extraction and circle Hough transform. This provides the geometrical features needed for an automatic mask design which extracts statistical features of the regions of interest. The results shown in this study prove the potential of this tool for further diagnostics and classification of mammographic images due to the low sensitivity to noisy images and low contrast mammographies.

Keywords: breast cancer, segmentation, X-ray imaging, hough transform, image analysis

Procedia PDF Downloads 43
536 Comparative Analysis of Classification Methods in Determining Non-Active Student Characteristics in Indonesia Open University

Authors: Dewi Juliah Ratnaningsih, Imas Sukaesih Sitanggang

Abstract:

Classification is one of data mining techniques that aims to discover a model from training data that distinguishes records into the appropriate category or class. Data mining classification methods can be applied in education, for example, to determine the classification of non-active students in Indonesia Open University. This paper presents a comparison of three methods of classification: Naïve Bayes, Bagging, and C.45. The criteria used to evaluate the performance of three methods of classification are stratified cross-validation, confusion matrix, the value of the area under the ROC Curve (AUC), Recall, Precision, and F-measure. The data used for this paper are from the non-active Indonesia Open University students in registration period of 2004.1 to 2012.2. Target analysis requires that non-active students were divided into 3 groups: C1, C2, and C3. Data analyzed are as many as 4173 students. Results of the study show: (1) Bagging method gave a high degree of classification accuracy than Naïve Bayes and C.45, (2) the Bagging classification accuracy rate is 82.99 %, while the Naïve Bayes and C.45 are 80.04 % and 82.74 % respectively, (3) the result of Bagging classification tree method has a large number of nodes, so it is quite difficult in decision making, (4) classification of non-active Indonesia Open University student characteristics uses algorithms C.45, (5) based on the algorithm C.45, there are 5 interesting rules which can describe the characteristics of non-active Indonesia Open University students.

Keywords: comparative analysis, data mining, clasiffication, Bagging, Naïve Bayes, C.45, non-active students, Indonesia Open University

Procedia PDF Downloads 288
535 Crop Classification using Unmanned Aerial Vehicle Images

Authors: Iqra Yaseen

Abstract:

One of the well-known areas of computer science and engineering, image processing in the context of computer vision has been essential to automation. In remote sensing, medical science, and many other fields, it has made it easier to uncover previously undiscovered facts. Grading of diverse items is now possible because of neural network algorithms, categorization, and digital image processing. Its use in the classification of agricultural products, particularly in the grading of seeds or grains and their cultivars, is widely recognized. A grading and sorting system enables the preservation of time, consistency, and uniformity. Global population growth has led to an increase in demand for food staples, biofuel, and other agricultural products. To meet this demand, available resources must be used and managed more effectively. Image processing is rapidly growing in the field of agriculture. Many applications have been developed using this approach for crop identification and classification, land and disease detection and for measuring other parameters of crop. Vegetation localization is the base of performing these task. Vegetation helps to identify the area where the crop is present. The productivity of the agriculture industry can be increased via image processing that is based upon Unmanned Aerial Vehicle photography and satellite. In this paper we use the machine learning techniques like Convolutional Neural Network, deep learning, image processing, classification, You Only Live Once to UAV imaging dataset to divide the crop into distinct groups and choose the best way to use it.

Keywords: image processing, UAV, YOLO, CNN, deep learning, classification

Procedia PDF Downloads 64
534 Antibacterial Evaluation, in Silico ADME and QSAR Studies of Some Benzimidazole Derivatives

Authors: Strahinja Kovačević, Lidija Jevrić, Miloš Kuzmanović, Sanja Podunavac-Kuzmanović

Abstract:

In this paper, various derivatives of benzimidazole have been evaluated against Gram-negative bacteria Escherichia coli. For all investigated compounds the minimum inhibitory concentration (MIC) was determined. Quantitative structure-activity relationships (QSAR) attempts to find consistent relationships between the variations in the values of molecular properties and the biological activity for a series of compounds so that these rules can be used to evaluate new chemical entities. The correlation between MIC and some absorption, distribution, metabolism and excretion (ADME) parameters was investigated, and the mathematical models for predicting the antibacterial activity of this class of compounds were developed. The quality of the multiple linear regression (MLR) models was validated by the leave-one-out (LOO) technique, as well as by the calculation of the statistical parameters for the developed models and the results are discussed on the basis of the statistical data. The results of this study indicate that ADME parameters have a significant effect on the antibacterial activity of this class of compounds. Principal component analysis (PCA) and agglomerative hierarchical clustering algorithms (HCA) confirmed that the investigated molecules can be classified into groups on the basis of the ADME parameters: Madin-Darby Canine Kidney cell permeability (MDCK), Plasma protein binding (PPB%), human intestinal absorption (HIA%) and human colon carcinoma cell permeability (Caco-2).

Keywords: benzimidazoles, QSAR, ADME, in silico

Procedia PDF Downloads 345
533 Hardware in the Loop Platform for Virtual Commissioning: Case Study of a Hydraulic-Press Model Simulated in Real-Time

Authors: Jorge Rodriguez-Guerra, Carlos Calleja, Aron Pujana, Ana Maria Macarulla

Abstract:

Hydraulic-press commissioning consumes a great amount of man-hours, due to the fact that it takes place several miles away from where it has been designed. This factor became exacerbated due to control designers’ lack of knowledge about which will be the final controller gains before they start working with it. Virtual commissioning has been postulated as an optimal solution to deal with this lack of knowledge. Here, a case study is presented in which a controller is set up against a real-time model based on a hydraulic-press. The press model is designed following manufacturer specifications and it is embedded in a real-time simulator. This methodology ensures that the model achieves similar responses as the real machine that would be placed on the industry. A deterministic communication protocol is in charge of the bidirectional information transmission between the real-time model and the controller. This platform allows the engineer to test and verify the final control responses with exactly the same hardware that is going to be installed in the hydraulic-press, in other words, realize a virtual commissioning of the electro-hydraulic actuator. The Hardware in the Loop (HiL) platform validates in laboratory conditions and harmless for the machine the control algorithms designed, which allows embedding them afterwards in the industrial environment without further modifications.

Keywords: deterministic communication protocol, electro-hydraulic actuator, hardware in the loop, real-time, virtual commissioning

Procedia PDF Downloads 113
532 Production Planning for Animal Food Industry under Demand Uncertainty

Authors: Pirom Thangchitpianpol, Suttipong Jumroonrut

Abstract:

This research investigates the distribution of food demand for animal food and the optimum amount of that food production at minimum cost. The data consist of customer purchase orders for the food of laying hens, price of food for laying hens, cost per unit for the food inventory, cost related to food of laying hens in which the food is out of stock, such as fine, overtime, urgent purchase for material. They were collected from January, 1990 to December, 2013 from a factory in Nakhonratchasima province. The collected data are analyzed in order to explore the distribution of the monthly food demand for the laying hens and to see the rate of inventory per unit. The results are used in a stochastic linear programming model for aggregate planning in which the optimum production or minimum cost could be obtained. Programming algorithms in MATLAB and tools in Linprog software are used to get the solution. The distribution of the food demand for laying hens and the random numbers are used in the model. The study shows that the distribution of monthly food demand for laying has a normal distribution, the monthly average amount (unit: 30 kg) of production from January to December. The minimum total cost average for 12 months is Baht 62,329,181.77. Therefore, the production planning can reduce the cost by 14.64% from real cost.

Keywords: animal food, stochastic linear programming, aggregate planning, production planning, demand uncertainty

Procedia PDF Downloads 353
531 BeamGA Median: A Hybrid Heuristic Search Approach

Authors: Ghada Badr, Manar Hosny, Nuha Bintayyash, Eman Albilali, Souad Larabi Marie-Sainte

Abstract:

The median problem is significantly applied to derive the most reasonable rearrangement phylogenetic tree for many species. More specifically, the problem is concerned with finding a permutation that minimizes the sum of distances between itself and a set of three signed permutations. Genomes with equal number of genes but different order can be represented as permutations. In this paper, an algorithm, namely BeamGA median, is proposed that combines a heuristic search approach (local beam) as an initialization step to generate a number of solutions, and then a Genetic Algorithm (GA) is applied in order to refine the solutions, aiming to achieve a better median with the smallest possible reversal distance from the three original permutations. In this approach, any genome rearrangement distance can be applied. In this paper, we use the reversal distance. To the best of our knowledge, the proposed approach was not applied before for solving the median problem. Our approach considers true biological evolution scenario by applying the concept of common intervals during the GA optimization process. This allows us to imitate a true biological behavior and enhance genetic approach time convergence. We were able to handle permutations with a large number of genes, within an acceptable time performance and with same or better accuracy as compared to existing algorithms.

Keywords: median problem, phylogenetic tree, permutation, genetic algorithm, beam search, genome rearrangement distance

Procedia PDF Downloads 240
530 Road Traffic Accidents Analysis in Mexico City through Crowdsourcing Data and Data Mining Techniques

Authors: Gabriela V. Angeles Perez, Jose Castillejos Lopez, Araceli L. Reyes Cabello, Emilio Bravo Grajales, Adriana Perez Espinosa, Jose L. Quiroz Fabian

Abstract:

Road traffic accidents are among the principal causes of traffic congestion, causing human losses, damages to health and the environment, economic losses and material damages. Studies about traditional road traffic accidents in urban zones represents very high inversion of time and money, additionally, the result are not current. However, nowadays in many countries, the crowdsourced GPS based traffic and navigation apps have emerged as an important source of information to low cost to studies of road traffic accidents and urban congestion caused by them. In this article we identified the zones, roads and specific time in the CDMX in which the largest number of road traffic accidents are concentrated during 2016. We built a database compiling information obtained from the social network known as Waze. The methodology employed was Discovery of knowledge in the database (KDD) for the discovery of patterns in the accidents reports. Furthermore, using data mining techniques with the help of Weka. The selected algorithms was the Maximization of Expectations (EM) to obtain the number ideal of clusters for the data and k-means as a grouping method. Finally, the results were visualized with the Geographic Information System QGIS.

Keywords: data mining, k-means, road traffic accidents, Waze, Weka

Procedia PDF Downloads 377
529 Charting Sentiments with Naive Bayes and Logistic Regression

Authors: Jummalla Aashrith, N. L. Shiva Sai, K. Bhavya Sri

Abstract:

The swift progress of web technology has not only amassed a vast reservoir of internet data but also triggered a substantial surge in data generation. The internet has metamorphosed into one of the dynamic hubs for online education, idea dissemination, as well as opinion-sharing. Notably, the widely utilized social networking platform Twitter is experiencing considerable expansion, providing users with the ability to share viewpoints, participate in discussions spanning diverse communities, and broadcast messages on a global scale. The upswing in online engagement has sparked a significant curiosity in subjective analysis, particularly when it comes to Twitter data. This research is committed to delving into sentiment analysis, focusing specifically on the realm of Twitter. It aims to offer valuable insights into deciphering information within tweets, where opinions manifest in a highly unstructured and diverse manner, spanning a spectrum from positivity to negativity, occasionally punctuated by neutrality expressions. Within this document, we offer a comprehensive exploration and comparative assessment of modern approaches to opinion mining. Employing a range of machine learning algorithms such as Naive Bayes and Logistic Regression, our investigation plunges into the domain of Twitter data streams. We delve into overarching challenges and applications inherent in the realm of subjectivity analysis over Twitter.

Keywords: machine learning, sentiment analysis, visualisation, python

Procedia PDF Downloads 22
528 A Clustering Algorithm for Massive Texts

Authors: Ming Liu, Chong Wu, Bingquan Liu, Lei Chen

Abstract:

Internet users have to face the massive amount of textual data every day. Organizing texts into categories can help users dig the useful information from large-scale text collection. Clustering, in fact, is one of the most promising tools for categorizing texts due to its unsupervised characteristic. Unfortunately, most of traditional clustering algorithms lose their high qualities on large-scale text collection. This situation mainly attributes to the high- dimensional vectors generated from texts. To effectively and efficiently cluster large-scale text collection, this paper proposes a vector reconstruction based clustering algorithm. Only the features that can represent the cluster are preserved in cluster’s representative vector. This algorithm alternately repeats two sub-processes until it converges. One process is partial tuning sub-process, where feature’s weight is fine-tuned by iterative process. To accelerate clustering velocity, an intersection based similarity measurement and its corresponding neuron adjustment function are proposed and implemented in this sub-process. The other process is overall tuning sub-process, where the features are reallocated among different clusters. In this sub-process, the features useless to represent the cluster are removed from cluster’s representative vector. Experimental results on the three text collections (including two small-scale and one large-scale text collections) demonstrate that our algorithm obtains high quality on both small-scale and large-scale text collections.

Keywords: vector reconstruction, large-scale text clustering, partial tuning sub-process, overall tuning sub-process

Procedia PDF Downloads 404
527 Nonlinear Analysis in Investigating the Complexity of Neurophysiological Data during Reflex Behavior

Authors: Juliana A. Knocikova

Abstract:

Methods of nonlinear signal analysis are based on finding that random behavior can arise in deterministic nonlinear systems with a few degrees of freedom. Considering the dynamical systems, entropy is usually understood as a rate of information production. Changes in temporal dynamics of physiological data are indicating evolving of system in time, thus a level of new signal pattern generation. During last decades, many algorithms were introduced to assess some patterns of physiological responses to external stimulus. However, the reflex responses are usually characterized by short periods of time. This characteristic represents a great limitation for usual methods of nonlinear analysis. To solve the problems of short recordings, parameter of approximate entropy has been introduced as a measure of system complexity. Low value of this parameter is reflecting regularity and predictability in analyzed time series. On the other side, increasing of this parameter means unpredictability and a random behavior, hence a higher system complexity. Reduced neurophysiological data complexity has been observed repeatedly when analyzing electroneurogram and electromyogram activities during defence reflex responses. Quantitative phrenic neurogram changes are also obvious during severe hypoxia, as well as during airway reflex episodes. Concluding, the approximate entropy parameter serves as a convenient tool for analysis of reflex behavior characterized by short lasting time series.

Keywords: approximate entropy, neurophysiological data, nonlinear dynamics, reflex

Procedia PDF Downloads 278
526 Distances over Incomplete Diabetes and Breast Cancer Data Based on Bhattacharyya Distance

Authors: Loai AbdAllah, Mahmoud Kaiyal

Abstract:

Missing values in real-world datasets are a common problem. Many algorithms were developed to deal with this problem, most of them replace the missing values with a fixed value that was computed based on the observed values. In our work, we used a distance function based on Bhattacharyya distance to measure the distance between objects with missing values. Bhattacharyya distance, which measures the similarity of two probability distributions. The proposed distance distinguishes between known and unknown values. Where the distance between two known values is the Mahalanobis distance. When, on the other hand, one of them is missing the distance is computed based on the distribution of the known values, for the coordinate that contains the missing value. This method was integrated with Wikaya, a digital health company developing a platform that helps to improve prevention of chronic diseases such as diabetes and cancer. In order for Wikaya’s recommendation system to work distance between users need to be measured. Since there are missing values in the collected data, there is a need to develop a distance function distances between incomplete users profiles. To evaluate the accuracy of the proposed distance function in reflecting the actual similarity between different objects, when some of them contain missing values, we integrated it within the framework of k nearest neighbors (kNN) classifier, since its computation is based only on the similarity between objects. To validate this, we ran the algorithm over diabetes and breast cancer datasets, standard benchmark datasets from the UCI repository. Our experiments show that kNN classifier using our proposed distance function outperforms the kNN using other existing methods.

Keywords: missing values, incomplete data, distance, incomplete diabetes data

Procedia PDF Downloads 187
525 Infodemic Detection on Social Media with a Multi-Dimensional Deep Learning Framework

Authors: Raymond Xu, Cindy Jingru Wang

Abstract:

Social media has become a globally connected and influencing platform. Social media data, such as tweets, can help predict the spread of pandemics and provide individuals and healthcare providers early warnings. Public psychological reactions and opinions can be efficiently monitored by AI models on the progression of dominant topics on Twitter. However, statistics show that as the coronavirus spreads, so does an infodemic of misinformation due to pandemic-related factors such as unemployment and lockdowns. Social media algorithms are often biased toward outrage by promoting content that people have an emotional reaction to and are likely to engage with. This can influence users’ attitudes and cause confusion. Therefore, social media is a double-edged sword. Combating fake news and biased content has become one of the essential tasks. This research analyzes the variety of methods used for fake news detection covering random forest, logistic regression, support vector machines, decision tree, naive Bayes, BoW, TF-IDF, LDA, CNN, RNN, LSTM, DeepFake, and hierarchical attention network. The performance of each method is analyzed. Based on these models’ achievements and limitations, a multi-dimensional AI framework is proposed to achieve higher accuracy in infodemic detection, especially pandemic-related news. The model is trained on contextual content, images, and news metadata.

Keywords: artificial intelligence, fake news detection, infodemic detection, image recognition, sentiment analysis

Procedia PDF Downloads 203