Search results for: clustering algorithm
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3927

Search results for: clustering algorithm

2397 A Review on Comparative Analysis of Path Planning and Collision Avoidance Algorithms

Authors: Divya Agarwal, Pushpendra S. Bharti

Abstract:

Autonomous mobile robots (AMR) are expected as smart tools for operations in every automation industry. Path planning and obstacle avoidance is the backbone of AMR as robots have to reach their goal location avoiding obstacles while traversing through optimized path defined according to some criteria such as distance, time or energy. Path planning can be classified into global and local path planning where environmental information is known and unknown/partially known, respectively. A number of sensors are used for data collection. A number of algorithms such as artificial potential field (APF), rapidly exploring random trees (RRT), bidirectional RRT, Fuzzy approach, Purepursuit, A* algorithm, vector field histogram (VFH) and modified local path planning algorithm, etc. have been used in the last three decades for path planning and obstacle avoidance for AMR. This paper makes an attempt to review some of the path planning and obstacle avoidance algorithms used in the field of AMR. The review includes comparative analysis of simulation and mathematical computations of path planning and obstacle avoidance algorithms using MATLAB 2018a. From the review, it could be concluded that different algorithms may complete the same task (i.e. with a different set of instructions) in less or more time, space, effort, etc.

Keywords: path planning, obstacle avoidance, autonomous mobile robots, algorithms

Procedia PDF Downloads 216
2396 Design of Personal Job Recommendation Framework on Smartphone Platform

Authors: Chayaporn Kaensar

Abstract:

Recently, Job Recommender Systems have gained much attention in industries since they solve the problem of information overload on the recruiting website. Therefore, we proposed Extended Personalized Job System that has the capability of providing the appropriate jobs for job seeker and recommending some suitable information for them using Data Mining Techniques and Dynamic User Profile. On the other hands, company can also interact to the system for publishing and updating job information. This system have emerged and supported various platforms such as web application and android mobile application. In this paper, User profiles, Implicit User Action, User Feedback, and Clustering Techniques in WEKA libraries have gained attention and implemented for this application. In additions, open source tools like Yii Web Application Framework, Bootstrap Front End Framework and Android Mobile Technology were also applied.

Keywords: recommendation, user profile, data mining, web and mobile technology

Procedia PDF Downloads 299
2395 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data

Procedia PDF Downloads 304
2394 Using Jumping Particle Swarm Optimization for Optimal Operation of Pump in Water Distribution Networks

Authors: R. Rajabpour, N. Talebbeydokhti, M. H. Ahmadi

Abstract:

Carefully scheduling the operations of pumps can be resulted to significant energy savings. Schedules can be defined either implicit, in terms of other elements of the network such as tank levels, or explicit by specifying the time during which each pump is on/off. In this study, two new explicit representations based on time-controlled triggers were analyzed, where the maximum number of pump switches was established beforehand, and the schedule may contain fewer switches than the maximum. The optimal operation of pumping stations was determined using a Jumping Particle Swarm Optimization (JPSO) algorithm to achieve the minimum energy cost. The model integrates JPSO optimizer and EPANET hydraulic network solver. The optimal pump operation schedule of VanZyl water distribution system was determined using the proposed model and compared with those from Genetic and Ant Colony algorithms. The results indicate that the proposed model utilizing the JPSP algorithm outperformed the others and is a versatile management model for the operation of real-world water distribution system.

Keywords: JPSO, operation, optimization, water distribution system

Procedia PDF Downloads 224
2393 Reusing Assessments Tests by Generating Arborescent Test Groups Using a Genetic Algorithm

Authors: Ovidiu Domşa, Nicolae Bold

Abstract:

Using Information and Communication Technologies (ICT) notions in education and three basic processes of education (teaching, learning and assessment) can bring benefits to the pupils and the professional development of teachers. In this matter, we refer to these notions as concepts taken from the informatics area and apply them to the domain of education. These notions refer to genetic algorithms and arborescent structures, used in the specific process of assessment or evaluation. This paper uses these kinds of notions to generate subtrees from a main tree of tests related between them by their degree of difficulty. These subtrees must contain the highest number of connections between the nodes and the lowest number of missing edges (which are subtrees of the main tree) and, in the particular case of the non-existence of a subtree with no missing edges, the subtrees which have the lowest (minimal) number of missing edges between the nodes, where a node is a test and an edge is a direct connection between two tests which differs by one degree of difficulty. The subtrees are represented as sequences. The tests are the same (a number coding a test represents that test in every sequence) and they are reused for each sequence of tests.

Keywords: chromosome, genetic algorithm, subtree, test

Procedia PDF Downloads 306
2392 Orbit Determination from Two Position Vectors Using Finite Difference Method

Authors: Akhilesh Kumar, Sathyanarayan G., Nirmala S.

Abstract:

An unusual approach is developed to determine the orbit of satellites/space objects. The determination of orbits is considered a boundary value problem and has been solved using the finite difference method (FDM). Only positions of the satellites/space objects are known at two end times taken as boundary conditions. The technique of finite difference has been used to calculate the orbit between end times. In this approach, the governing equation is defined as the satellite's equation of motion with a perturbed acceleration. Using the finite difference method, the governing equations and boundary conditions are discretized. The resulting system of algebraic equations is solved using Tri Diagonal Matrix Algorithm (TDMA) until convergence is achieved. This methodology test and evaluation has been done using all GPS satellite orbits from National Geospatial-Intelligence Agency (NGA) precise product for Doy 125, 2023. Towards this, two hours of twelve sets have been taken into consideration. Only positions at the end times of each twelve sets are considered boundary conditions. This algorithm is applied to all GPS satellites. Results achieved using FDM compared with the results of NGA precise orbits. The maximum RSS error for the position is 0.48 [m] and the velocity is 0.43 [mm/sec]. Also, the present algorithm is applied on the IRNSS satellites for Doy 220, 2023. The maximum RSS error for the position is 0.49 [m], and for velocity is 0.28 [mm/sec]. Next, a simulation has been done for a Highly Elliptical orbit for DOY 63, 2023, for the duration of 6 hours. The RSS of difference in position is 0.92 [m] and velocity is 1.58 [mm/sec] for the orbital speed of more than 5km/sec. Whereas the RSS of difference in position is 0.13 [m] and velocity is 0.12 [mm/sec] for the orbital speed less than 5km/sec. Results show that the newly created method is reliable and accurate. Further applications of the developed methodology include missile and spacecraft targeting, orbit design (mission planning), space rendezvous and interception, space debris correlation, and navigation solutions.

Keywords: finite difference method, grid generation, NavIC system, orbit perturbation

Procedia PDF Downloads 68
2391 Cooperative Sensing for Wireless Sensor Networks

Authors: Julien Romieux, Fabio Verdicchio

Abstract:

Wireless Sensor Networks (WSNs), which sense environmental data with battery-powered nodes, require multi-hop communication. This power-demanding task adds an extra workload that is unfairly distributed across the network. As a result, nodes run out of battery at different times: this requires an impractical individual node maintenance scheme. Therefore we investigate a new Cooperative Sensing approach that extends the WSN operational life and allows a more practical network maintenance scheme (where all nodes deplete their batteries almost at the same time). We propose a novel cooperative algorithm that derives a piecewise representation of the sensed signal while controlling approximation accuracy. Simulations show that our algorithm increases WSN operational life and spreads communication workload evenly. Results convey a counterintuitive conclusion: distributing workload fairly amongst nodes may not decrease the network power consumption and yet extend the WSN operational life. This is achieved as our cooperative approach decreases the workload of the most burdened cluster in the network.

Keywords: cooperative signal processing, signal representation and approximation, power management, wireless sensor networks

Procedia PDF Downloads 370
2390 Path Planning for Multiple Unmanned Aerial Vehicles Based on Adaptive Probabilistic Sampling Algorithm

Authors: Long Cheng, Tong He, Iraj Mantegh, Wen-Fang Xie

Abstract:

Path planning is essential for UAVs (Unmanned Aerial Vehicle) with autonomous navigation in unknown environments. In this paper, an adaptive probabilistic sampling algorithm is proposed for the GPS-denied environment, which can be utilized for autonomous navigation system of multiple UAVs in a dynamically-changing structured environment. This method can be used for Unmanned Aircraft Systems Traffic Management (UTM) solutions and in autonomous urban aerial mobility, where a number of platforms are expected to share the airspace. A path network is initially built off line based on available environment map, and on-board sensors systems on the flying UAVs are used for continuous situational awareness and to inform the changes in the path network. Simulation results based on MATLAB and Gazebo in different scenarios and algorithms performance measurement show the high efficiency and accuracy of the proposed technique in unknown environments.

Keywords: path planning, adaptive probabilistic sampling, obstacle avoidance, multiple unmanned aerial vehicles, unknown environments

Procedia PDF Downloads 134
2389 Gray Level Image Encryption

Authors: Roza Afarin, Saeed Mozaffari

Abstract:

The aim of this paper is image encryption using Genetic Algorithm (GA). The proposed encryption method consists of two phases. In modification phase, pixels locations are altered to reduce correlation among adjacent pixels. Then, pixels values are changed in the diffusion phase to encrypt the input image. Both phases are performed by GA with binary chromosomes. For modification phase, these binary patterns are generated by Local Binary Pattern (LBP) operator while for diffusion phase binary chromosomes are obtained by Bit Plane Slicing (BPS). Initial population in GA includes rows and columns of the input image. Instead of subjective selection of parents from this initial population, a random generator with predefined key is utilized. It is necessary to decrypt the coded image and reconstruct the initial input image. Fitness function is defined as average of transition from 0 to 1 in LBP image and histogram uniformity in modification and diffusion phases, respectively. Randomness of the encrypted image is measured by entropy, correlation coefficients and histogram analysis. Experimental results show that the proposed method is fast enough and can be used effectively for image encryption.

Keywords: correlation coefficients, genetic algorithm, image encryption, image entropy

Procedia PDF Downloads 308
2388 SEM Image Classification Using CNN Architectures

Authors: Güzi̇n Ti̇rkeş, Özge Teki̇n, Kerem Kurtuluş, Y. Yekta Yurtseven, Murat Baran

Abstract:

A scanning electron microscope (SEM) is a type of electron microscope mainly used in nanoscience and nanotechnology areas. Automatic image recognition and classification are among the general areas of application concerning SEM. In line with these usages, the present paper proposes a deep learning algorithm that classifies SEM images into nine categories by means of an online application to simplify the process. The NFFA-EUROPE - 100% SEM data set, containing approximately 21,000 images, was used to train and test the algorithm at 80% and 20%, respectively. Validation was carried out using a separate data set obtained from the Middle East Technical University (METU) in Turkey. To increase the accuracy in the results, the Inception ResNet-V2 model was used in view of the Fine-Tuning approach. By using a confusion matrix, it was observed that the coated-surface category has a negative effect on the accuracy of the results since it contains other categories in the data set, thereby confusing the model when detecting category-specific patterns. For this reason, the coated-surface category was removed from the train data set, hence increasing accuracy by up to 96.5%.

Keywords: convolutional neural networks, deep learning, image classification, scanning electron microscope

Procedia PDF Downloads 104
2387 Extracting Actions with Improved Part of Speech Tagging for Social Networking Texts

Authors: Yassine Jamoussi, Ameni Youssfi, Henda Ben Ghezala

Abstract:

With the growing interest in social networking, the interaction of social actors evolved to a source of knowledge in which it becomes possible to perform context aware-reasoning. The information extraction from social networking especially Twitter and Facebook is one of the problems in this area. To extract text from social networking, we need several lexical features and large scale word clustering. We attempt to expand existing tokenizer and to develop our own tagger in order to support the incorrect words currently in existence in Facebook and Twitter. Our goal in this work is to benefit from the lexical features developed for Twitter and online conversational text in previous works, and to develop an extraction model for constructing a huge knowledge based on actions

Keywords: social networking, information extraction, part-of-speech tagging, natural language processing

Procedia PDF Downloads 288
2386 Authentication Based on Hand Movement by Low Dimensional Space Representation

Authors: Reut Lanyado, David Mendlovic

Abstract:

Most biological methods for authentication require special equipment and, some of them are easy to fake. We proposed a method for authentication based on hand movement while typing a sentence with a regular camera. This technique uses the full video of the hand, which is harder to fake. In the first phase, we tracked the hand joints in each frame. Next, we represented a single frame for each individual using our Pose Agnostic Rotation and Movement (PARM) dimensional space. Then, we indicated a full video of hand movement in a fixed low dimensional space using this method: Fixed Dimension Video by Interpolation Statistics (FDVIS). Finally, we identified each individual in the FDVIS representation using unsupervised clustering and supervised methods. Accuracy exceeds 96% for 80 individuals by using supervised KNN.

Keywords: authentication, feature extraction, hand recognition, security, signal processing

Procedia PDF Downloads 113
2385 Quality of Service Based Routing Algorithm for Real Time Applications in MANETs Using Ant Colony and Fuzzy Logic

Authors: Farahnaz Karami

Abstract:

Routing is an important, challenging task in mobile ad hoc networks due to node mobility, lack of central control, unstable links, and limited resources. An ant colony has been found to be an attractive technique for routing in Mobile Ad Hoc Networks (MANETs). However, existing swarm intelligence based routing protocols find an optimal path by considering only one or two route selection metrics without considering correlations among such parameters making them unsuitable lonely for routing real time applications. Fuzzy logic combines multiple route selection parameters containing uncertain information or imprecise data in nature, but does not have multipath routing property naturally in order to provide load balancing. The objective of this paper is to design a routing algorithm using fuzzy logic and ant colony that can solve some of routing problems in mobile ad hoc networks, such as nodes energy consumption optimization to increase network lifetime, link failures rate reduction to increase packet delivery reliability and providing load balancing to optimize available bandwidth. In proposed algorithm, the path information will be given to fuzzy inference system by ants. Based on the available path information and considering the parameters required for quality of service (QoS), the fuzzy cost of each path is calculated and the optimal paths will be selected. NS2.35 simulation tools are used for simulation and the results are compared and evaluated with the newest QoS based algorithms in MANETs according to packet delivery ratio, end-to-end delay and routing overhead ratio criterions. The simulation results show significant improvement in the performance of these networks in terms of decreasing end-to-end delay, and routing overhead ratio, and also increasing packet delivery ratio.

Keywords: mobile ad hoc networks, routing, quality of service, ant colony, fuzzy logic

Procedia PDF Downloads 45
2384 A Monolithic Arbitrary Lagrangian-Eulerian Finite Element Strategy for Partly Submerged Solid in Incompressible Fluid with Mortar Method for Modeling the Contact Surface

Authors: Suman Dutta, Manish Agrawal, C. S. Jog

Abstract:

Accurate computation of hydrodynamic forces on floating structures and their deformation finds application in the ocean and naval engineering and wave energy harvesting. This manuscript presents a monolithic, finite element strategy for fluid-structure interaction involving hyper-elastic solids partly submerged in an incompressible fluid. A velocity-based Arbitrary Lagrangian-Eulerian (ALE) formulation has been used for the fluid and a displacement-based Lagrangian approach has been used for the solid. The flexibility of the ALE technique permits us to treat the free surface of the fluid as a Lagrangian entity. At the interface, the continuity of displacement, velocity and traction are enforced using the mortar method. In the mortar method, the constraints are enforced in a weak sense using the Lagrange multiplier method. In the literature, the mortar method has been shown to be robust in solving various contact mechanics problems. The time-stepping strategy used in this work reduces to the generalized trapezoidal rule in the Eulerian setting. In the Lagrangian limit, in the absence of external load, the algorithm conserves the linear and angular momentum and the total energy of the system. The use of monolithic coupling with an energy-conserving time-stepping strategy gives an unconditionally stable algorithm and allows the user to take large time steps. All the governing equations and boundary conditions have been mapped to the reference configuration. The use of the exact tangent stiffness matrix ensures that the algorithm converges quadratically within each time step. The robustness and good performance of the proposed method are demonstrated by solving benchmark problems from the literature.

Keywords: ALE, floating body, fluid-structure interaction, monolithic, mortar method

Procedia PDF Downloads 265
2383 Integrating Natural Language Processing (NLP) and Machine Learning in Lung Cancer Diagnosis

Authors: Mehrnaz Mostafavi

Abstract:

The assessment and categorization of incidental lung nodules present a considerable challenge in healthcare, often necessitating resource-intensive multiple computed tomography (CT) scans for growth confirmation. This research addresses this issue by introducing a distinct computational approach leveraging radiomics and deep-learning methods. However, understanding local services is essential before implementing these advancements. With diverse tracking methods in place, there is a need for efficient and accurate identification approaches, especially in the context of managing lung nodules alongside pre-existing cancer scenarios. This study explores the integration of text-based algorithms in medical data curation, indicating their efficacy in conjunction with machine learning and deep-learning models for identifying lung nodules. Combining medical images with text data has demonstrated superior data retrieval compared to using each modality independently. While deep learning and text analysis show potential in detecting previously missed nodules, challenges persist, such as increased false positives. The presented research introduces a Structured-Query-Language (SQL) algorithm designed for identifying pulmonary nodules in a tertiary cancer center, externally validated at another hospital. Leveraging natural language processing (NLP) and machine learning, the algorithm categorizes lung nodule reports based on sentence features, aiming to facilitate research and assess clinical pathways. The hypothesis posits that the algorithm can accurately identify lung nodule CT scans and predict concerning nodule features using machine-learning classifiers. Through a retrospective observational study spanning a decade, CT scan reports were collected, and an algorithm was developed to extract and classify data. Results underscore the complexity of lung nodule cohorts in cancer centers, emphasizing the importance of careful evaluation before assuming a metastatic origin. The SQL and NLP algorithms demonstrated high accuracy in identifying lung nodule sentences, indicating potential for local service evaluation and research dataset creation. Machine-learning models exhibited strong accuracy in predicting concerning changes in lung nodule scan reports. While limitations include variability in disease group attribution, the potential for correlation rather than causality in clinical findings, and the need for further external validation, the algorithm's accuracy and potential to support clinical decision-making and healthcare automation represent a significant stride in lung nodule management and research.

Keywords: lung cancer diagnosis, structured-query-language (SQL), natural language processing (NLP), machine learning, CT scans

Procedia PDF Downloads 64
2382 Capacitated Multiple Allocation P-Hub Median Problem on a Cluster Based Network under Congestion

Authors: Çağrı Özgün Kibiroğlu, Zeynep Turgut

Abstract:

This paper considers a hub location problem where the network service area partitioned into predetermined zones (represented by node clusters is given) and potential hub nodes capacity levels are determined a priori as a selection criteria of hub to investigate congestion effect on network. The objective is to design hub network by determining all required hub locations in the node clusters and also allocate non-hub nodes to hubs such that the total cost including transportation cost, opening cost of hubs and penalty cost for exceed of capacity level at hubs is minimized. A mixed integer linear programming model is developed introducing additional constraints to the traditional model of capacitated multiple allocation hub location problem and empirically tested.

Keywords: hub location problem, p-hub median problem, clustering, congestion

Procedia PDF Downloads 478
2381 A Fuzzy Multiobjective Model for Bed Allocation Optimized by Artificial Bee Colony Algorithm

Authors: Jalal Abdulkareem Sultan, Abdulhakeem Luqman Hasan

Abstract:

With the development of health care systems competition, hospitals face more and more pressures. Meanwhile, resource allocation has a vital effect on achieving competitive advantages in hospitals. Selecting the appropriate number of beds is one of the most important sections in hospital management. However, in real situation, bed allocation selection is a multiple objective problem about different items with vagueness and randomness of the data. It is very complex. Hence, research about bed allocation problem is relatively scarce under considering multiple departments, nursing hours, and stochastic information about arrival and service of patients. In this paper, we develop a fuzzy multiobjective bed allocation model for overcoming uncertainty and multiple departments. Fuzzy objectives and weights are simultaneously applied to help the managers to select the suitable beds about different departments. The proposed model is solved by using Artificial Bee Colony (ABC), which is a very effective algorithm. The paper describes an application of the model, dealing with a public hospital in Iraq. The results related that fuzzy multi-objective model was presented suitable framework for bed allocation and optimum use.

Keywords: bed allocation problem, fuzzy logic, artificial bee colony, multi-objective optimization

Procedia PDF Downloads 302
2380 Design and Field Programmable Gate Array Implementation of Radio Frequency Identification for Boosting up Tag Data Processing

Authors: G. Rajeshwari, V. D. M. Jabez Daniel

Abstract:

Radio Frequency Identification systems are used for automated identification in various applications such as automobiles, health care and security. It is also called as the automated data collection technology. RFID readers are placed in any area to scan large number of tags to cover a wide distance. The placement of the RFID elements may result in several types of collisions. A major challenge in RFID system is collision avoidance. In the previous works the collision was avoided by using algorithms such as ALOHA and tree algorithm. This work proposes collision reduction and increased throughput through reading enhancement method with tree algorithm. The reading enhancement is done by improving interrogation procedure and increasing the data handling capacity of RFID reader with parallel processing. The work is simulated using Xilinx ISE 14.5 verilog language. By implementing this in the RFID system, we can able to achieve high throughput and avoid collision in the reader at a same instant of time. The overall system efficiency will be increased by implementing this.

Keywords: antenna, anti-collision protocols, data management system, reader, reading enhancement, tag

Procedia PDF Downloads 285
2379 Optimisation of Intermodal Transport Chain of Supermarkets on Isle of Wight, UK

Authors: Jingya Liu, Yue Wu, Jiabin Luo

Abstract:

This work investigates an intermodal transportation system for delivering goods from a Regional Distribution Centre to supermarkets on the Isle of Wight (IOW) via the port of Southampton or Portsmouth in the UK. We consider this integrated logistics chain as a 3-echelon transportation system. In such a system, there are two types of transport methods used to deliver goods across the Solent Channel: one is accompanied transport, which is used by most supermarkets on the IOW, such as Spar, Lidl and Co-operative food; the other is unaccompanied transport, which is used by Aldi. Five transport scenarios are studied based on different transport modes and ferry routes. The aim is to determine an optimal delivery plan for supermarkets of different business scales on IOW, in order to minimise the total running cost, fuel consumptions and carbon emissions. The problem is modelled as a vehicle routing problem with time windows and solved by genetic algorithm. The computing results suggested that accompanied transport is more cost efficient for small and medium business-scale supermarket chains on IOW, while unaccompanied transport has the potential to improve the efficiency and effectiveness of large business scale supermarket chains.

Keywords: genetic algorithm, intermodal transport system, Isle of Wight, optimization, supermarket

Procedia PDF Downloads 352
2378 Diagnose of the Future of Family Businesses Based on the Study of Spanish Family Businesses Founders

Authors: Fernando Doral

Abstract:

Family businesses are a key phenomenon within the business landscape. Nevertheless, it involves two terms (“family” and “business”) which are nowadays rapidly evolving. Consequently, it isn't easy to diagnose if a family business will be a growing or decreasing phenomenon, which is the objective of this study. For that purpose, a sample of 50 Spanish-established companies from various sectors was taken. Different factors were identified for each enterprise, related to the profile of the founders, such as age, the number of sons and daughters, or support received from the family at the moment to start it up. That information was taken as an input for a clustering method to identify groups, which could help define the founders' profiles. That characterization was carried as a base to identify three factors whose evolution should be analyzed: family structures, business landscape and entrepreneurs' motivations. The analysis of the evolution of these three factors seems to indicate a negative tendency of family businesses. Therefore the consequent diagnosis of this study is to consider family businesses as a declining phenomenon.

Keywords: business diagnose, business trends, family business, family business founders

Procedia PDF Downloads 189
2377 Data Mining Techniques for Anti-Money Laundering

Authors: M. Sai Veerendra

Abstract:

Today, money laundering (ML) poses a serious threat not only to financial institutions but also to the nation. This criminal activity is becoming more and more sophisticated and seems to have moved from the cliché of drug trafficking to financing terrorism and surely not forgetting personal gain. Most of the financial institutions internationally have been implementing anti-money laundering solutions (AML) to fight investment fraud activities. However, traditional investigative techniques consume numerous man-hours. Recently, data mining approaches have been developed and are considered as well-suited techniques for detecting ML activities. Within the scope of a collaboration project on developing a new data mining solution for AML Units in an international investment bank in Ireland, we survey recent data mining approaches for AML. In this paper, we present not only these approaches but also give an overview on the important factors in building data mining solutions for AML activities.

Keywords: data mining, clustering, money laundering, anti-money laundering solutions

Procedia PDF Downloads 524
2376 2D Hexagonal Cellular Automata: The Complexity of Forms

Authors: Vural Erdogan

Abstract:

We created two-dimensional hexagonal cellular automata to obtain complexity by using simple rules same as Conway’s game of life. Considering the game of life rules, Wolfram's works about life-like structures and John von Neumann's self-replication, self-maintenance, self-reproduction problems, we developed 2-states and 3-states hexagonal growing algorithms that reach large populations through random initial states. Unlike the game of life, we used six neighbourhoods cellular automata instead of eight or four neighbourhoods. First simulations explained that whether we are able to obtain sort of oscillators, blinkers, and gliders. Inspired by Wolfram's 1D cellular automata complexity and life-like structures, we simulated 2D synchronous, discrete, deterministic cellular automata to reach life-like forms with 2-states cells. The life-like formations and the oscillators have been explained how they contribute to initiating self-maintenance together with self-reproduction and self-replication. After comparing simulation results, we decided to develop the algorithm for another step. Appending a new state to the same algorithm, which we used for reaching life-like structures, led us to experiment new branching and fractal forms. All these studies tried to demonstrate that complex life forms might come from uncomplicated rules.

Keywords: hexagonal cellular automata, self-replication, self-reproduction, self- maintenance

Procedia PDF Downloads 136
2375 Framework for Detecting External Plagiarism from Monolingual Documents: Use of Shallow NLP and N-Gram Frequency Comparison

Authors: Saugata Bose, Ritambhra Korpal

Abstract:

The internet has increased the copy-paste scenarios amongst students as well as amongst researchers leading to different levels of plagiarized documents. For this reason, much of research is focused on for detecting plagiarism automatically. In this paper, an initiative is discussed where Natural Language Processing (NLP) techniques as well as supervised machine learning algorithms have been combined to detect plagiarized texts. Here, the major emphasis is on to construct a framework which detects external plagiarism from monolingual texts successfully. For successfully detecting the plagiarism, n-gram frequency comparison approach has been implemented to construct the model framework. The framework is based on 120 characteristics which have been extracted during pre-processing the documents using NLP approach. Afterwards, filter metrics has been applied to select most relevant characteristics and then supervised classification learning algorithm has been used to classify the documents in four levels of plagiarism. Confusion matrix was built to estimate the false positives and false negatives. Our plagiarism framework achieved a very high the accuracy score.

Keywords: lexical matching, shallow NLP, supervised machine learning algorithm, word n-gram

Procedia PDF Downloads 341
2374 Cost Sensitive Feature Selection in Decision-Theoretic Rough Set Models for Customer Churn Prediction: The Case of Telecommunication Sector Customers

Authors: Emel Kızılkaya Aydogan, Mihrimah Ozmen, Yılmaz Delice

Abstract:

In recent days, there is a change and the ongoing development of the telecommunications sector in the global market. In this sector, churn analysis techniques are commonly used for analysing why some customers terminate their service subscriptions prematurely. In addition, customer churn is utmost significant in this sector since it causes to important business loss. Many companies make various researches in order to prevent losses while increasing customer loyalty. Although a large quantity of accumulated data is available in this sector, their usefulness is limited by data quality and relevance. In this paper, a cost-sensitive feature selection framework is developed aiming to obtain the feature reducts to predict customer churn. The framework is a cost based optional pre-processing stage to remove redundant features for churn management. In addition, this cost-based feature selection algorithm is applied in a telecommunication company in Turkey and the results obtained with this algorithm.

Keywords: churn prediction, data mining, decision-theoretic rough set, feature selection

Procedia PDF Downloads 428
2373 The Data Quality Model for the IoT based Real-time Water Quality Monitoring Sensors

Authors: Rabbia Idrees, Ananda Maiti, Saurabh Garg, Muhammad Bilal Amin

Abstract:

IoT devices are the basic building blocks of IoT network that generate enormous volume of real-time and high-speed data to help organizations and companies to take intelligent decisions. To integrate this enormous data from multisource and transfer it to the appropriate client is the fundamental of IoT development. The handling of this huge quantity of devices along with the huge volume of data is very challenging. The IoT devices are battery-powered and resource-constrained and to provide energy efficient communication, these IoT devices go sleep or online/wakeup periodically and a-periodically depending on the traffic loads to reduce energy consumption. Sometime these devices get disconnected due to device battery depletion. If the node is not available in the network, then the IoT network provides incomplete, missing, and inaccurate data. Moreover, many IoT applications, like vehicle tracking and patient tracking require the IoT devices to be mobile. Due to this mobility, If the distance of the device from the sink node become greater than required, the connection is lost. Due to this disconnection other devices join the network for replacing the broken-down and left devices. This make IoT devices dynamic in nature which brings uncertainty and unreliability in the IoT network and hence produce bad quality of data. Due to this dynamic nature of IoT devices we do not know the actual reason of abnormal data. If data are of poor-quality decisions are likely to be unsound. It is highly important to process data and estimate data quality before bringing it to use in IoT applications. In the past many researchers tried to estimate data quality and provided several Machine Learning (ML), stochastic and statistical methods to perform analysis on stored data in the data processing layer, without focusing the challenges and issues arises from the dynamic nature of IoT devices and how it is impacting data quality. A comprehensive review on determining the impact of dynamic nature of IoT devices on data quality is done in this research and presented a data quality model that can deal with this challenge and produce good quality of data. This research presents the data quality model for the sensors monitoring water quality. DBSCAN clustering and weather sensors are used in this research to make data quality model for the sensors monitoring water quality. An extensive study has been done in this research on finding the relationship between the data of weather sensors and sensors monitoring water quality of the lakes and beaches. The detailed theoretical analysis has been presented in this research mentioning correlation between independent data streams of the two sets of sensors. With the help of the analysis and DBSCAN, a data quality model is prepared. This model encompasses five dimensions of data quality: outliers’ detection and removal, completeness, patterns of missing values and checks the accuracy of the data with the help of cluster’s position. At the end, the statistical analysis has been done on the clusters formed as the result of DBSCAN, and consistency is evaluated through Coefficient of Variation (CoV).

Keywords: clustering, data quality, DBSCAN, and Internet of things (IoT)

Procedia PDF Downloads 121
2372 Application of Random Forest Model in The Prediction of River Water Quality

Authors: Turuganti Venkateswarlu, Jagadeesh Anmala

Abstract:

Excessive runoffs from various non-point source land uses, and other point sources are rapidly contaminating the water quality of streams in the Upper Green River watershed, Kentucky, USA. It is essential to maintain the stream water quality as the river basin is one of the major freshwater sources in this province. It is also important to understand the water quality parameters (WQPs) quantitatively and qualitatively along with their important features as stream water is sensitive to climatic events and land-use practices. In this paper, a model was developed for predicting one of the significant WQPs, Fecal Coliform (FC) from precipitation, temperature, urban land use factor (ULUF), agricultural land use factor (ALUF), and forest land-use factor (FLUF) using Random Forest (RF) algorithm. The RF model, a novel ensemble learning algorithm, can even find out advanced feature importance characteristics from the given model inputs for different combinations. This model’s outcomes showed a good correlation between FC and climate events and land use factors (R2 = 0.94) and precipitation and temperature are the primary influencing factors for FC.

Keywords: water quality, land use factors, random forest, fecal coliform

Procedia PDF Downloads 181
2371 Training of Future Computer Science Teachers Based on Machine Learning Methods

Authors: Meruert Serik, Nassipzhan Duisegaliyeva, Danara Tleumagambetova

Abstract:

The article highlights and describes the characteristic features of real-time face detection in images and videos using machine learning algorithms. Students of educational programs reviewed the research work "6B01511-Computer Science", "7M01511-Computer Science", "7M01525- STEM Education," and "8D01511-Computer Science" of Eurasian National University named after L.N. Gumilyov. As a result, the advantages and disadvantages of Haar Cascade (Haar Cascade OpenCV), HoG SVM (Histogram of Oriented Gradients, Support Vector Machine), and MMOD CNN Dlib (Max-Margin Object Detection, convolutional neural network) detectors used for face detection were determined. Dlib is a general-purpose cross-platform software library written in the programming language C++. It includes detectors used for determining face detection. The Cascade OpenCV algorithm is efficient for fast face detection. The considered work forms the basis for the development of machine learning methods by future computer science teachers.

Keywords: algorithm, artificial intelligence, education, machine learning

Procedia PDF Downloads 59
2370 Sinusoidal Roughness Elements in a Square Cavity

Authors: Muhammad Yousaf, Shoaib Usman

Abstract:

Numerical studies were conducted using Lattice Boltzmann Method (LBM) to study the natural convection in a square cavity in the presence of roughness. An algorithm basedon a single relaxation time Bhatnagar-Gross-Krook (BGK) model of Lattice Boltzmann Method (LBM) was developed. Roughness was introduced on both the hot and cold walls in the form of sinusoidal roughness elements. The study was conducted for a Newtonian fluid of Prandtl number (Pr) 1.0. The range of Ra number was explored from 103 to 106 in a laminar region. Thermal and hydrodynamic behavior of fluid was analyzed using a differentially heated square cavity with roughness elements present on both the hot and cold wall. Neumann boundary conditions were introduced on horizontal walls with vertical walls as isothermal. The roughness elements were at the same boundary condition as corresponding walls. Computational algorithm was validated against previous benchmark studies performed with different numerical methods, and a good agreement was found to exist. Results indicate that the maximum reduction in the average heat transfer was16.66 percent at Ra number 105.

Keywords: Lattice Boltzmann method, natural convection, nusselt number, rayleigh number, roughness

Procedia PDF Downloads 515
2369 Performance Prediction Methodology of Slow Aging Assets

Authors: M. Ben Slimene, M.-S. Ouali

Abstract:

Asset management of urban infrastructures faces a multitude of challenges that need to be overcome to obtain a reliable measurement of performances. Predicting the performance of slowly aging systems is one of those challenges, which helps the asset manager to investigate specific failure modes and to undertake the appropriate maintenance and rehabilitation interventions to avoid catastrophic failures as well as to optimize the maintenance costs. This article presents a methodology for modeling the deterioration of slowly degrading assets based on an operating history. It consists of extracting degradation profiles by grouping together assets that exhibit similar degradation sequences using an unsupervised classification technique derived from artificial intelligence. The obtained clusters are used to build the performance prediction models. This methodology is applied to a sample of a stormwater drainage culvert dataset.

Keywords: artificial Intelligence, clustering, culvert, regression model, slow degradation

Procedia PDF Downloads 89
2368 Optimization of Traffic Agent Allocation for Minimizing Bus Rapid Transit Cost on Simplified Jakarta Network

Authors: Gloria Patricia Manurung

Abstract:

Jakarta Bus Rapid Transit (BRT) system which was established in 2009 to reduce private vehicle usage and ease the rush hour gridlock throughout the Jakarta Greater area, has failed to achieve its purpose. With gradually increasing the number of private vehicles ownership and reduced road space by the BRT lane construction, private vehicle users intuitively invade the exclusive lane of BRT, creating local traffic along the BRT network. Invaded BRT lanes costs become the same with the road network, making BRT which is supposed to be the main public transportation in the city becoming unreliable. Efforts to guard critical lanes with preventing the invasion by allocating traffic agents at several intersections have been expended, lead to the improving congestion level along the lane. Given a set of number of traffic agents, this study uses an analytical approach to finding the best deployment strategy of traffic agent on a simplified Jakarta road network in minimizing the BRT link cost which is expected to lead to the improvement of BRT system time reliability. User-equilibrium model of traffic assignment is used to reproduce the origin-destination demand flow on the network and the optimum solution conventionally can be obtained with brute force algorithm. This method’s main constraint is that traffic assignment simulation time escalates exponentially with the increase of set of agent’s number and network size. Our proposed metaheuristic and heuristic algorithms perform linear simulation time increase and result in minimized BRT cost approaching to brute force algorithm optimization. Further analysis of the overall network link cost should be performed to see the impact of traffic agent deployment to the network system.

Keywords: traffic assignment, user equilibrium, greedy algorithm, optimization

Procedia PDF Downloads 215