Search results for: algorithms and data structure
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 31146

Search results for: algorithms and data structure

30936 A Comparison of Methods for Neural Network Aggregation

Authors: John Pomerat, Aviv Segev

Abstract:

Recently, deep learning has had many theoretical breakthroughs. For deep learning to be successful in the industry, however, there need to be practical algorithms capable of handling many real-world hiccups preventing the immediate application of a learning algorithm. Although AI promises to revolutionize the healthcare industry, getting access to patient data in order to train learning algorithms has not been easy. One proposed solution to this is data- sharing. In this paper, we propose an alternative protocol, based on multi-party computation, to train deep learning models while maintaining both the privacy and security of training data. We examine three methods of training neural networks in this way: Transfer learning, average ensemble learning, and series network learning. We compare these methods to the equivalent model obtained through data-sharing across two different experiments. Additionally, we address the security concerns of this protocol. While the motivating example is healthcare, our findings regarding multi-party computation of neural network training are purely theoretical and have use-cases outside the domain of healthcare.

Keywords: neural network aggregation, multi-party computation, transfer learning, average ensemble learning

Procedia PDF Downloads 138
30935 General Architecture for Automation of Machine Learning Practices

Authors: U. Borasi, Amit Kr. Jain, Rakesh, Piyush Jain

Abstract:

Data collection, data preparation, model training, model evaluation, and deployment are all processes in a typical machine learning workflow. Training data needs to be gathered and organised. This often entails collecting a sizable dataset and cleaning it to remove or correct any inaccurate or missing information. Preparing the data for use in the machine learning model requires pre-processing it after it has been acquired. This often entails actions like scaling or normalising the data, handling outliers, selecting appropriate features, reducing dimensionality, etc. This pre-processed data is then used to train a model on some machine learning algorithm. After the model has been trained, it needs to be assessed by determining metrics like accuracy, precision, and recall, utilising a test dataset. Every time a new model is built, both data pre-processing and model training—two crucial processes in the Machine learning (ML) workflow—must be carried out. Thus, there are various Machine Learning algorithms that can be employed for every single approach to data pre-processing, generating a large set of combinations to choose from. Example: for every method to handle missing values (dropping records, replacing with mean, etc.), for every scaling technique, and for every combination of features selected, a different algorithm can be used. As a result, in order to get the optimum outcomes, these tasks are frequently repeated in different combinations. This paper suggests a simple architecture for organizing this largely produced “combination set of pre-processing steps and algorithms” into an automated workflow which simplifies the task of carrying out all possibilities.

Keywords: machine learning, automation, AUTOML, architecture, operator pool, configuration, scheduler

Procedia PDF Downloads 35
30934 Genetic Algorithms Based ACPS Safety

Authors: Emine Laarouchi, Daniela Cancila, Laurent Soulier, Hakima Chaouchi

Abstract:

Cyber-Physical Systems as drones proved their efficiency for supporting emergency applications. For these particular applications, travel time and autonomous navigation algorithms are of paramount importance, especially when missions are performed in urban environments with high obstacle density. In this context, however, safety properties are not properly addressed. Our ambition is to optimize the system safety level under autonomous navigation systems, by preserving performance of the CPS. At this aim, we introduce genetic algorithms in the autonomous navigation process of the drone to better infer its trajectory considering the possible obstacles. We first model the wished safety requirements through a cost function and then seek to optimize it though genetics algorithms (GA). The main advantage in the use of GA is to consider different parameters together, for example, the level of battery for navigation system selection. Our tests show that the GA introduction in the autonomous navigation systems minimize the risk of safety lossless. Finally, although our simulation has been tested for autonomous drones, our approach and results could be extended for other autonomous navigation systems such as autonomous cars, robots, etc.

Keywords: safety, unmanned aerial vehicles , CPS, ACPS, drones, path planning, genetic algorithms

Procedia PDF Downloads 164
30933 Adaptive Process Monitoring for Time-Varying Situations Using Statistical Learning Algorithms

Authors: Seulki Lee, Seoung Bum Kim

Abstract:

Statistical process control (SPC) is a practical and effective method for quality control. The most important and widely used technique in SPC is a control chart. The main goal of a control chart is to detect any assignable changes that affect the quality output. Most conventional control charts, such as Hotelling’s T2 charts, are commonly based on the assumption that the quality characteristics follow a multivariate normal distribution. However, in modern complicated manufacturing systems, appropriate control chart techniques that can efficiently handle the nonnormal processes are required. To overcome the shortcomings of conventional control charts for nonnormal processes, several methods have been proposed to combine statistical learning algorithms and multivariate control charts. Statistical learning-based control charts, such as support vector data description (SVDD)-based charts, k-nearest neighbors-based charts, have proven their improved performance in nonnormal situations compared to that of the T2 chart. Beside the nonnormal property, time-varying operations are also quite common in real manufacturing fields because of various factors such as product and set-point changes, seasonal variations, catalyst degradation, and sensor drifting. However, traditional control charts cannot accommodate future condition changes of the process because they are formulated based on the data information recorded in the early stage of the process. In the present paper, we propose a SVDD algorithm-based control chart, which is capable of adaptively monitoring time-varying and nonnormal processes. We reformulated the SVDD algorithm into a time-adaptive SVDD algorithm by adding a weighting factor that reflects time-varying situations. Moreover, we defined the updating region for the efficient model-updating structure of the control chart. The proposed control chart simultaneously allows efficient model updates and timely detection of out-of-control signals. The effectiveness and applicability of the proposed chart were demonstrated through experiments with the simulated data and the real data from the metal frame process in mobile device manufacturing.

Keywords: multivariate control chart, nonparametric method, support vector data description, time-varying process

Procedia PDF Downloads 278
30932 A Bayesian Multivariate Microeconometric Model for Estimation of Price Elasticity of Demand

Authors: Jefferson Hernandez, Juan Padilla

Abstract:

Estimation of price elasticity of demand is a valuable tool for the task of price settling. Given its relevance, it is an active field for microeconomic and statistical research. Price elasticity in the industry of oil and gas, in particular for fuels sold in gas stations, has shown to be a challenging topic given the market and state restrictions, and underlying correlations structures between the types of fuels sold by the same gas station. This paper explores the Lotka-Volterra model for the problem for price elasticity estimation in the context of fuels; in addition, it is introduced multivariate random effects with the purpose of dealing with errors, e.g., measurement or missing data errors. In order to model the underlying correlation structures, the Inverse-Wishart, Hierarchical Half-t and LKJ distributions are studied. Here, the Bayesian paradigm through Markov Chain Monte Carlo (MCMC) algorithms for model estimation is considered. Simulation studies covering a wide range of situations were performed in order to evaluate parameter recovery for the proposed models and algorithms. Results revealed that the proposed algorithms recovered quite well all model parameters. Also, a real data set analysis was performed in order to illustrate the proposed approach.

Keywords: price elasticity, volume, correlation structures, Bayesian models

Procedia PDF Downloads 140
30931 Fast and Efficient Algorithms for Evaluating Uniform and Nonuniform Lagrange and Newton Curves

Authors: Taweechai Nuntawisuttiwong, Natasha Dejdumrong

Abstract:

Newton-Lagrange Interpolations are widely used in numerical analysis. However, it requires a quadratic computational time for their constructions. In computer aided geometric design (CAGD), there are some polynomial curves: Wang-Ball, DP and Dejdumrong curves, which have linear time complexity algorithms. Thus, the computational time for Newton-Lagrange Interpolations can be reduced by applying the algorithms of Wang-Ball, DP and Dejdumrong curves. In order to use Wang-Ball, DP and Dejdumrong algorithms, first, it is necessary to convert Newton-Lagrange polynomials into Wang-Ball, DP or Dejdumrong polynomials. In this work, the algorithms for converting from both uniform and non-uniform Newton-Lagrange polynomials into Wang-Ball, DP and Dejdumrong polynomials are investigated. Thus, the computational time for representing Newton-Lagrange polynomials can be reduced into linear complexity. In addition, the other utilizations of using CAGD curves to modify the Newton-Lagrange curves can be taken.

Keywords: Lagrange interpolation, linear complexity, monomial matrix, Newton interpolation

Procedia PDF Downloads 211
30930 Evaluate the Influence of Culture on the Choice of Capital Structure Management Companies

Authors: Sahar Jami, Iman Valizadeh

Abstract:

The purpose of the study: The aim of this study was to evaluate the influence of culture on the choice of capital structure management companies are listed in the Tehran Stock Exchange. Methods: This study was a cross-document using data after the event (Retrospective) in 1394 was performed. To select a sample of elimination sampling (screening) is used to determine the sample size was 123 companies. Results: The results showed that the variables of culture, return on equity, a significant positive impact on the capital structure (ROA, QTobins) and financial leverage and firm size variables and a significant negative impact on the capital structure (ROA, QTobins).

Keywords: culture management, capital structure, ROA, QTobins, variables of culture

Procedia PDF Downloads 444
30929 Image Segmentation Techniques: Review

Authors: Lindani Mbatha, Suvendi Rimer, Mpho Gololo

Abstract:

Image segmentation is the process of dividing an image into several sections, such as the object's background and the foreground. It is a critical technique in both image-processing tasks and computer vision. Most of the image segmentation algorithms have been developed for gray-scale images and little research and algorithms have been developed for the color images. Most image segmentation algorithms or techniques vary based on the input data and the application. Nearly all of the techniques are not suitable for noisy environments. Most of the work that has been done uses the Markov Random Field (MRF), which involves the computations and is said to be robust to noise. In the past recent years' image segmentation has been brought to tackle problems such as easy processing of an image, interpretation of the contents of an image, and easy analysing of an image. This article reviews and summarizes some of the image segmentation techniques and algorithms that have been developed in the past years. The techniques include neural networks (CNN), edge-based techniques, region growing, clustering, and thresholding techniques and so on. The advantages and disadvantages of medical ultrasound image segmentation techniques are also discussed. The article also addresses the applications and potential future developments that can be done around image segmentation. This review article concludes with the fact that no technique is perfectly suitable for the segmentation of all different types of images, but the use of hybrid techniques yields more accurate and efficient results.

Keywords: clustering-based, convolution-network, edge-based, region-growing

Procedia PDF Downloads 63
30928 Adapting an Accurate Reverse-time Migration Method to USCT Imaging

Authors: Brayden Mi

Abstract:

Reverse time migration has been widely used in the Petroleum exploration industry to reveal subsurface images and to detect rock and fluid properties since the early 1980s. The seismic technology involves the construction of a velocity model through interpretive model construction, seismic tomography, or full waveform inversion, and the application of the reverse-time propagation of acquired seismic data and the original wavelet used in the acquisition. The methodology has matured from 2D, simple media to present-day to handle full 3D imaging challenges in extremely complex geological conditions. Conventional Ultrasound computed tomography (USCT) utilize travel-time-inversion to reconstruct the velocity structure of an organ. With the velocity structure, USCT data can be migrated with the “bend-ray” method, also known as migration. Its seismic application counterpart is called Kirchhoff depth migration, in which the source of reflective energy is traced by ray-tracing and summed to produce a subsurface image. It is well known that ray-tracing-based migration has severe limitations in strongly heterogeneous media and irregular acquisition geometries. Reverse time migration (RTM), on the other hand, fully accounts for the wave phenomena, including multiple arrives and turning rays due to complex velocity structure. It has the capability to fully reconstruct the image detectable in its acquisition aperture. The RTM algorithms typically require a rather accurate velocity model and demand high computing powers, and may not be applicable to real-time imaging as normally required in day-to-day medical operations. However, with the improvement of computing technology, such a computational bottleneck may not present a challenge in the near future. The present-day (RTM) algorithms are typically implemented from a flat datum for the seismic industry. It can be modified to accommodate any acquisition geometry and aperture, as long as sufficient illumination is provided. Such flexibility of RTM can be conveniently implemented for the application in USCT imaging if the spatial coordinates of the transmitters and receivers are known and enough data is collected to provide full illumination. This paper proposes an implementation of a full 3D RTM algorithm for USCT imaging to produce an accurate 3D acoustic image based on the Phase-shift-plus-interpolation (PSPI) method for wavefield extrapolation. In this method, each acquired data set (shot) is propagated back in time, and a known ultrasound wavelet is propagated forward in time, with PSPI wavefield extrapolation and a piece-wise constant velocity model of the organ (breast). The imaging condition is then applied to produce a partial image. Although each image is subject to the limitation of its own illumination aperture, the stack of multiple partial images will produce a full image of the organ, with a much-reduced noise level if compared with individual partial images.

Keywords: illumination, reverse time migration (RTM), ultrasound computed tomography (USCT), wavefield extrapolation

Procedia PDF Downloads 53
30927 Assessing Performance of Data Augmentation Techniques for a Convolutional Network Trained for Recognizing Humans in Drone Images

Authors: Masood Varshosaz, Kamyar Hasanpour

Abstract:

In recent years, we have seen growing interest in recognizing humans in drone images for post-disaster search and rescue operations. Deep learning algorithms have shown great promise in this area, but they often require large amounts of labeled data to train the models. To keep the data acquisition cost low, augmentation techniques can be used to create additional data from existing images. There are many techniques of such that can help generate variations of an original image to improve the performance of deep learning algorithms. While data augmentation is potentially assumed to improve the accuracy and robustness of the models, it is important to ensure that the performance gains are not outweighed by the additional computational cost or complexity of implementing the techniques. To this end, it is important to evaluate the impact of data augmentation on the performance of the deep learning models. In this paper, we evaluated the most currently available 2D data augmentation techniques on a standard convolutional network which was trained for recognizing humans in drone images. The techniques include rotation, scaling, random cropping, flipping, shifting, and their combination. The results showed that the augmented models perform 1-3% better compared to a base network. However, as the augmented images only contain the human parts already visible in the original images, a new data augmentation approach is needed to include the invisible parts of the human body. Thus, we suggest a new method that employs simulated 3D human models to generate new data for training the network.

Keywords: human recognition, deep learning, drones, disaster mitigation

Procedia PDF Downloads 73
30926 Development of a Shape Based Estimation Technology Using Terrestrial Laser Scanning

Authors: Gichun Cha, Byoungjoon Yu, Jihwan Park, Minsoo Park, Junghyun Im, Sehwan Park, Sujung Sin, Seunghee Park

Abstract:

The goal of this research is to estimate a structural shape change using terrestrial laser scanning. This study proceeds with development of data reduction and shape change estimation algorithm for large-capacity scan data. The point cloud of scan data was converted to voxel and sampled. Technique of shape estimation is studied to detect changes in structure patterns, such as skyscrapers, bridges, and tunnels based on large point cloud data. The point cloud analysis applies the octree data structure to speed up the post-processing process for change detection. The point cloud data is the relative representative value of shape information, and it used as a model for detecting point cloud changes in a data structure. Shape estimation model is to develop a technology that can detect not only normal but also immediate structural changes in the event of disasters such as earthquakes, typhoons, and fires, thereby preventing major accidents caused by aging and disasters. The study will be expected to improve the efficiency of structural health monitoring and maintenance.

Keywords: terrestrial laser scanning, point cloud, shape information model, displacement measurement

Procedia PDF Downloads 211
30925 An Enhanced Harmony Search (ENHS) Algorithm for Solving Optimization Problems

Authors: Talha A. Taj, Talha A. Khan, M. Imran Khalid

Abstract:

Optimization techniques attract researchers to formulate a problem and determine its optimum solution. This paper presents an Enhanced Harmony Search (ENHS) algorithm for solving optimization problems. The proposed algorithm increases the convergence and is more efficient than the standard Harmony Search (HS) algorithm. The paper discusses the novel techniques in detail and also provides the strategy for tuning the decisive parameters that affects the efficiency of the ENHS algorithm. The algorithm is tested on various benchmark functions, a real world optimization problem and a constrained objective function. Also, the results of ENHS are compared to standard HS, and various other optimization algorithms. The ENHS algorithms prove to be significantly better and more efficient than other algorithms. The simulation and testing of the algorithms is performed in MATLAB.

Keywords: optimization, harmony search algorithm, MATLAB, electronic

Procedia PDF Downloads 440
30924 A Strategy for the Application of Second-Order Monte Carlo Algorithms to Petroleum Exploration and Production Projects

Authors: Obioma Uche

Abstract:

Due to the recent volatility in oil & gas prices as well as increased development of non-conventional resources, it has become even more essential to critically evaluate the profitability of petroleum prospects prior to making any investment decisions. Traditionally, simple Monte Carlo (MC) algorithms have been used to randomly sample probability distributions of economic and geological factors (e.g. price, OPEX, CAPEX, reserves, productive life, etc.) in order to obtain probability distributions for profitability metrics such as Net Present Value (NPV). In recent years, second-order MC algorithms have been shown to offer an advantage over simple MC techniques due to the added consideration of uncertainties associated with the probability distributions of the relevant variables. Here, a strategy for the application of the second-order MC technique to a case study is demonstrated to analyze its effectiveness as a tool for portfolio management.

Keywords: Monte Carlo algorithms, portfolio management, profitability, risk analysis

Procedia PDF Downloads 308
30923 Dissimilarity-Based Coloring for Symbolic and Multivariate Data Visualization

Authors: K. Umbleja, M. Ichino, H. Yaguchi

Abstract:

In this paper, we propose a coloring method for multivariate data visualization by using parallel coordinates based on dissimilarity and tree structure information gathered during hierarchical clustering. The proposed method is an extension for proximity-based coloring that suffers from a few undesired side effects if hierarchical tree structure is not balanced tree. We describe the algorithm by assigning colors based on dissimilarity information, show the application of proposed method on three commonly used datasets, and compare the results with proximity-based coloring. We found our proposed method to be especially beneficial for symbolic data visualization where many individual objects have already been aggregated into a single symbolic object.

Keywords: data visualization, dissimilarity-based coloring, proximity-based coloring, symbolic data

Procedia PDF Downloads 149
30922 Mechanical Properties of Ancient Timber Structure Based on the Non Destructive Test Method: A Study to Feiyun Building, Shanxi, China

Authors: Annisa Dewanti Putri, Wang Juan, Y. Qing Shan

Abstract:

The structural assessment is one of a crucial part for ancient timber structure, in which this phase will be the reference for the maintenance and preservation phase. The mechanical properties of a structure are one of an important component of the structural assessment of building. Feiyun as one of the particular preserved building in China will become one of the Pioneer of Timber Structure Building Assessment. The 3-storey building which is located in Shanxi Province consists of complex ancient timber structure. Due to condition and preservation purpose, assessments (visual inspections, Non-Destructive Test and a Semi Non-Destructive test) were conducted. The stress wave measurement, moisture content analyzer, and the micro-drilling resistance meter data will overview the prediction of Mechanical Properties. As a result, the mechanical properties can be used for the next phase as reference for structural damage solutions.

Keywords: ancient structure, mechanical properties, non destructive test, stress wave, structural assessment, timber structure

Procedia PDF Downloads 450
30921 Genetic Algorithms for Feature Generation in the Context of Audio Classification

Authors: José A. Menezes, Giordano Cabral, Bruno T. Gomes

Abstract:

Choosing good features is an essential part of machine learning. Recent techniques aim to automate this process. For instance, feature learning intends to learn the transformation of raw data into a useful representation to machine learning tasks. In automatic audio classification tasks, this is interesting since the audio, usually complex information, needs to be transformed into a computationally convenient input to process. Another technique tries to generate features by searching a feature space. Genetic algorithms, for instance, have being used to generate audio features by combining or modifying them. We find this approach particularly interesting and, despite the undeniable advances of feature learning approaches, we wanted to take a step forward in the use of genetic algorithms to find audio features, combining them with more conventional methods, like PCA, and inserting search control mechanisms, such as constraints over a confusion matrix. This work presents the results obtained on particular audio classification problems.

Keywords: feature generation, feature learning, genetic algorithm, music information retrieval

Procedia PDF Downloads 412
30920 Digital Mapping as a Tool for Finding Cities' DNA

Authors: Sanja Peter

Abstract:

Transformation of urban environments can be compared to evolutionary processes. Systematic digital mapping of historical data can enable capturing some of these processes and their outcomes. For example, it may help reveal the structure of a city’s historical DNA. Gathering historical data for automatic processing may be giving a basis for cultural algorithms. Gothenburg City museum is trying to make city’s heritage information accessible through GIS-platforms and is now partnering with academic institutions to find appropriate methods to make accessible the knowledge on the city’s historical fabric. Hopefully, this will be carried out through a project called Digital Twin Cities. One part of this large project, concerning matters of Cultural Heritage, will be in collaboration with Chalmers University of Technology. The aim is to create a layered map showing historical developments of the city and extracting quantitative data about its built heritage, above and below the earth. It will allow interpreting the information from historic maps through, for example, names of the streets/places, geography, structural changes in urban fabric and information gathered by archaeologists’ excavations. Through the study of these geographical, historical and local metamorphoses, urban environment will reveal its metaphorical DNA or its MEM (Dawkins).

Keywords: Gothenburg, mapping, cultural heritage, city history

Procedia PDF Downloads 118
30919 Downscaling Daily Temperature with Neuroevolutionary Algorithm

Authors: Min Shi

Abstract:

State of the art research with Artificial Neural Networks for the downscaling of General Circulation Models (GCMs) mainly uses back-propagation algorithm as a training approach. This paper introduces another training approach of ANNs, Evolutionary Algorithm. The combined algorithm names neuroevolutionary (NE) algorithm. We investigate and evaluate the use of the NE algorithms in statistical downscaling by generating temperature estimates at interior points given information from a lattice of surrounding locations. The results of our experiments indicate that NE algorithms can be efficient alternative downscaling methods for daily temperatures.

Keywords: temperature, downscaling, artificial neural networks, evolutionary algorithms

Procedia PDF Downloads 330
30918 Geospatial Network Analysis Using Particle Swarm Optimization

Authors: Varun Singh, Mainak Bandyopadhyay, Maharana Pratap Singh

Abstract:

The shortest path (SP) problem concerns with finding the shortest path from a specific origin to a specified destination in a given network while minimizing the total cost associated with the path. This problem has widespread applications. Important applications of the SP problem include vehicle routing in transportation systems particularly in the field of in-vehicle Route Guidance System (RGS) and traffic assignment problem (in transportation planning). Well known applications of evolutionary methods like Genetic Algorithms (GA), Ant Colony Optimization, Particle Swarm Optimization (PSO) have come up to solve complex optimization problems to overcome the shortcomings of existing shortest path analysis methods. It has been reported by various researchers that PSO performs better than other evolutionary optimization algorithms in terms of success rate and solution quality. Further Geographic Information Systems (GIS) have emerged as key information systems for geospatial data analysis and visualization. This research paper is focused towards the application of PSO for solving the shortest path problem between multiple points of interest (POI) based on spatial data of Allahabad City and traffic speed data collected using GPS. Geovisualization of results of analysis is carried out in GIS.

Keywords: particle swarm optimization, GIS, traffic data, outliers

Procedia PDF Downloads 456
30917 Global Convergence of a Modified Three-Term Conjugate Gradient Algorithms

Authors: Belloufi Mohammed, Sellami Badreddine

Abstract:

This paper deals with a new nonlinear modified three-term conjugate gradient algorithm for solving large-scale unstrained optimization problems. The search direction of the algorithms from this class has three terms and is computed as modifications of the classical conjugate gradient algorithms to satisfy both the descent and the conjugacy conditions. An example of three-term conjugate gradient algorithm from this class, as modifications of the classical and well known Hestenes and Stiefel or of the CG_DESCENT by Hager and Zhang conjugate gradient algorithms, satisfying both the descent and the conjugacy conditions is presented. Under mild conditions, we prove that the modified three-term conjugate gradient algorithm with Wolfe type line search is globally convergent. Preliminary numerical results show the proposed method is very promising.

Keywords: unconstrained optimization, three-term conjugate gradient, sufficient descent property, line search

Procedia PDF Downloads 348
30916 Study of Adaptive Filtering Algorithms and the Equalization of Radio Mobile Channel

Authors: Said Elkassimi, Said Safi, B. Manaut

Abstract:

This paper presented a study of three algorithms, the equalization algorithm to equalize the transmission channel with ZF and MMSE criteria, application of channel Bran A, and adaptive filtering algorithms LMS and RLS to estimate the parameters of the equalizer filter, i.e. move to the channel estimation and therefore reflect the temporal variations of the channel, and reduce the error in the transmitted signal. So far the performance of the algorithm equalizer with ZF and MMSE criteria both in the case without noise, a comparison of performance of the LMS and RLS algorithm.

Keywords: adaptive filtering second equalizer, LMS, RLS Bran A, Proakis (B) MMSE, ZF

Procedia PDF Downloads 298
30915 Arabic Text Representation and Classification Methods: Current State of the Art

Authors: Rami Ayadi, Mohsen Maraoui, Mounir Zrigui

Abstract:

In this paper, we have presented a brief current state of the art for Arabic text representation and classification methods. We decomposed Arabic Task Classification into four categories. First we describe some algorithms applied to classification on Arabic text. Secondly, we cite all major works when comparing classification algorithms applied on Arabic text, after this, we mention some authors who proposing new classification methods and finally we investigate the impact of preprocessing on Arabic TC.

Keywords: text classification, Arabic, impact of preprocessing, classification algorithms

Procedia PDF Downloads 445
30914 Multi-Cluster Overlapping K-Means Extension Algorithm (MCOKE)

Authors: Said Baadel, Fadi Thabtah, Joan Lu

Abstract:

Clustering involves the partitioning of n objects into k clusters. Many clustering algorithms use hard-partitioning techniques where each object is assigned to one cluster. In this paper, we propose an overlapping algorithm MCOKE which allows objects to belong to one or more clusters. The algorithm is different from fuzzy clustering techniques because objects that overlap are assigned a membership value of 1 (one) as opposed to a fuzzy membership degree. The algorithm is also different from other overlapping algorithms that require a similarity threshold to be defined as a priority which can be difficult to determine by novice users.

Keywords: data mining, k-means, MCOKE, overlapping

Procedia PDF Downloads 543
30913 Design and Implementation of a Software Platform Based on Artificial Intelligence for Product Recommendation

Authors: Giuseppina Settanni, Antonio Panarese, Raffaele Vaira, Maurizio Galiano

Abstract:

Nowdays, artificial intelligence is used successfully in academia and industry for its ability to learn from a large amount of data. In particular, in recent years the use of machine learning algorithms in the field of e-commerce has spread worldwide. In this research study, a prototype software platform was designed and implemented in order to suggest to users the most suitable products for their needs. The platform includes a chatbot and a recommender system based on artificial intelligence algorithms that provide suggestions and decision support to the customer. The recommendation systems perform the important function of automatically filtering and personalizing information, thus allowing to manage with the IT overload to which the user is exposed on a daily basis. Recently, international research has experimented with the use of machine learning technologies with the aim to increase the potential of traditional recommendation systems. Specifically, support vector machine algorithms have been implemented combined with natural language processing techniques that allow the user to interact with the system, express their requests and receive suggestions. The interested user can access the web platform on the internet using a computer, tablet or mobile phone, register, provide the necessary information and view the products that the system deems them most appropriate. The platform also integrates a dashboard that allows the use of the various functions, which the platform is equipped with, in an intuitive and simple way. Artificial intelligence algorithms have been implemented and trained on historical data collected from user browsing. Finally, the testing phase allowed to validate the implemented model, which will be further tested by letting customers use it.

Keywords: machine learning, recommender system, software platform, support vector machine

Procedia PDF Downloads 114
30912 Optimization of Proton Exchange Membrane Fuel Cell Parameters Based on Modified Particle Swarm Algorithms

Authors: M. Dezvarei, S. Morovati

Abstract:

In recent years, increasing usage of electrical energy provides a widespread field for investigating new methods to produce clean electricity with high reliability and cost management. Fuel cells are new clean generations to make electricity and thermal energy together with high performance and no environmental pollution. According to the expansion of fuel cell usage in different industrial networks, the identification and optimization of its parameters is really significant. This paper presents optimization of a proton exchange membrane fuel cell (PEMFC) parameters based on modified particle swarm optimization with real valued mutation (RVM) and clonal algorithms. Mathematical equations of this type of fuel cell are presented as the main model structure in the optimization process. Optimized parameters based on clonal and RVM algorithms are compared with the desired values in the presence and absence of measurement noise. This paper shows that these methods can improve the performance of traditional optimization methods. Simulation results are employed to analyze and compare the performance of these methodologies in order to optimize the proton exchange membrane fuel cell parameters.

Keywords: clonal algorithm, proton exchange membrane fuel cell (PEMFC), particle swarm optimization (PSO), real-valued mutation (RVM)

Procedia PDF Downloads 328
30911 Data Poisoning Attacks on Federated Learning and Preventive Measures

Authors: Beulah Rani Inbanathan

Abstract:

In the present era, it is vivid from the numerous outcomes that data privacy is being compromised in various ways. Machine learning is one technology that uses the centralized server, and then data is given as input which is being analyzed by the algorithms present on this mentioned server, and hence outputs are predicted. However, each time the data must be sent by the user as the algorithm will analyze the input data in order to predict the output, which is prone to threats. The solution to overcome this issue is federated learning, where the models alone get updated while the data resides on the local machine and does not get exchanged with the other local models. Nevertheless, even on these local models, there are chances of data poisoning, and it is crystal clear from various experiments done by many people. This paper delves into many ways where data poisoning occurs and the many methods through which it is prevalent that data poisoning still exists. It includes the poisoning attacks on IoT devices, Edge devices, Autoregressive model, and also, on Industrial IoT systems and also, few points on how these could be evadible in order to protect our data which is personal, or sensitive, or harmful when exposed.

Keywords: data poisoning, federated learning, Internet of Things, edge computing

Procedia PDF Downloads 68
30910 Inferring Human Mobility in India Using Machine Learning

Authors: Asra Yousuf, Ajaykumar Tannirkulum

Abstract:

Inferring rural-urban migration trends can help design effective policies that promote better urban planning and rural development. In this paper, we describe how machine learning algorithms can be applied to predict internal migration decisions of people. We consider data collected from household surveys in Tamil Nadu to train our model. To measure the performance of the model, we use data on past migration from National Sample Survey Organisation of India. The factors for training the model include socioeconomic characteristic of each individual like age, gender, place of residence, outstanding loans, strength of the household, etc. and his past migration history. We perform a comparative analysis of the performance of a number of machine learning algorithm to determine their prediction accuracy. Our results show that machine learning algorithms provide a stronger prediction accuracy as compared to statistical models. Our goal through this research is to propose the use of data science techniques in understanding human decisions and behaviour in developing countries.

Keywords: development, migration, internal migration, machine learning, prediction

Procedia PDF Downloads 251
30909 Aggregation Scheduling Algorithms in Wireless Sensor Networks

Authors: Min Kyung An

Abstract:

In Wireless Sensor Networks which consist of tiny wireless sensor nodes with limited battery power, one of the most fundamental applications is data aggregation which collects nearby environmental conditions and aggregates the data to a designated destination, called a sink node. Important issues concerning the data aggregation are time efficiency and energy consumption due to its limited energy, and therefore, the related problem, named Minimum Latency Aggregation Scheduling (MLAS), has been the focus of many researchers. Its objective is to compute the minimum latency schedule, that is, to compute a schedule with the minimum number of timeslots, such that the sink node can receive the aggregated data from all the other nodes without any collision or interference. For the problem, the two interference models, the graph model and the more realistic physical interference model known as Signal-to-Interference-Noise-Ratio (SINR), have been adopted with different power models, uniform-power and non-uniform power (with power control or without power control), and different antenna models, omni-directional antenna and directional antenna models. In this survey article, as the problem has proven to be NP-hard, we present and compare several state-of-the-art approximation algorithms in various models on the basis of latency as its performance measure.

Keywords: data aggregation, convergecast, gathering, approximation, interference, omni-directional, directional

Procedia PDF Downloads 207
30908 Improved Classification Procedure for Imbalanced and Overlapped Situations

Authors: Hankyu Lee, Seoung Bum Kim

Abstract:

The issue with imbalance and overlapping in the class distribution becomes important in various applications of data mining. The imbalanced dataset is a special case in classification problems in which the number of observations of one class (i.e., major class) heavily exceeds the number of observations of the other class (i.e., minor class). Overlapped dataset is the case where many observations are shared together between the two classes. Imbalanced and overlapped data can be frequently found in many real examples including fraud and abuse patients in healthcare, quality prediction in manufacturing, text classification, oil spill detection, remote sensing, and so on. The class imbalance and overlap problem is the challenging issue because this situation degrades the performance of most of the standard classification algorithms. In this study, we propose a classification procedure that can effectively handle imbalanced and overlapped datasets by splitting data space into three parts: nonoverlapping, light overlapping, and severe overlapping and applying the classification algorithm in each part. These three parts were determined based on the Hausdorff distance and the margin of the modified support vector machine. An experiments study was conducted to examine the properties of the proposed method and compared it with other classification algorithms. The results showed that the proposed method outperformed the competitors under various imbalanced and overlapped situations. Moreover, the applicability of the proposed method was demonstrated through the experiment with real data.

Keywords: classification, imbalanced data with class overlap, split data space, support vector machine

Procedia PDF Downloads 289
30907 A Numerical Description of a Fibre Reinforced Concrete Using a Genetic Algorithm

Authors: Henrik L. Funke, Lars Ulke-Winter, Sandra Gelbrich, Lothar Kroll

Abstract:

This work reports about an approach for an automatic adaptation of concrete formulations based on genetic algorithms (GA) to optimize a wide range of different fit-functions. In order to achieve the goal, a method was developed which provides a numerical description of a fibre reinforced concrete (FRC) mixture regarding the production technology and the property spectrum of the concrete. In a first step, the FRC mixture with seven fixed components was characterized by varying amounts of the components. For that purpose, ten concrete mixtures were prepared and tested. The testing procedure comprised flow spread, compressive and bending tensile strength. The analysis and approximation of the determined data was carried out by GAs. The aim was to obtain a closed mathematical expression which best describes the given seven-point cloud of FRC by applying a Gene Expression Programming with Free Coefficients (GEP-FC) strategy. The seven-parametric FRC-mixtures model which is generated according to this method correlated well with the measured data. The developed procedure can be used for concrete mixtures finding closed mathematical expressions, which are based on the measured data.

Keywords: concrete design, fibre reinforced concrete, genetic algorithms, GEP-FC

Procedia PDF Downloads 250