Search results for: distributed algorithms
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 3815

Search results for: distributed algorithms

3515 A Near-Optimal Domain Independent Approach for Detecting Approximate Duplicates

Authors: Abdelaziz Fellah, Allaoua Maamir

Abstract:

We propose a domain-independent merging-cluster filter approach complemented with a set of algorithms for identifying approximate duplicate entities efficiently and accurately within a single and across multiple data sources. The near-optimal merging-cluster filter (MCF) approach is based on the Monge-Elkan well-tuned algorithm and extended with an affine variant of the Smith-Waterman similarity measure. Then we present constant, variable, and function threshold algorithms that work conceptually in a divide-merge filtering fashion for detecting near duplicates as hierarchical clusters along with their corresponding representatives. The algorithms take recursive refinement approaches in the spirit of filtering, merging, and updating, cluster representatives to detect approximate duplicates at each level of the cluster tree. Experiments show a high effectiveness and accuracy of the MCF approach in detecting approximate duplicates by outperforming the seminal Monge-Elkan’s algorithm on several real-world benchmarks and generated datasets.

Keywords: data mining, data cleaning, approximate duplicates, near-duplicates detection, data mining applications and discovery

Procedia PDF Downloads 358
3514 Project Progress Prediction in Software Devlopment Integrating Time Prediction Algorithms and Large Language Modeling

Authors: Dong Wu, Michael Grenn

Abstract:

Managing software projects effectively is crucial for meeting deadlines, ensuring quality, and managing resources well. Traditional methods often struggle with predicting project timelines accurately due to uncertain schedules and complex data. This study addresses these challenges by combining time prediction algorithms with Large Language Models (LLMs). It makes use of real-world software project data to construct and validate a model. The model takes detailed project progress data such as task completion dynamic, team Interaction and development metrics as its input and outputs predictions of project timelines. To evaluate the effectiveness of this model, a comprehensive methodology is employed, involving simulations and practical applications in a variety of real-world software project scenarios. This multifaceted evaluation strategy is designed to validate the model's significant role in enhancing forecast accuracy and elevating overall management efficiency, particularly in complex software project environments. The results indicate that the integration of time prediction algorithms with LLMs has the potential to optimize software project progress management. These quantitative results suggest the effectiveness of the method in practical applications. In conclusion, this study demonstrates that integrating time prediction algorithms with LLMs can significantly improve the predictive accuracy and efficiency of software project management. This offers an advanced project management tool for the industry, with the potential to improve operational efficiency, optimize resource allocation, and ensure timely project completion.

Keywords: software project management, time prediction algorithms, large language models (LLMS), forecast accuracy, project progress prediction

Procedia PDF Downloads 43
3513 Evaluation of Classification Algorithms for Diagnosis of Asthma in Iranian Patients

Authors: Taha SamadSoltani, Peyman Rezaei Hachesu, Marjan GhaziSaeedi, Maryam Zolnoori

Abstract:

Introduction: Data mining defined as a process to find patterns and relationships along data in the database to build predictive models. Application of data mining extended in vast sectors such as the healthcare services. Medical data mining aims to solve real-world problems in the diagnosis and treatment of diseases. This method applies various techniques and algorithms which have different accuracy and precision. The purpose of this study was to apply knowledge discovery and data mining techniques for the diagnosis of asthma based on patient symptoms and history. Method: Data mining includes several steps and decisions should be made by the user which starts by creation of an understanding of the scope and application of previous knowledge in this area and identifying KD process from the point of view of the stakeholders and finished by acting on discovered knowledge using knowledge conducting, integrating knowledge with other systems and knowledge documenting and reporting.in this study a stepwise methodology followed to achieve a logical outcome. Results: Sensitivity, Specifity and Accuracy of KNN, SVM, Naïve bayes, NN, Classification tree and CN2 algorithms and related similar studies was evaluated and ROC curves were plotted to show the performance of the system. Conclusion: The results show that we can accurately diagnose asthma, approximately ninety percent, based on the demographical and clinical data. The study also showed that the methods based on pattern discovery and data mining have a higher sensitivity compared to expert and knowledge-based systems. On the other hand, medical guidelines and evidence-based medicine should be base of diagnostics methods, therefore recommended to machine learning algorithms used in combination with knowledge-based algorithms.

Keywords: asthma, datamining, classification, machine learning

Procedia PDF Downloads 422
3512 Modeling of Virtual Power Plant

Authors: Muhammad Fanseem E. M., Rama Satya Satish Kumar, Indrajeet Bhausaheb Bhavar, Deepak M.

Abstract:

Keeping the right balance of electricity between the supply and demand sides of the grid is one of the most important objectives of electrical grid operation. Power generation and demand forecasting are the core of power management and generation scheduling. Large, centralized producing units were used in the construction of conventional power systems in the past. A certain level of balance was possible since the generation kept up with the power demand. However, integrating renewable energy sources into power networks has proven to be a difficult challenge due to its intermittent nature. The power imbalance caused by rising demands and peak loads is negatively affecting power quality and dependability. Demand side management and demand response were one of the solutions, keeping generation the same but altering or rescheduling or shedding completely the load or demand. However, shedding the load or rescheduling is not an efficient way. There comes the significance of virtual power plants. The virtual power plant integrates distributed generation, dispatchable load, and distributed energy storage organically by using complementing control approaches and communication technologies. This would eventually increase the utilization rate and financial advantages of distributed energy resources. Most of the writing on virtual power plant models ignored technical limitations, and modeling was done in favor of a financial or commercial viewpoint. Therefore, this paper aims to address the modeling intricacies of VPPs and their technical limitations, shedding light on a holistic understanding of this innovative power management approach.

Keywords: cost optimization, distributed energy resources, dynamic modeling, model quality tests, power system modeling

Procedia PDF Downloads 23
3511 Implicit Force Control of a Position Controlled Robot - A Comparison with Explicit Algorithms

Authors: Alexander Winkler, Jozef Suchý

Abstract:

This paper investigates simple implicit force control algorithms realizable with industrial robots. A lot of approaches already published are difficult to implement in commercial robot controllers, because the access to the robot joint torques is necessary or the complete dynamic model of the manipulator is used. In the past we already deal with explicit force control of a position controlled robot. Well known schemes of implicit force control are stiffness control, damping control and impedance control. Using such algorithms the contact force cannot be set directly. It is further the result of controller impedance, environment impedance and the commanded robot motion/position. The relationships of these properties are worked out in this paper in detail for the chosen implicit approaches. They have been adapted to be implementable on a position controlled robot. The behaviors of stiffness control and damping control are verified by practical experiments. For this purpose a suitable test bed was configured. Using the full mechanical impedance within the controller structure will not be practical in the case when the robot is in physical contact with the environment. This fact will be verified by simulation.

Keywords: robot force control, stiffness control, damping control, impedance control, stability

Procedia PDF Downloads 491
3510 Performance Evaluation of Task Scheduling Algorithm on LCQ Network

Authors: Zaki Ahmad Khan, Jamshed Siddiqui, Abdus Samad

Abstract:

The Scheduling and mapping of tasks on a set of processors is considered as a critical problem in parallel and distributed computing system. This paper deals with the problem of dynamic scheduling on a special type of multiprocessor architecture known as Linear Crossed Cube (LCQ) network. This proposed multiprocessor is a hybrid network which combines the features of both linear type of architectures as well as cube based architectures. Two standard dynamic scheduling schemes namely Minimum Distance Scheduling (MDS) and Two Round Scheduling (TRS) schemes are implemented on the LCQ network. Parallel tasks are mapped and the imbalance of load is evaluated on different set of processors in LCQ network. The simulations results are evaluated and effort is made by means of through analysis of the results to obtain the best solution for the given network in term of load imbalance left and execution time. The other performance matrices like speedup and efficiency are also evaluated with the given dynamic algorithms.

Keywords: dynamic algorithm, load imbalance, mapping, task scheduling

Procedia PDF Downloads 420
3509 Performance of Non-Deterministic Structural Optimization Algorithms Applied to a Steel Truss Structure

Authors: Ersilio Tushaj

Abstract:

The efficient solution that satisfies the optimal condition is an important issue in the structural engineering design problem. The new codes of structural design consist in design methodology that looks after the exploitation of the total resources of the construction material. In recent years some non-deterministic or meta-heuristic structural optimization algorithms have been developed widely in the research community. These methods search the optimum condition starting from the simulation of a natural phenomenon, such as survival of the fittest, the immune system, swarm intelligence or the cooling process of molten metal through annealing. Among these techniques the most known are: the genetic algorithms, simulated annealing, evolution strategies, particle swarm optimization, tabu search, ant colony optimization, harmony search and big bang crunch optimization. In this study, five of these algorithms are applied for the optimum weight design of a steel truss structure with variable geometry but fixed topology. The design process selects optimum distances and size sections from a set of commercial steel profiles. In the formulation of the design problem are considered deflection limitations, buckling and allowable stress constraints. The approach is repeated starting from different initial populations. The design problem topology is taken from an existing steel structure. The optimization process helps the engineer to achieve good final solutions, avoiding the repetitive evaluation of alternative designs in a time consuming process. The algorithms used for the application, the results of the optimal solutions, the number of iterations and the minimal weight designs, will be reported in the paper. Based on these results, it would be estimated, the amount of the steel that could be saved by applying structural analysis combined with non-deterministic optimization methods.

Keywords: structural optimization, non-deterministic methods, truss structures, steel truss

Procedia PDF Downloads 188
3508 Incorporating Multiple Supervised Learning Algorithms for Effective Intrusion Detection

Authors: Umar Albalawi, Sang C. Suh, Jinoh Kim

Abstract:

As internet continues to expand its usage with an enormous number of applications, cyber-threats have significantly increased accordingly. Thus, accurate detection of malicious traffic in a timely manner is a critical concern in today’s Internet for security. One approach for intrusion detection is to use Machine Learning (ML) techniques. Several methods based on ML algorithms have been introduced over the past years, but they are largely limited in terms of detection accuracy and/or time and space complexity to run. In this work, we present a novel method for intrusion detection that incorporates a set of supervised learning algorithms. The proposed technique provides high accuracy and outperforms existing techniques that simply utilizes a single learning method. In addition, our technique relies on partial flow information (rather than full information) for detection, and thus, it is light-weight and desirable for online operations with the property of early identification. With the mid-Atlantic CCDC intrusion dataset publicly available, we show that our proposed technique yields a high degree of detection rate over 99% with a very low false alarm rate (0.4%).

Keywords: intrusion detection, supervised learning, traffic classification, computer networks

Procedia PDF Downloads 316
3507 Short Text Classification Using Part of Speech Feature to Analyze Students' Feedback of Assessment Components

Authors: Zainab Mutlaq Ibrahim, Mohamed Bader-El-Den, Mihaela Cocea

Abstract:

Students' textual feedback can hold unique patterns and useful information about learning process, it can hold information about advantages and disadvantages of teaching methods, assessment components, facilities, and other aspects of teaching. The results of analysing such a feedback can form a key point for institutions’ decision makers to advance and update their systems accordingly. This paper proposes a data mining framework for analysing end of unit general textual feedback using part of speech feature (PoS) with four machine learning algorithms: support vector machines, decision tree, random forest, and naive bays. The proposed framework has two tasks: first, to use the above algorithms to build an optimal model that automatically classifies the whole data set into two subsets, one subset is tailored to assessment practices (assessment related), and the other one is the non-assessment related data. Second task to use the same algorithms to build an optimal model for whole data set, and the new data subsets to automatically detect their sentiment. The significance of this paper is to compare the performance of the above four algorithms using part of speech feature to the performance of the same algorithms using n-grams feature. The paper follows Knowledge Discovery and Data Mining (KDDM) framework to construct the classification and sentiment analysis models, which is understanding the assessment domain, cleaning and pre-processing the data set, selecting and running the data mining algorithm, interpreting mined patterns, and consolidating the discovered knowledge. The results of this paper experiments show that both models which used both features performed very well regarding first task. But regarding the second task, models that used part of speech feature has underperformed in comparison with models that used unigrams and bigrams.

Keywords: assessment, part of speech, sentiment analysis, student feedback

Procedia PDF Downloads 107
3506 Comparative Analysis of Reinforcement Learning Algorithms for Autonomous Driving

Authors: Migena Mana, Ahmed Khalid Syed, Abdul Malik, Nikhil Cherian

Abstract:

In recent years, advancements in deep learning enabled researchers to tackle the problem of self-driving cars. Car companies use huge datasets to train their deep learning models to make autonomous cars a reality. However, this approach has certain drawbacks in that the state space of possible actions for a car is so huge that there cannot be a dataset for every possible road scenario. To overcome this problem, the concept of reinforcement learning (RL) is being investigated in this research. Since the problem of autonomous driving can be modeled in a simulation, it lends itself naturally to the domain of reinforcement learning. The advantage of this approach is that we can model different and complex road scenarios in a simulation without having to deploy in the real world. The autonomous agent can learn to drive by finding the optimal policy. This learned model can then be easily deployed in a real-world setting. In this project, we focus on three RL algorithms: Q-learning, Deep Deterministic Policy Gradient (DDPG), and Proximal Policy Optimization (PPO). To model the environment, we have used TORCS (The Open Racing Car Simulator), which provides us with a strong foundation to test our model. The inputs to the algorithms are the sensor data provided by the simulator such as velocity, distance from side pavement, etc. The outcome of this research project is a comparative analysis of these algorithms. Based on the comparison, the PPO algorithm gives the best results. When using PPO algorithm, the reward is greater, and the acceleration, steering angle and braking are more stable compared to the other algorithms, which means that the agent learns to drive in a better and more efficient way in this case. Additionally, we have come up with a dataset taken from the training of the agent with DDPG and PPO algorithms. It contains all the steps of the agent during one full training in the form: (all input values, acceleration, steering angle, break, loss, reward). This study can serve as a base for further complex road scenarios. Furthermore, it can be enlarged in the field of computer vision, using the images to find the best policy.

Keywords: autonomous driving, DDPG (deep deterministic policy gradient), PPO (proximal policy optimization), reinforcement learning

Procedia PDF Downloads 119
3505 Markowitz and Implementation of a Multi-Objective Evolutionary Technique Applied to the Colombia Stock Exchange (2009-2015)

Authors: Feijoo E. Colomine Duran, Carlos E. Peñaloza Corredor

Abstract:

There modeling component selection financial investment (Portfolio) a variety of problems that can be addressed with optimization techniques under evolutionary schemes. For his feature, the problem of selection of investment components of a dichotomous relationship between two elements that are opposed: The Portfolio Performance and Risk presented by choosing it. This relationship was modeled by Markowitz through a media problem (Performance) - variance (risk), ie must Maximize Performance and Minimize Risk. This research included the study and implementation of multi-objective evolutionary techniques to solve these problems, taking as experimental framework financial market equities Colombia Stock Exchange between 2009-2015. Comparisons three multiobjective evolutionary algorithms, namely the Nondominated Sorting Genetic Algorithm II (NSGA-II), the Strength Pareto Evolutionary Algorithm 2 (SPEA2) and Indicator-Based Selection in Multiobjective Search (IBEA) were performed using two measures well known performance: The Hypervolume indicator and R_2 indicator, also it became a nonparametric statistical analysis and the Wilcoxon rank-sum test. The comparative analysis also includes an evaluation of the financial efficiency of the investment portfolio chosen by the implementation of various algorithms through the Sharpe ratio. It is shown that the portfolio provided by the implementation of the algorithms mentioned above is very well located between the different stock indices provided by the Colombia Stock Exchange.

Keywords: finance, optimization, portfolio, Markowitz, evolutionary algorithms

Procedia PDF Downloads 266
3504 A Study of Issues and Mitigations on Distributed Denial of Service and Medical Internet of Things Devices

Authors: Robin Singh, Jing-Chiou Liou

Abstract:

The Internet of Things (IoT) devices are being used heavily as part of our everyday routines. Through improved communication and automated procedures, its popularity has assisted users in raising the quality of work. These devices are used in healthcare in order to better collect the patient’s data for their treatment. They are generally considered safe and secure. However, there is some possibility that some loopholes do exist which manufacturers do need to identify before some hacker takes advantage of them. For this study, we focused on two medical IoT devices which are pacemakers and hearing aids. The aim of this paper is to identify if there is any likelihood of these medical devices being hijacked and used as a botnet in Distributed Denial-Of Service attacks. Moreover, some mitigation strategies are being proposed to better secure

Keywords: cybersecurity, DDoS, IoT, medical devices

Procedia PDF Downloads 53
3503 Optimal Load Control Strategy in the Presence of Stochastically Dependent Renewable Energy Sources

Authors: Mahmoud M. Othman, Almoataz Y. Abdelaziz, Yasser G. Hegazy

Abstract:

This paper presents a load control strategy based on modification of the Big Bang Big Crunch optimization method. The proposed strategy aims to determine the optimal load to be controlled and the corresponding time of control in order to minimize the energy purchased from substation. The presented strategy helps the distribution network operator to rely on the renewable energy sources in supplying the system demand. The renewable energy sources used in the presented study are modeled using the diagonal band Copula method and sequential Monte Carlo method in order to accurately consider the multivariate stochastic dependence between wind power, photovoltaic power and the system demand. The proposed algorithms are implemented in MATLAB environment and tested on the IEEE 37-node feeder. Several case studies are done and the subsequent discussions show the effectiveness of the proposed algorithm.

Keywords: big bang big crunch, distributed generation, load control, optimization, planning

Procedia PDF Downloads 319
3502 An Efficient FPGA Realization of Fir Filter Using Distributed Arithmetic

Authors: M. Iruleswari, A. Jeyapaul Murugan

Abstract:

Most fundamental part used in many Digital Signal Processing (DSP) application is a Finite Impulse Response (FIR) filter because of its linear phase, stability and regular structure. Designing a high-speed and hardware efficient FIR filter is a very challenging task as the complexity increases with the filter order. In most applications the higher order filters are required but the memory usage of the filter increases exponentially with the order of the filter. Using multipliers occupy a large chip area and need high computation time. Multiplier-less memory-based techniques have gained popularity over past two decades due to their high throughput processing capability and reduced dynamic power consumption. This paper describes the design and implementation of highly efficient Look-Up Table (LUT) based circuit for the implementation of FIR filter using Distributed arithmetic algorithm. It is a multiplier less FIR filter. The LUT can be subdivided into a number of LUT to reduce the memory usage of the LUT for higher order filter. Analysis on the performance of various filter orders with different address length is done using Xilinx 14.5 synthesis tool. The proposed design provides less latency, less memory usage and high throughput.

Keywords: finite impulse response, distributed arithmetic, field programmable gate array, look-up table

Procedia PDF Downloads 432
3501 EnumTree: An Enumerative Biclustering Algorithm for DNA Microarray Data

Authors: Haifa Ben Saber, Mourad Elloumi

Abstract:

In a number of domains, like in DNA microarray data analysis, we need to cluster simultaneously rows (genes) and columns (conditions) of a data matrix to identify groups of constant rows with a group of columns. This kind of clustering is called biclustering. Biclustering algorithms are extensively used in DNA microarray data analysis. More effective biclustering algorithms are highly desirable and needed. We introduce a new algorithm called, Enumerative tree (EnumTree) for biclustering of binary microarray data. is an algorithm adopting the approach of enumerating biclusters. This algorithm extracts all biclusters consistent good quality. The main idea of ​​EnumLat is the construction of a new tree structure to represent adequately different biclusters discovered during the process of enumeration. This algorithm adopts the strategy of all biclusters at a time. The performance of the proposed algorithm is assessed using both synthetic and real DNA micryarray data, our algorithm outperforms other biclustering algorithms for binary microarray data. Biclusters with different numbers of rows. Moreover, we test the biological significance using a gene annotation web tool to show that our proposed method is able to produce biologically relevent biclusters.

Keywords: DNA microarray, biclustering, gene expression data, tree, datamining.

Procedia PDF Downloads 344
3500 Image Encryption Using Eureqa to Generate an Automated Mathematical Key

Authors: Halima Adel Halim Shnishah, David Mulvaney

Abstract:

Applying traditional symmetric cryptography algorithms while computing encryption and decryption provides immunity to secret keys against different attacks. One of the popular techniques generating automated secret keys is evolutionary computing by using Eureqa API tool, which got attention in 2013. In this paper, we are generating automated secret keys for image encryption and decryption using Eureqa API (tool which is used in evolutionary computing technique). Eureqa API models pseudo-random input data obtained from a suitable source to generate secret keys. The validation of generated secret keys is investigated by performing various statistical tests (histogram, chi-square, correlation of two adjacent pixels, correlation between original and encrypted images, entropy and key sensitivity). Experimental results obtained from methods including histogram analysis, correlation coefficient, entropy and key sensitivity, show that the proposed image encryption algorithms are secure and reliable, with the potential to be adapted for secure image communication applications.

Keywords: image encryption algorithms, Eureqa, statistical measurements, automated key generation

Procedia PDF Downloads 455
3499 A Novel Guided Search Based Multi-Objective Evolutionary Algorithm

Authors: A. Baviskar, C. Sandeep, K. Shankar

Abstract:

Solving Multi-objective Optimization Problems requires faster convergence and better spread. Though existing Evolutionary Algorithms (EA's) are able to achieve this, the computation effort can further be reduced by hybridizing them with innovative strategies. This study is focuses on converging to the pareto front faster while adapting the advantages of Strength Pareto Evolutionary Algorithm-II (SPEA-II) for a better spread. Two different approaches based on optimizing the objective functions independently are implemented. In the first method, the decision variables corresponding to the optima of individual objective functions are strategically used to guide the search towards the pareto front. In the second method, boundary points of the pareto front are calculated and their decision variables are seeded to the initial population. Both the methods are applied to different constrained and unconstrained multi-objective test functions. It is observed that proposed guided search based algorithm gives better convergence and diversity than several well-known existing algorithms (such as NSGA-II and SPEA-II) in considerably less number of iterations.

Keywords: boundary points, evolutionary algorithms (EA's), guided search, strength pareto evolutionary algorithm-II (SPEA-II)

Procedia PDF Downloads 242
3498 Inferential Reasoning for Heterogeneous Multi-Agent Mission

Authors: Sagir M. Yusuf, Chris Baber

Abstract:

We describe issues bedeviling the coordination of heterogeneous (different sensors carrying agents) multi-agent missions such as belief conflict, situation reasoning, etc. We applied Bayesian and agents' presumptions inferential reasoning to solve the outlined issues with the heterogeneous multi-agent belief variation and situational-base reasoning. Bayesian Belief Network (BBN) was used in modeling the agents' belief conflict due to sensor variations. Simulation experiments were designed, and cases from agents’ missions were used in training the BBN using gradient descent and expectation-maximization algorithms. The output network is a well-trained BBN for making inferences for both agents and human experts. We claim that the Bayesian learning algorithm prediction capacity improves by the number of training data and argue that it enhances multi-agents robustness and solve agents’ sensor conflicts.

Keywords: distributed constraint optimization problem, multi-agent system, multi-robot coordination, autonomous system, swarm intelligence

Procedia PDF Downloads 115
3497 Feature Weighting Comparison Based on Clustering Centers in the Detection of Diabetic Retinopathy

Authors: Kemal Polat

Abstract:

In this paper, three feature weighting methods have been used to improve the classification performance of diabetic retinopathy (DR). To classify the diabetic retinopathy, features extracted from the output of several retinal image processing algorithms, such as image-level, lesion-specific and anatomical components, have been used and fed them into the classifier algorithms. The dataset used in this study has been taken from University of California, Irvine (UCI) machine learning repository. Feature weighting methods including the fuzzy c-means clustering based feature weighting, subtractive clustering based feature weighting, and Gaussian mixture clustering based feature weighting, have been used and compered with each other in the classification of DR. After feature weighting, five different classifier algorithms comprising multi-layer perceptron (MLP), k- nearest neighbor (k-NN), decision tree, support vector machine (SVM), and Naïve Bayes have been used. The hybrid method based on combination of subtractive clustering based feature weighting and decision tree classifier has been obtained the classification accuracy of 100% in the screening of DR. These results have demonstrated that the proposed hybrid scheme is very promising in the medical data set classification.

Keywords: machine learning, data weighting, classification, data mining

Procedia PDF Downloads 301
3496 A Method for Compression of Short Unicode Strings

Authors: Masoud Abedi, Abbas Malekpour, Peter Luksch, Mohammad Reza Mojtabaei

Abstract:

The use of short texts in communication has been greatly increasing in recent years. Applying different languages in short texts has led to compulsory use of Unicode strings. These strings need twice the space of common strings, hence, applying algorithms of compression for the purpose of accelerating transmission and reducing cost is worthwhile. Nevertheless, other compression methods like gzip, bzip2 or PAQ due to high overhead data size are not appropriate. The Huffman algorithm is one of the rare algorithms effective in reducing the size of short Unicode strings. In this paper, an algorithm is proposed for compression of very short Unicode strings. At first, every new character to be sent to a destination is inserted in the proposed mapping table. At the beginning, every character is new. In case the character is repeated for the same destination, it is not considered as a new character. Next, the new characters together with the mapping value of repeated characters are arranged through a specific technique and specially formatted to be transmitted. The results obtained from an assessment made on a set of short Persian and Arabic strings indicate that this proposed algorithm outperforms the Huffman algorithm in size reduction.

Keywords: Algorithms, Data Compression, Decoding, Encoding, Huffman Codes, Text Communication

Procedia PDF Downloads 320
3495 The Clustering of Multiple Sclerosis Subgroups through L2 Norm Multifractal Denoising Technique

Authors: Yeliz Karaca, Rana Karabudak

Abstract:

Multifractal Denoising techniques are used in the identification of significant attributes by removing the noise of the dataset. Magnetic resonance (MR) image technique is the most sensitive method so as to identify chronic disorders of the nervous system such as Multiple Sclerosis. MRI and Expanded Disability Status Scale (EDSS) data belonging to 120 individuals who have one of the subgroups of MS (Relapsing Remitting MS (RRMS), Secondary Progressive MS (SPMS), Primary Progressive MS (PPMS)) as well as 19 healthy individuals in the control group have been used in this study. The study is comprised of the following stages: (i) L2 Norm Multifractal Denoising technique, one of the multifractal technique, has been used with the application on the MS data (MRI and EDSS). In this way, the new dataset has been obtained. (ii) The new MS dataset obtained from the MS dataset and L2 Multifractal Denoising technique has been applied to the K-Means and Fuzzy C Means clustering algorithms which are among the unsupervised methods. Thus, the clustering performances have been compared. (iii) In the identification of significant attributes in the MS dataset through the Multifractal denoising (L2 Norm) technique using K-Means and FCM algorithms on the MS subgroups and control group of healthy individuals, excellent performance outcome has been yielded. According to the clustering results based on the MS subgroups obtained in the study, successful clustering results have been obtained in the K-Means and FCM algorithms by applying the L2 norm of multifractal denoising technique for the MS dataset. Clustering performance has been more successful with the MS Dataset (L2_Norm MS Data Set) K-Means and FCM in which significant attributes are obtained by applying L2 Norm Denoising technique.

Keywords: clinical decision support, clustering algorithms, multiple sclerosis, multifractal techniques

Procedia PDF Downloads 133
3494 On the Application of Heuristics of the Traveling Salesman Problem for the Task of Restoring the DNA Matrix

Authors: Boris Melnikov, Dmitrii Chaikovskii, Elena Melnikova

Abstract:

The traveling salesman problem (TSP) is a well-known optimization problem that seeks to find the shortest possible route that visits a set of points and returns to the starting point. In this paper, we apply some heuristics of the TSP for the task of restoring the DNA matrix. This restoration problem is often considered in biocybernetics. For it, we must recover the matrix of distances between DNA sequences if not all the elements of the matrix under consideration are known at the input. We consider the possibility of using this method in the testing of distance calculation algorithms between a pair of DNAs to restore the partially filled matrix.

Keywords: optimization problems, DNA matrix, partially filled matrix, traveling salesman problem, heuristic algorithms

Procedia PDF Downloads 120
3493 Distributed Processing for Content Based Lecture Video Retrieval on Hadoop Framework

Authors: U. S. N. Raju, Kothuri Sai Kiran, Meena G. Kamal, Vinay Nikhil Pabba, Suresh Kanaparthi

Abstract:

There is huge amount of lecture video data available for public use, and many more lecture videos are being created and uploaded every day. Searching for videos on required topics from this huge database is a challenging task. Therefore, an efficient method for video retrieval is needed. An approach for automated video indexing and video search in large lecture video archives is presented. As the amount of video lecture data is huge, it is very inefficient to do the processing in a centralized computation framework. Hence, Hadoop Framework for distributed computing for Big Video Data is used. First, step in the process is automatic video segmentation and key-frame detection to offer a visual guideline for the video content navigation. In the next step, we extract textual metadata by applying video Optical Character Recognition (OCR) technology on key-frames. The OCR and detected slide text line types are adopted for keyword extraction, by which both video- and segment-level keywords are extracted for content-based video browsing and search. The performance of the indexing process can be improved for a large database by using distributed computing on Hadoop framework.

Keywords: video lectures, big video data, video retrieval, hadoop

Procedia PDF Downloads 493
3492 Data Mining in Medicine Domain Using Decision Trees and Vector Support Machine

Authors: Djamila Benhaddouche, Abdelkader Benyettou

Abstract:

In this paper, we used data mining to extract biomedical knowledge. In general, complex biomedical data collected in studies of populations are treated by statistical methods, although they are robust, they are not sufficient in themselves to harness the potential wealth of data. For that you used in step two learning algorithms: the Decision Trees and Support Vector Machine (SVM). These supervised classification methods are used to make the diagnosis of thyroid disease. In this context, we propose to promote the study and use of symbolic data mining techniques.

Keywords: biomedical data, learning, classifier, algorithms decision tree, knowledge extraction

Procedia PDF Downloads 513
3491 Predication Model for Leukemia Diseases Based on Data Mining Classification Algorithms with Best Accuracy

Authors: Fahd Sabry Esmail, M. Badr Senousy, Mohamed Ragaie

Abstract:

In recent years, there has been an explosion in the rate of using technology that help discovering the diseases. For example, DNA microarrays allow us for the first time to obtain a "global" view of the cell. It has great potential to provide accurate medical diagnosis, to help in finding the right treatment and cure for many diseases. Various classification algorithms can be applied on such micro-array datasets to devise methods that can predict the occurrence of Leukemia disease. In this study, we compared the classification accuracy and response time among eleven decision tree methods and six rule classifier methods using five performance criteria. The experiment results show that the performance of Random Tree is producing better result. Also it takes lowest time to build model in tree classifier. The classification rules algorithms such as nearest- neighbor-like algorithm (NNge) is the best algorithm due to the high accuracy and it takes lowest time to build model in classification.

Keywords: data mining, classification techniques, decision tree, classification rule, leukemia diseases, microarray data

Procedia PDF Downloads 292
3490 MapReduce Logistic Regression Algorithms with RHadoop

Authors: Byung Ho Jung, Dong Hoon Lim

Abstract:

Logistic regression is a statistical method for analyzing a dataset in which there are one or more independent variables that determine an outcome. Logistic regression is used extensively in numerous disciplines, including the medical and social science fields. In this paper, we address the problem of estimating parameters in the logistic regression based on MapReduce framework with RHadoop that integrates R and Hadoop environment applicable to large scale data. There exist three learning algorithms for logistic regression, namely Gradient descent method, Cost minimization method and Newton-Rhapson's method. The Newton-Rhapson's method does not require a learning rate, while gradient descent and cost minimization methods need to manually pick a learning rate. The experimental results demonstrated that our learning algorithms using RHadoop can scale well and efficiently process large data sets on commodity hardware. We also compared the performance of our Newton-Rhapson's method with gradient descent and cost minimization methods. The results showed that our newton's method appeared to be the most robust to all data tested.

Keywords: big data, logistic regression, MapReduce, RHadoop

Procedia PDF Downloads 245
3489 Parametric Analysis of Lumped Devices Modeling Using Finite-Difference Time-Domain

Authors: Felipe M. de Freitas, Icaro V. Soares, Lucas L. L. Fortes, Sandro T. M. Gonçalves, Úrsula D. C. Resende

Abstract:

The SPICE-based simulators are quite robust and widely used for simulation of electronic circuits, their algorithms support linear and non-linear lumped components and they can manipulate an expressive amount of encapsulated elements. Despite the great potential of these simulators based on SPICE in the analysis of quasi-static electromagnetic field interaction, that is, at low frequency, these simulators are limited when applied to microwave hybrid circuits in which there are both lumped and distributed elements. Usually the spatial discretization of the FDTD (Finite-Difference Time-Domain) method is done according to the actual size of the element under analysis. After spatial discretization, the Courant Stability Criterion calculates the maximum temporal discretization accepted for such spatial discretization and for the propagation velocity of the wave. This criterion guarantees the stability conditions for the leapfrogging of the Yee algorithm; however, it is known that for the field update, the stability of the complete FDTD procedure depends on factors other than just the stability of the Yee algorithm, because the FDTD program needs other algorithms in order to be useful in engineering problems. Examples of these algorithms are Absorbent Boundary Conditions (ABCs), excitation sources, subcellular techniques, grouped elements, and non-uniform or non-orthogonal meshes. In this work, the influence of the stability of the FDTD method in the modeling of concentrated elements such as resistive sources, resistors, capacitors, inductors and diode will be evaluated. In this paper is proposed, therefore, the electromagnetic modeling of electronic components in order to create models that satisfy the needs for simulations of circuits in ultra-wide frequencies. The models of the resistive source, the resistor, the capacitor, the inductor, and the diode will be evaluated, among the mathematical models for lumped components in the LE-FDTD method (Lumped-Element Finite-Difference Time-Domain), through the parametric analysis of Yee cells size which discretizes the lumped components. In this way, it is sought to find an ideal cell size so that the analysis in FDTD environment is in greater agreement with the expected circuit behavior, maintaining the stability conditions of this method. Based on the mathematical models and the theoretical basis of the required extensions of the FDTD method, the computational implementation of the models in Matlab® environment is carried out. The boundary condition Mur is used as the absorbing boundary of the FDTD method. The validation of the model is done through the comparison between the obtained results by the FDTD method through the electric field values and the currents in the components, and the analytical results using circuit parameters.

Keywords: hybrid circuits, LE-FDTD, lumped element, parametric analysis

Procedia PDF Downloads 123
3488 Safe and Scalable Framework for Participation of Nodes in Smart Grid Networks in a P2P Exchange of Short-Term Products

Authors: Maciej Jedrzejczyk, Karolina Marzantowicz

Abstract:

Traditional utility value chain is being transformed during last few years into unbundled markets. Increased distributed generation of energy is one of considerable challenges faced by Smart Grid networks. New sources of energy introduce volatile demand response which has a considerable impact on traditional middlemen in E&U market. The purpose of this research is to search for ways to allow near-real-time electricity markets to transact with surplus energy based on accurate time synchronous measurements. A proposed framework evaluates the use of secure peer-2-peer (P2P) communication and distributed transaction ledgers to provide flat hierarchy, and allow real-time insights into present and forecasted grid operations, as well as state and health of the network. An objective is to achieve dynamic grid operations with more efficient resource usage, higher security of supply and longer grid infrastructure life cycle. Methods used for this study are based on comparative analysis of different distributed ledger technologies in terms of scalability, transaction performance, pluggability with external data sources, data transparency, privacy, end-to-end security and adaptability to various market topologies. An intended output of this research is a design of a framework for safer, more efficient and scalable Smart Grid network which is bridging a gap between traditional components of the energy network and individual energy producers. Results of this study are ready for detailed measurement testing, a likely follow-up in separate studies. New platforms for Smart Grid achieving measurable efficiencies will allow for development of new types of Grid KPI, multi-smart grid branches, markets, and businesses.

Keywords: autonomous agents, Distributed computing, distributed ledger technologies, large scale systems, micro grids, peer-to-peer networks, Self-organization, self-stabilization, smart grids

Procedia PDF Downloads 268
3487 Pareto System of Optimal Placement and Sizing of Distributed Generation in Radial Distribution Networks Using Particle Swarm Optimization

Authors: Sani M. Lawal, Idris Musa, Aliyu D. Usman

Abstract:

The Pareto approach of optimal solutions in a search space that evolved in multi-objective optimization problems is adopted in this paper, which stands for a set of solutions in the search space. This paper aims at presenting an optimal placement of Distributed Generation (DG) in radial distribution networks with an optimal size for minimization of power loss and voltage deviation as well as maximizing voltage profile of the networks. And these problems are formulated using particle swarm optimization (PSO) as a constraint nonlinear optimization problem with both locations and sizes of DG being continuous. The objective functions adopted are the total active power loss function and voltage deviation function. The multiple nature of the problem, made it necessary to form a multi-objective function in search of the solution that consists of both the DG location and size. The proposed PSO algorithm is used to determine optimal placement and size of DG in a distribution network. The output indicates that PSO algorithm technique shows an edge over other types of search methods due to its effectiveness and computational efficiency. The proposed method is tested on the standard IEEE 34-bus and validated with 33-bus test systems distribution networks. Results indicate that the sizing and location of DG are system dependent and should be optimally selected before installing the distributed generators in the system and also an improvement in the voltage profile and power loss reduction have been achieved.

Keywords: distributed generation, pareto, particle swarm optimization, power loss, voltage deviation

Procedia PDF Downloads 328
3486 A Research Using Remote Monitoring Technology for Pump Output Monitoring in Distributed Fuel Stations in Nigeria

Authors: Ofoegbu Ositadinma Edward

Abstract:

This research paper discusses a web based monitoring system that enables effective monitoring of fuel pump output and sales volume from distributed fuel stations under the domain of a single company/organization. The traditional method of operation by these organizations in Nigeria is non-automated and accounting for dispensed product is usually approximated and manual as there is little or no technology implemented to presently provide information relating to the state of affairs in the station both to on-ground staff and to supervisory staff that are not physically present in the station. This results in unaccountable losses in product and revenue as well as slow decision making. Remote monitoring technology as a vast research field with numerous application areas incorporating various data collation techniques and sensor networks can be applied to provide information relating to fuel pump status in distributed fuel stations reliably. Thus, the proposed system relies upon a microcontroller, keypad and pump to demonstrate the traditional fuel dispenser. A web-enabled PC with an accompanying graphic user interface (GUI) was designed using virtual basic which is connected to the microcontroller via the serial port which is to provide the web implementation.

Keywords: fuel pump, microcontroller, GUI, web

Procedia PDF Downloads 404