Search results for: Parallel computation.
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 989

Search results for: Parallel computation.

899 Detecting the Edge of Multiple Images in Parallel

Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar

Abstract:

Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel. The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. Message Passing Interface (MPI) and Open Computing Language (OpenCL) is used to achieve task and pixel level parallelism respectively.

Keywords: Edge detection, multicore, GPU, openCL, MPI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2339
898 Analysis of Long-Term File System Activities on Cluster Systems

Authors: Hyeyoung Cho, Sungho Kim, Sik Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and to maximize file system performance. However to measure I/O workload on running distributed parallel file system is non-trivial due to collection overhead and large volume of data. In this paper, we measured and analyzed file system activities on two large-scale cluster systems which had TFlops level high performance computation resources. By comparing file system activities of 2009 with those of 2006, we analyzed the change of I/O workloads by the development of system performance and high-speed network technology.

Keywords: I/O workload, Lustre, GPFS, Cluster File System

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1461
897 Recursive Wiener-Khintchine Theorem

Authors: Khalid M. Aamir, Mohammad A. Maud

Abstract:

Power Spectral Density (PSD) computed by taking the Fourier transform of auto-correlation functions (Wiener-Khintchine Theorem) gives better result, in case of noisy data, as compared to the Periodogram approach. However, the computational complexity of Wiener-Khintchine approach is more than that of the Periodogram approach. For the computation of short time Fourier transform (STFT), this problem becomes even more prominent where computation of PSD is required after every shift in the window under analysis. In this paper, recursive version of the Wiener-Khintchine theorem has been derived by using the sliding DFT approach meant for computation of STFT. The computational complexity of the proposed recursive Wiener-Khintchine algorithm, for a window size of N, is O(N).

Keywords: Power Spectral Density (PSD), Wiener-KhintchineTheorem, Periodogram, Short Time Fourier Transform (STFT), TheSliding DFT.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2484
896 An Improved Method to Compute Sparse Graphs for Traveling Salesman Problem

Authors: Y. Wang

Abstract:

The Traveling salesman problem (TSP) is NP-hard in combinatorial optimization. The research shows the algorithms for TSP on the sparse graphs have the shorter computation time than those for TSP according to the complete graphs. We present an improved iterative algorithm to compute the sparse graphs for TSP by frequency graphs computed with frequency quadrilaterals. The iterative algorithm is enhanced by adjusting two parameters of the algorithm. The computation time of the algorithm is O(CNmaxn2) where C is the iterations, Nmax is the maximum number of frequency quadrilaterals containing each edge and n is the scale of TSP. The experimental results showed the computed sparse graphs generally have less than 5n edges for most of these Euclidean instances. Moreover, the maximum degree and minimum degree of the vertices in the sparse graphs do not have much difference. Thus, the computation time of the methods to resolve the TSP on these sparse graphs will be greatly reduced.

Keywords: Frequency quadrilateral, iterative algorithm, sparse graph, traveling salesman problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1010
895 Singularity Loci of Actuation Schemes for 3RRR Planar Parallel Manipulator

Authors: S. Ramana Babu, V. Ramachandra Raju, K. Ramji

Abstract:

This paper presents the effect of actuation schemes on the performance of parallel manipulators and also how the singularity loci have been changed in the reachable workspace of the manipulator with the choice of actuation scheme to drive the manipulator. The performance of the eight possible actuation schemes of 3RRR planar parallel manipulator is compared with each other. The optimal design problem is formulated to find the manipulator geometry that maximizes the singularity free conditioned workspace for all the eight actuation cases, the optimization problem is solved by using genetic algorithms.

Keywords: Actuation schemes, GCI, genetic algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1625
894 Unrelated Parallel Machines Scheduling Problem Using an Ant Colony Optimization Approach

Authors: Y. K. Lin, H. T. Hsieh, F. Y. Hsieh

Abstract:

Total weighted tardiness is a measure of customer satisfaction. Minimizing it represents satisfying the general requirement of on-time delivery. In this research, we consider an ant colony optimization (ACO) algorithm to solve the problem of scheduling unrelated parallel machines to minimize total weighted tardiness. The problem is NP-hard in the strong sense. Computational results show that the proposed ACO algorithm is giving promising results compared to other existing algorithms.

Keywords: ant colony optimization, total weighted tardiness, unrelated parallel machines.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1890
893 Bounds on Reliability of Parallel Computer Interconnection Systems

Authors: Ranjan Kumar Dash, Chita Ranjan Tripathy

Abstract:

The evaluation of residual reliability of large sized parallel computer interconnection systems is not practicable with the existing methods. Under such conditions, one must go for approximation techniques which provide the upper bound and lower bound on this reliability. In this context, a new approximation method for providing bounds on residual reliability is proposed here. The proposed method is well supported by two algorithms for simulation purpose. The bounds on residual reliability of three different categories of interconnection topologies are efficiently found by using the proposed method

Keywords: Parallel computer network, reliability, probabilisticgraph, interconnection networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1343
892 Minimizing Makespan Subject to Budget Limitation in Parallel Flow Shop

Authors: Amin Sahraeian

Abstract:

One of the criteria in production scheduling is Make Span, minimizing this criteria causes more efficiently use of the resources specially machinery and manpower. By assigning some budget to some of the operations the operation time of these activities reduces and affects the total completion time of all the operations (Make Span). In this paper this issue is practiced in parallel flow shops. At first we convert parallel flow shop to a network model and by using a linear programming approach it is identified in order to minimize make span (the completion time of the network) which activities (operations) are better to absorb the predetermined and limited budget. Minimizing the total completion time of all the activities in the network is equivalent to minimizing make span in production scheduling.

Keywords: parallel flow shop, make span, linear programming, budget

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1779
891 Aerodynamic Coefficients Prediction from Minimum Computation Combinations Using OpenVSP Software

Authors: Marine Segui, Ruxandra Mihaela Botez

Abstract:

OpenVSP is an aerodynamic solver developed by National Aeronautics and Space Administration (NASA) that allows building a reliable model of an aircraft. This software performs an aerodynamic simulation according to the angle of attack of the aircraft makes between the incoming airstream, and its speed. A reliable aerodynamic model of the Cessna Citation X was designed but it required a lot of computation time. As a consequence, a prediction method was established that allowed predicting lift and drag coefficients for all Mach numbers and for all angles of attack, exclusively for stall conditions, from a computation of three angles of attack and only one Mach number. Aerodynamic coefficients given by the prediction method for a Cessna Citation X model were finally compared with aerodynamics coefficients obtained using a complete OpenVSP study.

Keywords: Aerodynamic, coefficient, cruise, improving, longitudinal, OpenVSP, solver, time.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1438
890 Design and Implementation of Shared Memory based Parallel File System Logging Method for High Performance Computing

Authors: Hyeyoung Cho, Sungho Kim, SangDong Lee

Abstract:

I/O workload is a critical and important factor to analyze I/O pattern and file system performance. However tracing I/O operations on the fly distributed parallel file system is non-trivial due to collection overhead and a large volume of data. In this paper, we design and implement a parallel file system logging method for high performance computing using shared memory-based multi-layer scheme. It minimizes the overhead with reduced logging operation response time and provides efficient post-processing scheme through shared memory. Separated logging server can collect sequential logs from multiple clients in a cluster through packet communication. Implementation and evaluation result shows low overhead and high scalability of this architecture for high performance parallel logging analysis.

Keywords: I/O workload, PVFS, I/O Trace.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1560
889 Novel Ridge Orientation Based Approach for Fingerprint Identification Using Co-Occurrence Matrix

Authors: Mehran Yazdi, Zahra Adelpour, Batoul Bahraini, Yasaman Keshtkar Jahromi

Abstract:

In this paper we use the property of co-occurrence matrix in finding parallel lines in binary pictures for fingerprint identification. In our proposed algorithm, we reduce the noise by filtering the fingerprint images and then transfer the fingerprint images to binary images using a proper threshold. Next, we divide the binary images into some regions having parallel lines in the same direction. The lines in each region have a specific angle that can be used for comparison. This method is simple, performs the comparison step quickly and has a good resistance in the presence of the noise.

Keywords: Parallel lines detection, co-occurrence matrix, fingerprint identification.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1357
888 Fast Painting with Different Colors Using Cross Correlation in the Frequency Domain

Authors: Hazem M. El-Bakry

Abstract:

In this paper, a new technique for fast painting with different colors is presented. The idea of painting relies on applying masks with different colors to the background. Fast painting is achieved by applying these masks in the frequency domain instead of spatial (time) domain. New colors can be generated automatically as a result from the cross correlation operation. This idea was applied successfully for faster specific data (face, object, pattern, and code) detection using neural algorithms. Here, instead of performing cross correlation between the input input data (e.g., image, or a stream of sequential data) and the weights of neural networks, the cross correlation is performed between the colored masks and the background. Furthermore, this approach is developed to reduce the computation steps required by the painting operation. The principle of divide and conquer strategy is applied through background decomposition. Each background is divided into small in size subbackgrounds and then each sub-background is processed separately by using a single faster painting algorithm. Moreover, the fastest painting is achieved by using parallel processing techniques to paint the resulting sub-backgrounds using the same number of faster painting algorithms. In contrast to using only faster painting algorithm, the speed up ratio is increased with the size of the background when using faster painting algorithm and background decomposition. Simulation results show that painting in the frequency domain is faster than that in the spatial domain.

Keywords: Fast Painting, Cross Correlation, Frequency Domain, Parallel Processing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1795
887 Parallel Vector Processing Using Multi Level Orbital DATA

Authors: Nagi Mekhiel

Abstract:

Many applications use vector operations by applying single instruction to multiple data that map to different locations in conventional memory. Transferring data from memory is limited by access latency and bandwidth affecting the performance gain of vector processing. We present a memory system that makes all of its content available to processors in time so that processors need not to access the memory, we force each location to be available to all processors at a specific time. The data move in different orbits to become available to other processors in higher orbits at different time. We use this memory to apply parallel vector operations to data streams at first orbit level. Data processed in the first level move to upper orbit one data element at a time, allowing a processor in that orbit to apply another vector operation to deal with serial code limitations inherited in all parallel applications and interleaved it with lower level vector operations.

Keywords: Memory organization, parallel processors, serial code, vector processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1062
886 A Method to Annotate Programs with High-Level Knowledge of Computation

Authors: Nobuhiko Hishinuma, Jun Igari, Rentaro Yoshioka

Abstract:

When programming in languages such as C, Java, etc., it is difficult to reconstruct the programmer's ideas only from the program code. This occurs mainly because, much of the programmer's ideas behind the implementation are not recorded in the code during implementation. For example, physical aspects of computation such as spatial structures, activities, and meaning of variables are not required as instructions to the computer and are often excluded. This makes the future reconstruction of the original ideas difficult. AIDA, which is a multimedia programming language based on the cyberFilm model, can solve these problems allowing to describe ideas behind programs using advanced annotation methods as a natural extension to programming. In this paper, a development environment that implements the AIDA language is presented with a focus on the annotation methods. In particular, an actual scientific numerical computation code is created and the effects of the annotation methods are analyzed.

Keywords: cyberFilm, development environment, knowledge engineering, multimedia programming language

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1282
885 A Survey on Performance Tools for OpenMP

Authors: Mubarak S. Mohsen, Rosni Abdullah, Yong M. Teo

Abstract:

Advances in processors architecture, such as multicore, increase the size of complexity of parallel computer systems. With multi-core architecture there are different parallel languages that can be used to run parallel programs. One of these languages is OpenMP which embedded in C/Cµ or FORTRAN. Because of this new architecture and the complexity, it is very important to evaluate the performance of OpenMP constructs, kernels, and application program on multi-core systems. Performance is the activity of collecting the information about the execution characteristics of a program. Performance tools consists of at least three interfacing software layers, including instrumentation, measurement, and analysis. The instrumentation layer defines the measured performance events. The measurement layer determines what performance event is actually captured and how it is measured by the tool. The analysis layer processes the performance data and summarizes it into a form that can be displayed in performance tools. In this paper, a number of OpenMP performance tools are surveyed, explaining how each is used to collect, analyse, and display data collection.

Keywords: Parallel performance tools, OpenMP, multi-core.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1922
884 Isotropic Stress Distribution in Cu/(001) Fe Two Sheets

Authors: A. Derardja, L. Baroura, M. Brioua

Abstract:

The nanotechnology based on epitaxial systems includes single or arranged misfit dislocations. In general, whatever is the type of dislocation or the geometry of the array formed by the dislocations; it is important for experimental studies to know exactly the stress distribution for which there is no analytical expression [1, 2]. This work, using a numerical analysis, deals with relaxation of epitaxial layers having at their interface a periodic network of edge misfit dislocations. The stress distribution is estimated by using isotropic elasticity. The results show that the thickness of the two sheets is a crucial parameter in the stress distributions and then in the profile of the two sheets. A comparative study between the case of single dislocation and the case of parallel network shows that the layers relaxed better when the interface is covered by a parallel arrangement of misfit. Consequently, a single dislocation at the interface produces an important stress field which can be reduced by inserting a parallel network of dislocations with suitable periodicity.

Keywords: Parallel array of misfit, interface, isotropic elasticity, single crystalline substrates, coherent interface

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1571
883 Unsupervised Feature Learning by Pre-Route Simulation of Auto-Encoder Behavior Model

Authors: Youngjae Jin, Daeshik Kim

Abstract:

This paper describes a cycle accurate simulation results of weight values learned by an auto-encoder behavior model in terms of pre-route simulation. Given the results we visualized the first layer representations with natural images. Many common deep learning threads have focused on learning high-level abstraction of unlabeled raw data by unsupervised feature learning. However, in the process of handling such a huge amount of data, the learning method’s computation complexity and time limited advanced research. These limitations came from the fact these algorithms were computed by using only single core CPUs. For this reason, parallel-based hardware, FPGAs, was seen as a possible solution to overcome these limitations. We adopted and simulated the ready-made auto-encoder to design a behavior model in VerilogHDL before designing hardware. With the auto-encoder behavior model pre-route simulation, we obtained the cycle accurate results of the parameter of each hidden layer by using MODELSIM. The cycle accurate results are very important factor in designing a parallel-based digital hardware. Finally this paper shows an appropriate operation of behavior model based pre-route simulation. Moreover, we visualized learning latent representations of the first hidden layer with Kyoto natural image dataset.

Keywords: Auto-encoder, Behavior model simulation, Digital hardware design, Pre-route simulation, Unsupervised feature learning.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2690
882 Groebner Bases Computation in Boolean Rings is P-SPACE

Authors: Quoc-Nam Tran

Abstract:

The theory of Groebner Bases, which has recently been honored with the ACM Paris Kanellakis Theory and Practice Award, has become a crucial building block to computer algebra, and is widely used in science, engineering, and computer science. It is wellknown that Groebner bases computation is EXP-SPACE in a general polynomial ring setting. However, for many important applications in computer science such as satisfiability and automated verification of hardware and software, computations are performed in a Boolean ring. In this paper, we give an algorithm to show that Groebner bases computation is PSPACE in Boolean rings. We also show that with this discovery, the Groebner bases method can theoretically be as efficient as other methods for automated verification of hardware and software. Additionally, many useful and interesting properties of Groebner bases including the ability to efficiently convert the bases for different orders of variables making Groebner bases a promising method in automated verification.

Keywords: Algorithm, Complexity, Groebner basis, Applications of Computer Science.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1960
881 On Fault Diagnosis of Asynchronous Sequential Machines with Parallel Composition

Authors: Jung-Min Yang

Abstract:

Fault diagnosis of composite asynchronous sequential machines with parallel composition is addressed in this paper. An adversarial input can infiltrate one of two submachines comprising the composite asynchronous machine, causing an unauthorized state transition. The objective is to characterize the condition under which the controller can diagnose any fault occurrence. Two control configurations, state feedback and output feedback, are considered in this paper. In the case of output feedback, the exact estimation of the state is impossible since the current state is inaccessible and the output feedback is given as the form of burst. A simple example is provided to demonstrate the proposed methodology.

Keywords: Asynchronous sequential machines, parallel composition, fault diagnosis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 970
880 Active Segment Selection Method in EEG Classification Using Fractal Features

Authors: Samira Vafaye Eslahi

Abstract:

BCI (Brain Computer Interface) is a communication machine that translates brain massages to computer commands. These machines with the help of computer programs can recognize the tasks that are imagined. Feature extraction is an important stage of the process in EEG classification that can effect in accuracy and the computation time of processing the signals. In this study we process the signal in three steps of active segment selection, fractal feature extraction, and classification. One of the great challenges in BCI applications is to improve classification accuracy and computation time together. In this paper, we have used student’s 2D sample t-statistics on continuous wavelet transforms for active segment selection to reduce the computation time. In the next level, the features are extracted from some famous fractal dimension estimation of the signal. These fractal features are Katz and Higuchi. In the classification stage we used ANFIS (Adaptive Neuro-Fuzzy Inference System) classifier, FKNN (Fuzzy K-Nearest Neighbors), LDA (Linear Discriminate Analysis), and SVM (Support Vector Machines). We resulted that active segment selection method would reduce the computation time and Fractal dimension features with ANFIS analysis on selected active segments is the best among investigated methods in EEG classification.

Keywords: EEG, Student’s t- statistics, BCI, Fractal Features, ANFIS, FKNN.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2120
879 Conditions for Fault Recovery of Interconnected Asynchronous Sequential Machines with State Feedback

Authors: Jung–Min Yang

Abstract:

In this paper, fault recovery for parallel interconnected asynchronous sequential machines is studied. An adversarial input can infiltrate into one of two submachines comprising parallel composition of the considered asynchronous sequential machine, causing an unauthorized state transition. The control objective is to elucidate the condition for the existence of a corrective controller that makes the closed-loop system immune against any occurrence of adversarial inputs. In particular, an efficient existence condition is presented that does not need the complete modeling of the interconnected asynchronous sequential machine.

Keywords: Asynchronous sequential machines, parallel composition, corrective control, fault tolerance.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 839
878 DNA Computing for an Absolute 1-Center Problem: An Evolutionary Approach

Authors: Zuwairie Ibrahim, Yusei Tsuboi, Osamu Ono, Marzuki Khalid

Abstract:

Deoxyribonucleic Acid or DNA computing has emerged as an interdisciplinary field that draws together chemistry, molecular biology, computer science and mathematics. Thus, in this paper, the possibility of DNA-based computing to solve an absolute 1-center problem by molecular manipulations is presented. This is truly the first attempt to solve such a problem by DNA-based computing approach. Since, part of the procedures involve with shortest path computation, research works on DNA computing for shortest path Traveling Salesman Problem, in short, TSP are reviewed. These approaches are studied and only the appropriate one is adapted in designing the computation procedures. This DNA-based computation is designed in such a way that every path is encoded by oligonucleotides and the path-s length is directly proportional to the length of oligonucleotides. Using these properties, gel electrophoresis is performed in order to separate the respective DNA molecules according to their length. One expectation arise from this paper is that it is possible to verify the instance absolute 1-center problem using DNA computing by laboratory experiments.

Keywords: DNA computing, operation research, 1-center problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1462
877 Hybrid Prefix Adder Architecture for Minimizing the Power Delay Product

Authors: P.Ramanathan, P.T.Vanathi

Abstract:

Parallel Prefix addition is a technique for improving the speed of binary addition. Due to continuing integrating intensity and the growing needs of portable devices, low-power and highperformance designs are of prime importance. The classical parallel prefix adder structures presented in the literature over the years optimize for logic depth, area, fan-out and interconnect count of logic circuits. In this paper, a new architecture for performing 8-bit, 16-bit and 32-bit Parallel Prefix addition is proposed. The proposed prefix adder structures is compared with several classical adders of same bit width in terms of power, delay and number of computational nodes. The results reveal that the proposed structures have the least power delay product when compared with its peer existing Prefix adder structures. Tanner EDA tool was used for simulating the adder designs in the TSMC 180 nm and TSMC 130 nm technologies.

Keywords: Parallel Prefix Adder (PPA), Dot operator, Semi-Dotoperator, Complementary Metal Oxide Semiconductor (CMOS), Odd-dot operator, Even-dot operator, Odd-semi-dot operator andEven-semi-dot operator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1726
876 The Riemann Barycenter Computation and Means of Several Matrices

Authors: Miklos Palfia

Abstract:

An iterative definition of any n variable mean function is given in this article, which iteratively uses the two-variable form of the corresponding two-variable mean function. This extension method omits recursivity which is an important improvement compared with certain recursive formulas given before by Ando-Li-Mathias, Petz- Temesi. Furthermore it is conjectured here that this iterative algorithm coincides with the solution of the Riemann centroid minimization problem. Certain simulations are given here to compare the convergence rate of the different algorithms given in the literature. These algorithms will be the gradient and the Newton mehod for the Riemann centroid computation.

Keywords: Means, matrix means, operator means, geometric mean, Riemannian center of mass.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1788
875 Some Computational Results on MPI Parallel Implementation of Dense Simplex Method

Authors: El-Said Badr, Mahmoud Moussa, Konstantinos Paparrizos, Nikolaos Samaras, Angelo Sifaleras

Abstract:

There are two major variants of the Simplex Algorithm: the revised method and the standard, or tableau method. Today, all serious implementations are based on the revised method because it is more efficient for sparse linear programming problems. Moreover, there are a number of applications that lead to dense linear problems so our aim in this paper is to present some computational results on parallel implementation of dense Simplex Method. Our implementation is implemented on a SMP cluster using C programming language and the Message Passing Interface MPI. Preliminary computational results on randomly generated dense linear programs support our results.

Keywords: Linear Programming, MPI, Parallel Implementation, Simplex Algorithm.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2049
874 A New High Speed Neural Model for Fast Character Recognition Using Cross Correlation and Matrix Decomposition

Authors: Hazem M. El-Bakry

Abstract:

Neural processors have shown good results for detecting a certain character in a given input matrix. In this paper, a new idead to speed up the operation of neural processors for character detection is presented. Such processors are designed based on cross correlation in the frequency domain between the input matrix and the weights of neural networks. This approach is developed to reduce the computation steps required by these faster neural networks for the searching process. The principle of divide and conquer strategy is applied through image decomposition. Each image is divided into small in size sub-images and then each one is tested separately by using a single faster neural processor. Furthermore, faster character detection is obtained by using parallel processing techniques to test the resulting sub-images at the same time using the same number of faster neural networks. In contrast to using only faster neural processors, the speed up ratio is increased with the size of the input image when using faster neural processors and image decomposition. Moreover, the problem of local subimage normalization in the frequency domain is solved. The effect of image normalization on the speed up ratio of character detection is discussed. Simulation results show that local subimage normalization through weight normalization is faster than subimage normalization in the spatial domain. The overall speed up ratio of the detection process is increased as the normalization of weights is done off line.

Keywords: Fast Character Detection, Neural Processors, Cross Correlation, Image Normalization, Parallel Processing.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1537
873 Using the PGAS Programming Paradigm for Biological Sequence Alignment on a Chip Multi-Threading Architecture

Authors: M. Bakhouya, S. A. Bahra, T. El-Ghazawi

Abstract:

The Partitioned Global Address Space (PGAS) programming paradigm offers ease-of-use in expressing parallelism through a global shared address space while emphasizing performance by providing locality awareness through the partitioning of this address space. Therefore, the interest in PGAS programming languages is growing and many new languages have emerged and are becoming ubiquitously available on nearly all modern parallel architectures. Recently, new parallel machines with multiple cores are designed for targeting high performance applications. Most of the efforts have gone into benchmarking but there are a few examples of real high performance applications running on multicore machines. In this paper, we present and evaluate a parallelization technique for implementing a local DNA sequence alignment algorithm using a PGAS based language, UPC (Unified Parallel C) on a chip multithreading architecture, the UltraSPARC T1.

Keywords: Partitioned Global Address Space, Unified Parallel C, Multicore machines, Multi-threading Architecture, Sequence alignment.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1390
872 Fault Diagnosis of Nonlinear Systems Using Dynamic Neural Networks

Authors: E. Sobhani-Tehrani, K. Khorasani, N. Meskin

Abstract:

This paper presents a novel integrated hybrid approach for fault diagnosis (FD) of nonlinear systems. Unlike most FD techniques, the proposed solution simultaneously accomplishes fault detection, isolation, and identification (FDII) within a unified diagnostic module. At the core of this solution is a bank of adaptive neural parameter estimators (NPE) associated with a set of singleparameter fault models. The NPEs continuously estimate unknown fault parameters (FP) that are indicators of faults in the system. Two NPE structures including series-parallel and parallel are developed with their exclusive set of desirable attributes. The parallel scheme is extremely robust to measurement noise and possesses a simpler, yet more solid, fault isolation logic. On the contrary, the series-parallel scheme displays short FD delays and is robust to closed-loop system transients due to changes in control commands. Finally, a fault tolerant observer (FTO) is designed to extend the capability of the NPEs to systems with partial-state measurement.

Keywords: Hybrid fault diagnosis, Dynamic neural networks, Nonlinear systems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2221
871 Parallel Particle Swarm Optimization Optimized LDI Controller with Lyapunov Stability Criterion for Nonlinear Structural Systems

Authors: P.-W. Tsai, W.-L. Hong, C.-W. Chen, C.-Y. Chen

Abstract:

In this paper, we present a neural-network (NN) based approach to represent a nonlinear Tagagi-Sugeno (T-S) system. A linear differential inclusion (LDI) state-space representation is utilized to deal with the NN models. Taking advantage of the LDI representation, the stability conditions and controller design are derived for a class of nonlinear structural systems. Moreover, the concept of utilizing the Parallel Particle Swarm Optimization (PPSO) algorithm to solve the common P matrix under the stability criteria is given in this paper.

Keywords: Lyapunov Stability, Parallel Particle Swarm Optimization, Linear Differential Inclusion, Artificial Intelligence.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1865
870 Simulation Modeling of Manufacturing Systems for the Serial Route and the Parallel One

Authors: Tadeusz Witkowski, Paweł Antczak, Arkadiusz Antczak

Abstract:

In the paper we discuss the influence of the route flexibility degree, the open rate of operations and the production type coefficient on makespan. The flexible job-open shop scheduling problem FJOSP (an extension of the classical job shop scheduling) is analyzed. For the analysis of the production process we used a hybrid heuristic of the GRASP (greedy randomized adaptive search procedure) with simulated annealing algorithm. Experiments with different levels of factors have been considered and compared. The GRASP+SA algorithm has been tested and illustrated with results for the serial route and the parallel one.

Keywords: Makespan, open shop, route flexibility, serial and parallel route, simulation modeling, type of production.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1814