Search results for: parallel simulations
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 2939

Search results for: parallel simulations

2879 Parallel Computing: Offloading Matrix Multiplication to GPU

Authors: Bharath R., Tharun Sai N., Bhuvan G.

Abstract:

This project focuses on developing a Parallel Computing method aimed at optimizing matrix multiplication through GPU acceleration. Addressing algorithmic challenges, GPU programming intricacies, and integration issues, the project aims to enhance efficiency and scalability. The methodology involves algorithm design, GPU programming, and optimization techniques. Future plans include advanced optimizations, extended functionality, and integration with high-level frameworks. User engagement is emphasized through user-friendly interfaces, open- source collaboration, and continuous refinement based on feedback. The project's impact extends to significantly improving matrix multiplication performance in scientific computing and machine learning applications.

Keywords: matrix multiplication, parallel processing, cuda, performance boost, neural networks

Procedia PDF Downloads 12
2878 Performance Evaluation of Parallel Surface Modeling and Generation on Actual and Virtual Multicore Systems

Authors: Nyeng P. Gyang

Abstract:

Even though past, current and future trends suggest that multicore and cloud computing systems are increasingly prevalent/ubiquitous, this class of parallel systems is nonetheless underutilized, in general, and barely used for research on employing parallel Delaunay triangulation for parallel surface modeling and generation, in particular. The performances, of actual/physical and virtual/cloud multicore systems/machines, at executing various algorithms, which implement various parallelization strategies of the incremental insertion technique of the Delaunay triangulation algorithm, were evaluated. T-tests were run on the data collected, in order to determine whether various performance metrics differences (including execution time, speedup and efficiency) were statistically significant. Results show that the actual machine is approximately twice faster than the virtual machine at executing the same programs for the various parallelization strategies. Results, which furnish the scalability behaviors of the various parallelization strategies, also show that some of the differences between the performances of these systems, during different runs of the algorithms on the systems, were statistically significant. A few pseudo superlinear speedup results, which were computed from the raw data collected, are not true superlinear speedup values. These pseudo superlinear speedup values, which arise as a result of one way of computing speedups, disappear and give way to asymmetric speedups, which are the accurate kind of speedups that occur in the experiments performed.

Keywords: cloud computing systems, multicore systems, parallel Delaunay triangulation, parallel surface modeling and generation

Procedia PDF Downloads 178
2877 Solid Particles Transport and Deposition Prediction in a Turbulent Impinging Jet Using the Lattice Boltzmann Method and a Probabilistic Model on GPU

Authors: Ali Abdul Kadhim, Fue Lien

Abstract:

Solid particle distribution on an impingement surface has been simulated utilizing a graphical processing unit (GPU). In-house computational fluid dynamics (CFD) code has been developed to investigate a 3D turbulent impinging jet using the lattice Boltzmann method (LBM) in conjunction with large eddy simulation (LES) and the multiple relaxation time (MRT) models. This paper proposed an improvement in the LBM-cellular automata (LBM-CA) probabilistic method. In the current model, the fluid flow utilizes the D3Q19 lattice, while the particle model employs the D3Q27 lattice. The particle numbers are defined at the same regular LBM nodes, and transport of particles from one node to its neighboring nodes are determined in accordance with the particle bulk density and velocity by considering all the external forces. The previous models distribute particles at each time step without considering the local velocity and the number of particles at each node. The present model overcomes the deficiencies of the previous LBM-CA models and, therefore, can better capture the dynamic interaction between particles and the surrounding turbulent flow field. Despite the increasing popularity of LBM-MRT-CA model in simulating complex multiphase fluid flows, this approach is still expensive in term of memory size and computational time required to perform 3D simulations. To improve the throughput of each simulation, a single GeForce GTX TITAN X GPU is used in the present work. The CUDA parallel programming platform and the CuRAND library are utilized to form an efficient LBM-CA algorithm. The methodology was first validated against a benchmark test case involving particle deposition on a square cylinder confined in a duct. The flow was unsteady and laminar at Re=200 (Re is the Reynolds number), and simulations were conducted for different Stokes numbers. The present LBM solutions agree well with other results available in the open literature. The GPU code was then used to simulate the particle transport and deposition in a turbulent impinging jet at Re=10,000. The simulations were conducted for L/D=2,4 and 6, where L is the nozzle-to-surface distance and D is the jet diameter. The effect of changing the Stokes number on the particle deposition profile was studied at different L/D ratios. For comparative studies, another in-house serial CPU code was also developed, coupling LBM with the classical Lagrangian particle dispersion model. Agreement between results obtained with LBM-CA and LBM-Lagrangian models and the experimental data is generally good. The present GPU approach achieves a speedup ratio of about 350 against the serial code running on a single CPU.

Keywords: CUDA, GPU parallel programming, LES, lattice Boltzmann method, MRT, multi-phase flow, probabilistic model

Procedia PDF Downloads 175
2876 An Improved Many Worlds Quantum Genetic Algorithm

Authors: Li Dan, Zhao Junsuo, Zhang Wenjun

Abstract:

Aiming at the shortcomings of the Quantum Genetic Algorithm such as the multimodal function optimization problems easily falling into the local optimum, and vulnerable to premature convergence due to no closely relationship between individuals, the paper presents an Improved Many Worlds Quantum Genetic Algorithm (IMWQGA). The paper using the concept of Many Worlds; using the derivative way of parallel worlds’ parallel evolution; putting forward the thought which updating the population according to the main body; adopting the transition methods such as parallel transition, backtracking, travel forth. In addition, the algorithm in the paper also proposes the quantum training operator and the combinatorial optimization operator as new operators of quantum genetic algorithm.

Keywords: quantum genetic algorithm, many worlds, quantum training operator, combinatorial optimization operator

Procedia PDF Downloads 707
2875 Parallel Computation of the Covariance-Matrix

Authors: Claude Tadonki

Abstract:

We address the issues related to the computation of the covariance matrix. This matrix is likely to be ill conditioned following its canonical expression, thus consequently raises serious numerical issues. The underlying linear system, which therefore should be solved by means of iterative approaches, becomes computationally challenging. A huge number of iterations is expected in order to reach an acceptable level of convergence, necessary to meet the required accuracy of the computation. In addition, this linear system needs to be solved at each iteration following the general form of the covariance matrix. Putting all together, its comes that we need to compute as fast as possible the associated matrix-vector product. This is our purpose in the work, where we consider and discuss skillful formulations of the problem, then propose a parallel implementation of the matrix-vector product involved. Numerical and performance oriented discussions are provided based on experimental evaluations.

Keywords: covariance-matrix, multicore, numerical computing, parallel computing

Procedia PDF Downloads 284
2874 Study of the Vertical Handoff in Heterogeneous Networks and Implement Based on Opnet

Authors: Wafa Benaatou, Adnane Latif

Abstract:

In this document we studied more in detail the Performances of the vertical handover in the networks WLAN, WiMAX, UMTS before studying of it the Procedure of Handoff Vertical, the whole buckled by simulations putting forward the performances of the handover in the heterogeneous networks. The goal of Vertical Handover is to carry out several accesses in real-time in the heterogeneous networks. This makes it possible a user to use several networks (such as WLAN UMTS and WiMAX) in parallel, and the system to commutate automatically at another basic station, without disconnecting itself, as if there were no cut and with little loss of data as possible.

Keywords: vertical handoff, WLAN, UMTS, WIMAX, heterogeneous

Procedia PDF Downloads 358
2873 Resistivity Tomography Optimization Based on Parallel Electrode Linear Back Projection Algorithm

Authors: Yiwei Huang, Chunyu Zhao, Jingjing Ding

Abstract:

Electrical Resistivity Tomography has been widely used in the medicine and the geology, such as the imaging of the lung impedance and the analysis of the soil impedance, etc. Linear Back Projection is the core algorithm of Electrical Resistivity Tomography, but the traditional Linear Back Projection can not make full use of the information of the electric field. In this paper, an imaging method of Parallel Electrode Linear Back Projection for Electrical Resistivity Tomography is proposed, which generates the electric field distribution that is not linearly related to the traditional Linear Back Projection, captures the new information and improves the imaging accuracy without increasing the number of electrodes by changing the connection mode of the electrodes. The simulation results show that the accuracy of the image obtained by the inverse operation obtained by the Parallel Electrode Linear Back Projection can be improved by about 20%.

Keywords: electrical resistivity tomography, finite element simulation, image optimization, parallel electrode linear back projection

Procedia PDF Downloads 120
2872 Series-Parallel Systems Reliability Optimization Using Genetic Algorithm and Statistical Analysis

Authors: Essa Abrahim Abdulgader Saleem, Thien-My Dao

Abstract:

The main objective of this paper is to optimize series-parallel system reliability using Genetic Algorithm (GA) and statistical analysis; considering system reliability constraints which involve the redundant numbers of selected components, total cost, and total weight. To perform this work, firstly the mathematical model which maximizes system reliability subject to maximum system cost and maximum system weight constraints is presented; secondly, a statistical analysis is used to optimize GA parameters, and thirdly GA is used to optimize series-parallel systems reliability. The objective is to determine the strategy choosing the redundancy level for each subsystem to maximize the overall system reliability subject to total cost and total weight constraints. Finally, the series-parallel system case study reliability optimization results are showed, and comparisons with the other previous results are presented to demonstrate the performance of our GA.

Keywords: reliability, optimization, meta-heuristic, genetic algorithm, redundancy

Procedia PDF Downloads 312
2871 The Comparison of Dismount Skill between National and International Men’s Artistic Gymnastics in Parallel Bars Apparatus

Authors: Chen ChihYu, Tang Wen Tzu, Chen Kuang Hui

Abstract:

Aim —To compare the dismount skill between Taiwanese and elite international gymnastics in parallel bars following the 2017-2020 code of points. Methods—The gymnasts who advanced to the parallel bars event finals of these four competitions including World Championships, Universiade, the National Games of Taiwan, and the National Intercollegiate Athletic Games of Taiwan both 2017 and 2019 were selected in this study. The dismount skill of parallel bars was analyzed, and the average difficulty score was compared by one-way ANOVA. Descriptive statistics were applied to present the type of dismount skill and the difficulty of each gymnast in these four competitions. The data from World Championships and Universiade were combined as the international group (INT), and data of Taiwanese National Games and National Intercollegiate Athletic Games were also combined as the national group (NAT). The differences between INT and NAT were analyzed by the Chi-square test. The statistical significance of this study was set at α= 0.05. Results— i) There was a significant difference in the mean parallel bars dismount skill in these four competitions analyzed by one-way ANOVA. Both dismount scores of World Championships and Universiade were significantly higher than in Taiwanese National Games and National Intercollegiate Athletic Games (0.58±0.08 & 0.56±0.08 > 0.42±0.06 & 40±0.06, p < 0.05). ii) Most of the gymnasts in World Championships and Universiade selected the 0.6-point skill as the parallel bars dismount element, and for the Taiwanese National Games and the National Intercollegiate Athletic Games, most of the gymnasts performed the 0.4-point dismount skill. iii) The result of the Chi-square test has shown that there was a significant difference in the selection of parallel bars dismount skill. The INT group used the E or E+ difficulty element as the dismount skill, and the NAT group selected the D or D- difficulty element. Conclusion— The level of parallel bars dismount in Taiwanese gymnastics is inferior to elite international gymnastics. It is suggested that Taiwanese gymnastics must try to practice the F difficulty dismount (double salto forward tucked with half twist) in the future.

Keywords: Artistic Gymnastics World Championships, dismount, difficulty score, element

Procedia PDF Downloads 114
2870 Task Scheduling on Parallel System Using Genetic Algorithm

Authors: Jasbir Singh Gill, Baljit Singh

Abstract:

Scheduling and mapping the application task graph on multiprocessor parallel systems is considered as the most crucial and critical NP-complete problem. Many genetic algorithms have been proposed to solve such problems. In this paper, two genetic approach based algorithms have been designed and developed with or without task duplication. The proposed algorithms work on two fitness functions. The first fitness i.e. task fitness is used to minimize the total finish time of the schedule (schedule length) while the second fitness function i.e. process fitness is concerned with allocating the tasks to the available highly efficient processor from the list of available processors (load balance). Proposed genetic-based algorithms have been experimentally implemented and evaluated with other state-of-art popular and widely used algorithms.

Keywords: parallel computing, task scheduling, task duplication, genetic algorithm

Procedia PDF Downloads 310
2869 A Framework of Dynamic Rule Selection Method for Dynamic Flexible Job Shop Problem by Reinforcement Learning Method

Authors: Rui Wu

Abstract:

In the volatile modern manufacturing environment, new orders randomly occur at any time, while the pre-emptive methods are infeasible. This leads to a real-time scheduling method that can produce a reasonably good schedule quickly. The dynamic Flexible Job Shop problem is an NP-hard scheduling problem that hybrid the dynamic Job Shop problem with the Parallel Machine problem. A Flexible Job Shop contains different work centres. Each work centre contains parallel machines that can process certain operations. Many algorithms, such as genetic algorithms or simulated annealing, have been proposed to solve the static Flexible Job Shop problems. However, the time efficiency of these methods is low, and these methods are not feasible in a dynamic scheduling problem. Therefore, a dynamic rule selection scheduling system based on the reinforcement learning method is proposed in this research, in which the dynamic Flexible Job Shop problem is divided into several parallel machine problems to decrease the complexity of the dynamic Flexible Job Shop problem. Firstly, the features of jobs, machines, work centres, and flexible job shops are selected to describe the status of the dynamic Flexible Job Shop problem at each decision point in each work centre. Secondly, a framework of reinforcement learning algorithm using a double-layer deep Q-learning network is applied to select proper composite dispatching rules based on the status of each work centre. Then, based on the selected composite dispatching rule, an available operation is selected from the waiting buffer and assigned to an available machine in each work centre. Finally, the proposed algorithm will be compared with well-known dispatching rules on objectives of mean tardiness, mean flow time, mean waiting time, or mean percentage of waiting time in the real-time Flexible Job Shop problem. The result of the simulations proved that the proposed framework has reasonable performance and time efficiency.

Keywords: dynamic scheduling problem, flexible job shop, dispatching rules, deep reinforcement learning

Procedia PDF Downloads 73
2868 Fault Diagnosis of Nonlinear Systems Using Dynamic Neural Networks

Authors: E. Sobhani-Tehrani, K. Khorasani, N. Meskin

Abstract:

This paper presents a novel integrated hybrid approach for fault diagnosis (FD) of nonlinear systems. Unlike most FD techniques, the proposed solution simultaneously accomplishes fault detection, isolation, and identification (FDII) within a unified diagnostic module. At the core of this solution is a bank of adaptive neural parameter estimators (NPE) associated with a set of single-parameter fault models. The NPEs continuously estimate unknown fault parameters (FP) that are indicators of faults in the system. Two NPE structures including series-parallel and parallel are developed with their exclusive set of desirable attributes. The parallel scheme is extremely robust to measurement noise and possesses a simpler, yet more solid, fault isolation logic. On the contrary, the series-parallel scheme displays short FD delays and is robust to closed-loop system transients due to changes in control commands. Finally, a fault tolerant observer (FTO) is designed to extend the capability of the NPEs to systems with partial-state measurement.

Keywords: hybrid fault diagnosis, dynamic neural networks, nonlinear systems, fault tolerant observer

Procedia PDF Downloads 364
2867 Arc Interruption Design for DC High Current/Low SC Fuses via Simulation

Authors: Ali Kadivar, Kaveh Niayesh

Abstract:

This report summarizes a simulation-based approach to estimate the current interruption behavior of a fuse element utilized in a DC network protecting battery banks under different stresses. Due to internal resistance of the battries, the short circuit current in very close to the nominal current, and it makes the fuse designation tricky. The base configuration considered in this report consists of five fuse units in parallel. The simulations are performed using a multi-physics software package, COMSOL® 5.6, and the necessary material parameters have been calculated using two other software packages.The first phase of the simulation starts with the heating of the fuse elements resulted from the current flow through the fusing element. In this phase, the heat transfer between the metallic strip and the adjacent materials results in melting and evaporation of the filler and housing before the aluminum strip is evaporated and the current flow in the evaporated strip is cut-off, or an arc is eventually initiated. The initiated arc starts to expand, so the entire metallic strip is ablated, and a long arc of around 20 mm is created within the first 3 milliseconds after arc initiation (v_elongation = 6.6 m/s. The final stage of the simulation is related to the arc simulation and its interaction with the external circuitry. Because of the strong ablation of the filler material and venting of the arc caused by the melting and evaporation of the filler and housing before an arc initiates, the arc is assumed to burn in almost pure ablated material. To be able to precisely model this arc, one more step related to the derivation of the transport coefficients of the plasma in ablated urethane was necessary. The results indicate that an arc current interruption, in this case, will not be achieved within the first tens of milliseconds. In a further study, considering two series elements, the arc was interrupted within few milliseconds. A very important aspect in this context is the potential impact of many broken strips parallel to the one where the arc occurs. The generated arcing voltage is also applied to the other broken strips connected in parallel with arcing path. As the gap between the other strips is very small, a large voltage of a few hundred volts generated during the current interruption may eventually lead to a breakdown of another gap. As two arcs in parallel are not stable, one of the arcs will extinguish, and the total current will be carried by one single arc again. This process may be repeated several times if the generated voltage is very large. The ultimate result would be that the current interruption may be delayed.

Keywords: DC network, high current / low SC fuses, FEM simulation, paralle fuses

Procedia PDF Downloads 40
2866 A Genetic Algorithm for the Load Balance of Parallel Computational Fluid Dynamics Computation with Multi-Block Structured Mesh

Authors: Chunye Gong, Ming Tie, Jie Liu, Weimin Bao, Xinbiao Gan, Shengguo Li, Bo Yang, Xuguang Chen, Tiaojie Xiao, Yang Sun

Abstract:

Large-scale CFD simulation relies on high-performance parallel computing, and the load balance is the key role which affects the parallel efficiency. This paper focuses on the load-balancing problem of parallel CFD simulation with structured mesh. A mathematical model for this load-balancing problem is presented. The genetic algorithm, fitness computing, two-level code are designed. Optimal selector, robust operator, and local optimization operator are designed. The properties of the presented genetic algorithm are discussed in-depth. The effects of optimal selector, robust operator, and local optimization operator are proved by experiments. The experimental results of different test sets, DLR-F4, and aircraft design applications show the presented load-balancing algorithm is robust, quickly converged, and is useful in real engineering problems.

Keywords: genetic algorithm, load-balancing algorithm, optimal variation, local optimization

Procedia PDF Downloads 138
2865 A Study on Design for Parallel Test Based on Embedded System

Authors: Zheng Sun, Weiwei Cui, Xiaodong Ma, Hongxin Jin, Dongpao Hong, Jinsong Yang, Jingyi Sun

Abstract:

With the improvement of the performance and complexity of modern equipment, automatic test system (ATS) becomes widely used for condition monitoring and fault diagnosis. However, the conventional ATS mainly works in a serial mode, and lacks the ability of testing several equipments at the same time. That leads to low test efficiency and ATS redundancy. Especially for a large majority of equipment under test, the conventional ATS cannot meet the requirement of efficient testing. To reduce the support resource and increase test efficiency, we propose a method of design for the parallel test based on the embedded system in this paper. Firstly, we put forward the general framework of the parallel test system, and the system contains a central management system (CMS) and several distributed test subsystems (DTS). Then we give a detailed design of the system. For the hardware of the system, we use embedded architecture to design DTS. For the software of the system, we use test program set to improve the test adaption. By deploying the parallel test system, the time to test five devices is now equal to the time to test one device in the past. Compared with the conventional test system, the proposed test system reduces the size and improves testing efficiency. This is of great significance for equipment to be put into operation swiftly. Finally, we take an industrial control system as an example to verify the effectiveness of the proposed method. The result shows that the method is reasonable, and the efficiency is improved up to 500%.

Keywords: parallel test, embedded system, automatic test system, automatic test system (ATS), central management system, central management system (CMS), distributed test subsystems, distributed test subsystems (DTS)

Procedia PDF Downloads 264
2864 People Vote with Their Feet: The 'Parallel Polis' in South Africa as a Reaction to the Neo-Patrimonial State

Authors: A. Kok

Abstract:

The South African experience of the general upsurge in protest movements internationally is characterised by a tension between a neo-patrimonial state on the one hand, and a society with growing middle-class needs and interests on the other. This tension translates into local community service delivery protests – often violent in nature – that have been steadily increasing in number since 2008, student uprisings that have reached their height in October 2015, and various continuing local social #MustFall movements that are geared towards addressing government corruption and transforming neo-liberal structures. As a result, growing citizen (and non-citizen) revolt in South Africa has seen the (i) creeping securitization of the neo-patrimonial state and (ii) the 'top-down' misuse of a current 'bottom-up' people’s ideology, decoloniality, in an attempt by a faction in the ruling party (representing the neo-patrimonial state) to legitimize its actions and consolidate its power. The neo-patrimonial state’s creeping securitization and ideological positioning lead to a further mistrust of public institutions, people’s disengagement with traditional politics, and the creation of a 'parallel polis' by citizens and non-citizens that bypasses the official and oftentimes corrupt structures of the state. By applying the concept 'parallel polis' – originally developed by Václav Benda in connection with the movement Charter 77 in former Czechoslovakia – to a South African case study, it is illustrated that, even in the absence of overt oppression and the use of terror by a ruling elite, entrenched neo-patrimonialism can be potent enough to fuel the creation of various independent parallel public spheres (or, as a whole, understood as a 'parallel polis') to bypass dysfunctional state channels. A flourishing parallel polis offers possibilities for political, social and economic renewal. This is especially relevant in the consolidation of South Africa’s relatively young democracy.

Keywords: decoloniality, neo-patrimonialism, 'parallel polis', protest movements, South Africa, state securitization

Procedia PDF Downloads 182
2863 Synthesis of Balanced 3-RRR Planar Parallel Manipulators

Authors: Arakelian Vigen, Geng Jing, Le Baron Jean-Paul

Abstract:

The paper deals with the design of parallel manipulators with balanced inertia forces and moments. The balancing of the resultant of the inertia forces of 3-RRR planar parallel manipulators is carried out through mass redistribution and centre of mass acceleration minimization. The proposed balancing technique is achieved in two steps: at first, optimal redistribution of the masses of input links is accomplished, which ensures the similarity of the end-effector trajectory and the manipulator’s common centre of mass trajectory, then, optimal trajectory planning of the end-effector by 'bang-bang' profile is reached. In such a way, the minimization of the magnitude of the acceleration of the centre of mass of the manipulator brings about a minimization of shaking force. To minimize the resultant of the inertia moments (shaking moment), the active balancing via inertia flywheel is applied. However, in this case, the active balancing is quite different from previous applications because it provides only a partial cancellation of the shaking moment due to the incomplete balancing of shaking force.

Keywords: dynamic balancing, inertia force minimization, inertia moment minimization, 3-RRR planar parallel manipulator

Procedia PDF Downloads 430
2862 Continuous-Time Analysis And Performance Assessment For Digital Control Of High-Frequency Switching Synchronous Dc-Dc Converter

Authors: Rihab Hamdi, Amel Hadri Hamida, Ouafae Bennis, Sakina Zerouali

Abstract:

This paper features a performance analysis and robustness assessment of a digitally controlled DC-DC three-cell buck converter associated in parallel, operating in continuous conduction mode (CCM), facing feeding parameters variation and loads disturbance. The control strategy relies on the continuous-time with an averaged modeling technique for high-frequency switching converter. The methodology is to modulate the complete design procedure, in regard to the existence of an instantaneous current operating point for designing the digital closed-loop, to the same continuous-time domain. Moreover, the adopted approach is to include a digital voltage control (DVC) technique, taking an account for digital control delays and sampling effects, which aims at improving efficiency and dynamic response and preventing generally undesired phenomena. The results obtained under load change, input change, and reference change clearly demonstrates an excellent dynamic response of the proposed technique, also as provide stability in any operating conditions, the effectiveness is fast with a smooth tracking of the specified output voltage. Simulations studies in MATLAB/Simulink environment are performed to verify the concept.

Keywords: continuous conduction mode, digital control, parallel multi-cells converter, performance analysis, power electronics

Procedia PDF Downloads 120
2861 Unsteady Three-Dimensional Adaptive Spatial-Temporal Multi-Scale Direct Simulation Monte Carlo Solver to Simulate Rarefied Gas Flows in Micro/Nano Devices

Authors: Mirvat Shamseddine, Issam Lakkis

Abstract:

We present an efficient, three-dimensional parallel multi-scale Direct Simulation Monte Carlo (DSMC) algorithm for the simulation of unsteady rarefied gas flows in micro/nanosystems. The algorithm employs a novel spatiotemporal adaptivity scheme. The scheme performs a fully dynamic multi-level grid adaption based on the gradients of flow macro-parameters and an automatic temporal adaptation. The computational domain consists of a hierarchical octree-based Cartesian grid representation of the flow domain and a triangular mesh for the solid object surfaces. The hybrid mesh, combined with the spatiotemporal adaptivity scheme, allows for increased flexibility and efficient data management, rendering the framework suitable for efficient particle-tracing and dynamic grid refinement and coarsening. The parallel algorithm is optimized to run DSMC simulations of strongly unsteady, non-equilibrium flows over multiple cores. The presented method is validated by comparing with benchmark studies and then employed to improve the design of micro-scale hotwire thermal sensors in rarefied gas flows.

Keywords: DSMC, oct-tree hierarchical grid, ray tracing, spatial-temporal adaptivity scheme, unsteady rarefied gas flows

Procedia PDF Downloads 278
2860 A TFETI Domain Decompositon Solver for von Mises Elastoplasticity Model with Combination of Linear Isotropic-Kinematic Hardening

Authors: Martin Cermak, Stanislav Sysala

Abstract:

In this paper we present the efficient parallel implementation of elastoplastic problems based on the TFETI (Total Finite Element Tearing and Interconnecting) domain decomposition method. This approach allow us to use parallel solution and compute this nonlinear problem on the supercomputers and decrease the solution time and compute problems with millions of DOFs. In our approach we consider an associated elastoplastic model with the von Mises plastic criterion and the combination of linear isotropic-kinematic hardening law. This model is discretized by the implicit Euler method in time and by the finite element method in space. We consider the system of nonlinear equations with a strongly semismooth and strongly monotone operator. The semismooth Newton method is applied to solve this nonlinear system. Corresponding linearized problems arising in the Newton iterations are solved in parallel by the above mentioned TFETI. The implementation of this problem is realized in our in-house MatSol packages developed in MATLAB.

Keywords: isotropic-kinematic hardening, TFETI, domain decomposition, parallel solution

Procedia PDF Downloads 385
2859 An Evolutionary Approach for Automated Optimization and Design of Vivaldi Antennas

Authors: Sahithi Yarlagadda

Abstract:

The design of antenna is constrained by mathematical and geometrical parameters. Though there are diverse antenna structures with wide range of feeds yet, there are many geometries to be tried, which cannot be customized into predefined computational methods. The antenna design and optimization qualify to apply evolutionary algorithmic approach since the antenna parameters weights dependent on geometric characteristics directly. The evolutionary algorithm can be explained simply for a given quality function to be maximized. We can randomly create a set of candidate solutions, elements of the function's domain, and apply the quality function as an abstract fitness measure. Based on this fitness, some of the better candidates are chosen to seed the next generation by applying recombination and permutation to them. In conventional approach, the quality function is unaltered for any iteration. But the antenna parameters and geometries are wide to fit into single function. So, the weight coefficients are obtained for all possible antenna electrical parameters and geometries; the variation is learnt by mining the data obtained for an optimized algorithm. The weight and covariant coefficients of corresponding parameters are logged for learning and future use as datasets. This paper drafts an approach to obtain the requirements to study and methodize the evolutionary approach to automated antenna design for our past work on Vivaldi antenna as test candidate. The antenna parameters like gain, directivity, etc. are directly caged by geometries, materials, and dimensions. The design equations are to be noted here and valuated for all possible conditions to get maxima and minima for given frequency band. The boundary conditions are thus obtained prior to implementation, easing the optimization. The implementation mainly aimed to study the practical computational, processing, and design complexities that incur while simulations. HFSS is chosen for simulations and results. MATLAB is used to generate the computations, combinations, and data logging. MATLAB is also used to apply machine learning algorithms and plotting the data to design the algorithm. The number of combinations is to be tested manually, so HFSS API is used to call HFSS functions from MATLAB itself. MATLAB parallel processing tool box is used to run multiple simulations in parallel. The aim is to develop an add-in to antenna design software like HFSS, CSTor, a standalone application to optimize pre-identified common parameters of wide range of antennas available. In this paper, we have used MATLAB to calculate Vivaldi antenna parameters like slot line characteristic impedance, impedance of stripline, slot line width, flare aperture size, dielectric and K means, and Hamming window are applied to obtain the best test parameters. HFSS API is used to calculate the radiation, bandwidth, directivity, and efficiency, and data is logged for applying the Evolutionary genetic algorithm in MATLAB. The paper demonstrates the computational weights and Machine Learning approach for automated antenna optimizing for Vivaldi antenna.

Keywords: machine learning, Vivaldi, evolutionary algorithm, genetic algorithm

Procedia PDF Downloads 83
2858 New Series Input Parallel Output LLC DC/DC Converter with the Input Voltage Balancing Capacitor for the Electric System of Electric Vehicles

Authors: Kang Hyun Yi

Abstract:

This paper presents a new parallel output LLC DC/DC converter for electric vehicle. The electric vehicle has two batteries. One is a high voltage battery for the powertrain of the vehicle and the other is a low voltage battery for the vehicle electric system. The low voltage is charged from the high voltage battery and the high voltage input and the high current output DC/DC converter is needed. Therefore, the new LLC converter with the input voltage compensation is proposed for the high voltage input and the low voltage output DC/DC converter. The proposed circuit has two LLC converters with the series input voltage from the battery for the powertrain and the parallel output low battery voltage for the vehicle electric system because the battery voltage for the powertrain and the electric power for the vehicle become high. Also, the input series voltage compensation capacitor is used for balancing the input current in the two LLC converters. The proposed converter has an equal electric stress of the semiconductor parts and the reactive components, high efficiency and good heat dissipation.

Keywords: electric vehicle, LLC DC/DC converter, input voltage balancing, parallel output

Procedia PDF Downloads 1021
2857 Design and Fabrication of an Electrostatically Actuated Parallel-Plate Mirror by 3D-Printer

Authors: J. Mizuno, S. Takahashi

Abstract:

In this paper, design and fabrication of an actuated parallel-plate mirror based on a 3D-printer is described. The mirror and electrode layers are fabricated separately and assembled thereafter. The alignment is performed by dowel pin-hole pairs fabricated on the respective layers. The electrodes are formed on the surface of the electrode layer by Au ion sputtering using a suitable mask, which is also fabricated by a 3D-printer.For grounding the mirror layer, except the contact area with the electrode paths, all the surface is Au ion sputtered. 3D-printers are widely used for creating 3D models or mock-ups. The authors have recently proposed that these models can perform electromechanical functions such as actuators by suitably masking them followed by metallization process. Since the smallest possible fabrication size is in the order of sub-millimeters, these electromechanical devices are named by the authors as SMEMS (Sub-Milli Electro-Mechanical Systems) devices. The proposed mirror described in this paper which consists of parallel-plate electrostatic actuators is also one type of SMEMS devices. In addition, SMEMS is totally environment-clean compared to MEMS (Micro Electro-Mechanical Systems) fabrication processes because any hazardous chemicals or gases are utilized.

Keywords: MEMS, parallel-plate mirror, SMEMS, 3D-printer

Procedia PDF Downloads 409
2856 GPU Accelerated Fractal Image Compression for Medical Imaging in Parallel Computing Platform

Authors: Md. Enamul Haque, Abdullah Al Kaisan, Mahmudur R. Saniat, Aminur Rahman

Abstract:

In this paper, we have implemented both sequential and parallel version of fractal image compression algorithms using CUDA (Compute Unified Device Architecture) programming model for parallelizing the program in Graphics Processing Unit for medical images, as they are highly similar within the image itself. There is several improvements in the implementation of the algorithm as well. Fractal image compression is based on the self similarity of an image, meaning an image having similarity in majority of the regions. We take this opportunity to implement the compression algorithm and monitor the effect of it using both parallel and sequential implementation. Fractal compression has the property of high compression rate and the dimensionless scheme. Compression scheme for fractal image is of two kinds, one is encoding and another is decoding. Encoding is very much computational expensive. On the other hand decoding is less computational. The application of fractal compression to medical images would allow obtaining much higher compression ratios. While the fractal magnification an inseparable feature of the fractal compression would be very useful in presenting the reconstructed image in a highly readable form. However, like all irreversible methods, the fractal compression is connected with the problem of information loss, which is especially troublesome in the medical imaging. A very time consuming encoding process, which can last even several hours, is another bothersome drawback of the fractal compression.

Keywords: accelerated GPU, CUDA, parallel computing, fractal image compression

Procedia PDF Downloads 300
2855 Discrete Breeding Swarm for Cost Minimization of Parallel Job Shop Scheduling Problem

Authors: Tarek Aboueldahab, Hanan Farag

Abstract:

Parallel Job Shop Scheduling Problem (JSP) is a multi-objective and multi constrains NP- optimization problem. Traditional Artificial Intelligence techniques have been widely used; however, they could be trapped into the local minimum without reaching the optimum solution, so we propose a hybrid Artificial Intelligence model (AI) with Discrete Breeding Swarm (DBS) added to traditional Artificial Intelligence to avoid this trapping. This model is applied in the cost minimization of the Car Sequencing and Operator Allocation (CSOA) problem. The practical experiment shows that our model outperforms other techniques in cost minimization.

Keywords: parallel job shop scheduling problem, artificial intelligence, discrete breeding swarm, car sequencing and operator allocation, cost minimization

Procedia PDF Downloads 150
2854 Modal FDTD Method for Wave Propagation Modeling Customized for Parallel Computing

Authors: H. Samadiyeh, R. Khajavi

Abstract:

A new FD-based procedure, modal finite difference method (MFDM), is proposed for seismic wave propagation modeling, in which simulation is dealt with in the modal space. The method employs eigenvalues of a characteristic matrix formed by appropriate time-space FD stencils. Since MFD runs for different modes are totally independent of each other, MFDM can easily be parallelized while considerable simplicity in parallel-algorithm is also achieved. There is no requirement to any domain-decomposition procedure and inter-core data exchange. More important is the possibility to skip processing of less-significant modes, which enables one to adjust the procedure up to the level of accuracy needed. Thus, in addition to considerable ease of parallel programming, computation and storage costs are significantly reduced. The method is qualified for its efficiency by some numerical examples.

Keywords: Finite Difference Method, Graphics Processing Unit (GPU), Message Passing Interface (MPI), Modal, Wave propagation

Procedia PDF Downloads 269
2853 Enhancement of Natural Convection Heat Transfer within Closed Enclosure Using Parallel Fins

Authors: F. A. Gdhaidh, K. Hussain, H. S. Qi

Abstract:

A numerical study of natural convection heat transfer in water filled cavity has been examined in 3D for single phase liquid cooling system by using an array of parallel plate fins mounted to one wall of a cavity. The heat generated by a heat source represents a computer CPU with dimensions of 37.5×37.5 mm mounted on substrate. A cold plate is used as a heat sink installed on the opposite vertical end of the enclosure. The air flow inside the computer case is created by an exhaust fan. A turbulent air flow is assumed and k-ε model is applied. The fins are installed on the substrate to enhance the heat transfer. The applied power energy range used is between 15- 40W. In order to determine the thermal behaviour of the cooling system, the effect of the heat input and the number of the parallel plate fins are investigated. The results illustrate that as the fin number increases the maximum heat source temperature decreases. However, when the fin number increases to critical value the temperature start to increase due to the fins are too closely spaced and that cause the obstruction of water flow. The introduction of parallel plate fins reduces the maximum heat source temperature by 10% compared to the case without fins. The cooling system maintains the maximum chip temperature at 64.68℃ when the heat input was at 40 W which is much lower than the recommended computer chips limit temperature of no more than 85℃ and hence the performance of the CPU is enhanced.

Keywords: chips limit temperature, closed enclosure, natural convection, parallel plate, single phase liquid

Procedia PDF Downloads 242
2852 Chemical Fingerprinting of Complex Samples With the Aid of Parallel Outlet Flow Chromatography

Authors: Xavier A. Conlan

Abstract:

Speed of analysis is a significant limitation to current high-performance liquid chromatography/mass spectrometry (HPLC/MS) and ultra-high-pressure liquid chromatography (UHPLC)/MS systems both of which are used in many forensic investigations. The flow rate limitations of MS detection require a compromise in the chromatographic flow rate, which in turn reduces throughput, and when using modern columns, a reduction in separation efficiency. Commonly, this restriction is combated through the post-column splitting of flow prior to entry into the mass spectrometer. However, this results in a loss of sensitivity and a loss in efficiency due to the post-extra column dead volume. A new chromatographic column format known as 'parallel segmented flow' involves the splitting of eluent flow within the column outlet end fitting, and in this study we present its application in order to interrogate the provenience of methamphetamine samples with mass spectrometry detection. Using parallel segmented flow, column flow rates as high as 3 mL/min were employed in the analysis of amino acids without post-column splitting to the mass spectrometer. Furthermore, when parallel segmented flow chromatography columns were employed, the sensitivity was more than twice that of conventional systems with post-column splitting when the same volume of mobile phase was passed through the detector. These finding suggest that this type of column technology will particularly enhance the capabilities of modern LC/MS enabling both high-throughput and sensitive mass spectral detection.

Keywords: chromatography, mass spectrometry methamphetamine, parallel segmented outlet flow column, forensic sciences

Procedia PDF Downloads 460
2851 CFD Simulations to Study the Cooling Effects of Different Greening Modifications

Authors: An-Shik Yang, Chih-Yung Wen, Chiang-Ho Cheng, Yu-Hsuan Juan

Abstract:

The objective of this study is to conduct computational fluid dynamic (CFD) simulations for evaluating the cooling efficacy from vegetation implanted in a public park in the Taipei, Taiwan. To probe the impacts of park renewal by means of adding three pavilions and supplementary green areas on urban microclimates, the simulated results have revealed that the park having a higher percentage of green coverage ratio (GCR) tended to experience a better cooling effect. These findings can be used to explore the effects of different greening modifications on urban environments for achieving an effective thermal comfort in urban public spaces.

Keywords: CFD simulations, Green Coverage Ratio, Urban heat island, Urban Public Park

Procedia PDF Downloads 449
2850 Design of Active Power Filters for Harmonics on Power System and Reducing Harmonic Currents

Authors: Düzgün Akmaz, Hüseyin Erişti

Abstract:

In the last few years, harmonics have been occurred with the increasing use of nonlinear loads, and these harmonics have been an ever increasing problem for the line systems. This situation importantly affects the quality of power and gives large losses to the network. An efficient way to solve these problems is providing harmonic compensation through parallel active power filters. Many methods can be used in the control systems of the parallel active power filters which provide the compensation. These methods efficiently affect the performance of the active power filters. For this reason, the chosen control method is significant. In this study, Fourier analysis (FA) control method and synchronous reference frame (SRF) control method are discussed. These control methods are designed for both eliminate harmonics and perform reactive power compensation in MATLAB/Simulink pack program and are tested. The results have been compared for each two methods.

Keywords: parallel active power filters, harmonic compensation, power quality, harmonics

Procedia PDF Downloads 414