Search results for: data parallelism
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 25273

Search results for: data parallelism

25273 Grammatical Parallelism in the Qurʼān

Authors: Yehudit Dror

Abstract:

Parallelism¬, or as it is called in Arabic, al-muqābala, occupies a central position in the rhetorical discipline of ʻilm al-bayān. Parallelism is used as a figure of textual ornamentation or embellishment and can be divided into several types that are based on the semantics of parallelism and its formative structure. Parallelism in Arabic has received a considerable amount of attention from the Arab rhetorician, which enables understanding the essence of parallelism in Arabic – its types, structure and meaning. However, there are some lacunae in their descriptions concerning the function and thematic restrictions of parallelism in the Qur’ān. In my presentation, which focuses on grammatical parallelism where the two stichos of the parallelism are the same with respect to syntax and morphology, I will show that parallelism has some important roles in the textual arrangement; it may, for example, conclude a thematic section, indicate a turning point in the text or to clarify what has been said previously. In addition, it will be shown that parallelism is not used randomly in the Qurʼān but rather is restricted to repeated themes which carry the most important messages of the Qurʼān, such as God's Might or behavioral patterns of the believers and the non-believers; or it can be used as a stylistic device.

Keywords: grammatical parallelism, half-line, symmetry, Koran

Procedia PDF Downloads 335
25272 Collision Detection Algorithm Based on Data Parallelism

Authors: Zhen Peng, Baifeng Wu

Abstract:

Modern computing technology enters the era of parallel computing with the trend of sustainable and scalable parallelism. Single Instruction Multiple Data (SIMD) is an important way to go along with the trend. It is able to gather more and more computing ability by increasing the number of processor cores without the need of modifying the program. Meanwhile, in the field of scientific computing and engineering design, many computation intensive applications are facing the challenge of increasingly large amount of data. Data parallel computing will be an important way to further improve the performance of these applications. In this paper, we take the accurate collision detection in building information modeling as an example. We demonstrate a model for constructing a data parallel algorithm. According to the model, a complex object is decomposed into the sets of simple objects; collision detection among complex objects is converted into those among simple objects. The resulting algorithm is a typical SIMD algorithm, and its advantages in parallelism and scalability is unparalleled in respect to the traditional algorithms.

Keywords: data parallelism, collision detection, single instruction multiple data, building information modeling, continuous scalability

Procedia PDF Downloads 291
25271 Exploring SSD Suitable Allocation Schemes Incompliance with Workload Patterns

Authors: Jae Young Park, Hwansu Jung, Jong Tae Kim

Abstract:

Whether the data has been well parallelized is an important factor in the Solid-State-Drive (SSD) performance. SSD parallelization is affected by allocation scheme and it is directly connected to SSD performance. There are dynamic allocation and static allocation in representative allocation schemes. Dynamic allocation is more adaptive in exploiting write operation parallelism, while static allocation is better in read operation parallelism. Therefore, it is hard to select the appropriate allocation scheme when the workload is mixed read and write operations. We simulated conditions on a few mixed data patterns and analyzed the results to help the right choice for better performance. As the results, if data arrival interval is long enough prior operations to be finished and continuous read intensive data environment static allocation is more suitable. Dynamic allocation performs the best on write performance and random data patterns.

Keywords: dynamic allocation, NAND flash based SSD, SSD parallelism, static allocation

Procedia PDF Downloads 340
25270 A Study on the Influence of Planet Pin Parallelism Error to Load Sharing Factor

Authors: Kyung Min Kang, Peng Mou, Dong Xiang, Yong Yang, Gang Shen

Abstract:

In this paper, planet pin parallelism error, which is one of manufacturing error of planet carrier, is employed as a main variable to influence planet load sharing factor. This error is categorize two group: (i) pin parallelism error with rotation on the axis perpendicular to the tangent of base circle of gear(x axis rotation in this paper) (ii) pin parallelism error with rotation on the tangent axis of base circle of gear(y axis rotation in this paper). For this study, the planetary gear system in 1.5MW wind turbine is applied and pure torsional rigid body model of this planetary gear is built using Solidworks and MSC.ADAMS. Based on quantified parallelism error and simulation model, dynamics simulation of planetary gear is carried out to obtain dynamic mesh load results with each type of error and load sharing factor is calculated with mesh load results. Load sharing factor formula and the suggestion for planetary reliability design is proposed with the conclusion of this study.

Keywords: planetary gears, planet load sharing, MSC. ADAMS, parallelism error

Procedia PDF Downloads 400
25269 A Sub-Scalar Approach to the MIPS Architecture

Authors: Kumar Sambhav Pandey, Anamika Singh

Abstract:

The continuous researches in the field of computer architecture basically aims at accelerating the computational speed and to gain enhanced performance. In this era, the superscalar, sub-scalar concept has not gained enough attention for improving the computation performance. In this paper, we have presented a sub-scalar approach to utilize the parallelism present with in the data while processing. The main idea is to split the data into individual smaller entities and these entities are processed with a defined known set of instructions. This sub-scalar approach to the MIPS architecture can bring out significant improvement in the computational speedup. MIPS-I is the basic design taken in consideration for the development of sub-scalar MIPS64 for increasing the instruction level parallelism (ILP) and resource utilization.

Keywords: dataword, MIPS, processor, sub-scalar

Procedia PDF Downloads 548
25268 Quantifying Parallelism of Vectors Is the Quantification of Distributed N-Party Entanglement

Authors: Shreya Banerjee, Prasanta K. Panigrahi

Abstract:

The three-way distributive entanglement is shown to be related to the parallelism of vectors. Using a measurement-based approach a set of 2−dimensional vectors is formed, representing the post-measurement states of one of the parties. These vectors originate at the same point and have an angular distance between them. The area spanned by a pair of such vectors is a measure of the entanglement of formation. This leads to a geometrical manifestation of the 3−tangle in 2−dimensions, from inequality in the area which generalizes for n− qubits to reveal that the n− tangle also has a planar structure. Quantifying the genuine n−party entanglement in every 1|(n − 1) bi-partition it is shown that the genuine n−way entanglement does not manifest in n− tangle. A new quantity geometrically similar to 3−tangle is then introduced that represents the genuine n− way entanglement. Extending the formalism to 3− qutrits, the nonlocality without entanglement can be seen to arise from a condition under which the post-measurement state vectors of a separable state show parallelism. A connection to nontrivial sum uncertainty relation analogous to Maccone and Pati uncertainty relation is then presented using decomposition of post-measurement state vectors along parallel and perpendicular direction of the pre-measurement state vectors. This study opens a novel way to understand multiparty entanglement in qubit and qudit systems.

Keywords: Geometry of quantum entanglement, Multipartite and distributive entanglement, Parallelism of vectors , Tangle

Procedia PDF Downloads 155
25267 Parallel Version of Reinhard’s Color Transfer Algorithm

Authors: Abhishek Bhardwaj, Manish Kumar Bajpai

Abstract:

An image with its content and schema of colors presents an effective mode of information sharing and processing. By changing its color schema different visions and prospect are discovered by the users. This phenomenon of color transfer is being used by Social media and other channel of entertainment. Reinhard et al’s algorithm was the first one to solve this problem of color transfer. In this paper, we make this algorithm efficient by introducing domain parallelism among different processors. We also comment on the factors that affect the speedup of this problem. In the end by analyzing the experimental data we claim to propose a novel and efficient parallel Reinhard’s algorithm.

Keywords: Reinhard et al’s algorithm, color transferring, parallelism, speedup

Procedia PDF Downloads 614
25266 Relevance of the Variation in the Angulation of Palatal Throat Form to the Orientation of the Occlusal Plane- A Cephalometric Study

Authors: Sanath Kumar Shetty, Sanya Sinha, K. Kamalakanth Shenoy

Abstract:

The posterior reference for the ala tragal line is a cause of confusion, with different authors suggesting different locations as to the superior, middle or inferior part of the tragus. This study was conducted on 200 subjects to evaluate if any correlation exists between the variation of angulation of palatal throat form and the relative parallelism of occlusal plane to ala-tragal line at different tragal levels. A Custom made Occlusal Plane Analyzer was used to check the parallelism between the ala-tragal line and occlusal plane. A lateral cephalogram was shot for each subject to measure the angulation of the palatal throat form. Fisher’s exact test was used to evaluate the correlation between the angulation of the palatal throat form and the relative parallelism of occlusal plane to the ala tragal line. Also, a classification was formulated for the palatal throat form, based on confidence interval. From the results of the study, the inferior part, middle part and superior part of the tragus were seen as the reference points in 49.5%, 32% and 18.5% of the subjects respectively. Class I palatal throat form (41degree-50 degree), Class II palatal throat form (below 41 degree) and Class III palatal throat form (above 50 degree) were seen in 42%, 43% and 15% of the subjects respectively. It was also concluded that there is no significant correlation between the variation in the angulations of the palatal throat form and the relative parallelism of occlusal plane to the ala-tragal line.

Keywords: Ala-Tragal line, occlusal plane, palatal throat form, cephalometry

Procedia PDF Downloads 311
25265 Tetrad field and torsion vectors in Schwarzschild solution

Authors: M.A.Bakry1, *, Aryn T. Shafeek1, +

Abstract:

In this article, absolute Parallelism geometry is used to study the torsional gravitational field. And discovered the tetrad fields, torsion vector, and torsion scalar of Schwarzschild space. The new solution of the torsional gravitational field is a generalization of Schwarzschild in the context of general relativity. The results are applied to the planetary orbits.

Keywords: absolute parallelism geometry, tetrad fields, torsion vectors, torsion scalar

Procedia PDF Downloads 143
25264 Detecting the Edge of Multiple Images in Parallel

Authors: Prakash K. Aithal, U. Dinesh Acharya, Rajesh Gopakumar

Abstract:

Edge is variation of brightness in an image. Edge detection is useful in many application areas such as finding forests, rivers from a satellite image, detecting broken bone in a medical image etc. The paper discusses about finding edge of multiple aerial images in parallel .The proposed work tested on 38 images 37 colored and one monochrome image. The time taken to process N images in parallel is equivalent to time taken to process 1 image in sequential. The proposed method achieves pixel level parallelism as well as image level parallelism.

Keywords: edge detection, multicore, gpu, opencl, mpi

Procedia PDF Downloads 480
25263 Efficient DNN Training on Heterogeneous Clusters with Pipeline Parallelism

Authors: Lizhi Ma, Dan Liu

Abstract:

Pipeline parallelism has been widely used to accelerate distributed deep learning to alleviate GPU memory bottlenecks and to ensure that models can be trained and deployed smoothly under limited graphics memory conditions. However, in highly heterogeneous distributed clusters, traditional model partitioning methods are not able to achieve load balancing. The overlap of communication and computation is also a big challenge. In this paper, HePipe is proposed, an efficient pipeline parallel training method for highly heterogeneous clusters. According to the characteristics of the neural network model pipeline training task, oriented to the 2-level heterogeneous cluster computing topology, a training method based on the 2-level stage division of neural network modeling and partitioning is designed to improve the parallelism. Additionally, a multi-forward 1F1B scheduling strategy is designed to accelerate the training time of each stage by executing the computation units in advance to maximize the overlap between the forward propagation communication and backward propagation computation. Finally, a dynamic recomputation strategy based on task memory requirement prediction is proposed to improve the fitness ratio of task and memory, which improves the throughput of the cluster and solves the memory shortfall problem caused by memory differences in heterogeneous clusters. The empirical results show that HePipe improves the training speed by 1.6×−2.2× over the existing asynchronous pipeline baselines.

Keywords: pipeline parallelism, heterogeneous cluster, model training, 2-level stage partitioning

Procedia PDF Downloads 19
25262 Software Transactional Memory in a Dynamic Programming Language at Virtual Machine Level

Authors: Szu-Kai Hsu, Po-Ching Lin

Abstract:

As more and more multi-core processors emerge, traditional sequential programming paradigm no longer suffice. Yet only few modern dynamic programming languages can leverage such advantage. Ruby, for example, despite its wide adoption, only includes threads as a simple parallel primitive. The global virtual machine lock of official Ruby runtime makes it impossible to exploit full parallelism. Though various alternative Ruby implementations do eliminate the global virtual machine lock, they only provide developers dated locking mechanism for data synchronization. However, traditional locking mechanism error-prone by nature. Software Transactional Memory is one of the promising alternatives among others. This paper introduces a new virtual machine: GobiesVM to provide a native software transactional memory based solution for dynamic programming languages to exploit parallelism. We also proposed a simplified variation of Transactional Locking II algorithm. The empirical results of our experiments show that support of STM at virtual machine level enables developers to write straightforward code without compromising parallelism or sacrificing thread safety. Existing source code only requires minimal or even none modi cation, which allows developers to easily switch their legacy codebase to a parallel environment. The performance evaluations of GobiesVM also indicate the difference between sequential and parallel execution is significant.

Keywords: global interpreter lock, ruby, software transactional memory, virtual machine

Procedia PDF Downloads 287
25261 Automatic Tuning for a Systemic Model of Banking Originated Losses (SYMBOL) Tool on Multicore

Authors: Ronal Muresano, Andrea Pagano

Abstract:

Nowadays, the mathematical/statistical applications are developed with more complexity and accuracy. However, these precisions and complexities have brought as result that applications need more computational power in order to be executed faster. In this sense, the multicore environments are playing an important role to improve and to optimize the execution time of these applications. These environments allow us the inclusion of more parallelism inside the node. However, to take advantage of this parallelism is not an easy task, because we have to deal with some problems such as: cores communications, data locality, memory sizes (cache and RAM), synchronizations, data dependencies on the model, etc. These issues are becoming more important when we wish to improve the application’s performance and scalability. Hence, this paper describes an optimization method developed for Systemic Model of Banking Originated Losses (SYMBOL) tool developed by the European Commission, which is based on analyzing the application's weakness in order to exploit the advantages of the multicore. All these improvements are done in an automatic and transparent manner with the aim of improving the performance metrics of our tool. Finally, experimental evaluations show the effectiveness of our new optimized version, in which we have achieved a considerable improvement on the execution time. The time has been reduced around 96% for the best case tested, between the original serial version and the automatic parallel version.

Keywords: algorithm optimization, bank failures, OpenMP, parallel techniques, statistical tool

Procedia PDF Downloads 370
25260 Scheduling Algorithm Based on Load-Aware Queue Partitioning in Heterogeneous Multi-Core Systems

Authors: Hong Kai, Zhong Jun Jie, Chen Lin Qi, Wang Chen Guang

Abstract:

There are inefficient global scheduling parallelism and local scheduling parallelism prone to processor starvation in current scheduling algorithms. Regarding this issue, this paper proposed a load-aware queue partitioning scheduling strategy by first allocating the queues according to the number of processor cores, calculating the load factor to specify the load queue capacity, and it assigned the awaiting nodes to the appropriate perceptual queues through the precursor nodes and the communication computation overhead. At the same time, real-time computation of the load factor could effectively prevent the processor from being starved for a long time. Experimental comparison with two classical algorithms shows that there is a certain improvement in both performance metrics of scheduling length and task speedup ratio.

Keywords: load-aware, scheduling algorithm, perceptual queue, heterogeneous multi-core

Procedia PDF Downloads 148
25259 The Language Use of Middle Eastern Freedom Activists' Speeches: A Gender Perspective

Authors: Sulistyaningtyas

Abstract:

Examining the role of Middle Eastern freedom activists’ speech based on gender perspective is considered noteworthy because the society in the Middle East is patriarchal. This research aims to examine the language use of the Middle Eastern freedom activists’ speeches through gender perspective. The data sources are from male and female Middle Eastern freedom activists’ speech videos. In analyzing the data, the theories employed are about Language Style from Gender Perspective and The Language for Speech. The result reveals that there are sets of spoken language differences between male and female speakers. In using the language for speech, both male and female speakers produce metaphor, euphemism, the ‘rule of three’, parallelism, and pronouns in random frequency of production, which cannot be separated by genders. Moreover, it cannot be concluded that one gender is more potential than the other to influence the audience in delivering speech. There are other factors, particularly non-verbal factors, existing to give impacts on how a speech can influence the audience.

Keywords: gender perspective, language use, Middle Eastern freedom activists, speech

Procedia PDF Downloads 423
25258 Performance Analysis and Optimization for Diagonal Sparse Matrix-Vector Multiplication on Machine Learning Unit

Authors: Qiuyu Dai, Haochong Zhang, Xiangrong Liu

Abstract:

Diagonal sparse matrix-vector multiplication is a well-studied topic in the fields of scientific computing and big data processing. However, when diagonal sparse matrices are stored in DIA format, there can be a significant number of padded zero elements and scattered points, which can lead to a degradation in the performance of the current DIA kernel. This can also lead to excessive consumption of computational and memory resources. In order to address these issues, the authors propose the DIA-Adaptive scheme and its kernel, which leverages the parallel instruction sets on MLU. The researchers analyze the effect of allocating a varying number of threads, clusters, and hardware architectures on the performance of SpMV using different formats. The experimental results indicate that the proposed DIA-Adaptive scheme performs well and offers excellent parallelism.

Keywords: adaptive method, DIA, diagonal sparse matrices, MLU, sparse matrix-vector multiplication

Procedia PDF Downloads 136
25257 Rapid Algorithm for GPS Signal Acquisition

Authors: Fabricio Costa Silva, Samuel Xavier de Souza

Abstract:

A Global Positioning System (GPS) receiver is responsible to determine position, velocity and timing information by using satellite information. To get this information are necessary to combine an incoming and a locally generated signal. The procedure called acquisition need to found two information, the frequency and phase of the incoming signal. This is very time consuming, so there are several techniques to reduces the computational complexity, but each of then put projects issues in conflict. I this papers we present a method that can reduce the computational complexity by reducing the search space and paralleling the search.

Keywords: GPS, acquisition, complexity, parallelism

Procedia PDF Downloads 539
25256 Rapid Parallel Algorithm for GPS Signal Acquisition

Authors: Fabricio Costa Silva, Samuel Xavier de Souza

Abstract:

A Global Positioning System (GPS) receiver is responsible to determine position, velocity and timing information by using satellite information. To get this information's are necessary to combine an incoming and a locally generated signal. The procedure called acquisition need to found two information, the frequency and phase of the incoming signal. This is very time consuming, so there are several techniques to reduces the computational complexity, but each of then put projects issues in conflict. I this papers we present a method that can reduce the computational complexity by reducing the search space and paralleling the search.

Keywords: GPS, acquisition, low complexity, parallelism

Procedia PDF Downloads 503
25255 Quantitative Analysis of Multiprocessor Architectures for Radar Signal Processing

Authors: Deepak Kumar, Debasish Deb, Reena Mamgain

Abstract:

Radar signal processing requires high number crunching capability. Most often this is achieved using multiprocessor platform. Though multiprocessor platform provides the capability of meeting the real time computational challenges, the architecture of the same along with mapping of the algorithm on the architecture plays a vital role in efficiently using the platform. Towards this, along with standard performance metrics, few additional metrics are defined which helps in evaluating the multiprocessor platform along with the algorithm mapping. A generic multiprocessor architecture can not suit all the processing requirements. Depending on the system requirement and type of algorithms used, the most suitable architecture for the given problem is decided. In the paper, we study different architectures and quantify the different performance metrics which enables comparison of different architectures for their merit. We also carried out case study of different architectures and their efficiency depending on parallelism exploited on algorithm or data or both.

Keywords: radar signal processing, multiprocessor architecture, efficiency, load imbalance, buffer requirement, pipeline, parallel, hybrid, cluster of processors (COPs)

Procedia PDF Downloads 413
25254 DNA PLA: A Nano-Biotechnological Programmable Device

Authors: Hafiz Md. HasanBabu, Khandaker Mohammad Mohi Uddin, Md. IstiakJaman Ami, Rahat Hossain Faisal

Abstract:

Computing in biomolecular programming performs through the different types of reactions. Proteins and nucleic acids are used to store the information generated by biomolecular programming. DNA (Deoxyribose Nucleic Acid) can be used to build a molecular computing system and operating system for its predictable molecular behavior property. The DNA device has clear advantages over conventional devices when applied to problems that can be divided into separate, non-sequential tasks. The reason is that DNA strands can hold so much data in memory and conduct multiple operations at once, thus solving decomposable problems much faster. Programmable Logic Array, abbreviated as PLA is a programmable device having programmable AND operations and OR operations. In this paper, a DNA PLA is designed by different molecular operations using DNA molecules with the proposed algorithms. The molecular PLA could take advantage of DNA's physical properties to store information and perform calculations. These include extremely dense information storage, enormous parallelism, and extraordinary energy efficiency.

Keywords: biological systems, DNA computing, parallel computing, programmable logic array, PLA, DNA

Procedia PDF Downloads 130
25253 Modified Montgomery for RSA Cryptosystem

Authors: Rupali Verma, Maitreyee Dutta, Renu Vig

Abstract:

Encryption and decryption in RSA are done by modular exponentiation which is achieved by repeated modular multiplication. Hence, efficiency of modular multiplication directly determines the efficiency of RSA cryptosystem. This paper designs a Modified Montgomery Modular multiplication in which addition of operands is computed by 4:2 compressor. The basic logic operations in addition are partitioned over two iterations such that parallel computations are performed. This reduces the critical path delay of proposed Montgomery design. The proposed design and RSA are implemented on Virtex 2 and Virtex 5 FPGAs. The two factors partitioning and parallelism have improved the frequency and throughput of proposed design.

Keywords: RSA, montgomery modular multiplication, 4:2 compressor, FPGA

Procedia PDF Downloads 414
25252 Solving a Micromouse Maze Using an Ant-Inspired Algorithm

Authors: Rolando Barradas, Salviano Soares, António Valente, José Alberto Lencastre, Paulo Oliveira

Abstract:

This article reviews the Ant Colony Optimization, a nature-inspired algorithm, and its implementation in the Scratch/m-Block programming environment. The Ant Colony Optimization is a part of Swarm Intelligence-based algorithms and is a subset of biological-inspired algorithms. Starting with a problem in which one has a maze and needs to find its path to the center and return to the starting position. This is similar to an ant looking for a path to a food source and returning to its nest. Starting with the implementation of a simple wall follower simulator, the proposed solution uses a dynamic graphical interface that allows young students to observe the ants’ movement while the algorithm optimizes the routes to the maze’s center. Things like interface usability, Data structures, and the conversion of algorithmic language to Scratch syntax were some of the details addressed during this implementation. This gives young students an easier way to understand the computational concepts of sequences, loops, parallelism, data, events, and conditionals, as they are used through all the implemented algorithms. Future work includes the simulation results with real contest mazes and two different pheromone update methods and the comparison with the optimized results of the winners of each one of the editions of the contest. It will also include the creation of a Digital Twin relating the virtual simulator with a real micromouse in a full-size maze. The first test results show that the algorithm found the same optimized solutions that were found by the winners of each one of the editions of the Micromouse contest making this a good solution for maze pathfinding.

Keywords: nature inspired algorithms, scratch, micromouse, problem-solving, computational thinking

Procedia PDF Downloads 126
25251 A Study of Mandarin Ba Constructions from the Perspective of Event Structure

Authors: Changyin Zhou

Abstract:

Ba constructions are a special type of constructions in Chinese. Their syntactic behaviors are closely related to their event structural properties. The existing study which treats the semantic function of Ba as causative meets difficulty in treating the discrepancy between Ba constructions and their corresponding constructions without Ba in expressing causativity. This paper holds that Ba in Ba constructions is a functional category expressing affectedness. The affectedness expressed by Ba can be positive or negative. The functional category Ba expressing negative affectedness has the semantic property of being 'expected'. The precondition of Ba construction is the boundedness of the event concerned. This paper, holding the parallelism between motion events and change-of-state events, proposes a syntactic model based on the notions of boundedness and affectedness, discusses the transformations between Ba constructions and the related resultative constructions, and derivates the various Ba constructions concerned.

Keywords: affectedness, Ba constructions, boundedness, event structure, resultative constructions

Procedia PDF Downloads 422
25250 Core Number Optimization Based Scheduler to Order/Mapp Simulink Application

Authors: Asma Rebaya, Imen Amari, Kaouther Gasmi, Salem Hasnaoui

Abstract:

Over these last years, the number of cores witnessed a spectacular increase in digital signal and general use processors. Concurrently, significant researches are done to get benefit from the high degree of parallelism. Indeed, these researches are focused to provide an efficient scheduling from hardware/software systems to multicores architecture. The scheduling process consists on statically choose one core to execute one task and to specify an execution order for the application tasks. In this paper, we describe an efficient scheduler that calculates the optimal number of cores required to schedule an application, gives a heuristic scheduling solution and evaluates its cost. Our proposal results are evaluated and compared with Preesm scheduler results and we prove that ours allows better scheduling in terms of latency, computation time and number of cores.

Keywords: computation time, hardware/software system, latency, optimization, multi-cores platform, scheduling

Procedia PDF Downloads 284
25249 Data Transformations in Data Envelopment Analysis

Authors: Mansour Mohammadpour

Abstract:

Data transformation refers to the modification of any point in a data set by a mathematical function. When applying transformations, the measurement scale of the data is modified. Data transformations are commonly employed to turn data into the appropriate form, which can serve various functions in the quantitative analysis of the data. This study addresses the investigation of the use of data transformations in Data Envelopment Analysis (DEA). Although data transformations are important options for analysis, they do fundamentally alter the nature of the variable, making the interpretation of the results somewhat more complex.

Keywords: data transformation, data envelopment analysis, undesirable data, negative data

Procedia PDF Downloads 24
25248 A Parallel Computation Based on GPU Programming for a 3D Compressible Fluid Flow Simulation

Authors: Sugeng Rianto, P.W. Arinto Yudi, Soemarno Muhammad Nurhuda

Abstract:

A computation of a 3D compressible fluid flow for virtual environment with haptic interaction can be a non-trivial issue. This is especially how to reach good performances and balancing between visualization, tactile feedback interaction, and computations. In this paper, we describe our approach of computation methods based on parallel programming on a GPU. The 3D fluid flow solvers have been developed for smoke dispersion simulation by using combinations of the cubic interpolated propagation (CIP) based fluid flow solvers and the advantages of the parallelism and programmability of the GPU. The fluid flow solver is generated in the GPU-CPU message passing scheme to get rapid development of haptic feedback modes for fluid dynamic data. A rapid solution in fluid flow solvers is developed by applying cubic interpolated propagation (CIP) fluid flow solvers. From this scheme, multiphase fluid flow equations can be solved simultaneously. To get more acceleration in the computation, the Navier-Stoke Equations (NSEs) is packed into channels of texel, where computation models are performed on pixels that can be considered to be a grid of cells. Therefore, despite of the complexity of the obstacle geometry, processing on multiple vertices and pixels can be done simultaneously in parallel. The data are also shared in global memory for CPU to control the haptic in providing kinaesthetic interaction and felling. The results show that GPU based parallel computation approaches provide effective simulation of compressible fluid flow model for real-time interaction in 3D computer graphic for PC platform. This report has shown the feasibility of a new approach of solving the compressible fluid flow equations on the GPU. The experimental tests proved that the compressible fluid flowing on various obstacles with haptic interactions on the few model obstacles can be effectively and efficiently simulated on the reasonable frame rate with a realistic visualization. These results confirm that good performances and balancing between visualization, tactile feedback interaction, and computations can be applied successfully.

Keywords: CIP, compressible fluid, GPU programming, parallel computation, real-time visualisation

Procedia PDF Downloads 432
25247 The Novel of 'the Adventure of the Secrets': Character in Postmodern Labyrinth, the Problem of Time and Subject

Authors: Nargiz Ismayilova

Abstract:

In Kamal Abdulla's "The Adventure of Mysteries", the plot develops on two parallel lines. While reading the work, the future looks hazy on the background of the present and the past. It is impossible to predict the end of the work in particular. This can be considered the success of the author. The novel has reflected the features of postmodernism. The novel is characterized by a richness of intertwined plots, themes, meta- submission, device (fiction) typical of postmodern prose technique. The introduction and progress of the work takes the reader to the place, which is an unrecognizable unknown for him but at the same time, its native for him very well. Parts of the novel, divided into chapter techniques, force the reader to distinguish mystical repetitions from the artistic circulation of reality. This makes people think directly. Intertextual communication and the variety of fiction, intelligence, and informativeness determine the perspective of the exemplary reader. As is well known, “postmodern novels, which often use intertextual communication and superstructure techniques, focus on expression rather than on the subject, and benefit from history by combining fiction with historical facts, are able to attract attention with their extraordinary foreign fiction.

Keywords: Kamal Abdulla, postmodernism, parallelism, labyrinth, comparison, novel

Procedia PDF Downloads 181
25246 Portable and Parallel Accelerated Development Method for Field-Programmable Gate Array (FPGA)-Central Processing Unit (CPU)- Graphics Processing Unit (GPU) Heterogeneous Computing

Authors: Nan Hu, Chao Wang, Xi Li, Xuehai Zhou

Abstract:

The field-programmable gate array (FPGA) has been widely adopted in the high-performance computing domain. In recent years, the embedded system-on-a-chip (SoC) contains coarse granularity multi-core CPU (central processing unit) and mobile GPU (graphics processing unit) that can be used as general-purpose accelerators. The motivation is that algorithms of various parallel characteristics can be efficiently mapped to the heterogeneous architecture coupled with these three processors. The CPU and GPU offload partial computationally intensive tasks from the FPGA to reduce the resource consumption and lower the overall cost of the system. However, in present common scenarios, the applications always utilize only one type of accelerator because the development approach supporting the collaboration of the heterogeneous processors faces challenges. Therefore, a systematic approach takes advantage of write-once-run-anywhere portability, high execution performance of the modules mapped to various architectures and facilitates the exploration of design space. In this paper, A servant-execution-flow model is proposed for the abstraction of the cooperation of the heterogeneous processors, which supports task partition, communication and synchronization. At its first run, the intermediate language represented by the data flow diagram can generate the executable code of the target processor or can be converted into high-level programming languages. The instantiation parameters efficiently control the relationship between the modules and computational units, including two hierarchical processing units mapping and adjustment of data-level parallelism. An embedded system of a three-dimensional waveform oscilloscope is selected as a case study. The performance of algorithms such as contrast stretching, etc., are analyzed with implementations on various combinations of these processors. The experimental results show that the heterogeneous computing system with less than 35% resources achieves similar performance to the pure FPGA and approximate energy efficiency.

Keywords: FPGA-CPU-GPU collaboration, design space exploration, heterogeneous computing, intermediate language, parameterized instantiation

Procedia PDF Downloads 118
25245 Parallelization of Random Accessible Progressive Streaming of Compressed 3D Models over Web

Authors: Aayushi Somani, Siba P. Samal

Abstract:

Three-dimensional (3D) meshes are data structures, which store geometric information of an object or scene, generally in the form of vertices and edges. Current technology in laser scanning and other geometric data acquisition technologies acquire high resolution sampling which leads to high resolution meshes. While high resolution meshes give better quality rendering and hence is used often, the processing, as well as storage of 3D meshes, is currently resource-intensive. At the same time, web applications for data processing have become ubiquitous owing to their accessibility. For 3D meshes, the advancement of 3D web technologies, such as WebGL, WebVR, has enabled high fidelity rendering of huge meshes. However, there exists a gap in ability to stream huge meshes to a native client and browser application due to high network latency. Also, there is an inherent delay of loading WebGL pages due to large and complex models. The focus of our work is to identify the challenges faced when such meshes are streamed into and processed on hand-held devices, owing to its limited resources. One of the solutions that are conventionally used in the graphics community to alleviate resource limitations is mesh compression. Our approach deals with a two-step approach for random accessible progressive compression and its parallel implementation. The first step includes partition of the original mesh to multiple sub-meshes, and then we invoke data parallelism on these sub-meshes for its compression. Subsequent threaded decompression logic is implemented inside the Web Browser Engine with modification of WebGL implementation in Chromium open source engine. This concept can be used to completely revolutionize the way e-commerce and Virtual Reality technology works for consumer electronic devices. These objects can be compressed in the server and can be transmitted over the network. The progressive decompression can be performed on the client device and rendered. Multiple views currently used in e-commerce sites for viewing the same product from different angles can be replaced by a single progressive model for better UX and smoother user experience. Can also be used in WebVR for commonly and most widely used activities like virtual reality shopping, watching movies and playing games. Our experiments and comparison with existing techniques show encouraging results in terms of latency (compressed size is ~10-15% of the original mesh), processing time (20-22% increase over serial implementation) and quality of user experience in web browser.

Keywords: 3D compression, 3D mesh, 3D web, chromium, client-server architecture, e-commerce, level of details, parallelization, progressive compression, WebGL, WebVR

Procedia PDF Downloads 170
25244 Modified Bat Algorithm for Economic Load Dispatch Problem

Authors: Daljinder Singh, J.S.Dhillon, Balraj Singh

Abstract:

According to no free lunch theorem, a single search technique cannot perform best in all conditions. Optimization method can be attractive choice to solve optimization problem that may have exclusive advantages like robust and reliable performance, global search capability, little information requirement, ease of implementation, parallelism, no requirement of differentiable and continuous objective function. In order to synergize between exploration and exploitation and to further enhance the performance of Bat algorithm, the paper proposed a modified bat algorithm that adds additional search procedure based on bat’s previous experience. The proposed algorithm is used for solving the economic load dispatch (ELD) problem. The practical constraint such valve-point loading along with power balance constraints and generator limit are undertaken. To take care of power demand constraint variable elimination method is exploited. The proposed algorithm is tested on various ELD problems. The results obtained show that the proposed algorithm is capable of performing better in majority of ELD problems considered and is at par with existing algorithms for some of problems.

Keywords: bat algorithm, economic load dispatch, penalty method, variable elimination method

Procedia PDF Downloads 461