Search results for: lower bound of processor elements

2863 An Implementation of MacMahon's Partition Analysis in Ordering the Lower Bound of Processing Elements for the Algorithm of LU Decomposition

Authors: Halil Snopce, Ilir Spahiu, Lavdrim Elmazi

Abstract:

A lot of Scientific and Engineering problems require the solution of large systems of linear equations of the form bAx in an effective manner. LU-Decomposition offers good choices for solving this problem. Our approach is to find the lower bound of processing elements needed for this purpose. Here is used the so called Omega calculus, as a computational method for solving problems via their corresponding Diophantine relation. From the corresponding algorithm is formed a system of linear diophantine equalities using the domain of computation which is given by the set of lattice points inside the polyhedron. Then is run the Mathematica program DiophantineGF.m. This program calculates the generating function from which is possible to find the number of solutions to the system of Diophantine equalities, which in fact gives the lower bound for the number of processors needed for the corresponding algorithm. There is given a mathematical explanation of the problem as well. Keywordsgenerating function, lattice points in polyhedron, lower bound of processor elements, system of Diophantine equationsand : calculus.

Keywords: generating function, lattice points in polyhedron, lower bound of processor elements, system of Diophantine equations and calculus.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1475

2862 An Efficient Algorithm for Reliability Lower Bound of Distributed Systems

Authors: Mohamed H. S. Mohamed, Yang Xiao-zong, Liu Hong-wei, Wu Zhi-bo

Abstract:

The reliability of distributed systems and computer networks have been modeled by a probabilistic network or a graph G. Computing the residual connectedness reliability (RCR), denoted by R(G), under the node fault model is very useful, but is an NP-hard problem. Since it may need exponential time of the network size to compute the exact value of R(G), it is important to calculate its tight approximate value, especially its lower bound, at a moderate calculation time. In this paper, we propose an efficient algorithm for reliability lower bound of distributed systems with unreliable nodes. We also applied our algorithm to several typical classes of networks to evaluate the lower bounds and show the effectiveness of our algorithm.

Keywords: Distributed systems, probabilistic network, residual connectedness reliability, lower bound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1683

2861 Volatility of Cu, Ni, Cr, Co, Pb, and As in Fluidised-Bed Combustion Chamber in Relation to Their Modes of Occurrence in Coal

Authors: L. Bartoňová, Z. Klika

Abstract:

Modes of occurrence of Pb, As, Cr, Co, Cu, and Ni in bituminous coal and lignite were determined by means of sequential extraction using NH4OAc, HCl, HF and HNO3 extraction solutions. Elemental affinities obtained were then evaluated in relation to volatility of these elements during the combustion of these coals in two circulating fluidised-bed power stations. It was found out that higher percentage of the elements bound in silicates brought about lower volatility, while higher elemental proportion with monosulphides association (or bound as exchangeable ion) resulted in higher volatility. The only exception was the behavior of arsenic, whose volatility depended on amount of limestone added during the combustion process (as desulphurisation additive) rather than to its association in coal.

Keywords: Coal combustion, sequential extraction, trace elements, volatility.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1792

2860 Lower Bound of Time Span Product for a General Class of Signals in Fractional Fourier Domain

Authors: Sukrit Shankar, Chetana Shanta Patsa, Jaydev Sharma

Abstract:

Fractional Fourier Transform is a generalization of the classical Fourier Transform which is often symbolized as the rotation in time- frequency plane. Similar to the product of time and frequency span which provides the Uncertainty Principle for the classical Fourier domain, there has not been till date an Uncertainty Principle for the Fractional Fourier domain for a generalized class of finite energy signals. Though the lower bound for the product of time and Fractional Fourier span is derived for the real signals, a tighter lower bound for a general class of signals is of practical importance, especially for the analysis of signals containing chirps. We hence formulate a mathematical derivation that gives the lower bound of time and Fractional Fourier span product. The relation proves to be utmost importance in taking the Fractional Fourier Transform with adaptive time and Fractional span resolutions for a varied class of complex signals.

Keywords: Fractional Fourier Transform, uncertainty principle, Fractional Fourier Span, amplitude, phase.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1194

2859 Some New Inequalities for Eigenvalues of the Hadamard Product and the Fan Product of Matrices

Authors: Jing Li, Guang Zhou

Abstract:

Let A and B be nonnegative matrices. A new upper bound on the spectral radius ρ(A◦B) is obtained. Meanwhile, a new lower bound on the smallest eigenvalue q(AB) for the Fan product, and a new lower bound on the minimum eigenvalue q(B ◦A−1) for the Hadamard product of B and A−1 of two nonsingular M-matrices A and B are given. Some results of comparison are also given in theory. To illustrate our results, numerical examples are considered.

Keywords: Hadamard product, Fan product; nonnegative matrix, M-matrix, Spectral radius, Minimum eigenvalue, 1-path cover.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1897

2858 Evaluating the Impact of Replacement Policies on the Cache Performance and Energy Consumption in Different Multicore Embedded Systems

Authors: Sajjad Rostami-Sani, Mojtaba Valinataj, Amir-Hossein Khojir-Angasi

Abstract:

The cache has an important role in the reduction of access delay between a processor and memory in high-performance embedded systems. In these systems, the energy consumption is one of the most important concerns, and it will become more important with smaller processor feature sizes and higher frequencies. Meanwhile, the cache system dissipates a significant portion of energy compared to the other components of a processor. There are some elements that can affect the energy consumption of the cache such as replacement policy and degree of associativity. Due to these points, it can be inferred that selecting an appropriate configuration for the cache is a crucial part of designing a system. In this paper, we investigate the effect of different cache replacement policies on both cache’s performance and energy consumption. Furthermore, the impact of different Instruction Set Architectures (ISAs) on cache’s performance and energy consumption has been investigated.

Keywords: L1-cache, energy consumption, replacement policy, Instruction set architecture, multicore processor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 962

2857 Concept of a Pseudo-Lower Bound Solution for Reinforced Concrete Slabs

Authors: M. De Filippo, J. S. Kuang

Abstract:

In construction industry, reinforced concrete (RC) slabs represent fundamental elements of buildings and bridges. Different methods are available for analysing the structural behaviour of slabs. In the early ages of last century, the yield-line method has been proposed to attempt to solve such problem. Simple geometry problems could easily be solved by using traditional hand analyses which include plasticity theories. Nowadays, advanced finite element (FE) analyses have mainly found their way into applications of many engineering fields due to the wide range of geometries to which they can be applied. In such cases, the application of an elastic or a plastic constitutive model would completely change the approach of the analysis itself. Elastic methods are popular due to their easy applicability to automated computations. However, elastic analyses are limited since they do not consider any aspect of the material behaviour beyond its yield limit, which turns to be an essential aspect of RC structural performance. Furthermore, their applicability to non-linear analysis for modeling plastic behaviour gives very reliable results. Per contra, this type of analysis is computationally quite expensive, i.e. not well suited for solving daily engineering problems. In the past years, many researchers have worked on filling this gap between easy-to-implement elastic methods and computationally complex plastic analyses. This paper aims at proposing a numerical procedure, through which a pseudo-lower bound solution, not violating the yield criterion, is achieved. The advantages of moment distribution are taken into account, hence the increase in strength provided by plastic behaviour is considered. The lower bound solution is improved by detecting over-yielded moments, which are used to artificially rule the moment distribution among the rest of the non-yielded elements. The proposed technique obeys Nielsen’s yield criterion. The outcome of this analysis provides a simple, yet accurate, and non-time-consuming tool of predicting the lower-bound solution of the collapse load of RC slabs. By using this method, structural engineers can find the fracture patterns and ultimate load bearing capacity. The collapse triggering mechanism is found by detecting yield-lines. An application to the simple case of a square clamped slab is shown, and a good match was found with the exact values of collapse load.

Keywords: Computational mechanics, lower bound method, reinforced concrete slabs, yield-line.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1096

2856 A Novel Implementation of Application Specific Instruction-set Processor (ASIP) using Verilog

Authors: Kamaraju.M, Lal Kishore.K, Tilak.A.V.N

Abstract:

The general purpose processors that are used in embedded systems must support constraints like execution time, power consumption, code size and so on. On the other hand an Application Specific Instruction-set Processor (ASIP) has advantages in terms of power consumption, performance and flexibility. In this paper, a 16-bit Application Specific Instruction-set processor for the sensor data transfer is proposed. The designed processor architecture consists of on-chip transmitter and receiver modules along with the processing and controlling units to enable the data transmission and reception on a single die. The data transfer is accomplished with less number of instructions as compared with the general purpose processor. The ASIP core operates at a maximum clock frequency of 1.132GHz with a delay of 0.883ns and consumes 569.63mW power at an operating voltage of 1.2V. The ASIP is implemented in Verilog HDL using the Xilinx platform on Virtex4.

Keywords: ASIP, Data transfer, Instruction set, Processor

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2068

2855 Design of Multi-disease Diagnosis Processor using Hypernetworks Technique

Authors: Jae-Yeon Song, Seung-Yerl Lee, Kyu-Yeul Wang, Byung-Soo Kim, Sang-Seol Lee, Seong-Seob Shin, Jae-Young Choi, Chong Ho Lee, Jeahyun Park, Duck-Jin Chung

Abstract:

In this paper, we propose disease diagnosis hardware architecture by using Hypernetworks technique. It can be used to diagnose 3 different diseases (SPECT Heart, Leukemia, Prostate cancer). Generally, the disparate diseases require specified diagnosis hardware model for each disease. Using similarities of three diseases diagnosis processor, we design diagnosis processor that can diagnose three different diseases. Our proposed architecture that is combining three processors to one processor can reduce hardware size without decrease of the accuracy.

Keywords: Diagnosis processor, Hypernetworks, Leukemia, Mask, Prostate cancer, SPECT Heart data

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1363

2854 A Hyper-Domain Image Watermarking Method based on Macro Edge Block and Wavelet Transform for Digital Signal Processor

Authors: Yi-Pin Hsu, Shin-Yu Lin

Abstract:

In order to protect original data, watermarking is first consideration direction for digital information copyright. In addition, to achieve high quality image, the algorithm maybe can not run on embedded system because the computation is very complexity. However, almost nowadays algorithms need to build on consumer production because integrator circuit has a huge progress and cheap price. In this paper, we propose a novel algorithm which efficient inserts watermarking on digital image and very easy to implement on digital signal processor. In further, we select a general and cheap digital signal processor which is made by analog device company to fit consumer application. The experimental results show that the image quality by watermarking insertion can achieve 46 dB can be accepted in human vision and can real-time execute on digital signal processor.

Keywords: watermarking, digital signal processor, embedded system

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1249

2853 Computing the Loop Bound in Iterative Data Flow Graphs Using Natural Token Flow

Authors: Ali Shatnawi

Abstract:

Signal processing applications which are iterative in nature are best represented by data flow graphs (DFG). In these applications, the maximum sampling frequency is dependent on the topology of the DFG, the cyclic dependencies in particular. The determination of the iteration bound, which is the reciprocal of the maximum sampling frequency, is critical in the process of hardware implementation of signal processing applications. In this paper, a novel technique to compute the iteration bound is proposed. This technique is different from all previously proposed techniques, in the sense that it is based on the natural flow of tokens into the DFG rather than the topology of the graph. The proposed algorithm has lower run-time complexity than all known algorithms. The performance of the proposed algorithm is illustrated through analytical analysis of the time complexity, as well as through simulation of some benchmark problems.

Keywords: Data flow graph, Iteration period bound, Rateoptimalscheduling, Recursive DSP algorithms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2562

2852 Lower Bounds of Some Small Ramsey Numbers

Authors: Decha Samana, Vites Longani

Abstract:

For positive integer s and t, the Ramsey number R(s, t) is the least positive integer n such that for every graph G of order n, either G contains Ks as a subgraph or G contains Kt as a subgraph. We construct the circulant graphs and use them to obtain lower bounds of some small Ramsey numbers.

Keywords: Lower bound, Ramsey numbers, Graphs, Distance line.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1374

2851 Limit Analysis of FGM Circular Plates Subjected to Arbitrary Rotational Symmetric Loads

Authors: Kargarnovin M.H., Faghidian S. A, Arghavani J.

Abstract:

The limit load carrying capacity of functionally graded materials (FGM) circular plates subjected to an arbitrary rotationally symmetric loading has been computed. It is provided that the plate material behaves rigid perfectly plastic and obeys either the Square or the Tresca yield criterion. To this end the upper and lower bound principles of limit analysis are employed to determine the exact value for the limiting load. The correctness of the result are verified and finally limiting loads for two examples namely; through radius and through thickness FGM circular plates with simply supported edges are calculated, respectively and moreover, the values of critical loading factor are determined.

Keywords: Circular plate, FGM circular plate, Limit analysis, Lower and Upper bound theorems.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1595

2850 Jeffrey's Prior for Unknown Sinusoidal Noise Model via Cramer-Rao Lower Bound

Authors: Samuel A. Phillips, Emmanuel A. Ayanlowo, Rasaki O. Olanrewaju, Olayode Fatoki

Abstract:

This paper employs the Jeffrey's prior technique in the process of estimating the periodograms and frequency of sinusoidal model for unknown noisy time variants or oscillating events (data) in a Bayesian setting. The non-informative Jeffrey's prior was adopted for the posterior trigonometric function of the sinusoidal model such that Cramer-Rao Lower Bound (CRLB) inference was used in carving-out the minimum variance needed to curb the invariance structure effect for unknown noisy time observational and repeated circular patterns. An average monthly oscillating temperature series measured in degree Celsius (0C) from 1901 to 2014 was subjected to the posterior solution of the unknown noisy events of the sinusoidal model via Markov Chain Monte Carlo (MCMC). It was not only deduced that two minutes period is required before completing a cycle of changing temperature from one particular degree Celsius to another but also that the sinusoidal model via the CRLB-Jeffrey's prior for unknown noisy events produced a miniature posterior Maximum A Posteriori (MAP) compare to a known noisy events.

Keywords: Cramer-Rao Lower Bound (CRLB), Jeffrey's prior, Sinusoidal, Maximum A Posteriori (MAP), Markov Chain Monte Carlo (MCMC), Periodograms.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 658

2849 Calculation of Wave Function at the Origin (WFO) for Heavy Mesons by Numerical Solving of the Schrodinger Equation

Authors: M. Momeni Feyli

Abstract:

Many recent high energy physics calculations involving charm and beauty invoke wave function at the origin (WFO) for the meson bound state. Uncertainties of charm and beauty quark masses and different models for potentials governing these bound states require a simple numerical algorithm for evaluation of the WFO's for these bound states. We present a simple algorithm for this propose which provides WFO's with high precision compared with similar ones already obtained in the literature.

Keywords: Mesons, Bound states, Schrodinger equation, Nonrelativistic quark model.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1504

2848 An Adversarial Construction of Instability Bounds in LIS Networks

Authors: Dimitrios Koukopoulos

Abstract:

In this work, we study the impact of dynamically changing link slowdowns on the stability properties of packetswitched networks under the Adversarial Queueing Theory framework. Especially, we consider the Adversarial, Quasi-Static Slowdown Queueing Theory model, where each link slowdown may take on values in the two-valued set of integers {1, D} with D > 1 which remain fixed for a long time, under a (w, ¤ü)-adversary. In this framework, we present an innovative systematic construction for the estimation of adversarial injection rate lower bounds, which, if exceeded, cause instability in networks that use the LIS (Longest-in- System) protocol for contention-resolution. In addition, we show that a network that uses the LIS protocol for contention-resolution may result in dropping its instability bound at injection rates ¤ü > 0 when the network size and the high slowdown D take large values. This is the best ever known instability lower bound for LIS networks.

Keywords: Network stability, quality of service, adversarial queueing theory, greedy scheduling protocols.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1229

2847 Spin-Dependent Transport Signatures of Bound States: From Finger to Top Gates

Authors: Yun-Hsuan Yu, Chi-Shung Tang, Nzar Rauf Abdullah, Vidar Gudmundsson

Abstract:

Spin-orbit gap feature in energy dispersion of one-dimensional devices is revealed via strong spin-orbit interaction (SOI) effects under Zeeman field. We describe the utilization of a finger-gate or a top-gate to control the spin-dependent transport characteristics in the SOI-Zeeman influenced split-gate devices by means of a generalized spin-mixed propagation matrix method. For the finger-gate system, we find a bound state in continuum for incident electrons within the ultra-low energy regime. For the top-gate system, we observe more bound-state features in conductance associated with the formation of spin-associated hole-like or electron-like quasi-bound states around band thresholds, as well as hole bound states around the reverse point of the energy dispersion. We demonstrate that the spin-dependent transport behavior of a top-gate system is similar to that of a finger-gate system only if the top-gate length is less than the effective Fermi wavelength.

Keywords: Spin-orbit, Zeeman, top-gate, finger-gate, bound state.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 951

2846 A Systematic Construction of Instability Bounds in LIS Networks

Authors: Dimitrios Koukopoulos

Abstract:

In this work, we study the impact of dynamically changing link slowdowns on the stability properties of packetswitched networks under the Adversarial Queueing Theory framework. Especially, we consider the Adversarial, Quasi-Static Slowdown Queueing Theory model, where each link slowdown may take on values in the two-valued set of integers {1, D} with D > 1 which remain fixed for a long time, under a (w, p)-adversary. In this framework, we present an innovative systematic construction for the estimation of adversarial injection rate lower bounds, which, if exceeded, cause instability in networks that use the LIS (Longest-in- System) protocol for contention-resolution. In addition, we show that a network that uses the LIS protocol for contention-resolution may result in dropping its instability bound at injection rates p > 0 when the network size and the high slowdown D take large values. This is the best ever known instability lower bound for LIS networks.

Keywords: Parallel computing, network stability, adversarial queuing theory, greedy scheduling protocols.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1415

2845 Improving the Performances of the nMPRA Architecture by Implementing Specific Functions in Hardware

Authors: Ionel Zagan, Vasile Gheorghita Gaitan

Abstract:

Minimizing the response time to asynchronous events in a real-time system is an important factor in increasing the speed of response and an interesting concept in designing equipment fast enough for the most demanding applications. The present article will present the results regarding the validation of the nMPRA (Multi Pipeline Register Architecture) architecture using the FPGA Virtex-7 circuit. The nMPRA concept is a hardware processor with the scheduler implemented at the processor level; this is done without affecting a possible bus communication, as is the case with the other CPU solutions. The implementation of static or dynamic scheduling operations in hardware and the improvement of handling interrupts and events by the real-time executive described in the present article represent a key solution for eliminating the overhead of the operating system functions. The nMPRA processor is capable of executing a preemptive scheduling, using various algorithms without a software scheduler. Therefore, we have also presented various scheduling methods and algorithms used in scheduling the real-time tasks.

Keywords: nMPRA architecture, pipeline processor, preemptive scheduling, real-time system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 880

2844 An Approximation Method for Three Quark Systems in the Hyper-Spherical Approach

Authors: B. Rezaei, G. R. Boroun, M. Abdolmaleki

Abstract:

The bound state energy of three quark systems is studied in the framework of a non- relativistic spin independent phenomenological model. The hyper- spherical coordinates are considered for the solution this system. According to Jacobi coordinate, we determined the bound state energy for (uud) and (ddu) quark systems, as quarks are flavorless mass, and it is restrict that choice potential at low and high range in nucleon bag for a bound state.

Keywords: Adiabatic expansion, grand angular momentum, binding energy, perturbation, baryons.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1433

2843 Bounds on Reliability of Parallel Computer Interconnection Systems

Authors: Ranjan Kumar Dash, Chita Ranjan Tripathy

Abstract:

The evaluation of residual reliability of large sized parallel computer interconnection systems is not practicable with the existing methods. Under such conditions, one must go for approximation techniques which provide the upper bound and lower bound on this reliability. In this context, a new approximation method for providing bounds on residual reliability is proposed here. The proposed method is well supported by two algorithms for simulation purpose. The bounds on residual reliability of three different categories of interconnection topologies are efficiently found by using the proposed method

Keywords: Parallel computer network, reliability, probabilisticgraph, interconnection networks.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1344

2842 The Fluid Limit of the Critical Processor Sharing Tandem Queue

Authors: Amal Ezzidani, Abdelghani Ben Tahar, Mohamed Hanini

Abstract:

A sequence of finite tandem queue is considered for this study. Each one has a single server, which operates under the egalitarian processor sharing discipline. External customers arrive at each queue according to a renewal input process and having a general service times distribution. Upon completing service, customers leave the current queue and enter to the next. Under mild assumptions, including critical data, we prove the existence and the uniqueness of the fluid solution. For asymptotic behavior, we provide necessary and sufficient conditions for the invariant state and the convergence to this invariant state. In the end, we establish the convergence of a correctly normalized state process to a fluid limit characterized by a system of algebraic and integral equations.

Keywords: Fluid Limit, fluid model, measure valued process, processor sharing, tandem queue.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 477

2841 Upper Bound of the Generalize p-Value for the Behrens-Fisher Problem with a Known Ratio of Variances

Authors: Rada Somkhuean, Suparat Niwitpong, Sa-aat Niwitpong

Abstract:

This paper presents the generalized p-values for testing the Behrens-Fisher problem when a ratio of variance is known. We also derive a closed form expression of the upper bound of the proposed generalized p-value.

Keywords: Generalized p-value, hypothesis testing, ratio of variances, upper bound.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1233

2840 Applying Branch-and-Bound and Petri Net Methods in Solving the Two-Sided Assembly Line Balancing Problem

Authors: Nai-Chieh Wei, I-Ming Chao, Chin-Jung Liuand, Hong Long Chen

Abstract:

This paper combines the branch-and-bound method and the petri net to solve the two-sided assembly line balancing problem, thus facilitating effective branching and pruning of tasks. By integrating features of the petri net, such as reachability graph and incidence matrix, the propose method can support the branch-and-bound to effectively reduce poor branches with systematic graphs. Test results suggest that using petri net in the branching process can effectively guide the system trigger process, and thus, lead to consistent results.

Keywords: Branch-and-Bound Method, Petri Net, Two-Sided Assembly Line Balancing Problem.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1918

2839 A Systematic Approach for Finding Hamiltonian Cycles with a Prescribed Edge in Crossed Cubes

Authors: Jheng-Cheng Chen, Chia-Jui Lai, Chang-Hsiung Tsai,

Abstract:

The crossed cube is one of the most notable variations of hypercube, but some properties of the former are superior to those of the latter. For example, the diameter of the crossed cube is almost the half of that of the hypercube. In this paper, we focus on the problem embedding a Hamiltonian cycle through an arbitrary given edge in the crossed cube. We give necessary and sufficient condition for determining whether a given permutation with n elements over Zn generates a Hamiltonian cycle pattern of the crossed cube. Moreover, we obtain a lower bound for the number of different Hamiltonian cycles passing through a given edge in an n-dimensional crossed cube. Our work extends some recently obtained results.

Keywords: Interconnection network, Hamiltonian, crossed cubes, prescribed edge.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1528

2838 Implementing High Performance VPN Router using Cavium-s CN2560 Security Processor

Authors: Sang Su Lee, Sang Woo Lee, Yong Sung Jeon, Ki Young Kim

Abstract:

IPsec protocol[1] is a set of security extensions developed by the IETF and it provides privacy and authentication services at the IP layer by using modern cryptography. In this paper, we describe both of H/W and S/W architectures of our router system, SRS-10. The system is designed to support high performance routing and IPsec VPN. Especially, we used Cavium-s CN2560 processor to implement IPsec processing in inline-mode.

Keywords: IP, router, VPN, IPsec.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2034

2837 Parallel-computing Approach for FFT Implementation on Digital Signal Processor (DSP)

Authors: Yi-Pin Hsu, Shin-Yu Lin

Abstract:

An efficient parallel form in digital signal processor can improve the algorithm performance. The butterfly structure is an important role in fast Fourier transform (FFT), because its symmetry form is suitable for hardware implementation. Although it can perform a symmetric structure, the performance will be reduced under the data-dependent flow characteristic. Even though recent research which call as novel memory reference reduction methods (NMRRM) for FFT focus on reduce memory reference in twiddle factor, the data-dependent property still exists. In this paper, we propose a parallel-computing approach for FFT implementation on digital signal processor (DSP) which is based on data-independent property and still hold the property of low-memory reference. The proposed method combines final two steps in NMRRM FFT to perform a novel data-independent structure, besides it is very suitable for multi-operation-unit digital signal processor and dual-core system. We have applied the proposed method of radix-2 FFT algorithm in low memory reference on TI TMSC320C64x DSP. Experimental results show the method can reduce 33.8% clock cycles comparing with the NMRRM FFT implementation and keep the low-memory reference property.

Keywords: Parallel-computing, FFT, low-memory reference, TIDSP.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2199

2836 Optimization of SAD Algorithm on VLIW DSP

Authors: Hui-Jae You, Sun-Tae Chung, Souhwan Jung

Abstract:

SAD (Sum of Absolute Difference) algorithm is heavily used in motion estimation which is computationally highly demanding process in motion picture encoding. To enhance the performance of motion picture encoding on a VLIW processor, an efficient implementation of SAD algorithm on the VLIW processor is essential. SAD algorithm is programmed as a nested loop with a conditional branch. In VLIW processors, loop is usually optimized by software pipelining, but researches on optimal scheduling of software pipelining for nested loops, especially nested loops with conditional branches are rare. In this paper, we propose an optimal scheduling and implementation of SAD algorithm with conditional branch on a VLIW DSP processor. The proposed optimal scheduling first transforms the nested loop with conditional branch into a single loop with conditional branch with consideration of full utilization of ILP capability of the VLIW processor and realization of earlier escape from the loop. Next, the proposed optimal scheduling applies a modulo scheduling technique developed for single loop. Based on this optimal scheduling strategy, optimal implementation of SAD algorithm on TMS320C67x, a VLIW DSP is presented. Through experiments on TMS320C6713 DSK, it is shown that H.263 encoder with the proposed SAD implementation performs better than other H.263 encoder with other SAD implementations, and that the code size of the optimal SAD implementation is small enough to be appropriate for embedded environments.

Keywords: Optimal implementation, SAD algorithm, VLIW, TMS320C6713.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2346

2835 Performance Analysis of Digital Signal Processors Using SMV Benchmark

Authors: Erh-Wen Hu, Cyril S. Ku, Andrew T. Russo, Bogong Su, Jian Wang

Abstract:

Unlike general-purpose processors, digital signal processors (DSP processors) are strongly application-dependent. To meet the needs for diverse applications, a wide variety of DSP processors based on different architectures ranging from the traditional to VLIW have been introduced to the market over the years. The functionality, performance, and cost of these processors vary over a wide range. In order to select a processor that meets the design criteria for an application, processor performance is usually the major concern for digital signal processing (DSP) application developers. Performance data are also essential for the designers of DSP processors to improve their design. Consequently, several DSP performance benchmarks have been proposed over the past decade or so. However, none of these benchmarks seem to have included recent new DSP applications. In this paper, we use a new benchmark that we recently developed to compare the performance of popular DSP processors from Texas Instruments and StarCore. The new benchmark is based on the Selectable Mode Vocoder (SMV), a speech-coding program from the recent third generation (3G) wireless voice applications. All benchmark kernels are compiled by the compilers of the respective DSP processors and run on their simulators. Weighted arithmetic mean of clock cycles and arithmetic mean of code size are used to compare the performance of five DSP processors. In addition, we studied how the performance of a processor is affected by code structure, features of processor architecture and optimization of compiler. The extensive experimental data gathered, analyzed, and presented in this paper should be helpful for DSP processor and compiler designers to meet their specific design goals.

Keywords: digital signal processors, DSP benchmark, instruction level parallelism, modified cyclomatic complexity, performance analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1610

2834 Processor Scheduling on Parallel Computers

Authors: Mohammad S. Laghari, Gulzar A. Khuwaja

Abstract:

Many problems in computer vision and image processing present potential for parallel implementations through one of the three major paradigms of geometric parallelism, algorithmic parallelism and processor farming. Static process scheduling techniques are used successfully to exploit geometric and algorithmic parallelism, while dynamic process scheduling is better suited to dealing with the independent processes inherent in the process farming paradigm. This paper considers the application of parallel or multi-computers to a class of problems exhibiting spatial data characteristic of the geometric paradigm. However, by using processor farming paradigm, a dynamic scheduling technique is developed to suit the MIMD structure of the multi-computers. A hybrid scheme of scheduling is also developed and compared with the other schemes. The specific problem chosen for the investigation is the Hough transform for line detection.

Keywords: Hough transforms, parallel computer, parallel paradigms, scheduling.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1651