Search results for: VLSI hardware.

309 Neural Network Implementation Using FPGA: Issues and Application

Authors: A. Muthuramalingam, S. Himavathi, E. Srinivasan

Abstract:

.Hardware realization of a Neural Network (NN), to a large extent depends on the efficient implementation of a single neuron. FPGA-based reconfigurable computing architectures are suitable for hardware implementation of neural networks. FPGA realization of ANNs with a large number of neurons is still a challenging task. This paper discusses the issues involved in implementation of a multi-input neuron with linear/nonlinear excitation functions using FPGA. Implementation method with resource/speed tradeoff is proposed to handle signed decimal numbers. The VHDL coding developed is tested using Xilinx XC V50hq240 Chip. To improve the speed of operation a lookup table method is used. The problems involved in using a lookup table (LUT) for a nonlinear function is discussed. The percentage saving in resource and the improvement in speed with an LUT for a neuron is reported. An attempt is also made to derive a generalized formula for a multi-input neuron that facilitates to estimate approximately the total resource requirement and speed achievable for a given multilayer neural network. This facilitates the designer to choose the FPGA capacity for a given application. Using the proposed method of implementation a neural network based application, namely, a Space vector modulator for a vector-controlled drive is presented

Keywords: FPGA implementation, multi-input neuron, neural network, nn based space vector modulator.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 4384

308 Low-Cost Mechatronic Design of an Omnidirectional Mobile Robot

Authors: S. Cobos-Guzman

Abstract:

This paper presents the results of a mechatronic design based on a 4-wheel omnidirectional mobile robot that can be used in indoor logistic applications. The low-level control has been selected using two open-source hardware (Raspberry Pi 3 Model B+ and Arduino Mega 2560) that control four industrial motors, four ultrasound sensors, four optical encoders, a vision system of two cameras, and a Hokuyo URG-04LX-UG01 laser scanner. Moreover, the system is powered with a lithium battery that can supply 24 V DC and a maximum current-hour of 20Ah.The Robot Operating System (ROS) has been implemented in the Raspberry Pi and the performance is evaluated with the selection of the sensors and hardware selected. The mechatronic system is evaluated and proposed safe modes of power distribution for controlling all the electronic devices based on different tests. Therefore, based on different performance results, some recommendations are indicated for using the Raspberry Pi and Arduino in terms of power, communication, and distribution of control for different devices. According to these recommendations, the selection of sensors is distributed in both real-time controllers (Arduino and Raspberry Pi). On the other hand, the drivers of the cameras have been implemented in Linux and a python program has been implemented to access the cameras. These cameras will be used for implementing a deep learning algorithm to recognize people and objects. In this way, the level of intelligence can be increased in combination with the maps that can be obtained from the laser scanner.

Keywords: Autonomous, indoor robot, mechatronic, omnidirectional robot.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 541

307 Interconnect Analysis of a Novel Multiplexer Based Full-Adder Cell for Power and Propagation Delay Optimizations

Authors: G.Ramana Murthy, C.Senthilpari, P.Velrajkumar, Lim Tien Sze

Abstract:

The proposed multiplexer-based novel 1-bit full adder cell is schematized by using DSCH2 and its layout is generated by using microwind VLSI CAD tool. The adder cell layout interconnect analysis is performed by using BSIM4 layout analyzer. The adder circuit is compared with other six existing adder circuits for parametric analysis. The proposed adder cell gives better performance than the other existing six adder circuits in terms of power, propagation delay and PDP. The proposed adder circuit is further analyzed for interconnect analysis, which gives better performance than other adder circuits in terms of layout thickness, width and height.

Keywords: Full Adder, Interconnect Analysis, Low-Power, Multiplexer, Propagation Delay, Parametric Analysis.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525

306 An Automated Test Setup for the Characterization of Antenna in CATR

Authors: Faisal Amin, Abdul Mueed, Xu Jiadong

Abstract:

This paper describes the development of a fully automated measurement software for antenna radiation pattern measurements in a Compact Antenna Test Range (CATR). The CATR has a frequency range from 2-40 GHz and the measurement hardware includes a Network Analyzer for transmitting and Receiving the microwave signal and a Positioner controller to control the motion of the Styrofoam column. The measurement process includes Calibration of CATR with a Standard Gain Horn (SGH) antenna followed by Gain versus angle measurement of the Antenna under test (AUT). The software is designed to control a variety of microwave transmitter / receiver and two axis Positioner controllers through the standard General Purpose interface bus (GPIB) interface. Addition of new Network Analyzers is supported through a slight modification of hardware control module. Time-domain gating is implemented to remove the unwanted signals and get the isolated response of AUT. The gated response of the AUT is compared with the calibration data in the frequency domain to obtain the desired results. The data acquisition and processing is implemented in Agilent VEE and Matlab. A variety of experimental measurements with SGH antennas were performed to validate the accuracy of software. A comparison of results with existing commercial softwares is presented and the measured results are found to be within .2 dBm.

Keywords: Antenna measurement, calibration, time-domain gating, VNA, Positioner controller

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1951

305 Estimation of Attenuation and Phase Delay in Driving Voltage Waveform of an Ultra-High-Speed Image Sensor by Dimensional Analysis

Authors: V. T. S. Dao, T. G. Etoh, C. Vo Le, H. D. Nguyen, K. Takehara, T. Akino, K. Nishi

Abstract:

We present an explicit expression to estimate driving voltage attenuation through RC networks representation of an ultrahigh- speed image sensor. Elmore delay metric for a fundamental RC chain is employed as the first-order approximation. By application of dimensional analysis to SPICE simulation data, we found a simple expression that significantly improves the accuracy of the approximation. Estimation error of the resultant expression for uniform RC networks is less than 2%. Similarly, another simple closed-form model to estimate 50 % delay through fundamental RC networks is also derived with sufficient accuracy. The framework of this analysis can be extended to address delay or attenuation issues of other VLSI structures.

Keywords: Dimensional Analysis, Elmore model, RC network, Signal Attenuation, Ultra-High-Speed Image Sensor.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1403

304 Skin Effect: A Natural Phenomenon for Minimization of Ground Bounce in VLSI RC Interconnect

Authors: Shilpi Lavania

Abstract:

As the frequency of operation has attained a range of GHz and signal rise time continues to increase interconnect technology is suffering due to various high frequency effects as well as ground bounce problem. In some recent studies a high frequency effect i.e. skin effect has been modeled and its drawbacks have been discussed. This paper strives to make an impression on the advantage side of modeling skin effect for interconnect line. The proposed method has considered a CMOS with RC interconnect. Delay and noise considering ground bounce problem and with skin effect are discussed. The simulation results reveal an advantage of considering skin effect for minimization of ground bounce problem during the working of the model. Noise and delay variations with temperature are also presented.

Keywords: Interconnect, Skin effect, Ground Bounce, Delay, Noise.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3109

303 Design Techniques and Implementation of Low Power High-Throughput Discrete Wavelet Transform Tilters for JPEG 2000 Standard

Authors: Grigorios D. Dimitroulakos, N. D. Zervas, N. Sklavos, Costas E. Goutis

Abstract:

In this paper, the implementation of low power, high throughput convolutional filters for the one dimensional Discrete Wavelet Transform and its inverse are presented. The analysis filters have already been used for the implementation of a high performance DWT encoder [15] with minimum memory requirements for the JPEG 2000 standard. This paper presents the design techniques and the implementation of the convolutional filters included in the JPEG2000 standard for the forward and inverse DWT for achieving low-power operation, high performance and reduced memory accesses. Moreover, they have the ability of performing progressive computations so as to minimize the buffering between the decomposition and reconstruction phases. The experimental results illustrate the filters- low power high throughput characteristics as well as their memory efficient operation.

Keywords: Discrete Wavelet Transform; JPEG2000 standard; VLSI design; Low Power-Throughput-optimized filters

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1255

302 Identification of Promising Infant Clusters to Obtain Improved Block Layout Designs

Authors: Mustahsan Mir, Ahmed Hassanin, Mohammed A. Al-Saleh

Abstract:

The layout optimization of building blocks of unequal areas has applications in many disciplines including VLSI floorplanning, macrocell placement, unequal-area facilities layout optimization, and plant or machine layout design. A number of heuristics and some analytical and hybrid techniques have been published to solve this problem. This paper presents an efficient high-quality building-block layout design technique especially suited for solving large-size problems. The higher efficiency and improved quality of optimized solutions are made possible by introducing the concept of Promising Infant Clusters in a constructive placement procedure. The results presented in the paper demonstrate the improved performance of the presented technique for benchmark problems in comparison with published heuristic, analytic, and hybrid techniques.

Keywords: Block layout problem, building-block layout design, CAD, optimization, search techniques.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1220

301 A Design of Elliptic Curve Cryptography Processor Based on SM2 over GF(p)

Authors: Shiji Hu, Lei Li, Wanting Zhou, Daohong Yang

Abstract:

The data encryption is the foundation of today’s communication. On this basis, to improve the speed of data encryption and decryption is always an important goal for high-speed applications. This paper proposed an elliptic curve crypto processor architecture based on SM2 prime field. Regarding hardware implementation, we optimized the algorithms in different stages of the structure. For modulo operation on finite field, we proposed an optimized improvement of the Karatsuba-Ofman multiplication algorithm and shortened the critical path through the pipeline structure in the algorithm implementation. Based on SM2 recommended prime field, a fast modular reduction algorithm is used to reduce 512-bit data obtained from the multiplication unit. The radix-4 extended Euclidean algorithm was used to realize the conversion between the affine coordinate system and the Jacobi projective coordinate system. In the parallel scheduling point operations on elliptic curves, we proposed a three-level parallel structure of point addition and point double based on the Jacobian projective coordinate system. Combined with the scalar multiplication algorithm, we added mutual pre-operation to the point addition and double point operation to improve the efficiency of the scalar point multiplication. The proposed ECC hardware architecture was verified and implemented on Xilinx Virtex-7 and ZYNQ-7 platforms, and each 256-bit scalar multiplication operation took 0.275ms. The performance for handling scalar multiplication is 32 times that of CPU (dual-core ARM Cortex-A9).

Keywords: Elliptic curve cryptosystems, SM2, modular multiplication, point multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 194

300 Motion Area Estimated Motion Estimation with Triplet Search Patterns for H.264/AVC

Authors: T. Song, T. Shimamoto

Abstract:

In this paper a fast motion estimation method for H.264/AVC named Triplet Search Motion Estimation (TS-ME) is proposed. Similar to some of the traditional fast motion estimation methods and their improved proposals which restrict the search points only to some selected candidates to decrease the computation complexity, proposed algorithm separate the motion search process to several steps but with some new features. First, proposed algorithm try to search the real motion area using proposed triplet patterns instead of some selected search points to avoid dropping into the local minimum. Then, in the localized motion area a novel 3-step motion search algorithm is performed. Proposed search patterns are categorized into three rings on the basis of the distance from the search center. These three rings are adaptively selected by referencing the surrounding motion vectors to early terminate the motion search process. On the other hand, computation reduction for sub pixel motion search is also discussed considering the appearance probability of the sub pixel motion vector. From the simulation results, motion estimation speed improved by a factor of up to 38 when using proposed algorithm than that of the reference software of H.264/AVC with ignorable picture quality loss.

Keywords: Motion estimation, VLSI, image processing, search patterns

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1308

299 Phase Error Accumulation Methodology for On-Chip Cell Characterization

Authors: Chang Soo Kang, In Ho Im, Sergey Churayev, Timour Paltashev

Abstract:

This paper describes the design of new method of propagation delay measurement in micro and nanostructures during characterization of ASIC standard library cell. Providing more accuracy timing information about library cell to the design team we can improve a quality of timing analysis inside of ASIC design flow process. Also, this information could be very useful for semiconductor foundry team to make correction in technology process. By comparison of the propagation delay in the CMOS element and result of analog SPICE simulation. It was implemented as digital IP core for semiconductor manufacturing process. Specialized method helps to observe the propagation time delay in one element of the standard-cell library with up-to picoseconds accuracy and less. Thus, the special useful solutions for VLSI schematic to parameters extraction, basic cell layout verification, design simulation and verification are announced.

Keywords: phase error accumulation methodology, gatepropagation delay, Processor Testing, MEMS Testing

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1480

298 Architecture of Large-Scale Systems

Authors: Arne Koschel, Irina Astrova, Elena Deutschkämer, Jacob Ester, Johannes Feldmann

Abstract:

In this paper various techniques in relation to large-scale systems are presented. At first, explanation of large-scale systems and differences from traditional systems are given. Next, possible specifications and requirements on hardware and software are listed. Finally, examples of large-scale systems are presented.

Keywords: Distributed file systems, cashing, large scale systems, MapReduce algorithm, NoSQL databases.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3022

297 A Review on WEB Resources in Teaching of Geotechnical Engineering

Authors: Amin Chegenizadeh, Hamid Nikraz

Abstract:

The use of computer hardware and software in education and training dates to the early 1940s, when American researchers developed flight simulators which used analog computers to generate simulated onboard instrument data.Computer software is widely used to help engineers and undergraduate student solve their problems quickly and more accurately. This paper presents the list of computer software in geotechnical engineering.

Keywords: Geotechnical, Teaching, Courseware

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1719

296 GPS Navigator for Blind Walking in a Campus

Authors: Rangsipan Marukatat, Pongmanat Manaspaibool, Benjawan Khaiprapay, Pornpimon Plienjai

Abstract:

We developed a GPS-based navigation device for the blind, with audio guidance in Thai language. The device is composed of simple and inexpensive hardware components. Its user interface is quite simple. It determines optimal routes to various landmarks in our university campus by using heuristic search for the next waypoints. We tested the device and made note of its limitations and possible extensions.

Keywords: Blind, global positioning system (GPS), navigation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2429

295 Leakage Reduction ONOFIC Approach for Deep Submicron VLSI Circuits Design

Authors: Vijay Kumar Sharma, Manisha Pattanaik, Balwinder Raj

Abstract:

Minimizations of power dissipation, chip area with higher circuit performance are the necessary and key parameters in deep submicron regime. The leakage current increases sharply in deep submicron regime and directly affected the power dissipation of the logic circuits. In deep submicron region the power dissipation as well as high performance is the crucial concern since increasing importance of portable systems. Number of leakage reduction techniques employed to reduce the leakage current in deep submicron region but they have some trade-off to control the leakage current. ONOFIC approach gives an excellent agreement between power dissipation and propagation delay for designing the efficient CMOS logic circuits. In this article ONOFIC approach is compared with LECTOR technique and output results show that ONOFIC approach significantly reduces the power dissipation and enhance the speed of the logic circuits. The lower power delay product is the big outcome of this approach and makes it an influential leakage reduction technique.

Keywords: Deep submicron, Leakage Current, LECTOR, ONOFIC, Power Delay Product

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2477

294 Efficient Semi-Systolic Finite Field Multiplier Using Redundant Basis

Authors: Hyun-Ho Lee, Kee-Won Kim

Abstract:

The arithmetic operations over GF(2m) have been extensively used in error correcting codes and public-key cryptography schemes. Finite field arithmetic includes addition, multiplication, division and inversion operations. Addition is very simple and can be implemented with an extremely simple circuit. The other operations are much more complex. The multiplication is the most important for cryptosystems, such as the elliptic curve cryptosystem, since computing exponentiation, division, and computing multiplicative inverse can be performed by computing multiplication iteratively. In this paper, we present a parallel computation algorithm that operates Montgomery multiplication over finite field using redundant basis. Also, based on the multiplication algorithm, we present an efficient semi-systolic multiplier over finite field. The multiplier has less space and time complexities compared to related multipliers. As compared to the corresponding existing structures, the multiplier saves at least 5% area, 50% time, and 53% area-time (AT) complexity. Accordingly, it is well suited for VLSI implementation and can be easily applied as a basic component for computing complex operations over finite field, such as inversion and division operation.

Keywords: Finite field, Montgomery multiplication, systolic array, cryptography.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1622

293 A High Level Implementation of a High Performance Data Transfer Interface for NoC

Authors: Mansi Jhamb, R. K. Sharma, A. K. Gupta

Abstract:

The distribution of a single global clock across a chip has become the major design bottleneck for high performance VLSI systems owing to the power dissipation, process variability and multicycle cross-chip signaling. A Network-on-Chip (NoC) architecture partitioned into several synchronous blocks has become a promising approach for attaining fine-grain power management at the system level. In a NoC architecture the communication between the blocks is handled asynchronously. To interface these blocks on a chip operating at different frequencies, an asynchronous FIFO interface is inevitable. However, these asynchronous FIFOs are not required if adjacent blocks belong to the same clock domain. In this paper, we have designed and analyzed a 16-bit asynchronous micropipelined FIFO of depth four, with the awareness of place and route on an FPGA device. We have used a commercially available Spartan 3 device and designed a high speed implementation of the asynchronous 4-phase micropipeline. The asynchronous FIFO implemented on the FPGA device shows 76 Mb/s throughput and a handshake cycle of 109 ns for write and 101.3 ns for read at the simulation under the worst case operating conditions (voltage = 0.95V) on a working chip at the room temperature.

Keywords: Asynchronous, FIFO, FPGA, GALS, Network-on- Chip (NoC), VHDL.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 2018

292 A Novel Recursive Multiplierless Algorithm for 2-D DCT

Authors: V.K.Ananthashayana, Geetha.K.S

Abstract:

In this paper, a recursive algorithm for the computation of 2-D DCT using Ramanujan Numbers is proposed. With this algorithm, the floating-point multiplication is completely eliminated and hence the multiplierless algorithm can be implemented using shifts and additions only. The orthogonality of the recursive kernel is well maintained through matrix factorization to reduce the computational complexity. The inherent parallel structure yields simpler programming and hardware implementation and provides log 1 2 3 2 N N-N+ additions and N N 2 log 2 shifts which is very much less complex when compared to other recent multiplierless algorithms.

Keywords: DCT, Multilplerless, Ramanujan Number, Recursive.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1525

291 Game-Tree Simplification by Pattern Matching and Its Acceleration Approach using an FPGA

Authors: Suguru Ochiai, Toru Yabuki, Yoshiki Yamaguchi, Yuetsu Kodama

Abstract:

In this paper, we propose a Connect6 solver which adopts a hybrid approach based on a tree-search algorithm and image processing techniques. The solver must deal with the complicated computation and provide high performance in order to make real-time decisions. The proposed approach enables the solver to be implemented on a single Spartan-6 XC6SLX45 FPGA produced by XILINX without using any external devices. The compact implementation is achieved through image processing techniques to optimize a tree-search algorithm of the Connect6 game. The tree search is widely used in computer games and the optimal search brings the best move in every turn of a computer game. Thus, many tree-search algorithms such as Minimax algorithm and artificial intelligence approaches have been widely proposed in this field. However, there is one fundamental problem in this area; the computation time increases rapidly in response to the growth of the game tree. It means the larger the game tree is, the bigger the circuit size is because of their highly parallel computation characteristics. Here, this paper aims to reduce the size of a Connect6 game tree using image processing techniques and its position symmetric property. The proposed solver is composed of four computational modules: a two-dimensional checkmate strategy checker, a template matching module, a skilful-line predictor, and a next-move selector. These modules work well together in selecting next moves from some candidates and the total amount of their circuits is small. The details of the hardware design for an FPGA implementation are described and the performance of this design is also shown in this paper.

Keywords: Connect6, pattern matching, game-tree reduction, hardware direct computation

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1946

290 Electronic Tool that Helps in Learning How to Play a Flute

Authors: Galeano R. Katherine, Rincon L. David, Luengas C. Lely

Abstract:

This paper describes the development of an electronic instrument that looks like a flute, which is able to sense the basic musical notes being executed by a specific user. The principal function of the instrument is to teach how to play a flute. This device will generate a significant academic impact, in a field of virtual reality interactive that combine art and technology. With this example is expected to contribute in research and implementation of teaching devices around the world.

Keywords: Flute, Hardware, Learning, Virtual Reality.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1632

289 A New Hardware Implementation of Manchester Line Decoder

Authors: Ibrahim A. Khorwat, Nabil Naas

Abstract:

In this paper, we present a simple circuit for Manchester decoding and without using any complicated or programmable devices. This circuit can decode 90kbps of transmitted encoded data; however, greater than this transmission rate can be decoded if high speed devices were used. We also present a new method for extracting the embedded clock from Manchester data in order to use it for serial-to-parallel conversion. All of our experimental measurements have been done using simulation.

Keywords: High threshold level, level segregation, lowthreshold level, smoothing circuit synchronization..

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3753

288 Implementation of Parallel Interface for Microprocessor Trainer

Authors: Moe Moe Htun, Khin Htar Nwe

Abstract:

In this paper, parallel interface for microprocessor trainer was implemented. A programmable parallel–port device such as the IC 8255A is initialized for simple input or output and for handshake input or output by choosing kinds of modes. The hardware connections and the programs can be used to interface microprocessor trainer and a personal computer by using IC 8255A. The assembly programs edited on PC-s editor can be downloaded to the trainer.

Keywords: Parallel I/O ports, parallel interface, trainer, two 8255 ICs.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3142

287 Depth Camera Aided Dead-Reckoning Localization of Autonomous Mobile Robots in Unstructured Global Navigation Satellite System Denied Environments

Authors: David L. Olson, Stephen B. H. Bruder, Adam S. Watkins, Cleon E. Davis

Abstract:

In global navigation satellite system (GNSS) denied settings, such as indoor environments, autonomous mobile robots are often limited to dead-reckoning navigation techniques to determine their position, velocity, and attitude (PVA). Localization is typically accomplished by employing an inertial measurement unit (IMU), which, while precise in nature, accumulates errors rapidly and severely degrades the localization solution. Standard sensor fusion methods, such as Kalman filtering, aim to fuse precise IMU measurements with accurate aiding sensors to establish a precise and accurate solution. In indoor environments, where GNSS and no other a priori information is known about the environment, effective sensor fusion is difficult to achieve, as accurate aiding sensor choices are sparse. However, an opportunity arises by employing a depth camera in the indoor environment. A depth camera can capture point clouds of the surrounding floors and walls. Extracting attitude from these surfaces can serve as an accurate aiding source, which directly combats errors that arise due to gyroscope imperfections. This configuration for sensor fusion leads to a dramatic reduction of PVA error compared to traditional aiding sensor configurations. This paper provides the theoretical basis for the depth camera aiding sensor method, initial expectations of performance benefit via simulation, and hardware implementation thus verifying its veracity. Hardware implementation is performed on the Quanser Qbot 2™ mobile robot, with a Vector-Nav VN-200™ IMU and Kinect™ camera from Microsoft.

Keywords: Autonomous mobile robotics, dead reckoning, depth camera, inertial navigation, Kalman filtering, localization, sensor fusion.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 667

286 Performance Analysis and Optimization for Diagonal Sparse Matrix-Vector Multiplication on Machine Learning Unit

Authors: Qiuyu Dai, Haochong Zhang, Xiangrong Liu

Abstract:

Efficient matrix-vector multiplication with diagonal sparse matrices is pivotal in a multitude of computational domains, ranging from scientific simulations to machine learning workloads. When encoded in the conventional Diagonal (DIA) format, these matrices often induce computational overheads due to extensive zero-padding and non-linear memory accesses, which can hamper the computational throughput, and elevate the usage of precious compute and memory resources beyond necessity. The ’DIA-Adaptive’ approach, a methodological enhancement introduced in this paper, confronts these challenges head-on by leveraging the advanced parallel instruction sets embedded within Machine Learning Units (MLUs). This research presents a thorough analysis of the DIA-Adaptive scheme’s efficacy in optimizing Sparse Matrix-Vector Multiplication (SpMV) operations. The scope of the evaluation extends to a variety of hardware architectures, examining the repercussions of distinct thread allocation strategies and cluster configurations across multiple storage formats. A dedicated computational kernel, intrinsic to the DIA-Adaptive approach, has been meticulously developed to synchronize with the nuanced performance characteristics of MLUs. Empirical results, derived from rigorous experimentation, reveal that the DIA-Adaptive methodology not only diminishes the performance bottlenecks associated with the DIA format but also exhibits pronounced enhancements in execution speed and resource utilization. The analysis delineates a marked improvement in parallelism, showcasing the DIA-Adaptive scheme’s ability to adeptly manage the interplay between storage formats, hardware capabilities, and algorithmic design. The findings suggest that this approach could set a precedent for accelerating SpMV tasks, thereby contributing significantly to the broader domain of high-performance computing and data-intensive applications.

Keywords: Adaptive method, DIA, diagonal sparse matrices, MLU, sparse matrix-vector multiplication.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 172

285 Experimental Testbed to Compare 4G and 5G Industrial IoT Connections in Simulated Based Control System

Authors: Andrea Gelmini

Abstract:

This paper considers the advent of 5G and the use of it in a Based Control System (BCS), posing as a basic concept the question of what the real differences and practical improvements are compared to 4G. To this purpose, a testbed hardware simulator has been designed and built where identical machines with the same sensors and management systems will communicate with different radio access network connections. This allows an objective statistical comparison of performance on the real functioning and improvement of the infrastructure with the Industrial Internet of Things (IIoT) connected to it.

Keywords: 4G, 5G, BCS, eSIM, IIoT, SCADA, Testbed.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 289

284 A Test Methodology to Measure the Open-Loop Voltage Gain of an Operational Amplifier

Authors: Maninder Kaur Gill, Alpana Agarwal

Abstract:

It is practically not feasible to measure the open-loop voltage gain of the operational amplifier in the open loop configuration. It is because the open-loop voltage gain of the operational amplifier is very large. In order to avoid the saturation of the output voltage, a very small input should be given to operational amplifier which is not possible to be measured practically by a digital multimeter. A test circuit for measurement of open loop voltage gain of an operational amplifier has been proposed and verified using simulation tools as well as by experimental methods on breadboard. The main advantage of this test circuit is that it is simple, fast, accurate, cost effective, and easy to handle even on a breadboard. The test circuit requires only the device under test (DUT) along with resistors. This circuit has been tested for measurement of open loop voltage gain for different operational amplifiers. The underlying goal is to design testable circuits for various analog devices that are simple to realize in VLSI systems, giving accurate results and without changing the characteristics of the original system. The DUTs used are LM741CN and UA741CP. For LM741CN, the simulated gain and experimentally measured gain (average) are calculated as 89.71 dB and 87.71 dB, respectively. For UA741CP, the simulated gain and experimentally measured gain (average) are calculated as 101.15 dB and 105.15 dB, respectively. These values are found to be close to the datasheet values.

Keywords: Device under test, open-loop voltage gain, operational amplifier, test circuit.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3289

283 Overview of Multi-Chip Alternatives for 2.5D and 3D Integrated Circuit Packagings

Authors: Ching-Feng Chen, Ching-Chih Tsai

Abstract:

With the size of the transistor gradually approaching the physical limit, it challenges the persistence of Moore’s Law due to such issues of the short channel effect and the development of the high numerical aperture (NA) lithography equipment. In the context of the ever-increasing technical requirements of portable devices and high-performance computing (HPC), relying on the law continuation to enhance the chip density will no longer support the prospects of the electronics industry. Weighing the chip’s power consumption-performance-area-cost-cycle time to market (PPACC) is an updated benchmark to drive the evolution of the advanced wafer nanometer (nm). The advent of two and half- and three-dimensional (2.5 and 3D)- Very-Large-Scale Integration (VLSI) packaging based on Through Silicon Via (TSV) technology has updated the traditional die assembly methods and provided the solution. This overview investigates the up-to-date and cutting-edge packaging technologies for 2.5D and 3D integrated circuits (IC) based on the updated transistor structure and technology nodes. We conclude that multi-chip solutions for 2.5D and 3D IC packaging can prolong Moore’s Law.

Keywords: Moore’s Law, High Numerical Aperture, Power Consumption-Performance-Area-Cost-Cycle Time to Market, PPACC, 2.5 and 3D-Very-Large-Scale Integration Packaging, Through Silicon Vi.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 177

282 Unconditionally Secure Quantum Payment System

Authors: Essam Al-Daoud

Abstract:

A potentially serious problem with current payment systems is that their underlying hard problems from number theory may be solved by either a quantum computer or unanticipated future advances in algorithms and hardware. A new quantum payment system is proposed in this paper. The suggested system makes use of fundamental principles of quantum mechanics to ensure the unconditional security without prior arrangements between customers and vendors. More specifically, the new system uses Greenberger-Home-Zeilinger (GHZ) states and Quantum Key Distribution to authenticate the vendors and guarantee the transaction integrity.

Keywords: Bell state, GHZ state, Quantum key distribution, Quantum payment system.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1529

281 A Novel FFT-Based Frequency Offset Estimator for OFDM Systems

Authors: Mahdi Masoumi, Mehrdad Ardebilipoor, Seyed Aidin Bassam

Abstract:

This paper proposes a novel frequency offset (FO) estimator for orthogonal frequency division multiplexing. Simplicity is most significant feature of this algorithm and can be repeated to achieve acceptable accuracy. Also fractional and integer part of FO is estimated jointly with use of the same algorithm. To do so, instead of using conventional algorithms that usually use correlation function, we use DFT of received signal. Therefore, complexity will be reduced and we can do synchronization procedure by the same hardware that is used to demodulate OFDM symbol. Finally, computer simulation shows that the accuracy of this method is better than other conventional methods.

Keywords: DFT, Estimator, Frequency Offset, IEEE802.11a, OFDM.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1470

280 Design of Low Power and High Speed Digital IIR Filter in 45nm with Optimized CSA for Digital Signal Processing Applications

Authors: G. Ramana Murthy, C. Senthilpari, P. Velrajkumar, Lim Tien Sze

Abstract:

In this paper, a design methodology to implement low-power and high-speed 2nd order recursive digital Infinite Impulse Response (IIR) filter has been proposed. Since IIR filters suffer from a large number of constant multiplications, the proposed method replaces the constant multiplications by using addition/subtraction and shift operations. The proposed new 6T adder cell is used as the Carry-Save Adder (CSA) to implement addition/subtraction operations in the design of recursive section IIR filter to reduce the propagation delay. Furthermore, high-level algorithms designed for the optimization of the number of CSA blocks are used to reduce the complexity of the IIR filter. The DSCH3 tool is used to generate the schematic of the proposed 6T CSA based shift-adds architecture design and it is analyzed by using Microwind CAD tool to synthesize low-complexity and high-speed IIR filters. The proposed design outperforms in terms of power, propagation delay, area and throughput when compared with MUX-12T, MCIT-7T based CSA adder filter design. It is observed from the experimental results that the proposed 6T based design method can find better IIR filter designs in terms of power and delay than those obtained by using efficient general multipliers.

Keywords: CSA Full Adder, Delay unit, IIR filter, Low-Power, PDP, Parametric Analysis, Propagation Delay, Throughput, VLSI.

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 3784