Search results for: hardware accelerator unit
2802 Providing Reliability, Availability and Scalability Support for Quick Assist Technology Cryptography on the Cloud
Authors: Songwu Shen, Garrett Drysdale, Veerendranath Mannepalli, Qihua Dai, Yuan Wang, Yuli Chen, David Qian, Utkarsh Kakaiya
Abstract:
Hardware accelerator has been a promising solution to reduce the cost of cloud data centers. This paper investigates the QoS enhancement of the acceleration of an important datacenter workload: the webserver (or proxy) that faces high computational consumption originated from secure sockets layer (SSL) or transport layer security (TLS) procession in the cloud environment. Our study reveals that for the accelerator maintenance cases—need to upgrade driver/firmware or hardware reset due to hardware hang; we still can provide cryptography services by switching to software during maintenance phase and then switching back to accelerator after maintenance. The switching is seamless to server application such as Nginx that runs inside a VM on top of the server. To achieve this high availability goal, we propose a comprehensive fallback solution based on Intel® QuickAssist Technology (QAT). This approach introduces an architecture that involves the collaboration between physical function (PF) and virtual function (VF), and collaboration among VF, OpenSSL, and web application Nginx. The evaluation shows that our solution could provide high reliability, availability, and scalability (RAS) of hardware cryptography service in a 7x24x365 manner in the cloud environment.Keywords: accelerator, cryptography service, RAS, secure sockets layer/transport layer security, SSL/TLS, virtualization fallback architecture
Procedia PDF Downloads 1532801 Analysis of Lightweight Register Hardware Threat
Authors: Yang Luo, Beibei Wang
Abstract:
In this paper, we present a design methodology of lightweight register transfer level (RTL) hardware threat implemented based on a MAX II FPGA platform. The dynamic power consumed by the toggling of the various bit of registers as well as the dynamic power consumed per unit of logic circuits were analyzed. The hardware threat was designed taking advantage of the differences in dynamic power consumed per unit of logic circuits to hide the transfer information. The experiment result shows that the register hardware threat was successfully implemented by using different dynamic power consumed per unit of logic circuits to hide the key information of DES encryption module. It needs more than 100000 sample curves to reduce the background noise by comparing the sample space when it completely meets the time alignment requirement. In additional, an external trigger signal is playing a very important role to detect the hardware threat in this experiment.Keywords: side-channel analysis, hardware Trojan, register transfer level, dynamic power
Procedia PDF Downloads 2772800 A Survey of Field Programmable Gate Array-Based Convolutional Neural Network Accelerators
Authors: Wei Zhang
Abstract:
With the rapid development of deep learning, neural network and deep learning algorithms play a significant role in various practical applications. Due to the high accuracy and good performance, Convolutional Neural Networks (CNNs) especially have become a research hot spot in the past few years. However, the size of the networks becomes increasingly large scale due to the demands of the practical applications, which poses a significant challenge to construct a high-performance implementation of deep learning neural networks. Meanwhile, many of these application scenarios also have strict requirements on the performance and low-power consumption of hardware devices. Therefore, it is particularly critical to choose a moderate computing platform for hardware acceleration of CNNs. This article aimed to survey the recent advance in Field Programmable Gate Array (FPGA)-based acceleration of CNNs. Various designs and implementations of the accelerator based on FPGA under different devices and network models are overviewed, and the versions of Graphic Processing Units (GPUs), Application Specific Integrated Circuits (ASICs) and Digital Signal Processors (DSPs) are compared to present our own critical analysis and comments. Finally, we give a discussion on different perspectives of these acceleration and optimization methods on FPGA platforms to further explore the opportunities and challenges for future research. More helpfully, we give a prospect for future development of the FPGA-based accelerator.Keywords: deep learning, field programmable gate array, FPGA, hardware accelerator, convolutional neural networks, CNN
Procedia PDF Downloads 1262799 Operator Optimization Based on Hardware Architecture Alignment Requirements
Authors: Qingqing Gai, Junxing Shen, Yu Luo
Abstract:
Due to the hardware architecture characteristics, some operators tend to acquire better performance if the input/output tensor dimensions are aligned to a certain minimum granularity, such as convolution and deconvolution commonly used in deep learning. Furthermore, if the requirements are not met, the general strategy is to pad with 0 to satisfy the requirements, potentially leading to the under-utilization of the hardware resources. Therefore, for the convolution and deconvolution whose input and output channels do not meet the minimum granularity alignment, we propose to transfer the W-dimensional data to the C-dimension for computation (W2C) to enable the C-dimension to meet the hardware requirements. This scheme also reduces the number of computations in the W-dimension. Although this scheme substantially increases computation, the operator’s speed can improve significantly. It achieves remarkable speedups on multiple hardware accelerators, including Nvidia Tensor cores, Qualcomm digital signal processors (DSPs), and Huawei neural processing units (NPUs). All you need to do is modify the network structure and rearrange the operator weights offline without retraining. At the same time, for some operators, such as the Reducemax, we observe that transferring the Cdimensional data to the W-dimension(C2W) and replacing the Reducemax with the Maxpool can accomplish acceleration under certain circumstances.Keywords: convolution, deconvolution, W2C, C2W, alignment, hardware accelerator
Procedia PDF Downloads 1032798 On the Design of Electronic Control Unitsfor the Safety-Critical Vehicle Applications
Authors: Kyung-Jung Lee, Hyun-Sik Ahn
Abstract:
This paper suggests a design methodology for the hardware and software of the Electronic Control Unit (ECU) of safety-critical vehicle applications such as braking and steering. The architecture of the hardware is a high integrity system such that it incorporates a high performance 32-bit CPU and a separate Peripheral Control-Processor (PCP) together with an external watchdog CPU. Communication between the main CPU and the PCP is executed via a common area of RAM and events on either processor which are invoked by interrupts. Safety-related software is also implemented to provide a reliable, self-testing computing environment for safety critical and high integrity applications. The validity of the design approach is shown by using the Hardware-in-the-Loop Simulation (HILS) for Electric Power Steering (EPS) systems which consists of the EPS mechanism, the designed ECU, and monitoring tools.Keywords: electronic control unit, electric power steering, functional safety, hardware-in-the-loop simulation
Procedia PDF Downloads 2932797 Efficient Field-Oriented Motor Control on Resource-Constrained Microcontrollers for Optimal Performance without Specialized Hardware
Authors: Nishita Jaiswal, Apoorv Mohan Satpute
Abstract:
The increasing demand for efficient, cost-effective motor control systems in the automotive industry has driven the need for advanced, highly optimized control algorithms. Field-Oriented Control (FOC) has established itself as the leading approach for motor control, offering precise and dynamic regulation of torque, speed, and position. However, as energy efficiency becomes more critical in modern applications, implementing FOC on low-power, cost-sensitive microcontrollers pose significant challenges due to the limited availability of computational and hardware resources. Currently, most solutions rely on high-performance 32-bit microcontrollers or Application-Specific Integrated Circuits (ASICs) equipped with Floating Point Units (FPUs) and Hardware Accelerated Units (HAUs). These advanced platforms enable rapid computation and simplify the execution of complex control algorithms like FOC. However, these benefits come at the expense of higher costs, increased power consumption, and added system complexity. These drawbacks limit their suitability for embedded systems with strict power and budget constraints, where achieving energy and execution efficiency without compromising performance is essential. In this paper, we present an alternative approach that utilizes optimized data representation and computation techniques on a 16-bit microcontroller without FPUs or HAUs. By carefully optimizing data point formats and employing fixed-point arithmetic, we demonstrate how the precision and computational efficiency required for FOC can be maintained in resource-constrained environments. This approach eliminates the overhead performance associated with floating-point operations and hardware acceleration, providing a more practical solution in terms of cost, scalability and improved execution time efficiency, allowing faster response in motor control applications. Furthermore, it enhances system design flexibility, making it particularly well-suited for applications that demand stringent control over power consumption and costs.Keywords: field-oriented control, fixed-point arithmetic, floating point unit, hardware accelerator unit, motor control systems
Procedia PDF Downloads 102796 Embedded Test Framework: A Solution Accelerator for Embedded Hardware Testing
Authors: Arjun Kumar Rath, Titus Dhanasingh
Abstract:
Embedded product development requires software to test hardware functionality during development and finding issues during manufacturing in larger quantities. As the components are getting integrated, the devices are tested for their full functionality using advanced software tools. Benchmarking tools are used to measure and compare the performance of product features. At present, these tests are based on a variety of methods involving varying hardware and software platforms. Typically, these tests are custom built for every product and remain unusable for other variants. A majority of the tests goes undocumented, not updated, unusable when the product is released. To bridge this gap, a solution accelerator in the form of a framework can address these issues for running all these tests from one place, using an off-the-shelf tests library in a continuous integration environment. There are many open-source test frameworks or tools (fuego. LAVA, AutoTest, KernelCI, etc.) designed for testing embedded system devices, with each one having several unique good features, but one single tool and framework may not satisfy all of the testing needs for embedded systems, thus an extensible framework with the multitude of tools. Embedded product testing includes board bring-up testing, test during manufacturing, firmware testing, application testing, and assembly testing. Traditional test methods include developing test libraries and support components for every new hardware platform that belongs to the same domain with identical hardware architecture. This approach will have drawbacks like non-reusability where platform-specific libraries cannot be reused, need to maintain source infrastructure for individual hardware platforms, and most importantly, time is taken to re-develop test cases for new hardware platforms. These limitations create challenges like environment set up for testing, scalability, and maintenance. A desirable strategy is certainly one that is focused on maximizing reusability, continuous integration, and leveraging artifacts across the complete development cycle during phases of testing and across family of products. To get over the stated challenges with the conventional method and offers benefits of embedded testing, an embedded test framework (ETF), a solution accelerator, is designed, which can be deployed in embedded system-related products with minimal customizations and maintenance to accelerate the hardware testing. Embedded test framework supports testing different hardwares including microprocessor and microcontroller. It offers benefits such as (1) Time-to-Market: Accelerates board brings up time with prepacked test suites supporting all necessary peripherals which can speed up the design and development stage(board bring up, manufacturing and device driver) (2) Reusability-framework components isolated from the platform-specific HW initialization and configuration makes the adaptability of test cases across various platform quick and simple (3) Effective build and test infrastructure with multiple test interface options and preintegrated with FUEGO framework (4) Continuos integration - pre-integrated with Jenkins which enabled continuous testing and automated software update feature. Applying the embedded test framework accelerator throughout the design and development phase enables to development of the well-tested systems before functional verification and improves time to market to a large extent.Keywords: board diagnostics software, embedded system, hardware testing, test frameworks
Procedia PDF Downloads 1432795 Dual-Rail Logic Unit in Double Pass Transistor Logic
Authors: Hamdi Belgacem, Fradi Aymen
Abstract:
In this paper we present a low power, low cost differential logic unit (LU). The proposed LU receives dual-rail inputs and generates dual-rail outputs. The proposed circuit can be used in Arithmetic and Logic Units (ALU) of processor. It can be also dedicated for self-checking applications based on dual duplication code. Four logic functions as well as their inverses are implemented within a single Logic Unit. The hardware overhead for the implementation of the proposed LU is lower than the hardware overhead required for standard LU implemented with standard CMOS logic style. This new implementation is attractive as fewer transistors are required to implement important logic functions. The proposed differential logic unit can perform 8 Boolean logical operations by using only 16 transistors. Spice simulations using a 32 nm technology was utilized to evaluate the performance of the proposed circuit and to prove its acceptable electrical behaviour.Keywords: differential logic unit, double pass transistor logic, low power CMOS design, low cost CMOS design
Procedia PDF Downloads 4502794 High Level Synthesis of Canny Edge Detection Algorithm on Zynq Platform
Authors: Hanaa M. Abdelgawad, Mona Safar, Ayman M. Wahba
Abstract:
Real-time image and video processing is a demand in many computer vision applications, e.g. video surveillance, traffic management and medical imaging. The processing of those video applications requires high computational power. Therefore, the optimal solution is the collaboration of CPU and hardware accelerators. In this paper, a Canny edge detection hardware accelerator is proposed. Canny edge detection is one of the common blocks in the pre-processing phase of image and video processing pipeline. Our presented approach targets offloading the Canny edge detection algorithm from processing system (PS) to programmable logic (PL) taking the advantage of High Level Synthesis (HLS) tool flow to accelerate the implementation on Zynq platform. The resulting implementation enables up to a 100x performance improvement through hardware acceleration. The CPU utilization drops down and the frame rate jumps to 60 fps of 1080p full HD input video stream.Keywords: high level synthesis, canny edge detection, hardware accelerators, computer vision
Procedia PDF Downloads 4782793 Ion Thruster Grid Lifetime Assessment Based on Its Structural Failure
Authors: Juan Li, Jiawen Qiu, Yuchuan Chu, Tianping Zhang, Wei Meng, Yanhui Jia, Xiaohui Liu
Abstract:
This article developed an ion thruster optic system sputter erosion depth numerical 3D model by IFE-PIC (Immersed Finite Element-Particle-in-Cell) and Mont Carlo method, and calculated the downstream surface sputter erosion rate of accelerator grid; Compared with LIPS-200 life test data, the results of the numerical model are in reasonable agreement with the measured data. Finally, we predict the lifetime of the 20cm diameter ion thruster via the erosion data obtained with the model. The ultimate result demonstrates that under normal operating condition, the erosion rate of the grooves wears on the downstream surface of the accelerator grid is 34.6μm⁄1000h, which means the conservative lifetime until structural failure occurring on the accelerator grid is 11500 hours.Keywords: ion thruster, accelerator gird, sputter erosion, lifetime assessment
Procedia PDF Downloads 5602792 Optimal Beam for Accelerator Driven Systems
Authors: M. Paraipan, V. M. Javadova, S. I. Tyutyunnikov
Abstract:
The concept of energy amplifier or accelerator driven system (ADS) involves the use of a particle accelerator coupled with a nuclear reactor. The accelerated particle beam generates a supplementary source of neutrons, which allows the subcritical functioning of the reactor, and consequently a safe exploitation. The harder neutron spectrum realized ensures a better incineration of the actinides. The almost generalized opinion is that the optimal beam for ADS is represented by protons with energy around 1 GeV (gigaelectronvolt). In the present work, a systematic analysis of the energy gain for proton beams with energy from 0.5 to 3 GeV and ion beams from deuteron to neon with energies between 0.25 and 2 AGeV is performed. The target is an assembly of metallic U-Pu-Zr fuel rods in a bath of lead-bismuth eutectic coolant. The rods length is 150 cm. A beryllium converter with length 110 cm is used in order to maximize the energy released in the target. The case of a linear accelerator is considered, with a beam intensity of 1.25‧10¹⁶ p/s, and a total accelerator efficiency of 0.18 for proton beam. These values are planned to be achieved in the European Spallation Source project. The energy gain G is calculated as the ratio between the energy released in the target to the energy spent to accelerate the beam. The energy released is obtained through simulation with the code Geant4. The energy spent is calculating by scaling from the data about the accelerator efficiency for the reference particle (proton). The analysis concerns the G values, the net power produce, the accelerator length, and the period between refueling. The optimal energy for proton is 1.5 GeV. At this energy, G reaches a plateau around a value of 8 and a net power production of 120 MW (megawatt). Starting with alpha, ion beams have a higher G than 1.5 GeV protons. A beam of 0.25 AGeV(gigaelectronvolt per nucleon) ⁷Li realizes the same net power production as 1.5 GeV protons, has a G of 15, and needs an accelerator length 2.6 times lower than for protons, representing the best solution for ADS. Beams of ¹⁶O or ²⁰Ne with energy 0.75 AGeV, accelerated in an accelerator with the same length as 1.5 GeV protons produce approximately 900 MW net power, with a gain of 23-25. The study of the evolution of the isotopes composition during irradiation shows that the increase in power production diminishes the period between refueling. For a net power produced of 120 MW, the target can be irradiated approximately 5000 days without refueling, but only 600 days when the net power reaches 1 GW (gigawatt).Keywords: accelerator driven system, ion beam, electrical power, energy gain
Procedia PDF Downloads 1392791 Application Research of Stilbene Crystal for the Measurement of Accelerator Neutron Sources
Authors: Zhao Kuo, Chen Liang, Zhang Zhongbing, Ruan Jinlu. He Shiyi, Xu Mengxuan
Abstract:
Stilbene, C₁₄H₁₂, is well known as one of the most useful organic scintillators for pulse shape discrimination (PSD) technique for its good scintillation properties. An on-line acquisition system and an off-line acquisition system were developed with several CAMAC standard plug-ins, NIM plug-ins, neutron/γ discriminating plug-in named 2160A and a digital oscilloscope with high sampling rate respectively for which stilbene crystals and photomultiplier tube detectors (PMT) as detector for accelerator neutron sources measurement carried out in China Institute of Atomic Energy. Pulse amplitude spectrums and charge amplitude spectrums were real-time recorded after good neutron/γ discrimination whose best PSD figure-of-merits (FoMs) are 1.756 for D-D accelerator neutron source and 1.393 for D-T accelerator neutron source. The probability of neutron events in total events was 80%, and neutron detection efficiency was 5.21% for D-D accelerator neutron sources, which were 50% and 1.44% for D-T accelerator neutron sources after subtracting the background of scattering observed by the on-line acquisition system. Pulse waveform signals were acquired by the off-line acquisition system randomly while the on-line acquisition system working. The PSD FoMs obtained by the off-line acquisition system were 2.158 for D-D accelerator neutron sources and 1.802 for D-T accelerator neutron sources after waveform digitization off-line processing named charge integration method for just 1000 pulses. In addition, the probabilities of neutron events in total events obtained by the off-line acquisition system matched very well with the probabilities of the on-line acquisition system. The pulse information recorded by the off-line acquisition system could be repetitively used to adjust the parameters or methods of PSD research and obtain neutron charge amplitude spectrums or pulse amplitude spectrums after digital analysis with a limited number of pulses. The off-line acquisition system showed equivalent or better measurement effects compared with the online system with a limited number of pulses which indicated a feasible method based on stilbene crystals detectors for the measurement of prompt neutrons neutron sources like prompt accelerator neutron sources emit a number of neutrons in a short time.Keywords: stilbene crystal, accelerator neutron source, neutron / γ discrimination, figure-of-merits, CAMAC, waveform digitization
Procedia PDF Downloads 1862790 Advanced Mechatronic Design of Robot Manipulator Using Hardware-In-The-Loop Simulation
Authors: Reza Karami, Ali Akbar Ebrahimi
Abstract:
This paper discusses concurrent engineering of robot manipulators, based on the Holistic Concurrent Design (HCD) methodology and by using a hardware-in-the-loop simulation platform. The methodology allows for considering numerous design variables with different natures concurrently. It redefines the ultimate goal of design based on the notion of satisfaction, resulting in the simplification of the multi-objective constrained optimization process. It also formalizes the effect of designer’s subjective attitude in the process. To enhance modeling efficiency for both computation and accuracy, a hardware-in-the-loop simulation platform is used, which involves physical joint modules and the control unit in addition to the software modules. This platform is implemented in the HCD design architecture to reliably evaluate the design attributes and performance super criterion during the design process. The resulting overall architecture is applied to redesigning kinematic, dynamic and control parameters of an industrial robot manipulator.Keywords: concurrent engineering, hardware-in-the-loop simulation, robot manipulator, multidisciplinary systems, mechatronics
Procedia PDF Downloads 4532789 On-Chip Sensor Ellipse Distribution Method and Equivalent Mapping Technique for Real-Time Hardware Trojan Detection and Location
Authors: Longfei Wang, Selçuk Köse
Abstract:
Hardware Trojan becomes great concern as integrated circuit (IC) technology advances and not all manufacturing steps of an IC are accomplished within one company. Real-time hardware Trojan detection is proven to be a feasible way to detect randomly activated Trojans that cannot be detected at testing stage. On-chip sensors serve as a great candidate to implement real-time hardware Trojan detection, however, the optimization of on-chip sensors has not been thoroughly investigated and the location of Trojan has not been carefully explored. On-chip sensor ellipse distribution method and equivalent mapping technique are proposed based on the characteristics of on-chip power delivery network in this paper to address the optimization and distribution of on-chip sensors for real-time hardware Trojan detection as well as to estimate the location and current consumption of hardware Trojan. Simulation results verify that hardware Trojan activation can be effectively detected and the location of a hardware Trojan can be efficiently estimated with less than 5% error for a realistic power grid using our proposed methods. The proposed techniques therefore lay a solid foundation for isolation and even deactivation of hardware Trojans through accurate location of Trojans.Keywords: hardware trojan, on-chip sensor, power distribution network, power/ground noise
Procedia PDF Downloads 3882788 A Multipurpose Inertial Electrostatic Magnetic Confinement Fusion for Medical Isotopes Production
Authors: Yasser R. Shaban
Abstract:
A practical multipurpose device for medical isotopes production is most wanted for clinical centers and researches. Unfortunately, the major supply of these radioisotopes currently comes from aging sources, and there is a great deal of uneasiness in the domestic market. There are also many cases where the cost of certain radioisotopes is too high for their introduction on a commercial scale even though the isotopes might have great benefits for society. The medical isotopes such as radiotracers PET (Positron Emission Tomography), Technetium-99 m, and Iodine-131, Lutetium-177 by is feasible to be generated by a single unit named IEMC (Inertial Electrostatic Magnetic Confinement). The IEMC fusion vessel is the upgrading unit of the Inertial Electrostatic Confinement IEC fusion vessel. Comprehensive experimental works on IEC were carried earlier with promising results. The principle of inertial electrostatic magnetic confinement IEMC fusion is based on forcing the binary fuel ions to interact in the opposite directions in ions cyclotrons orbits with different kinetic energies in order to have equal compression (forces) and with different ion cyclotron frequency ω in order to increase the rate of intersection. The IEMC features greater fusion volume than IEC by several orders of magnitude. The particles rate from the IEMC approach are projected to be 8.5 x 10¹¹ (p/s), ~ 0.2 microampere proton, for D/He-3 fusion reaction and 4.2 x 10¹² (n/s) for D/T fusion reaction. The projected values of particles yield (neutrons and protons) are suitable for medical isotope productions on-site by a single unit without any change in the fusion vessel but only the fuel gas. The PET radiotracers are usually produced on-site by medical ion accelerator whereas Technetium-99m (Tc-99m) is usually produced off-site from the irradiation facilities of nuclear power plants. Typically, hospitals receive molybdenum-99 isotope container; the isotope decays to Tc-99mwith half-life time 2.75 days. Even though the projected current from IEMC is lesser than the proton current from the medical ion accelerator but still the IEMC vessel is simpler, and reduced in components and power consumption which add a new value of populating the PET radiotracers in most clinical centers. On the other hand, the projected neutrons flux from the IEMC is lesser than the thermal neutron flux at the irradiation facilities of nuclear power plants, but in the IEMC case the productions of Technetium-99m is suggested to be at the resonance region of which the resonance integral cross section is two orders of magnitude higher than the thermal flux. Thus it can be said the net activity from both is evened. Besides, the particle accelerator cannot be considered a multipurpose particles production unless a significant change is made to the accelerator to change from neutrons mode to protons mode or vice versa. In conclusion, the projected fusion yield from IEMC is a straightforward since slightly change in the primer IEC and ion source is required.Keywords: electrostatic versus magnetic confinement fusion vessel, ion source, medical isotopes productions, neutron activation
Procedia PDF Downloads 3422787 Numerical Solution Speedup of the Laplace Equation Using FPGA Hardware
Authors: Abbas Ebrahimi, Mohammad Zandsalimy
Abstract:
The main purpose of this study is to investigate the feasibility of using FPGA (Field Programmable Gate Arrays) chips as alternatives for the conventional CPUs to accelerate the numerical solution of the Laplace equation. FPGA is an integrated circuit that contains an array of logic blocks, and its architecture can be reprogrammed and reconfigured after manufacturing. Complex circuits for various applications can be designed and implemented using FPGA hardware. The reconfigurable hardware used in this paper is an SoC (System on a Chip) FPGA type that integrates both microprocessor and FPGA architectures into a single device. In the present study the Laplace equation is implemented and solved numerically on both reconfigurable hardware and CPU. The precision of results and speedups of the calculations are compared together. The computational process on FPGA, is up to 20 times faster than a conventional CPU, with the same data precision. An analytical solution is used to validate the results.Keywords: accelerating numerical solutions, CFD, FPGA, hardware definition language, numerical solutions, reconfigurable hardware
Procedia PDF Downloads 3792786 Design and Implementation of a Fan Coil Unit Controller Based on the Duty Ratio Fuzzy Method
Authors: Liang Zhao, Jili Zhang, Kai Li
Abstract:
A microcontroller-based fan coil unit (FCU) fuzzy controller is designed and implemented in this paper. The controller employs the concept of duty ratio on the electric valve control, which could make full use of the cooling and dehumidifying capacity of the FCU when the valve is off. The traditional control method and its limitations are analyzed. The hardware and software design processes are introduced in detail. The experimental results show that the proposed method is more energy efficient compared to the traditional controlling strategy. Furthermore, a more comfortable room condition could be achieved by the proposed method. The proposed low-cost FCU fuzzy controller deserves to be widely used in engineering applications.Keywords: fan coil unit, duty ratio, fuzzy controller, experiment
Procedia PDF Downloads 3362785 Simulation and Hardware Implementation of Data Communication Between CAN Controllers for Automotive Applications
Authors: R. M. Kalayappan, N. Kathiravan
Abstract:
In automobile industries, Controller Area Network (CAN) is widely used to reduce the system complexity and inter-task communication. Therefore, this paper proposes the hardware implementation of data frame communication between one controller to other. The CAN data frames and protocols will be explained deeply, here. The data frames are transferred without any collision or corruption. The simulation is made in the KEIL vision software to display the data transfer between transmitter and receiver in CAN. ARM7 micro-controller is used to transfer data’s between the controllers in real time. Data transfer is verified using the CRO.Keywords: control area network (CAN), automotive electronic control unit, CAN 2.0, industry
Procedia PDF Downloads 3972784 Performances of the Double-Crystal Setup at CERN SPS Accelerator for Physics beyond Colliders Experiments
Authors: Andrii Natochii
Abstract:
We are currently presenting the recent results from the CERN accelerator facilities obtained in the frame of the UA9 Collaboration. The UA9 experiment investigates how a tiny silicon bent crystal (few millimeters long) can be used for various high-energy physics applications. Due to the huge electrostatic field (tens of GV/cm) between crystalline planes, there is a probability for charged particles, impinging the crystal, to be trapped in the channeling regime. It gives a possibility to steer a high intensity and momentum beam by bending the crystal: channeled particles will follow the crystal curvature and deflect on the certain angle (from tens microradians for LHC to few milliradians for SPS energy ranges). The measurements at SPS, performed in 2017 and 2018, confirmed that the protons deflected by the first crystal, inserted in the primary beam halo, can be caught and channeled by the second crystal. In this configuration, we measure the single pass deflection efficiency of the second crystal and prove our opportunity to perform the fixed target experiment at SPS accelerator (LHC in the future).Keywords: channeling, double-crystal setup, fixed target experiment, Timepix detector
Procedia PDF Downloads 1482783 Cortex-M3 Based Virtual Platform Implementation for Software Development
Authors: Jun Young Moon, Hyeonggeon Lee, Jong Tae Kim
Abstract:
In this paper, we present Cortex-M3 based virtual platform which can virtualize wearable hardware platform and evaluate hardware performance. Cortex-M3 is very popular microcontroller in wearable devices, hardware sensors and display devices. This platform can be used to implement software layer for specific hardware architecture. By using the proposed platform the software development process can be parallelized with hardware development process. We present internal mechanism to implement the proposed virtual platform and describe how to use the proposed platform to develop software by using case study which is low cost wearable device that uses Cortex-M3.Keywords: electronic system level design, software development, virtual platform, wearable device
Procedia PDF Downloads 3732782 Hardware for Genetic Algorithm
Authors: Fariborz Ahmadi, Reza Tati
Abstract:
Genetic algorithm is a soft computing method that works on set of solutions. These solutions are called chromosome and the best one is the absolute solution of the problem. The main problem of this algorithm is that after passing through some generations, it may be produced some chromosomes that had been produced in some generations ago that causes reducing the convergence speed. From another respective, most of the genetic algorithms are implemented in software and less works have been done on hardware implementation. Our work implements genetic algorithm in hardware that doesn’t produce chromosome that have been produced in previous generations. In this work, most of genetic operators are implemented without producing iterative chromosomes and genetic diversity is preserved. Genetic diversity causes that not only do not this algorithm converge to local optimum but also reaching to global optimum. Without any doubts, proposed approach is so faster than software implementations. Evaluation results also show the proposed approach is faster than hardware ones.Keywords: hardware, genetic algorithm, computer science, engineering
Procedia PDF Downloads 5052781 A Low-Area Fully-Reconfigurable Hardware Design of Fast Fourier Transform System for 3GPP-LTE Standard
Authors: Xin-Yu Shih, Yue-Qu Liu, Hong-Ru Chou
Abstract:
This paper presents a low-area and fully-reconfigurable Fast Fourier Transform (FFT) hardware design for 3GPP-LTE communication standard. It can fully support 32 different FFT sizes, up to 2048 FFT points. Besides, a special processing element is developed for making reconfigurable computing characteristics possible, while first-in first-out (FIFO) scheduling scheme design technique is proposed for hardware-friendly FIFO resource arranging. In a synthesis chip realization via TSMC 40 nm CMOS technology, the hardware circuit only occupies core area of 0.2325 mm2 and dissipates 233.5 mW at maximal operating frequency of 250 MHz.Keywords: reconfigurable, fast Fourier transform (FFT), single-path delay feedback (SDF), 3GPP-LTE
Procedia PDF Downloads 2762780 Development of the Accelerator Applied to an Early Stage High-Strength Shotcrete
Authors: Ayanori Sugiyama, Takahisa Hanei, Yasuhide Higo
Abstract:
Domestic demand for the construction of tunnels has been increasing in recent years in Japan. To meet this demand, various construction materials and construction methods have been developed to attain higher strength, reduction of negative impact on the environment and improvement for working conditions. In this report, we would like to introduce the newly developed shotcrete with superior hardening properties which were tested through the actual machine scale and its workability and strength development were evaluated. As a result, this new tunnel construction method was found to achieve higher workability and quicker strength development in only a couple of minutes.Keywords: accelerator, shotcrete, tunnel, high-strength
Procedia PDF Downloads 3172779 Optimal Opportunistic Maintenance Policy for a Two-Unit System
Authors: Nooshin Salari, Viliam Makis, Jane Doe
Abstract:
This paper presents a maintenance policy for a system consisting of two units. Unit 1 is gradually deteriorating and is subject to soft failure. Unit 2 has a general lifetime distribution and is subject to hard failure. Condition of unit 1 of the system is monitored periodically and it is considered as failed when its deterioration level reaches or exceeds a critical level N. At the failure time of unit 2 system is considered as failed, and unit 2 will be correctively replaced by the next inspection epoch. Unit 1 or 2 are preventively replaced when deterioration level of unit 1 or age of unit 2 exceeds the related preventive maintenance (PM) levels. At the time of corrective or preventive replacement of unit 2, there is an opportunity to replace unit 1 if its deterioration level reaches the opportunistic maintenance (OM) level. If unit 2 fails in an inspection interval, system stops operating although unit 1 has not failed. A mathematical model is derived to find the preventive and opportunistic replacement levels for unit 1 and preventive replacement age for unit 2, that minimize the long run expected average cost per unit time. The problem is formulated and solved in the semi-Markov decision process (SMDP) framework. Numerical example is provided to illustrate the performance of the proposed model and the comparison of the proposed model with an optimal policy without opportunistic maintenance level for unit 1 is carried out.Keywords: condition-based maintenance, opportunistic maintenance, preventive maintenance, two-unit system
Procedia PDF Downloads 1982778 Design and Thermal Simulation Analysis of the Chinese Accelerator Driven Sub-Critical System Injector-I Cryomodule
Authors: Rui-Xiong Han, Rui Ge, Shao-Peng Li, Lin Bian, Liang-Rui Sun, Min-Jing Sang, Rui Ye, Ya-Ping Liu, Xiang-Zhen Zhang, Jie-Hao Zhang, Zhuo Zhang, Jian-Qing Zhang, Miao-Fu Xu
Abstract:
The Chinese Accelerator Driven Sub-critical system (C-ADS) uses a high-energy proton beam to bombard the metal target and generate neutrons to deal with the nuclear waste. The Chinese ADS proton linear has two 0~10 MeV injectors and one 10~1500 MeV superconducting linac. Injector-I is studied by the Institute of High Energy Physics (IHEP) under construction in the Beijing, China. The linear accelerator consists of two accelerating cryomodules operating at the temperature of 2 Kelvin. This paper describes the structure and thermal performances analysis of the cryomodule. The analysis takes into account all the main contributors (support posts, multilayer insulation, current leads, power couplers, and cavities) to the static and dynamic heat load at various cryogenic temperature levels. The thermal simulation analysis of the cryomodule is important theory foundation of optimization and commissioning.Keywords: C-ADS, cryomodule, structure, thermal simulation, static heat load, dynamic heat load
Procedia PDF Downloads 4002777 Embedded Semantic Segmentation Network Optimized for Matrix Multiplication Accelerator
Authors: Jaeyoung Lee
Abstract:
Autonomous driving systems require high reliability to provide people with a safe and comfortable driving experience. However, despite the development of a number of vehicle sensors, it is difficult to always provide high perceived performance in driving environments that vary from time to season. The image segmentation method using deep learning, which has recently evolved rapidly, provides high recognition performance in various road environments stably. However, since the system controls a vehicle in real time, a highly complex deep learning network cannot be used due to time and memory constraints. Moreover, efficient networks are optimized for GPU environments, which degrade performance in embedded processor environments equipped simple hardware accelerators. In this paper, a semantic segmentation network, matrix multiplication accelerator network (MMANet), optimized for matrix multiplication accelerator (MMA) on Texas instrument digital signal processors (TI DSP) is proposed to improve the recognition performance of autonomous driving system. The proposed method is designed to maximize the number of layers that can be performed in a limited time to provide reliable driving environment information in real time. First, the number of channels in the activation map is fixed to fit the structure of MMA. By increasing the number of parallel branches, the lack of information caused by fixing the number of channels is resolved. Second, an efficient convolution is selected depending on the size of the activation. Since MMA is a fixed, it may be more efficient for normal convolution than depthwise separable convolution depending on memory access overhead. Thus, a convolution type is decided according to output stride to increase network depth. In addition, memory access time is minimized by processing operations only in L3 cache. Lastly, reliable contexts are extracted using the extended atrous spatial pyramid pooling (ASPP). The suggested method gets stable features from an extended path by increasing the kernel size and accessing consecutive data. In addition, it consists of two ASPPs to obtain high quality contexts using the restored shape without global average pooling paths since the layer uses MMA as a simple adder. To verify the proposed method, an experiment is conducted using perfsim, a timing simulator, and the Cityscapes validation sets. The proposed network can process an image with 640 x 480 resolution for 6.67 ms, so six cameras can be used to identify the surroundings of the vehicle as 20 frame per second (FPS). In addition, it achieves 73.1% mean intersection over union (mIoU) which is the highest recognition rate among embedded networks on the Cityscapes validation set.Keywords: edge network, embedded network, MMA, matrix multiplication accelerator, semantic segmentation network
Procedia PDF Downloads 1282776 Lightweight Hardware Firewall for Embedded System Based on Bus Transactions
Authors: Ziyuan Wu, Yulong Jia, Xiang Zhang, Wanting Zhou, Lei Li
Abstract:
The Internet of Things (IoT) is a rapidly evolving field involving a large number of interconnected embedded devices. In the design of embedded System-on-Chip (SoC), the key issues are power consumption, performance, and security. However, the easy-to-implement software and untrustworthy third-party IP cores may threaten the safety of hardware assets. Considering that illegal access and malicious attacks against SoC resources pass through the bus that integrates IPs, we propose a Lightweight Hardware Firewall (LHF) to protect SoC, which monitors and disallows the offending bus transactions based on physical addresses. Furthermore, under the LHF architecture, this paper refines two types of firewalls: Destination Hardware Firewall (DHF) and Source Hardware Firewall (SHF). The former is oriented to fine-grained detection and configuration, whose core technology is based on the method of dynamic grading units. In addition, we design the SHF based on static entries to achieve lightweight. Finally, we evaluate the hardware consumption of the proposed method by both Field-Programmable Gate Array (FPGA) and IC. Compared with the exciting efforts, LHF introduces a bus latency of zero clock cycles for every read or write transaction implemented on Xilinx Kintex-7 FPGAs. Meanwhile, the DC synthesis results based on TSMC 90nm show that the area is reduced by about 25% compared with the previous method.Keywords: IoT, security, SoC, bus architecture, lightweight hardware firewall, FPGA
Procedia PDF Downloads 592775 Individual Actuators of a Car-Like Robot with Back Trailer
Authors: Tarek El-Derini, Ahmed El-Shenawy
Abstract:
This paper presents the hardware implemented and validation for a special system to assist the unprofessional users of car with back trailers. The system consists of two platforms; the front car platform (C) and the trailer platform (T). The main objective is to control the Trailer platform using the actuators found in the front platform (c). The mobility of the platform (C) is investigated and inverse and forward kinematics model is obtained for both platforms (C) and (T). The system is simulated using Matlab M-file and the simulation examples results illustrated the system performance. The system is constructed with a hardware setup for the front and trailer platform. The hardware experimental results and the simulated examples outputs showed the validation of the hardware setup.Keywords: kinematics, modeling, robot, MATLAB
Procedia PDF Downloads 4412774 Hardware Error Analysis and Severity Characterization in Linux-Based Server Systems
Authors: Nikolaos Georgoulopoulos, Alkis Hatzopoulos, Konstantinos Karamitsios, Konstantinos Kotrotsios, Alexandros I. Metsai
Abstract:
In modern server systems, business critical applications run in different types of infrastructure, such as cloud systems, physical machines and virtualization. Often, due to high load and over time, various hardware faults occur in servers that translate to errors, resulting to malfunction or even server breakdown. CPU, RAM and hard drive (HDD) are the hardware parts that concern server administrators the most regarding errors. In this work, selected RAM, HDD and CPU errors, that have been observed or can be simulated in kernel ring buffer log files from two groups of Linux servers, are investigated. Moreover, a severity characterization is given for each error type. Better understanding of such errors can lead to more efficient analysis of kernel logs that are usually exploited for fault diagnosis and prediction. In addition, this work summarizes ways of simulating hardware errors in RAM and HDD, in order to test the error detection and correction mechanisms of a Linux server.Keywords: hardware errors, Kernel logs, Linux servers, RAM, hard disk, CPU
Procedia PDF Downloads 1522773 Neutron Contamination in 18 MV Medical Linear Accelerator
Authors: Onur Karaman, A. Gunes Tanir
Abstract:
Photon radiation therapy used to treat cancer is one of the most important methods. However, photon beam collimator materials in Linear Accelerator (LINAC) head generally contains heavy elements is used and the interaction of bremsstrahlung photon with such heavy nuclei, the neutron can be produced inside the treatment rooms. In radiation therapy, neutron contamination contributes to the risk of secondary malignancies in patients, also physicians working in this field. Since the neutron is more dangerous than photon, it is important to determine neutron dose during radiotherapy treatment. In this study, it is aimed to analyze the effect of field size, distance from axis and depth on the amount of in-field and out-field neutron contamination for ElektaVmat accelerator with 18 MV nominal energy. The photon spectra at the distance of 75, 150, 225, 300 cm from target and on the isocenter of beam were scored for 5x5, 10x10, 20x20, 30x30 and 40x40 cm2 fields. Results demonstrated that the neutron spectra and dose are dependent on field size and distances. Beyond 225 cm of isocenter, the dependence of the neutron dose on field size is minimal. As a result, it is concluded that as the open field increases, neutron dose determined decreases. It is important to remember that when treating with high energy photons, the dose from contamination neutrons must be considered as it is much greater than the photon dose.Keywords: radiotherapy, neutron contamination, linear accelerators, photon
Procedia PDF Downloads 347