Application-Specific Instruction Sets Processor with Implicit Registers to Improve Register Bandwidth
Commenced in January 2007
Frequency: Monthly
Edition: International
Paper Count: 33093
Application-Specific Instruction Sets Processor with Implicit Registers to Improve Register Bandwidth

Authors: Ginhsuan Li, Chiuyun Hung, Desheng Chen, Yiwen Wang

Abstract:

Application-Specific Instruction (ASI ) set Processors (ASIP) have become an important design choice for embedded systems due to runtime flexibility, which cannot be provided by custom ASIC solutions. One major bottleneck in maximizing ASIP performance is the limitation on the data bandwidth between the General Purpose Register File (GPRF) and ASIs. This paper presents the Implicit Registers (IRs) to provide the desirable data bandwidth. An ASI Input/Output model is proposed to formulate the overheads of the additional data transfer between the GPRF and IRs, therefore, an IRs allocation algorithm is used to achieve the better performance by minimizing the number of extra data transfer instructions. The experiment results show an up to 3.33x speedup compared to the results without using IRs.

Keywords: Application-Specific Instruction-set Processors, data bandwidth, configurable processor, implicit register.

Digital Object Identifier (DOI): doi.org/10.5281/zenodo.1081267

Procedia APA BibTeX Chicago EndNote Harvard JSON MLA RIS XML ISO 690 PDF Downloads 1535

References:


[1] CoWare LISATek Tools. http://www.coware.com/.
[2] Tensilica. http://www.tensilica.com/.
[3] Altera Corp. http://www.altera.com/.
[4] MIPS CorExtend. http://www.mips.com/.
[5] IBM PowerPC. http://www.ibm.com/
[6] M. Jain et al., "ASIP Design Methodologies: Survey and Issues," Proceedings of the 14 International Conference on VLSI Design, 2001, pp. 3-7, Jan. 2001.
[7] D. Fischer, J. Teich, M.Thies, and R.Weper, "Efficient architecture/compiler co-exploration for asips," in Proc. Int. Conf. Compilers, Arch., Synth. Embedded Syst., 2002, pp.27-34.
[8] N. Clark, H. Zhong, and S. Mahlke, "Processor acceleration through automated instruction set customization," in Proc. 36th Annu. Int. Symp. Microarchitecture, Dec. 2003, pp. 129-140.
[9] P. Yu and T. Mitra, "Scalable custom instructions identification for instruction set extensible processors," in Proc. Int. Conf. Compilers Architectures Synthesis Embedded Syst., Sep. 2004, pp. 69-78.
[10] K. Atasu, L. Pozzi, and P. Ienne, "Automatic application-specific instruction-set extensions under microarchitectural constraints," in Proc. 40th Des. Autom. Conf., Jun. 2003, pp. 256-261.
[11] L. Pozzi, K. Atasu, and P. Ienne, "Exact and approximate algorithms for the extension of embedded processor instruction sets," IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 25, no. 4, pp. 1209-1229, Jul. 2006.
[12] P. Yu and T. Mitra, "Disjoint pattern enumeration for custom instruction identification," in Proc. 17th Int. Conf. Field-Programmable Logic Appl., Aug. 2007, pp. 273-278.
[13] P. Bonzini and L. Pozzi, "Polynomial-time subgraph enumeration for automated instruction set extension," in Proc. Des. Autom. Test Eur. Conf. Exhibition, Apr. 2007, pp. 1331-1336.
[14] X. Chen, D. L. Maskell, and Y. Sun, "Fast identification of custom instructions for extensible processors," IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., vol. 26, no. 2, pp. 359-368, Feb. 2007.
[15] N.T. Clark, H. Zhong, S.A. Mahlke, "Automated custom instruction generation for domain-specific processor acceleration," IEEE Transactions on Computers, Vol. 54, Issue. 10, p1258-1270, Oct. 2005.
[16] P. Ienne, L. Pozzi, and M. Vuletic, "On the limits of processor specialization by mapping dataflow sections on ad-hoc functional units," Comput. Sci. Dept., Swiss Federal Inst. Technol. Lausanne, Lausanne, Switzerland, Tech. Rep. 01/376, 2001.
[17] F. Sun, S. Ravi, A. Raghunathan, and N. K. Jha, "Synthesis of custom processors based on extensible platforms," in Proc. Int. Conf. Comput.- Aided Des., 2002, pp. 256-261.
[18] J. Cong, G. Han, Z. Zhang, "Architecture and Compiler Optimizations for Data Bandwidth Improvement in Configurable Processors," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 14, no. 9, pp. 986 - 997, 2006.
[19] Pozzi L. Pozzi and P. Ienne. Exploiting pipelining to relax register file port constraints of instruction-set extensions. In CASES 2005, San Francisco, CA, Sept. 2005.
[20] M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown, "MiBench: A free, commercially representative embedded benchmark suite," Proc. IEEE 4th Ann. Workshop Workload Characterization (WWC 01), Dec. 2001, pp. 3-14.
[21] MPEG Audio Decoder. http://www.underbit.com/products/mad/.